OpenAI - evalflow

Use OpenAI when you want the simplest setup and broad model coverage. It is a good default if your application already calls OpenAI in production, since your evals will run against the same models your users interact with.

Configure evalflow.yaml

Add an openai block under providers and set default_provider so evalflow knows which provider to use when you run evalflow eval without a --provider flag.

providers:
  openai:
    api_key_env: "OPENAI_API_KEY"
    default_model: "gpt-4o-mini"

eval:
  default_provider: "openai"

api_key_env is the name of the environment variable that holds your key — evalflow reads the variable at runtime and never stores the key itself.

Set your API key

export OPENAI_API_KEY="your-key-here"

Add this line to your shell profile (~/.bashrc, ~/.zshrc, etc.) or a .env file so you do not have to re-export it each session. Never commit your .env file to version control.

Verify the connection

Run evalflow doctor to confirm evalflow can see the key before running any evals:

evalflow doctor

✓ OPENAI_API_KEY set

Run evals

evalflow eval --provider openai

Running test cases against gpt-4o-mini...
Quality Gate: PASS

If eval.default_provider is already set to openai in your evalflow.yaml, you can omit the --provider flag:

evalflow eval

Provider notes

Default model: gpt-4o-mini. You can override it by setting default_model to any model name your account has access to (for example, gpt-4o or o1-mini).
API key required: OpenAI requires a paid account or active free-tier credits. Requests without a valid key will fail immediately.
Judge model: By default, evalflow uses Groq as the LLM judge. If you want OpenAI to serve as both the model under test and the judge, update the judge block in evalflow.yaml:

judge:
  provider: "openai"
  model: "gpt-4o-mini"

​Configure evalflow.yaml

​Set your API key

​Verify the connection

​Run evals

​Provider notes

Configure evalflow.yaml

Set your API key

Verify the connection

Run evals

Provider notes