Anthropic

Use Anthropic when your production stack already depends on Claude models. Running evals against the same model family your app ships with keeps eval results closer to real-world behavior.

Configure evalflow.yaml

Add an anthropic block under providers and set default_provider to anthropic:

providers:
  anthropic:
    api_key_env: "ANTHROPIC_API_KEY"
    default_model: "claude-3-5-haiku-20241022"

eval:
  default_provider: "anthropic"

api_key_env is the name of the environment variable that holds your key — evalflow reads the variable at runtime and never stores the key itself.

Set your API key

export ANTHROPIC_API_KEY="your-key-here"

Add this line to your shell profile (~/.bashrc, ~/.zshrc, etc.) or a .env file so you do not have to re-export it each session. Never commit your .env file to version control.

Verify the connection

Run evalflow doctor to confirm evalflow can see the key before running any evals:

evalflow doctor

✓ ANTHROPIC_API_KEY set

Run evals

evalflow eval --provider anthropic

Running test cases against claude-3-5-haiku-20241022...
Quality Gate: PASS

If eval.default_provider is already set to anthropic in your evalflow.yaml, you can omit the --provider flag:

evalflow eval

Provider notes

Default model: claude-3-5-haiku-20241022. You can override it by setting default_model to any Claude model your account can access.
API key required: An Anthropic account and API key are required. Requests without a valid key will fail immediately.
Judge model: By default, evalflow uses Groq as the LLM judge. If you want Anthropic to serve as both the model under test and the judge, update the judge block in evalflow.yaml:

judge:
  provider: "anthropic"
  model: "claude-3-5-haiku-20241022"

Get Started

Core Concepts

CI/CD

Providers

Configure evalflow.yaml

Set your API key

Verify the connection

Run evals

Provider notes

​Configure evalflow.yaml

​Set your API key

​Verify the connection

​Run evals

​Provider notes

Configure evalflow.yaml

Set your API key

Verify the connection

Run evals

Provider notes