Skip to main content
Use Ollama when you want to run evals entirely on your own machine without sending data to an external API. Ollama runs models locally, so it works offline and requires no API key or billing account. It is useful for local development, demos, air-gapped environments, and teams that want to avoid external model calls during iteration.
Ollama requires Ollama to be installed and running on your machine before you can use it with evalflow. Pull the model you intend to use with ollama pull <model> before running evals.

Configure evalflow.yaml

Add an ollama block under providers. Because Ollama runs locally, you still set api_key_env in the config — but you do not need to export a real key. The field must be present for evalflow to validate your config:
providers:
  ollama:
    api_key_env: "OLLAMA_API_KEY"
    default_model: "llama3.2"

eval:
  default_provider: "ollama"
Even though Ollama does not require an API key, evalflow’s config schema requires the api_key_env field. Set the environment variable to any non-empty placeholder value so evalflow doctor passes its check.

Set the placeholder environment variable

export OLLAMA_API_KEY="local"

Start the Ollama server

evalflow communicates with Ollama over its local HTTP server. Start it before running any evals:
ollama serve
Leave this running in a separate terminal, or run it as a background service.

Verify the connection

Run evalflow doctor with the --check-providers flag to confirm evalflow can reach the local Ollama server:
evalflow doctor --check-providers
✓ ollama health check

Run evals

evalflow eval --provider ollama
Running test cases against llama3.2...
Quality Gate: PASS
If eval.default_provider is already set to ollama in your evalflow.yaml, you can omit the --provider flag:
evalflow eval

Provider notes

  • Default model: llama3.2. You can set default_model to any model you have pulled with ollama pull.
  • Offline support: Once a model is pulled, evals run fully offline. No network connection is required.
  • No billing: Ollama is free and open source. There are no API costs or rate limits.
  • Performance: Inference speed depends on your hardware. Running large models on CPU will be significantly slower than GPU-accelerated hardware.
  • Judge model: By default, evalflow uses Groq as the LLM judge, which requires network access. To keep everything local, configure Ollama as the judge provider too:
judge:
  provider: "ollama"
  model: "llama3.2"
For a fully offline eval pipeline, set both eval.default_provider and judge.provider to ollama.