Skip to main content
1

Install evalflow in your repo

Run the following commands at the root of your repository to install evalflow and create the initial config and dataset files.
pip install evalflow
evalflow init
evalflow init creates two files:
evalflow.yaml created
evals/dataset.json created
Run evalflow doctor locally to confirm your setup is valid before pushing.
2

Create the workflow file

Create .github/workflows/evalflow.yml in your repository with the following content:
# .github/workflows/evalflow.yml
name: LLM Quality Gate

on:
  pull_request:
    paths:
      - "prompts/**"
      - "evals/**"
      - "**.py"
      - "evalflow.yaml"

jobs:
  eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install evalflow
      - run: evalflow doctor --no-provider-check
      - run: evalflow eval
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
The paths filter limits the workflow to pull requests that touch prompts, evals, or your evalflow config. Remove or adjust those filters to run on every pull request.
Run evalflow doctor --no-provider-check before evalflow eval so setup errors surface as a distinct step in the GitHub Actions log.
3

Add repository secrets

Store your provider API key as a GitHub Actions secret so it is never visible in logs or workflow files.
  1. Go to your repository on GitHub.
  2. Click SettingsSecrets and variablesActions.
  3. Click New repository secret.
  4. Set the name and value:
Name:  OPENAI_API_KEY
Value: your real provider key
If you use a different provider, set the variable name that matches your evalflow.yaml configuration — for example GROQ_API_KEY or ANTHROPIC_API_KEY — and reference it the same way in your workflow file.

How blocking merges works

evalflow uses exit codes to communicate the result of a run:
0  — all evals passed
1  — quality regression detected
2  — setup or provider error
GitHub Actions treats any non-zero exit code as a workflow failure. When the evalflow eval step exits with 1 or 2, the workflow fails and GitHub blocks the pull request from merging — no additional configuration required. To enforce this as a required status check, go to SettingsBranchesBranch protection rules and add the eval job as a required check for your default branch.

Security

Never hardcode API keys in your workflow YAML. Anyone with read access to the repository can see workflow files.
  • Always store provider keys in GitHub Secrets or organization secrets.
  • Keep .env out of version control by adding it to .gitignore.
  • Use evalflow doctor locally before pushing so CI failures reflect model quality, not missing setup.