You've built images in CI and pushed them to registries. But before deploying an image to production, someone needs to verify it works. In Chapter 1, you learned that the Test stage is the quality gate—the checkpoint that prevents broken code from reaching users.
This chapter teaches you how to implement that test stage in GitHub Actions. You'll write tests with pytest, measure code coverage, enforce coverage thresholds, lint your code, and configure your workflow to fail if any test fails. No exceptions. No warnings that get ignored. A single failing test stops the entire pipeline.
By the end of this chapter, you'll understand how automated tests become a safety net that developers trust, and how quality gates make deployments safer.
Without tests in CI, here's what happens:
With tests in CI:
The test stage is your defense against shipping broken code. In a team setting, tests are the only thing preventing one person's mistake from taking down everyone's work.
Python projects use pytest for unit testing. Let's understand what a test looks like for your FastAPI agent.
Here's your FastAPI agent with a simple endpoint:
Output:
Now here's a test for this endpoint:
Output:
The test creates a client, calls your endpoint, and asserts the response status code and body. If any assertion fails, the test fails.
Before CI, you run tests on your machine to develop:
Output:
If a test fails:
Output:
pytest shows exactly which assertion failed and why. This feedback loop—write code, run tests, see failures, fix code—is how developers build confidence.
A passing test is good, but does it actually test the important parts of your code? Code coverage measures what percentage of your code is executed during tests.
Use pytest-cov to measure coverage:
Output:
Coverage shows:
If you have 86% coverage, 14% of your code isn't tested. That could be edge cases or error handling that only triggers in production.
A quality gate enforces a minimum coverage threshold. If coverage drops below the threshold, the pipeline fails:
Output (if coverage is 86%):
Output (if coverage is 75%):
The --cov-fail-under=80 flag makes pytest exit with a failure code if coverage doesn't meet the threshold. In CI, this failure stops the pipeline.
Beyond functional tests, linting checks code style and catches common mistakes. Tools like ruff or flake8 scan your code for issues without running it.
Output (clean code):
Output (with issues):
Linting catches:
In CI, a linting failure stops the pipeline just like a test failure.
Now let's put this together in a GitHub Actions workflow. Your CI pipeline needs:
Here's a workflow that runs tests and enforces quality gates:
Explanation: This workflow defines two jobs (test and build) where build only runs if test succeeds (via needs: test).
Output (when tests pass):
Output (when tests fail):
Notice the needs: test line in the build job. This creates a dependency: build only starts if test passes. If test fails, build never runs.
Unit tests verify individual functions. Integration tests verify components work together—especially when external services are involved. For a FastAPI agent that uses PostgreSQL, integration tests need a real database.
GitHub Actions supports service containers—temporary databases or services that spin up for your tests, then tear down.
Here's an integration test that reads from a database:
Output:
This test needs a real PostgreSQL running. GitHub Actions can provide it:
Explanation: This workflow adds a services section with PostgreSQL, waits for health checks, creates the database schema, then runs tests with database connectivity.
Output:
The service container automatically:
Your pipeline should stop immediately when something fails. Don't continue building images and pushing to registries if tests fail.
The workflow above uses needs: test to enforce dependencies:
Output (execution order):
If any job fails, all dependent jobs are skipped. This is fail-fast behavior.
By default, if a step fails, the job stops:
Output (when linter fails):
You can override this with continue-on-error: true, but you shouldn't for quality gates. Failures should block the pipeline.
GitHub Actions can upload test reports and coverage reports as artifacts. These are stored and accessible through the GitHub UI.
Output (GitHub UI):
After the workflow runs, GitHub provides a download link for these artifacts. Developers can download the HTML coverage report and view what code wasn't tested.
Here's a complete workflow combining everything:
Output (complete success):
Output (test failure):
Quality Gate: An automated checkpoint that must pass before the pipeline continues. If any test fails, coverage drops, or linter finds issues, the pipeline stops.
Test Coverage: The percentage of code executed by your tests. Higher coverage (80%+) reduces the risk of uncaught bugs reaching production.
Fail-Fast: Stop immediately when a quality gate fails. Don't waste resources building and pushing images if tests will reject the code.
Service Containers: Temporary databases or services that spin up for tests and tear down automatically, ensuring tests are isolated and repeatable.
Artifacts: Files (like coverage reports) uploaded to GitHub for review. Developers can download and inspect what tests covered.
Ask Claude: "I have a FastAPI application with 80% test coverage. Add quality gates to my GitHub Actions workflow that fail if coverage drops below 80% or if any linting errors are found."
Before accepting the output:
After Claude provides the workflow, ask: "Now add integration tests that require a PostgreSQL database using GitHub Actions service containers. The tests should verify that tasks are persisted to the database."
Verify the response includes:
Finally, ask: "Ensure the workflow has three jobs—test, build, and push—where build only runs if test passes, and push only runs if build succeeds. Show me the complete workflow with all three jobs."
Check that:
You built a gitops-deployment skill in Chapter 0. Test and improve it based on what you learned.
Ask yourself:
If you found gaps: