Testing Guide
Last updated: 2026-05-08
Testing Guide
This document outlines the testing strategy for the project, including how to run existing tests, write new tests, and best practices for ensuring code quality.
Philosophy: Sociable Integration Tests
The test suite has been transitioned to sociable integration tests for nodes and services. Rather than mocking every dependency, tests exercise real interactions between components (e.g., a node test that calls the real i18n manager and builds a real prompt), while still mocking external I/O boundaries like the LLM or the database.
This approach catches integration bugs that unit tests miss, while remaining fast enough for CI.
Running Tests
cd api
pytestTo run a specific test module or directory:
pytest tests/agents/
pytest tests/chat/test_app_data_service.py
pytest tests/auth/ -vTo run with coverage:
pytest --cov=. --cov-report=htmlCoverage configuration is in api/.coveragerc.
Test Structure
Tests live under api/tests/, organised by domain:
api/tests/
├── conftest.py # Global fixtures and SQLite engine patches
├── fixtures/ # Modular, reusable pytest fixtures
│ ├── clients.py # HTTP test client fixtures
│ ├── data.py # Seed data factories
│ ├── database.py # In-memory SQLite DB setup / teardown
│ ├── infrastructure.py # Service mocks (LLM, RAG, etc.)
│ └── services.py # Real service instances wired to test DB
├── agents/ # Agent graph and node tests
│ ├── service1/memory/ # MemoryLake unit tests
│ ├── test_dynamic_context.py
│ └── test_nodes_injection.py
├── auth/ # Authentication flow tests
│ ├── test_auth_flow.py
│ └── test_mock_idp.py
├── chat/ # AppDataService, chat init, and consent tests
│ ├── test_app_data_service.py
│ └── test_consent_flow.py
├── feedback/ # Feedback endpoint tests
├── i18n/ # i18n manager and placeholder tests
├── routers/ # API router integration tests
│ └── test_chat_init.py
└── services/ # Service-layer tests
├── test_chat_negotiation.py
└── test_memory_service.py
Modular Fixtures (tests/fixtures/)
All reusable test infrastructure is defined in tests/fixtures/ and imported via conftest.py. This avoids duplication and keeps individual test files focused on assertions.
database.py
Provides an in-memory SQLite database for each test session. Key fixtures:
db_session— async SQLAlchemy session connected to the in-memory DB.init_db— auto-used fixture that creates all tables before tests run and drops them after.
Running the full test suite against PostgreSQL would require a live database and make tests slow and environment-dependent. Three monkeypatches in conftest.py and one in tests/fixtures/database.py make the production database code run unmodified against SQLite:
Patch A — create_engine kwargs strip (conftest.py)
The production engine is configured with PostgreSQL-specific pool parameters (pool_size, max_overflow, pool_timeout, pool_recycle, pool_pre_ping). SQLite raises a TypeError if these are passed through. The patch strips them when SQLite is detected:
_orig_create_engine = sqlalchemy.create_engine
def _mock_create_engine(url, **kwargs):
if "sqlite" in str(url):
for key in ["pool_size", "max_overflow", "pool_timeout",
"pool_recycle", "pool_pre_ping"]:
kwargs.pop(key, None)
return _orig_create_engine(url, **kwargs)
sqlalchemy.create_engine = _mock_create_enginePatch B — create_async_engine URL redirect (conftest.py)
Production code calls create_async_engine with a hardcoded PostgreSQL URL from settings. This patch ignores that URL entirely and always redirects to the test SQLite database:
_orig_async_engine = sqlalchemy.ext.asyncio.create_async_engine
def _mock_async_engine(url, **kwargs):
kwargs.pop("async_creator", None)
kwargs.pop("poolclass", None)
return _orig_async_engine(os.environ["DATABASE_URL"], **kwargs)
sqlalchemy.ext.asyncio.create_async_engine = _mock_async_enginePatch C — Base.metadata.schema reset (tests/fixtures/database.py)
Base sets MetaData(schema="app") at class-definition time. SQLite has no schema namespacing — the app. prefix causes sqlite3.OperationalError. The fixture sets Base.metadata.schema = None before calling create_all / drop_all:
def _reset_schema_for_sqlite():
if os.environ.get("DATABASE_URL", "").startswith("sqlite"):
Base.metadata.schema = Noneconftest.py also sets DB_SCHEMA="" as an environment variable so Base is initialised with schema=None rather than "app" from the very first import.
For the production-side
Enum inherit_schemapatch that ensures PostgreSQL ENUM types are created in the schema of each table that references them, see Data Layer → Database Migrations.
services.py
Provides real service instances (not mocks) wired to the test database:
app_data_service—AppDataServiceinstance.feature_flag_service—FeatureFlagServiceinstance.user_auth_service—UserAuthServiceinstance.
data.py
Factory functions for creating seed data (users, sessions, conversations, collectives, consents) in the test database.
consent.py (New)
seed_consent_flags— seedsenforce_consent_policyandenforce_consent_gateflags at desired values.app_data_service_with_consent—AppDataServicefixture pre-loaded with consent-related repositories.
clients.py
test_client— anhttpx.AsyncClientconfigured with the FastAPI test app and a base URL.
infrastructure.py
Mocks for external dependencies (LLM, RAG, SSE manager) to prevent real API calls in tests.
Key Test Categories
Auth Tests (tests/auth/)
test_auth_flow.py— End-to-end auth: mock IDP token generation →POST /auth/login→ consent verification →AppSessioncreation →UserHistoryreturned. Tests both consent-enforced and consent-disabled paths.test_mock_idp.py— Validates mock IDP token signing andPOST /mock-idp/authenticateendpoint.
Agent Tests (tests/agents/)
test_nodes_injection.py— Sociable tests that invoke each node with a realisticService1Stateand assert on the output. The LLM is mocked at the boundary.test_dynamic_context.py— Tests context extraction with theQAFactorydynamic Pydantic model.memory/test_lake.py— Unit tests for theMemoryLakequeue, deduplication logic, and worker behavior.
Chat Tests (tests/chat/)
test_app_data_service.py— TestsAppDataServicemethods: user creation, session init, composite-key IDPLogin lookup,UserHistoryshape.test_consent_flow.py— Tests the full consent pipeline: user registration with consent, scope assignment,enforce_consent_policyflag behavior, invitation processing with implicit consent.
Service Tests (tests/services/)
test_chat_negotiation.py— Tests capability negotiation: explicit capabilities, registry lookup, safe defaults.test_memory_service.py— TestsMemoryServiceread/write operations against thePostgresStore(mocked).
Router Tests (tests/routers/)
test_chat_init.py— Integration tests forPOST /chat/init, validating request/response schema and service orchestration.
i18n Tests (tests/i18n/)
test_i18n.py— Tests YAML loading, fallback behaviour, and translation completeness.test_placeholders.py— Ensures all placeholder variables in YAML prompt templates have corresponding values.
Writing New Tests
- Place tests in the appropriate domain directory under
tests/. - Import fixtures from
tests/fixtures/viaconftest.py— do not redefine infrastructure fixtures in test files. - Use
pytest.mark.asynciofor all async tests. - Mock only external I/O (LLM API, network calls). Use real service instances with the test database where possible.
- Keep assertions focused — one logical assertion per test function where practical.