Testing Guide

Last updated: 2026-05-08

Testing Guide

This document outlines the testing strategy for the project, including how to run existing tests, write new tests, and best practices for ensuring code quality.

Philosophy: Sociable Integration Tests

The test suite has been transitioned to sociable integration tests for nodes and services. Rather than mocking every dependency, tests exercise real interactions between components (e.g., a node test that calls the real i18n manager and builds a real prompt), while still mocking external I/O boundaries like the LLM or the database.

This approach catches integration bugs that unit tests miss, while remaining fast enough for CI.

Running Tests

cd api
pytest

To run a specific test module or directory:

pytest tests/agents/
pytest tests/chat/test_app_data_service.py
pytest tests/auth/ -v

To run with coverage:

pytest --cov=. --cov-report=html

Coverage configuration is in api/.coveragerc.

Test Structure

Tests live under api/tests/, organised by domain:

api/tests/
├── conftest.py              # Global fixtures and SQLite engine patches
├── fixtures/                # Modular, reusable pytest fixtures
│   ├── clients.py           # HTTP test client fixtures
│   ├── data.py              # Seed data factories
│   ├── database.py          # In-memory SQLite DB setup / teardown
│   ├── infrastructure.py    # Service mocks (LLM, RAG, etc.)
│   └── services.py          # Real service instances wired to test DB
├── agents/                  # Agent graph and node tests
│   ├── service1/memory/     # MemoryLake unit tests
│   ├── test_dynamic_context.py
│   └── test_nodes_injection.py
├── auth/                    # Authentication flow tests
│   ├── test_auth_flow.py
│   └── test_mock_idp.py
├── chat/                    # AppDataService, chat init, and consent tests
│   ├── test_app_data_service.py
│   └── test_consent_flow.py
├── feedback/                # Feedback endpoint tests
├── i18n/                    # i18n manager and placeholder tests
├── routers/                 # API router integration tests
│   └── test_chat_init.py
└── services/                # Service-layer tests
    ├── test_chat_negotiation.py
    └── test_memory_service.py

Modular Fixtures (tests/fixtures/)

All reusable test infrastructure is defined in tests/fixtures/ and imported via conftest.py. This avoids duplication and keeps individual test files focused on assertions.

database.py

Provides an in-memory SQLite database for each test session. Key fixtures:

  • db_session — async SQLAlchemy session connected to the in-memory DB.
  • init_db — auto-used fixture that creates all tables before tests run and drops them after.

Running the full test suite against PostgreSQL would require a live database and make tests slow and environment-dependent. Three monkeypatches in conftest.py and one in tests/fixtures/database.py make the production database code run unmodified against SQLite:

Patch A — create_engine kwargs strip (conftest.py)
The production engine is configured with PostgreSQL-specific pool parameters (pool_size, max_overflow, pool_timeout, pool_recycle, pool_pre_ping). SQLite raises a TypeError if these are passed through. The patch strips them when SQLite is detected:

_orig_create_engine = sqlalchemy.create_engine

def _mock_create_engine(url, **kwargs):
    if "sqlite" in str(url):
        for key in ["pool_size", "max_overflow", "pool_timeout",
                    "pool_recycle", "pool_pre_ping"]:
            kwargs.pop(key, None)
    return _orig_create_engine(url, **kwargs)

sqlalchemy.create_engine = _mock_create_engine

Patch B — create_async_engine URL redirect (conftest.py)
Production code calls create_async_engine with a hardcoded PostgreSQL URL from settings. This patch ignores that URL entirely and always redirects to the test SQLite database:

_orig_async_engine = sqlalchemy.ext.asyncio.create_async_engine

def _mock_async_engine(url, **kwargs):
    kwargs.pop("async_creator", None)
    kwargs.pop("poolclass", None)
    return _orig_async_engine(os.environ["DATABASE_URL"], **kwargs)

sqlalchemy.ext.asyncio.create_async_engine = _mock_async_engine

Patch C — Base.metadata.schema reset (tests/fixtures/database.py)
Base sets MetaData(schema="app") at class-definition time. SQLite has no schema namespacing — the app. prefix causes sqlite3.OperationalError. The fixture sets Base.metadata.schema = None before calling create_all / drop_all:

def _reset_schema_for_sqlite():
    if os.environ.get("DATABASE_URL", "").startswith("sqlite"):
        Base.metadata.schema = None

conftest.py also sets DB_SCHEMA="" as an environment variable so Base is initialised with schema=None rather than "app" from the very first import.

For the production-side Enum inherit_schema patch that ensures PostgreSQL ENUM types are created in the schema of each table that references them, see Data Layer → Database Migrations.

services.py

Provides real service instances (not mocks) wired to the test database:

  • app_data_serviceAppDataService instance.
  • feature_flag_serviceFeatureFlagService instance.
  • user_auth_serviceUserAuthService instance.

data.py

Factory functions for creating seed data (users, sessions, conversations, collectives, consents) in the test database.

consent.py (New)

  • seed_consent_flags — seeds enforce_consent_policy and enforce_consent_gate flags at desired values.
  • app_data_service_with_consentAppDataService fixture pre-loaded with consent-related repositories.

clients.py

  • test_client — an httpx.AsyncClient configured with the FastAPI test app and a base URL.

infrastructure.py

Mocks for external dependencies (LLM, RAG, SSE manager) to prevent real API calls in tests.

Key Test Categories

Auth Tests (tests/auth/)

  • test_auth_flow.py — End-to-end auth: mock IDP token generation → POST /auth/login → consent verification → AppSession creation → UserHistory returned. Tests both consent-enforced and consent-disabled paths.
  • test_mock_idp.py — Validates mock IDP token signing and POST /mock-idp/authenticate endpoint.

Agent Tests (tests/agents/)

  • test_nodes_injection.py — Sociable tests that invoke each node with a realistic Service1State and assert on the output. The LLM is mocked at the boundary.
  • test_dynamic_context.py — Tests context extraction with the QAFactory dynamic Pydantic model.
  • memory/test_lake.py — Unit tests for the MemoryLake queue, deduplication logic, and worker behavior.

Chat Tests (tests/chat/)

  • test_app_data_service.py — Tests AppDataService methods: user creation, session init, composite-key IDPLogin lookup, UserHistory shape.
  • test_consent_flow.py — Tests the full consent pipeline: user registration with consent, scope assignment, enforce_consent_policy flag behavior, invitation processing with implicit consent.

Service Tests (tests/services/)

  • test_chat_negotiation.py — Tests capability negotiation: explicit capabilities, registry lookup, safe defaults.
  • test_memory_service.py — Tests MemoryService read/write operations against the PostgresStore (mocked).

Router Tests (tests/routers/)

  • test_chat_init.py — Integration tests for POST /chat/init, validating request/response schema and service orchestration.

i18n Tests (tests/i18n/)

  • test_i18n.py — Tests YAML loading, fallback behaviour, and translation completeness.
  • test_placeholders.py — Ensures all placeholder variables in YAML prompt templates have corresponding values.

Writing New Tests

  1. Place tests in the appropriate domain directory under tests/.
  2. Import fixtures from tests/fixtures/ via conftest.py — do not redefine infrastructure fixtures in test files.
  3. Use pytest.mark.asyncio for all async tests.
  4. Mock only external I/O (LLM API, network calls). Use real service instances with the test database where possible.
  5. Keep assertions focused — one logical assertion per test function where practical.