Agentic framework and structure
Last updated: 2026-05-08
Agentic framework and structure

service1 agent. This diagram is automatically generated by the graph definition in api/agents/service1/graph.py.Implementation details
The main agent logic is implemented using langgraph and is located in the api/agents/service1/ directory. The structure is modular, breaking down the agent’s functionality into distinct components:
graph.py: Defines thelanggraphStateGraph, connecting all the nodes and edges that constitute the agent’s logic. It also includes code to automatically generate a Mermaid diagram of the graph’s structure.core/: Contains the core components of the agent.state.py: Defines theService1StateTypedDict, which tracks the agent’s state throughout the conversation, including messages, context, and actions.llm_client.py: Manages the lazy-loaded LLM client (ChatGoogleGenerativeAI) and other services like RAG and Analytics.registry.py: Holds the tool registry, mapping tool names (used in YAML manifests) to their callable implementations.
nodes/: Each file in this directory corresponds to a specific node in the graph, encapsulating a particular piece of logic (e.g., giving advice, collecting context).nodes/manifests/: YAML manifest files, one per node, declaring the prompt structure, tools, and output schema for that node (see NodeEnv below).
memory/: Contains theMemoryLakecomponent for asynchronous, persistent agent memory (see Persistent Memory Lake below).routers/: Contains the conditional routing logic that directs the flow of the conversation between different nodes based on the current state.tools/: LangGraph tool definitions (research_tools.py,contact_tools.py) that nodes can invoke via theresearch_toolstool node.utils/: Provides helper functions, theNodeEnvcompiler (node_env.py), context extraction helpers (context.py), and output formatters (formatters.py).
The agent is designed as a state machine where each node transition is determined by the output of the previous node and the conditional logic in the routers.
Node descriptions
agent1A central router node that uses the LLM to determine the next high-level action (e.g.,
give_advice,collect_context) based on the current conversation history. Its prompt and output schema are declared innodes/manifests/agent1.yaml.collect_contextThis node manages the initial phase of the conversation, guiding the user through a series of questions defined in the i18n configuration (see Dynamic Context Collection). It continues to ask questions until all required context (
context_completeflag) has been collected.ask_for_contextA supplementary node to
collect_context. The conversation is routed here ifagent1determines that the user’s situation requires further clarification outside the standard initial questions.give_adviceThe core node responsible for generating a helpful, empathetic response to the user’s situation. It can trigger the
research_toolstool node to enrich its answer with information from ingested RAG documents.research_toolsA LangGraph
ToolNodethat executes tool calls emitted bygive_advice. Currently integratesresearch_educational_strategiesfor RAG-backed document search. The node generates multiple diverse RAG queries, performs parallelised vector searches, and filters results by an LLM-based relevance score before returning them togive_advice.ongoing_supportAfter the initial advice is given, this node handles the continuing conversation. It provides follow-up support, answers additional questions, and maintains context from the conversation summary stored in the state.
summarize_conversationTriggered at the end of a conversational loop. It generates a concise summary of the interaction. The summary is persisted asynchronously via the
MemoryLake(see below), allowing context to be retained across sessions.user_feedbackA terminal node in the main advice-giving flow. It allows the conversation to end gracefully, awaiting further input from the user. If the user continues the conversation, the flow restarts through the appropriate router.
Routings
Routing is managed by conditional edges that evaluate the agent’s state (Service1State) to determine the next node.
should_collect_contextThe main entry-point router. Routes to
agent1if context is already complete or if the session is in ongoing-support mode; otherwise defaults tocollect_context.give_advice_after_context_collectionFired after
collect_contextcompletes. Routes directly togive_advicewhen all context questions have been answered, bypassingagent1for speed.routerThe primary action router after
agent1. Takes theactionfield set byagent1and routes to the corresponding node (give_advice,ongoing_support,user_feedback, orsummarize_conversation). Also handles the fast-path toongoing_supportwhen the session is already in support mode and the last message is from the human.advice_routerManages the flow after
give_advice. Usestools_conditionto detect pending tool calls and route toresearch_tools; proceeds tosummarize_conversationwhen summarisation is flagged (should_summarize); otherwise returns the action from state.
Advanced Concepts
NodeEnv — Declarative Prompt System
Nodes no longer hard-code their prompts or tool bindings in Python. Instead, each node declares its requirements in a YAML manifest stored in nodes/manifests/:
# Example: nodes/manifests/give_advice.yaml
node: give_advice
prompts:
base: give_advice_system_prompt
snippets:
- first_advice_extension
tools:
catalog:
- tool: research_educational_strategies
condition: first_advice
response_schema: GiveAdviceAnswerThe NodeEnv class (utils/node_env.py) reads the manifest at runtime and:
- Resolves the base system prompt from the i18n layer for the current language and user type.
- Appends any declared supplementary
snippets(e.g.,first_advice_extension). - Filters the
toolscatalog against optional state conditions (e.g., only bind the research tool for the very first advice turn). - Binds the resolved tool list or response schema to the LLM using
bind_tools/with_structured_output.
This design means prompt content can be adjusted in the YAML files or i18n layer without any Python changes. A refresh_manifest_schema.py utility keeps the JSON Schema (node_manifest.schema.json) in sync with the manifest format for IDE validation.
Persistent Memory Lake
Conversation summaries are now persisted asynchronously via the MemoryLake (memory/lake.py), decoupling the agent’s hot path from database I/O:
- After
summarize_conversationgenerates a summary, it drops aQueuedSummaryinto the lake and returns immediately — the agent does not wait for the write. - The lake maintains a two-layer worker pool (3 + 3 async workers). The first layer deduplicates pending writes per
(user_id, session_id)key, keeping only the most recent summary. The second layer flushes these to theMemoryService(which writes to PostgreSQL). - On failure the key is re-queued, providing at-least-once delivery semantics without blocking the conversation.
The MemoryLake is initialised once on application startup and accessed as a singleton via get_memory_lake().
Internationalization (i18n)
The agent is multi-lingual from the ground up. The api/i18n/ directory and i18n_manager.py are central to this capability.
- YAML-Only Content: All user-facing strings — system prompts, UI messages, and context-collection questions — are now stored in YAML files (e.g.,
context_questions/FR.yaml,prompts/teenager/FR/). The older JSON format has been fully retired. - Multi-Tenant Prompt Structure: Prompts are organised by
user_type(teenager/parent) andlanguage, allowing each audience to receive appropriately tailored language and framing. - Dynamic Database Overrides: The system supports dynamic overrides via the
LocalizedContentdatabase table. This allows updating prompts, UI messages, and snippets in production without requiring a code redeployment. Database content always takes precedence over filesystem YAML content. - Two-Layer Caching: The
i18nManageremploys a file-level cache layered on top of a memory-based database override cache, with additional LRU caching on public getter methods to minimise repeated database or file reads. - Synchronization Tool: A
sync_to_db.pyCLI script is provided to synchronize YAML content into the database, facilitating bulk updates and CI/CD integration. - State-Driven Language: The
languagefield inService1Statedrives all translation lookups, making it straightforward to add new languages without changing the agent’s Python code. - XML-Style Tagging: Dynamic content injected into prompts (collected context, previous conversations, privacy guidelines) is wrapped in XML tags (e.g.,
<collected_context>,<previous_conversations>) so the LLM can reliably distinguish information types.
Consent-Aware Session State
The agent inherits the user’s consent status from the AppSession.scope stored in the database. During session initialisation, the ChatService reads this scope and propagates it into Service1State. Nodes can react to consent state when determining response behavior — for example, the collect_context node may adjust its approach for users in ONBOARDING scope who have not yet completed the consent flow. See Backend API — Consent-Based Session Scoping for the full consent pipeline.
Dynamic Context Collection
The collect_context node uses a sophisticated mechanism to extract structured information from a user’s initial message.
- Structured Output: It uses the LLM’s structured output capability (
with_structured_output). - Dynamic Pydantic Models: The
utils/context.pyfile contains aQAFactoryfunction that dynamically creates a PydanticBaseModelfrom the list of context questions for the user’s language. For multiple-choice questions,Literaltypes constrain the output to valid values. - Confident Extraction: Each answer is accompanied by the model’s
reasoningandconfidencelevel, so only explicitly stated information is used — reducing hallucinations. - AI-Driven Dynamic Questions (feature-flagged): When the
enable_dynamic_question_generationfeature flag is enabled, the node can ask the LLM to generate additional follow-up questions beyond the static set, collecting richer situational context. This is controlled at runtime without code changes.
Observability with Langfuse
The agent is integrated with Langfuse for tracing and monitoring, configured in api/agents/service1/utils/observability.py.
- Callback Handler: A custom
AsyncCallbackHandler,ErrorFlagger, is attached to the LLM client. - Error Flagging: The
on_llm_endmethod inspects the LLM response metadata. If it finds ablock_reason(response blocked by safety filters), it updates the corresponding trace in Langfuse. - Debugging: This provides immediate visibility into when and why the LLM refuses to respond, which is important given the sensitive domain.
Service and LLM Client Configuration
The api/agents/service1/core/llm_client.py file centralises LLM and service initialisation.
- Lazy Loading:
get_llm(),get_rag_service(), andget_analytics_service()are lazy-loaded — initialised on first call rather than at startup. - LLM Safety Settings: The
ChatGoogleGenerativeAIclient deliberately disables all default safety filters (HarmBlockThreshold.BLOCK_NONE) to handle sensitive topics. Content safety is managed through prompt design and theErrorFlaggerobservability layer. - User-Type-Aware RAG Service:
get_rag_serviceinitialises aRAGServiceaware of theuser_type(teenager/parent), routing it to the correct vector-database collection.