Backend
This document describes the FastAPI backend in detail: entry point, middleware, configuration, API structure, and core modules. The backend is the authority for identity, chat state, and AI orchestration.
Entry point and application creation
The FastAPI application is created in backend/app/main.py. It:
- Configures logging (stdout for all; optional file handler for the
applogger only, so platform/uvicorn logs stay on stdout). - Defines a lifespan context manager: on shutdown it closes the Redis client so that connections are released cleanly.
- Creates the
FastAPIinstance with title, version, docs at/docs, redoc at/redoc, and the lifespan. - Registers a global exception handler that logs errors with a unique
errorIdand returns a safe message to the client (no stack traces or internal details in responses). - Adds middleware in order (first added = outermost): CORS, rate limit (if enabled), CSRF, cache-prevention. A small HTTP middleware adds
X-Session-Versionto/api/auth/meresponses for MSAL deploy-time invalidation. - Includes all v1 routers under their prefixes (see API routers).
- Exposes
/healthand/for liveness and basic info; optionally a/test-logendpoint for verifying logging.
The application does not start the database or Redis in the lifespan; those are used on first request via dependencies (get_db, Redis client). The lifespan is used only for cleanup on shutdown.
Middleware (order: first added = outermost)
- CORS —
CORSMiddlewarewithallow_originsfrom settings,allow_credentials=True, and explicit methods/headers. Wildcard origins are not allowed when credentials are true (per spec and security). - Rate limit — When
RATE_LIMIT_ENABLEDis true,RateLimitMiddlewarelimits requests per client (by user id if authenticated, else by IP) per window; returns 429 when exceeded. - CSRF — Validates
Origin/Refererfor state-changing requests (POST, PUT, PATCH, DELETE) when configured (CSRF_REQUIRE_ORIGIN_ALWAYSand origin check). - Cache prevention — Sets
Cache-Control: no-store(and related headers) on all responses so that browsers and proxies do not cache sensitive or dynamic content.
Configuration
Configuration is centralized in backend/app/config.py using Pydantic settings loaded from the environment (and optional .env file). Key groups:
- Database —
POSTGRES_HOST,POSTGRES_PORT,POSTGRES_USER,POSTGRES_PASSWORD,POSTGRES_DB; optionalPOSTGRES_ALEMBIC_*andPOSTGRES_URL_SYNC.POSTGRES_URLandALEMBIC_POSTGRES_URLare computed (async URLs); credentials are URL-encoded in the computed URLs. - JWT —
JWT_SECRET_KEY, optionalJWT_SECRET_KEY_OLD(for rotation),JWT_ALGORITHM,JWT_ACCESS_TOKEN_EXPIRE_MINUTES. OptionalSESSION_SECRET_KEYfor opaque session tokens;SESSION_VERSIONfor deploy-time session invalidation. - App —
ENVIRONMENT,CORS_ORIGINS,NEXTJS_URL,CSRF_REQUIRE_ORIGIN_ALWAYS,COOKIE_DOMAIN, logging (LOG_FILE,LOG_LEVEL,LOG_MAX_BYTES,LOG_BACKUP_COUNT), file upload limits,USER_CACHE_TTL_SECONDS, guest session TTL,FEEDBACK_REVIEWER_EMAILS. - Auth —
AUTH_PROVIDER,MSAL_AUTH_COOKIE_NAME,AZURE_AD_TENANT_ID,AZURE_AD_CLIENT_ID,AZURE_AD_VALID_AUDIENCES,AZURE_AD_SKIP_SIGNATURE_VERIFY. - Redis —
REDIS_URL(optional). - Rate limit —
RATE_LIMIT_ENABLED,RATE_LIMIT_REQUESTS,RATE_LIMIT_WINDOW_SECONDS. - AI — Nested
ModelSettings:MODEL_PROVIDER,CHAT_MODEL,CHAT_MODEL_REASONING,TITLE_MODEL,ARTIFACT_MODEL. Routing:ROUTING_MODEL,ROUTING_HISTORY_LIMIT. - MCP —
MCPSettings(env prefixMCP_):server_url,ssl_verify,timeout,load_timeout,tools_cache_ttl_seconds.
See Deployment and operations for a full list of environment variables.
API routers
All API routes are mounted under /api with the following prefixes. Routers live in backend/app/api/v1/ (and are imported in main.py).
| Prefix | Router module | Purpose |
|---|---|---|
/api/auth |
auth.py |
Login, register, logout, refresh, guest, guest/reset, me, password-reset/request, password-reset/confirm; MSAL set-token is handled by frontend + backend cookie endpoint. |
/api/chat |
chat.py |
POST (create/continue conversation and stream), GET /{id}, GET /{id}/messages/latest, DELETE (chat or messages), PATCH (visibility), GET suggestions. |
/api/v1/chat |
chat_stream.py |
POST /stream (stream-only endpoint). |
/api/chat |
chat_resume.py |
Resumable stream (e.g. resume by stream id). |
/api/history |
history.py |
Chat history listing. |
/api/vote |
vote.py |
Get and patch message votes. |
/api/feedback |
feedback.py |
Submit and (for reviewers) list app feedback. |
/api/document |
document.py |
Document CRUD. |
/api/files |
files.py |
File upload and get by id. |
/api/v1/charts |
charts.py |
Create and get chart records. |
/api/v1/mcp |
mcp_tools.py |
MCP tool list and proxy. |
/api/models |
models.py |
List available models. |
Note: thinking_stream.py defines a router but is not included in main.py by default; if mounted, chat-ownership validation should be added.
Core modules and dependencies
app/core/
database.py— Creates the async SQLAlchemy engine andasync_sessionmakerfromsettings.POSTGRES_URL. Exposesget_db()dependency that yields anAsyncSessionand ensures rollback on exception.redis.py— Redis client (optional). Used for caching and for storing stream chunks when resumable streams are enabled. Closed in app lifespan.auth/— JWT encode/decode (jwt.py), session token generation/validation (session_token.py), password hashing and verification (password.py), Azure AD token validation (azure.py). Used by auth routes and byget_current_user.csrf.py— CSRF middleware implementation (Origin/Referer check).rate_limit.py— Rate limit middleware (per user or per IP).cache_headers.py— Cache-prevention middleware.cookie_utils.py— Helpers for setting/clearing auth cookies (domain, httpOnly, etc.).password_validation.py— Password strength rules; optional HIBP check.errors.py— Shared exception types.logging_utils.py— Logging helpers.
app/api/deps.py
get_current_user— Resolves the authenticated user for the request. Tries in order: JWT fromauth_tokencookie orAuthorizationheader; if that fails and Azure AD is configured, validates MSAL cookie; fallback to session tokens (guest_session_id/user_session_id) validated against the database. Checks revoked tokens and password-change invalidation. Can cache the resolved user forUSER_CACHE_TTL_SECONDS. Used by almost all protected routes.get_current_user_optional— Same as above but returnsNonewhen no valid auth; used for routes that behave differently when logged in vs anonymous.- Feedback reviewer — Dependency that allows access to feedback list only for users whose email is in
FEEDBACK_REVIEWER_EMAILS.
app/models/
SQLAlchemy declarative models for all persistent entities. See Data and persistence for the list and relationships.
app/db/queries/
Async query layer: one module per domain (e.g. chat_queries.py, user_queries.py, message via chat_queries, session_queries.py, document_queries.py, vote_queries.py, chart_queries.py, suggestion_queries.py, revoked_token_queries.py, password_reset_queries.py, password_reset_attempt_queries.py, login_attempt_queries.py). Handlers and AI code call these functions with a db: AsyncSession instead of writing raw SQL. See Data and persistence.
app/schemas/ and app/api/v1/schemas/
Pydantic models for request and response bodies. Re-exported or used from v1 routers.
Request lifecycle (typical protected route)
- Request hits FastAPI (after CORS, rate limit, CSRF, cache-prevention).
- Route dependency
get_current_userruns: reads cookie/header, validates JWT or MSAL or session, loads user from DB if needed, caches for TTL, attaches user to request state. - Route handler runs: may use
get_db()for DB access, calls query layer and/or AI code, returns response or streams. - If an unhandled exception is raised, the global exception handler logs it and returns a safe JSON response with
errorId.
The next document, Frontend, describes the Next.js application structure and how it interacts with this API.