This commit is contained in:
John Lancaster
2026-06-17 21:49:57 -05:00
parent 22e5357ffb
commit 9b02007216
7 changed files with 583 additions and 0 deletions
@@ -33,18 +33,37 @@ Use these concepts as the planning backbone:
1. Engine lifecycle and ownership: 1. Engine lifecycle and ownership:
One AsyncEngine per process for each DB URL, created once and disposed explicitly when the app lifecycle ends. One AsyncEngine per process for each DB URL, created once and disposed explicitly when the app lifecycle ends.
See: [references/engine.md](references/engine.md)
2. Session factory and scope: 2. Session factory and scope:
Use async_sessionmaker for configuration; create one AsyncSession per request or unit-of-work, never shared across concurrent tasks. Use async_sessionmaker for configuration; create one AsyncSession per request or unit-of-work, never shared across concurrent tasks.
See: [references/session.md](references/session.md)
3. Transaction boundaries: 3. Transaction boundaries:
Prefer context-managed begin blocks for write units and explicit read-only sessions for queries. Prefer context-managed begin blocks for write units and explicit read-only sessions for queries.
See: [references/transactions.md](references/transactions.md)
4. Lifespan composition: 4. Lifespan composition:
Compose startup/shutdown resources with AsyncExitStack so cleanup is deterministic and ordered. Compose startup/shutdown resources with AsyncExitStack so cleanup is deterministic and ordered.
See: [references/engine.md](references/engine.md)
5. Dependency injection: 5. Dependency injection:
Provide sessions via FastAPI dependencies with async generators/context managers, not globals. Provide sessions via FastAPI dependencies with async generators/context managers, not globals.
See: [references/session.md](references/session.md)
6. Implicit I/O control in ORM: 6. Implicit I/O control in ORM:
Avoid accidental lazy loads; use explicit eager-loading/refresh strategies for asyncio safety. Avoid accidental lazy loads; use explicit eager-loading/refresh strategies for asyncio safety.
See: [references/implicit_io.md](references/implicit_io.md)
7. Observability and resilience: 7. Observability and resilience:
Add pool/connection settings, logging, timeout, and health checks as first-class plan items. Add pool/connection settings, logging, timeout, and health checks as first-class plan items.
See: [references/observability.md](references/observability.md)
### Concept Reference Map
| Concept | Reference |
|---|---|
| Engine lifecycle and ownership | [references/engine.md](references/engine.md) |
| Session factory and scope | [references/session.md](references/session.md) |
| Transaction boundaries | [references/transactions.md](references/transactions.md) |
| Lifespan composition | [references/engine.md](references/engine.md) |
| Dependency injection | [references/session.md](references/session.md) |
| Implicit I/O control in ORM | [references/implicit_io.md](references/implicit_io.md) |
| Observability and resilience | [references/observability.md](references/observability.md) |
## Decision Points ## Decision Points
@@ -0,0 +1,106 @@
# Preventing Implicit ORM I/O (Asyncio)
Source:
- https://docs.sqlalchemy.org/en/21/orm/extensions/asyncio.html#preventing-implicit-io-when-using-asyncsession
- https://docs.sqlalchemy.org/en/21/orm/queryguide/relationships.html
Status: adopted
Decision level: advisory
Applies to: api-runtime, workers, tests
Last reviewed: 2026-06-17
---
## Purpose
Minimize unexpected database round-trips caused by attribute access in async ORM code.
In asyncio applications, hidden lazy loads are easy to miss and can produce runtime surprises. This guide defines explicit-loading defaults and progressive enforcement practices.
---
## Scope and Non-Goals
- In scope: relationship loading strategy, post-commit attribute access, explicit refresh/awaitable access patterns.
- Out of scope: full ORM performance tuning and domain-specific query architecture.
---
## Rules
- Prefer explicit eager loading for data required by endpoint/service outputs.
- Avoid relying on implicit lazy-load behavior in request critical paths.
- Keep `expire_on_commit=False` unless strict expiration behavior is intentionally required.
- Use explicit refresh or awaitable-attribute access when loading deferred state is necessary.
---
## Recommended Patterns
### Pattern A: Eager-load what you need
```python
from sqlalchemy import select
from sqlalchemy.orm import selectinload
stmt = select(User).options(selectinload(User.roles))
users = (await session.scalars(stmt)).all()
```
### Pattern B: Explicit refresh of named attributes
```python
user = await session.get(User, user_id)
await session.refresh(user, ["roles"])
```
### Pattern C: Awaitable attribute access where needed
```python
# Requires AsyncAttrs mixin on mapped base or class.
roles = await user.awaitable_attrs.roles
```
---
## Practical Enforcement Model
Use phased enforcement:
1. High-traffic and latency-sensitive routes: enforce explicit eager loading.
2. Background tasks and less critical paths: track and progressively tighten.
3. Add review checks to prevent newly introduced implicit-load hotspots.
This keeps modernization pragmatic while reducing hidden I/O over time.
---
## Anti-Patterns
- Returning ORM objects from handlers and triggering lazy loads during serialization.
- Assuming post-commit attribute access will always be loaded without explicit strategy.
- Relying on broad expiration + implicit reload behavior in async request flows.
- Enabling relationship patterns that hide SQL behavior in critical code paths.
---
## Operational Checks
- Endpoint query blocks define loader options for returned related data.
- Critical handlers do not depend on incidental lazy loads.
- Known exceptions are documented with rationale and follow-up items.
---
## Testing Checks
- Integration tests cover endpoints that return related objects.
- Tests verify expected data is present without hidden secondary query surprises.
- Regression tests exist for routes previously affected by implicit-load failures.
---
## Migration Notes
- Start advisory: target high-risk paths first.
- As coverage improves, elevate selected rules to mandatory in code review policy.
@@ -0,0 +1,32 @@
# FastAPI Async SQLAlchemy References Index
Purpose: concept registry for modernization guidance used by this skill.
---
## Concepts
| Concept | File | Status | Decision Level | Owner | Last Reviewed |
|---|---|---|---|---|---|
| Engine lifecycle and ownership | [engine.md](engine.md) | adopted | mandatory | platform/backend | 2026-06-17 |
| Session factory and scope | [session.md](session.md) | adopted | mandatory | platform/backend | 2026-06-17 |
| Transaction boundaries | [transactions.md](transactions.md) | adopted | mandatory | platform/backend | 2026-06-17 |
| Implicit ORM I/O under asyncio | [implicit_io.md](implicit_io.md) | adopted | advisory | platform/backend | 2026-06-17 |
| Observability and resilience | [observability.md](observability.md) | adopted | mandatory | platform/backend | 2026-06-17 |
---
## How to Use This Folder
- `SKILL.md` defines the planning workflow and migration procedure.
- Each concept doc defines policy-level guidance for one concern.
- Use the template in [template.md](template.md) for new concept docs.
- Keep references source-linked and implementation snippets minimal.
---
## Update Rules
- If a PR changes database lifecycle/session/ORM loading behavior, update the relevant concept file.
- Keep `Status`, `Decision Level`, and `Last Reviewed` current.
- Use `advisory` only when incremental rollout is intended; use `mandatory` for required runtime policy.
@@ -0,0 +1,113 @@
# DB Observability and Resilience
Source:
- https://docs.sqlalchemy.org/en/21/core/pooling.html
- https://docs.sqlalchemy.org/en/21/core/engines.html
- https://docs.sqlalchemy.org/en/21/core/events.html
- https://fastapi.tiangolo.com/advanced/events/
Status: adopted
Decision level: mandatory
Applies to: api-runtime, workers, tests
Last reviewed: 2026-06-17
---
## Purpose
Define baseline observability and resilience practices for DB connectivity in async FastAPI + SQLAlchemy apps.
Goals:
- detect and recover from stale/disconnected connections,
- expose useful diagnostics for pool/engine behavior,
- make readiness/liveness signals meaningful.
---
## Scope and Non-Goals
- In scope: pool health, connection liveness, SQL/pool logging hygiene, readiness checks, failure handling.
- Out of scope: full APM stack design and vendor-specific monitoring platform setup.
---
## Rules
- Enable connection liveness strategy (`pool_pre_ping=True`) for long-running services.
- Keep DB health checks out of liveness; include dependency checks in readiness.
- Centralize engine options and logging configuration.
- Avoid noisy SQL debug logging in production defaults.
- Treat disconnect handling as a first-class test scenario.
---
## Recommended Baseline
```python
engine = create_async_engine(
settings.database_url,
pool_pre_ping=True,
# Tune only from measured behavior:
# pool_size=10,
# max_overflow=20,
# pool_timeout=30,
# pool_recycle=1800,
)
```
Operational guidance:
- `pool_pre_ping=True` for stale-connection resilience.
- Introduce `pool_recycle` where backend/network idle timeout behavior warrants it.
- Use structured app logs with request correlation and error context.
---
## Health Endpoint Policy
- `/healthz`: process is alive; no DB call required.
- `/readyz`: application can currently serve traffic; include DB connectivity verification.
Readiness checks should be lightweight and bounded (timeouts), not heavy diagnostic queries.
---
## Failure Handling Guidance
- Handle transient disconnects with pool invalidation/reconnect semantics.
- Keep one failed request from cascading into broad app instability.
- Capture and log contextual DB errors with enough metadata for debugging.
---
## Anti-Patterns
- No readiness check for DB-dependent services.
- Permanent debug SQL echo in production.
- Per-handler ad hoc pool settings.
- Assuming disconnect events are too rare to test.
---
## Operational Checks
- Engine creation is centralized and configured once.
- Liveness/readiness behavior is documented and validated.
- Pool settings are explicit, versioned, and reviewed.
- DB-related errors produce actionable logs.
---
## Testing Checks
- Readiness endpoint test covers healthy and unhealthy DB states.
- Integration test simulates disconnect/reconnect behavior.
- Load/concurrency tests validate pool behavior under stress.
---
## Migration Notes
- Start with resilient defaults (`pool_pre_ping`) and simple health policy.
- Add deeper metrics/event hooks incrementally once baseline reliability is in place.
@@ -0,0 +1,140 @@
# Async SQLAlchemy Session Management
Source:
- https://docs.sqlalchemy.org/en/21/orm/extensions/asyncio.html
- https://docs.sqlalchemy.org/en/21/orm/session_basics.html
- https://fastapi.tiangolo.com/tutorial/dependencies/dependencies-with-yield/
Status: adopted
Decision level: mandatory
Applies to: api-runtime, workers, tests
Last reviewed: 2026-06-17
---
## Purpose
Define one canonical session model for FastAPI + SQLAlchemy asyncio:
- configure one shared session factory,
- create one AsyncSession per request or per unit-of-work,
- never share one AsyncSession across concurrent tasks.
---
## Scope and Non-Goals
- In scope: session factory creation, FastAPI dependency wiring, request/task scoping, transaction demarcation.
- Out of scope: ORM model design, query optimization strategy, schema migration tooling.
---
## Rules
- Create `async_sessionmaker` once from app-owned AsyncEngine.
- Use a fresh AsyncSession for each request or explicit unit-of-work.
- Do not share AsyncSession across `asyncio.gather()` or parallel tasks.
- Prefer direct dependency injection over global scoped-session patterns in new code.
- Use explicit transaction boundaries (`async with session.begin():`) for writes.
---
## Canonical FastAPI Dependency Pattern
```python
from collections.abc import AsyncIterator
from fastapi import Depends, Request
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
def get_session_factory(request: Request) -> async_sessionmaker[AsyncSession]:
return request.app.state.session_factory
async def get_db_session(
session_factory: async_sessionmaker[AsyncSession] = Depends(get_session_factory),
) -> AsyncIterator[AsyncSession]:
async with session_factory() as session:
yield session
```
Route usage:
```python
from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
router = APIRouter()
@router.post("/items")
async def create_item(session: AsyncSession = Depends(get_db_session)) -> dict:
async with session.begin():
# write operations here
...
return {"status": "ok"}
```
---
## Configuration Guidance
Typical session factory setup:
```python
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
session_factory = async_sessionmaker(
engine,
class_=AsyncSession,
expire_on_commit=False,
)
```
Notes:
- `expire_on_commit=False` is commonly preferred in asyncio applications to reduce accidental post-commit reload behavior.
- `AsyncSession.refresh()` is preferred over broad expiration patterns when state refresh is needed.
---
## Concurrency Rules
- One session per concurrent task.
- If work fans out into parallel tasks, each task receives its own AsyncSession.
- Pass sessions explicitly to service functions; avoid mutable global session state.
---
## Anti-Patterns
- A singleton/global AsyncSession reused across requests.
- Sharing one AsyncSession across parallel tasks.
- Hidden session creation in lower repository helpers with no caller control.
- Mixing commit/rollback ownership across layers without a declared boundary.
---
## Operational Checks
- Exactly one `async_sessionmaker` is registered in app lifecycle.
- Request handlers receive sessions from one canonical dependency.
- No code path creates AsyncSession in module import side effects.
- Background jobs and API handlers each create task-local sessions.
---
## Testing Checks
- Dependency override exists for test session factory.
- Rollback behavior is verified for failed write units.
- Parallel-task tests verify no shared AsyncSession instances.
- Lifespan tests confirm session factory is initialized and teardown-safe.
---
## Migration Notes
- If current code uses global/shared sessions, fix scope first before refactoring query style.
- If legacy sync patterns are present, keep session boundary rules stable while migrating incrementally.
@@ -0,0 +1,63 @@
# <Concept Title>
Source:
- <primary source url>
- <secondary source url>
Status: draft|adopted|deprecated
Decision level: advisory|mandatory
Applies to: api-runtime|workers|tests
Last reviewed: YYYY-MM-DD
---
## Purpose
Describe what this concept governs and why it exists.
## Scope and Non-Goals
- In scope:
- Out of scope:
---
## Rules
- Rule 1
- Rule 2
---
## Recommended Pattern
```python
# minimal example
```
---
## Anti-Patterns
- Anti-pattern 1
- Anti-pattern 2
---
## Operational Checks
- Check 1
- Check 2
---
## Testing Checks
- Test 1
- Test 2
---
## Migration Notes
- Staged rollout notes and compatibility caveats.
@@ -0,0 +1,110 @@
# Async Transaction Boundaries
Source:
- https://docs.sqlalchemy.org/en/21/orm/session_transaction.html
- https://docs.sqlalchemy.org/en/21/orm/extensions/asyncio.html
- https://docs.sqlalchemy.org/en/21/core/connections.html
Status: adopted
Decision level: mandatory
Applies to: api-runtime, workers, tests
Last reviewed: 2026-06-17
---
## Purpose
Define consistent transaction demarcation for async SQLAlchemy so write behavior is predictable, rollback semantics are clear, and concurrent request flows remain safe.
---
## Scope and Non-Goals
- In scope: transaction ownership, write/read policy, exception and rollback behavior, nested transaction guidance.
- Out of scope: business-domain validation rules and cross-service distributed transactions.
---
## Rules
- Every mutating use case must run inside an explicit transaction boundary.
- Prefer `async with session.begin():` for write units.
- Keep transaction ownership at service/use-case boundary, not deep in helper internals.
- Read paths should not auto-upgrade into hidden write behavior.
- On exception in a transaction block, rely on rollback semantics and propagate or map exceptions intentionally.
---
## Recommended Patterns
### Pattern A: Single write unit
```python
async def create_order(session: AsyncSession, payload: OrderIn) -> Order:
async with session.begin():
order = Order(...)
session.add(order)
# additional writes...
return order
```
### Pattern B: Explicit read flow
```python
async def get_order(session: AsyncSession, order_id: UUID) -> Order | None:
stmt = select(Order).where(Order.id == order_id)
return await session.scalar(stmt)
```
### Pattern C: Nested transaction (only when required)
```python
async with session.begin():
# outer transaction
async with session.begin_nested():
# savepoint-scoped operation
...
```
Use nested transactions only when partial failure semantics are explicitly required.
---
## Exception and Rollback Policy
- Write block fails: transaction context rolls back.
- Caller decides whether to translate exception (for example to domain/API errors).
- Do not swallow DB exceptions silently; map or re-raise intentionally.
---
## Anti-Patterns
- Multiple commits scattered across one logical use case.
- Helper functions that commit/rollback without caller awareness.
- Mixing implicit and explicit transaction styles in confusing ways.
- Using savepoints as a default pattern rather than a targeted tool.
---
## Operational Checks
- All mutating service functions declare one clear transaction boundary.
- No repository/helper performs hidden commit calls.
- Transaction style is consistent across handlers and workers.
---
## Testing Checks
- Success path test verifies expected durable writes.
- Failure path test verifies rollback behavior.
- Tests cover concurrency-sensitive write flows.
- Savepoint usage (if present) has dedicated behavior tests.
---
## Migration Notes
- First stabilize session scope, then normalize transaction ownership.
- Replace ad hoc commit patterns incrementally with bounded write units.