Types that Pay for Themselves: Pydantic v2, mypy/pyright, and Runtime Contracts

Published: August 20, 2020 (4y ago)14 min read

Updated: December 1, 2024 (8mo ago)

You add types for editors and code reviews; you keep them for reliability, onboarding, and speed. In Python circa 3.11–3.13, types can do more than color your IDE—they can enforce contracts before and during runtime with minimal friction. This post is a production-first guide to that stack: static checkers (mypy/pyright) for design-time guarantees and Pydantic v2 for fast, precise runtime validation.

We’ll stay practical: mental models that scale, copy‑pasteable snippets, and diagrams you can hand to a teammate. Benchmarks and a references section arrive later; today we establish the contracts and their ergonomics.

Who this is for

  • You run services or data pipelines where invalid inputs become tickets—or outages.
  • You want static types without ceremony and runtime validation without latency cliffs.
  • You need patterns that are easy to teach and hard to misuse.

Key takeaways

  • Types should live at two layers: design-time (mypy/pyright) and runtime (Pydantic v2). Together they prevent whole classes of bugs with little boilerplate.
  • Validate at the edges. Keep hot paths free of repeated validators; pay the cost once.
  • Pydantic v2’s pydantic-core (Rust) makes validation 4–50× faster than v1 for many shapes, so you can afford strictness where it matters.
  • Prefer expressive standard typing (TypedDict, Protocol, Annotated) and teach your checker to be strict; loosen only with intent.
flowchart LR A[External Inputs<br/>HTTP • CLI • Queue • File] -->|Raw bytes/JSON| B[Parsing & Validation<br/>Pydantic v2] B --> C[Typed Domain Objects] C --> D[Business Logic] D --> E[Typed Outputs] subgraph Guardrails F[Design-time Checks<br/>mypy / pyright] end C -.checked by.-> F D -.checked by.-> F

A minimal contract that pays for itself

Pydantic v2 turns type hints into fast runtime validators; static checkers prove you use those types consistently elsewhere.

# app/types.py
from typing import Annotated
from pydantic import BaseModel, Field
 
NonEmptyStr = Annotated[str, Field(min_length=1)]
 
class User(BaseModel):
    id: int
    email: NonEmptyStr
    is_active: bool = True
 
# app/handlers.py
from pydantic import TypeAdapter
 
UserList = TypeAdapter[list[User]]  # compiled, reusable validator
 
def ingest_users(payload: bytes) -> list[User]:
    # Validate once at the edge, return typed objects for the rest of the code.
    import json
    data = json.loads(payload)
    return UserList.validate_python(data)

Why this shape works:

  • Contracts sit at boundaries (parsers, request bodies, message consumers).
  • Everything past the edge works with plain Python objects with static types.
  • TypeAdapter compiles validators once and reuses them (cheap per call).

Strictness without surprises

In v2, you control coercion precisely. Favor strict where correctness matters, allow gentle coercions where you really want them.

from pydantic import BaseModel, ConfigDict, field_validator
 
class Invoice(BaseModel):
    model_config = ConfigDict(strict=True)  # turn off implicit coercions
    id: int
    total_cents: int
    currency: str
 
    @field_validator("currency")
    @classmethod
    def uppercase_iso4217(cls, v: str) -> str:
        if len(v) != 3:
            raise ValueError("currency must be 3 letters")
        return v.upper()

With strict=True, strings like "123" will not coerce to integers; you’ll get a clear validation error at the edge instead of a latent bug downstream.

Don’t re-validate hot paths

You can validate function calls with @validate_call, but keep it off tight loops—use it as a guard on integration boundaries or administrative code paths.

from pydantic import validate_call
 
@validate_call
def transfer(user_id: int, amount_cents: int) -> None:
    ...
 
# Great at the boundary (CLI/admin/cron): correctness beats a few µs.

For inner loops or high-QPS handlers, validate once (payload → model) and pass typed objects.

sequenceDiagram participant Client participant API as API Edge participant Core as Core Logic Client->>API: JSON payload API->>API: Pydantic v2 validate (once) API-->>Core: Typed models (no revalidation) Core-->>API: Result API-->>Client: Response

Pydantic v2 mental model (fast and precise)

  • Validation is powered by pydantic-core (Rust). Think: a compiled tree of small validators—fast to run, consistent to reason about.
  • Use Annotated[...] with Field(...) to express constraints alongside types (lengths, regexes, ge/le ranges) without inventing new names.
  • Reach for TypeAdapter when you need validation for "bare" types (e.g., list[EmailStr]) without wrapping them in a BaseModel.
  • Favor field_validator/model_validator for invariants; prefer pure functions that transform or reject.
  • For serialization, call model_dump/model_dump_json and keep transport formats at the edges.
from typing import Annotated
from pydantic import BaseModel, Field, model_validator
 
PositiveInt = Annotated[int, Field(gt=0)]
 
class LineItem(BaseModel):
    sku: str
    quantity: PositiveInt
    unit_price_cents: PositiveInt
 
class Order(BaseModel):
    items: list[LineItem]
 
    @model_validator(mode="after")
    def non_empty(cls, m: "Order") -> "Order":
        if len(m.items) == 0:
            raise ValueError("order must have at least one line item")
        return m
 
    @property
    def total_cents(self) -> int:
        return sum(i.quantity * i.unit_price_cents for i in self.items)

Static checkers as force multipliers

Static analysis makes the rest of the codebase honest about the types your validators produce. Turn on strict modes early; relax where needed with intent.

# pyproject.toml (mypy)
[tool.mypy]
python_version = "3.12"
strict = true
warn_unused_ignores = true
warn_redundant_casts = true
disallow_any_generics = true
plugins = ["pydantic.mypy"]
 
# pyproject.toml (pyright)
[tool.pyright]
pythonVersion = "3.12"
typeCheckingMode = "strict"
reportMissingTypeStubs = true
reportUnknownParameterType = true
reportUnknownMemberType = true
reportUnknownArgumentType = true

Guidance:

  • Treat unknowns as bugs, not noise. Add precise annotations or narrow with isinstance guards.
  • Prefer TypedDict for unvalidated mapping shapes at module boundaries; convert to models on ingress.
  • Encode protocols for pluggable components (typing.Protocol) instead of relying on comments.
from typing import Protocol, runtime_checkable
 
@runtime_checkable
class EmailSender(Protocol):
    def send(self, to: str, subject: str, body: str) -> None: ...
 
def notify(sender: EmailSender, user: "User") -> None:
    sender.send(user.email, "Welcome", "Hi!")

Edge-first validation: a small, durable pattern

Keep these invariants and you’ll avoid most footguns:

  1. Validate on ingress. Never trust input beyond the edge without a model.

  2. Don’t re-validate. Pass typed objects inward; static types guard the rest.

  3. Make strict the default. Opt into coercion when it’s an intentional UX choice.

  4. Measure the hot path. If validation shows up in profiles, move it outward or compile adapters once.

# fastapi-style example (applies to any stack)
from fastapi import FastAPI
from pydantic import BaseModel, ConfigDict
 
class CreateUser(BaseModel):
    model_config = ConfigDict(strict=True)
    email: str
 
app = FastAPI()
 
@app.post("/users")
def create_user(payload: CreateUser):
    # payload already validated; use payload.email as str safely
    return {"ok": True}

In the next sections we’ll deepen the toolbox: when to choose TypedDicts vs models, how to structure validators for composability, and how to wire strict static configs into CI without breaking developer flow—all with repeatable micro/macro benchmarks.

Designing types that scale (precision without ceremony)

Static types shine when they communicate intent clearly and guide editors to the right completions. Start with precise shapes and cheap narrowing.

Narrowing patterns you’ll use every day

from typing import Literal, Union, assert_never
 
class EmailJob(BaseException):
    pass
 
def handle(kind: Literal["welcome", "reset"], payload: dict) -> str:
    if kind == "welcome":
        return f"welcome:{payload['id']}"
    if kind == "reset":
        return f"reset:{payload['token']}"
    assert_never(kind)  # type checkers ensure we handled all cases
 
Num = Union[int, float]
 
def mean(xs: list[Num]) -> float:
    total = 0.0
    for x in xs:
        total += float(x)  # explicit, lossless here; documents intent
    return total / len(xs)

Use Literal to disallow arbitrary strings, assert_never to force exhaustive handling, and explicit casts when converting between safe numeric sets. Prefer local narrowing over sprinkling # type: ignore.

Choosing the right container

At boundaries you often begin with unvalidated dicts, then promote to validated domain objects. Use this decision guide:

flowchart TD A[Incoming data] --> B{Trusted internal shape?} B -- No --> C[TypedDict (document keys, minimal friction)] B -- Yes --> D{Performance critical?} D -- Yes --> E[dataclass / slots tuple (no validation)] D -- No --> F[Pydantic v2 BaseModel (validate once at edge)] C --> F

Patterns:

  • TypedDict for lightweight, unvalidated shapes (good for glue code and pre-parse).
  • BaseModel for validated domain objects with invariants and serialization.
  • dataclass/NamedTuple/slotted tuples for hot, purely-internal structures where speed/size matter and inputs are already trusted.
from typing import NotRequired, Required, TypedDict
 
class RawUser(TypedDict):
    id: Required[int]
    email: Required[str]
    name: NotRequired[str]

Convert to a model at the boundary and stop passing dicts any further in.

Pydantic v2 best practices (ergonomics and performance)

Pydantic v2 is fast, but you still own placement and reuse. Aim for strictness by default and compile once.

Make models small, immutable, and predictable

from pydantic import BaseModel, ConfigDict
 
class User(BaseModel):
    model_config = ConfigDict(
        strict=True,
        frozen=True,        # immutable instances
        slots=True,         # add __slots__ to reduce per-instance memory
        extra="forbid",     # reject unknown fields
    )
    id: int
    email: str

Immutable, slotted models are memory-efficient, hashable where needed, and resilient to accidental mutation. For ORMs or attribute-based sources, opt in explicitly:

class FromORM(BaseModel):
    model_config = ConfigDict(from_attributes=True)
    id: int

Reuse compiled validators and parse JSON directly

from functools import lru_cache
from pydantic import TypeAdapter
 
@lru_cache(maxsize=32)
def user_list_adapter() -> TypeAdapter[list[User]]:
    return TypeAdapter(list[User])
 
def parse_users_json(payload: bytes) -> list[User]:
    return user_list_adapter().validate_json(payload)

TypeAdapter compiles a validator tree once and can parse JSON natively via validate_json, skipping the intermediate Python json.loads where appropriate.

Prefer Annotated + Field over ad-hoc validators for simple constraints

from typing import Annotated
from pydantic import Field
 
EmailStr = Annotated[str, Field(min_length=3, pattern=r".+@.+\..+")]
 
class Signup(BaseModel):
    email: EmailStr
    password: Annotated[str, Field(min_length=12)]

Reserve field_validator/model_validator for cross-field invariants or transforms that can’t be stated declaratively.

Construct vs validate (know the difference)

u = User.model_validate({"id": 1, "email": "a@b.com"})     # full validation
fast_u = User.model_construct(id=1, email="a@b.com")         # trust caller, skip validation

Use model_construct only inside trusted code paths (e.g., after you just validated the same payload or when you hydrate from your own persistence format).

Validation placement and caching

Edge-first remains the rule. Practical tips:

  • Parse bytes directly with validate_json for hot JSON ingress.
  • Build one adapter per frequent shape and reuse it (memoize factories).
  • If parsing heterogeneous messages, pre-dispatch on a discriminant and validate only the selected variant.
from typing import Literal, Union
from pydantic import BaseModel, TypeAdapter
 
class MsgA(BaseModel): kind: Literal["A"]; v: int
class MsgB(BaseModel): kind: Literal["B"]; s: str
 
AnyMsg = MsgA | MsgB
ANY_MSG = TypeAdapter(AnyMsg)  # cache and reuse
 
def parse_any(msg: bytes) -> AnyMsg:
    return ANY_MSG.validate_json(msg)

CI gates: make it stick

Automate the guardrails so they don’t drift.

# .github/workflows/types.yml
name: types
on: [push, pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.12' }
      - run: pip install mypy pyright pydantic
      - name: mypy strict
        run: mypy --strict --warn-unused-ignores --warn-redundant-casts .
      - name: pyright strict
        run: npx pyright --warnings --stats

Guidance:

  • Fail the build on unknowns. Keep a short allowlist of legacy files if needed; shrink it over time.
  • Enforce extra="forbid" on models at the edges; explicitly opt in to permissive modes.
  • Track type debt as a number (unknowns, ignores). Make it visible in PRs.

Static checkers, deeply: mypy vs pyright in practice

Both are excellent; choose one as primary and keep the other as a periodic audit. Highlights:

  • mypy
    • Pros: mature, rich plugin system (use pydantic.mypy), granular flags
    • Cons: slower on huge repos; sometimes needs reveal_type spelunking
  • pyright
    • Pros: fast, great editor feedback, aggressive unknown reporting
    • Cons: no third‑party plugins; relies on library type hints being precise

Minimal configs that bite early:

# pyproject.toml (mypy)
[tool.mypy]
strict = true
warn_unused_ignores = true
disallow_any_unimported = true
plugins = ["pydantic.mypy"]
 
[tool.pydantic-mypy]
init_forbid_extra = true
warn_required_dynamic_aliases = true
// pyrightconfig.json
{
  "typeCheckingMode": "strict",
  "reportUnknownVariableType": true,
  "reportUnknownParameterType": true,
  "pythonVersion": "3.12"
}

Use backports (typing_extensions) for cutting-edge features while you straddle 3.11/3.12.

Advanced typing patterns for contracts

from typing import NewType, Annotated, Literal, TypedDict, Protocol, runtime_checkable
from typing_extensions import Required, NotRequired, Self
 
UserId = NewType("UserId", int)       # domain-distinct ids
NonEmpty = Annotated[str, ...]         # pair with Pydantic Field later
 
class UserRow(TypedDict):              # unvalidated raw row
    id: Required[int]
    email: Required[str]
    name: NotRequired[str]
 
@runtime_checkable
class Clock(Protocol):
    def now(self) -> float: ...
 
class Timer:
    def __init__(self, clock: Clock) -> None:
        self._c = clock
        self._t0 = clock.now()
    def reset(self) -> Self:
        self._t0 = self._c.now(); return self

For generics on 3.12+, the new parameter syntax (PEP 695) keeps types readable:

def pick[T](xs: list[T]) -> T:
    return xs[0]

Pydantic v2: advanced features you’ll actually use

Discriminated unions (no more hand-rolled switches)

from typing import Literal
from pydantic import BaseModel, Field, TypeAdapter
 
class A(BaseModel): kind: Literal["A"]; v: int
class B(BaseModel): kind: Literal["B"]; s: str
 
Message = A | B
DISPATCHED = TypeAdapter(Message, config={"discriminator": "kind"})
 
def parse(msg: bytes) -> Message:
    return DISPATCHED.validate_json(msg)

Or embed the discriminator via Field when mixing nesting:

class Event(BaseModel):
    payload: Message = Field(discriminator="kind")

RootModel for bare types

from pydantic import RootModel
 
class Ints(RootModel[list[int]]):
    pass
 
assert Ints.model_validate([1, 2]).root == [1, 2]

Computed properties and custom serializers

from pydantic import BaseModel, computed_field, field_serializer
 
class Product(BaseModel):
    price_cents: int
    tax_rate: float
 
    @computed_field  # appears in dumps but not in inputs
    @property
    def price_with_tax_cents(self) -> int:
        return int(round(self.price_cents * (1 + self.tax_rate), 0))
 
    @field_serializer("tax_rate")
    def serialize_rate(self, v: float) -> str:
        return f"{v:.3f}"

Aliases and back-compat without clutter

from pydantic import BaseModel, Field
 
class LegacyUser(BaseModel):
    user_id: int = Field(alias="id")     # accept id, expose user_id
    email: str
 
u = LegacyUser.model_validate({"id": 1, "email": "a@b.com"})
assert u.model_dump(by_alias=False) == {"user_id": 1, "email": "a@b.com"}
assert u.model_dump(by_alias=True)  == {"id": 1, "email": "a@b.com"}

Evolving schemas safely (versioning and migration)

Schema changes should be boring. Keep these rules:

  1. Add fields with defaults. Make them required only after producers are updated.
  2. Rename via aliases; drop old names after a deprecation window.
  3. For breaking shape changes, version at the envelope and dispatch.
flowchart LR A[Producer v1] -- old field --> B[Gateway] A2[Producer v2] -- new field --> B B -- alias/normalize --> C[Unified Model] C --> D[Core Service]

Example envelope versioning with discriminant:

from typing import Literal
from pydantic import BaseModel, TypeAdapter
 
class V1(BaseModel): version: Literal[1]; name: str
class V2(BaseModel): version: Literal[2]; full_name: str
Msg = V1 | V2
ADAPTER = TypeAdapter(Msg)
 
def normalize(payload: bytes) -> dict:
    m = ADAPTER.validate_json(payload)
    return {"name": m.name if isinstance(m, V1) else m.full_name}

Observability for contracts

Track validation outcomes and surface context for triage.

from dataclasses import dataclass
from pydantic import ValidationError
 
@dataclass
class ValidationMetrics:
    ok: int = 0
    failed: int = 0
 
METRICS = ValidationMetrics()
 
def try_parse[T](adapter: "TypeAdapter[T]", raw: bytes) -> T | None:
    global METRICS
    try:
        v = adapter.validate_json(raw)
        METRICS.ok += 1
        return v
    except ValidationError as e:
        METRICS.failed += 1
        # log e.errors() with request id / tenant / endpoint
        return None

Expose METRICS via your metrics stack (Prometheus/OpenTelemetry) and alert on sustained failures.

Measuring overhead honestly (micro and macro)

Use a tiny harness to compare strategies; keep it repeatable.

# bench/typed_edge.py
import json, time
from pydantic import BaseModel, TypeAdapter
 
class U(BaseModel): id: int; email: str
AD = TypeAdapter(list[U])
PAYLOAD = json.dumps([{"id": i, "email": f"e{i}@x"} for i in range(10_000)]).encode()
 
def t(fn, n=5):
    best = 9e9
    for _ in range(n):
        t0 = time.perf_counter(); fn(); dt = time.perf_counter() - t0
        best = min(best, dt)
    return best
 
def parse_adapter(): AD.validate_json(PAYLOAD)
def parse_json_then_validate(): AD.validate_python(json.loads(PAYLOAD))
 
print("adapter json:", t(parse_adapter))
print("json+validate:", t(parse_json_then_validate))

Guidance:

  • Pin Python version, CPU governor, and run multiple times. Inspect allocations if memory is hot.
  • Expect validate_json to win for large payloads by avoiding an intermediate Python object graph.

Anti-patterns (and better patterns)

  • Over-validating: running @validate_call and then validating payload models again in the same path. Validate once at ingress.
  • Permissive by default: leaving coercions on everywhere. Default strict=True; opt into coercion where UX demands it.
  • Passing dicts around: promote to models early; use plain dataclasses for hot, trusted internals.
  • Silent growth: allowing extras by default. Prefer extra="forbid" at edges to surface breaking producers.
  • Sprinkled # type: ignore: favor precise annotations and local narrowing; isolate unavoidable ignores and track them.

Migration quickstart (v1 → v2)

  1. Replace .parse_obj/.parse_raw with model_validate/model_validate_json.
  2. Replace .dict()/.json() with model_dump/model_dump_json.
  3. Swap @validator for @field_validator; @root_validator for @model_validator.
  4. Prefer Annotated[T, Field(...)] over con* types; keep typing_extensions handy.
  5. For bare types, introduce TypeAdapter or RootModel.

Minimal example:

# v1
class U(BaseModel):
    id: int
    @validator("id")
    def pos(cls, v):
        assert v > 0; return v
 
U.parse_obj({"id": 1}).dict()
 
# v2
from pydantic import field_validator
class U(BaseModel):
    id: int
    @field_validator("id")
    @classmethod
    def pos(cls, v: int) -> int:
        if v <= 0: raise ValueError("id>0"); return v
 
U.model_validate({"id": 1}).model_dump()

End-to-end shape: typed edge, typed core, strict CI

# app/contracts.py
from typing import Annotated
from pydantic import BaseModel, ConfigDict, Field, TypeAdapter
 
Email = Annotated[str, Field(min_length=3, pattern=r".+@.+\..+")]
 
class CreateUser(BaseModel):
    model_config = ConfigDict(strict=True, extra="forbid")
    email: Email
 
USERS = TypeAdapter(list[CreateUser])
 
# app/handler.py
def handle(payload: bytes) -> list[CreateUser]:
    return USERS.validate_json(payload)

CI enforces strict static types, and runtime validation guards edges. Everything inside speaks plain, typed Python.

Production checklist (printable)

  • Validate once at the edge; pass typed objects inward.
  • Default to strict=True, extra="forbid", slots=True, frozen=True where feasible.
  • Reuse TypeAdapter instances; prefer validate_json for large JSON.
  • Keep static checking strict; track unknowns/ignores as debt.
  • Version schemas at the envelope; use aliases for renames.
  • Log validation errors with tenant/request context; export failure rates.

Key takeaways

  • Design-time + runtime types catch whole bug classes with minimal ceremony.
  • Pydantic v2’s speed makes strict validation practical on hot ingress.
  • Place validators at boundaries, not in inner loops.
  • Strict static configs keep the rest of the code honest and self-documenting.

References