05 · Enrich

Enrich is the fifth pipeline stage. It calls the configured LLM (and, optionally, a VLM) once per document to add the AI-derived metadata that makes a knowledge space searchable and skimmable: a detected type, a list of topics, tags, and a summary. It is the only stage that can send your content to a cloud service — and the only stage you routinely drop entirely for a pure-graph, no-LLM run.

By default this stage uses the cloud-backed LLM openai:gpt-5-mini and the VLM is none (disabled). For air-gapped runs, switch the LLM to ollama:qwen2.5 or none; the stage contract is the same.

What Enrich produces

Enrich reads the Documents and Chunks assembled by the earlier stages from the shared SpaceContext and writes its results back onto the same context (the stage contract is run(ctx: SpaceContext) -> SpaceContext). The enrichments land on each Document and propagate into the relevant Chunk.metadata:

Field	Where it lands	Phase	Description
`type`	`Document.type` (and `Source.type`)	P1	Detected/refined document type, e.g. `policy`, `guide`, `table`. Type-aware enrichment tailors the prompt to the document kind.
`topics`	`Document.topics`, `Chunk.metadata.topics`	P0	Salient subjects covered by the document.
`tags`	`Document.tags`	P0	Short, keyword-style labels for filtering.
`summary`	`Document.summary`, `Chunk.metadata.summary`	P0	A concise natural-language summary.
references	`Document.references` (typed `Relation`s)	P1	LLM-assisted reference resolution that complements the Relate stage.

A resulting Document looks like this in index.json:

{
  "id": "doc_0007",
  "path": "policies/data/retention.pdf",
  "type": "policy",
  "topics": ["retention", "compliance"],
  "tags": ["gdpr", "data"],
  "summary": "Defines the 90-day retention rule…"
}

See index.json and the data models for the full shape.

The components it uses

Enrich is the only stage bound to the LLM and VLM component slots. Both are swappable adapters behind typed protocols — swap them by name in indx.toml or by passing an instance via the SDK (see bring your own stack).

The `LLM` protocol

The text model drives type, topics, tags, and summaries.

@runtime_checkable
class LLM(Protocol):
    """Text generation for enrichment (type, topics, tags, summaries).
    Default: openai:gpt-5-mini."""
    def complete(self, prompt: str, *, system: str | None = None,
                 max_tokens: int = 512, temperature: float = 0.0) -> str: ...

The default openai:gpt-5-mini uses the OpenAI adapter. The name string carries an optional :model suffix, so openai:gpt-5-mini selects the openai adapter with the gpt-5-mini model, while ollama:qwen2.5 selects the local Ollama adapter. Setting the LLM to none resolves the null adapter and effectively skips text enrichment.

The `VLM` protocol

The optional vision-language model describes images and layout captured during Parse (carried on ParsedDoc.images). It is off by default.

@runtime_checkable
class VLM(Protocol):
    """Vision-language enrichment for images/layout. Default: none (disabled)."""
    def describe(self, image: bytes, *, prompt: str | None = None) -> str: ...

Choosing which enrichments run

The [enrich] section of indx.toml controls both the models and exactly which enrichments are produced via metadata:

[enrich]
llm      = "openai:gpt-5-mini" # LLM name[:model] or "none"
vlm      = "none"             # VLM name or "none"
metadata = ["type", "topics", "tags", "summary"]

Key	Type	Default	Allowed values
`llm`	string	`openai:gpt-5-mini`	`<name>[:model]`, `none`
`vlm`	string	`none`	`<name>`, `none`
`metadata`	list of strings	`["type","topics","tags","summary"]`	any subset of those four

Trim metadata to skip work you don’t need — for example metadata = ["summary"] produces summaries only, saving LLM calls. The same values can be overridden on the CLI with --llm and --vlm (see the CLI reference).

Concurrency

Enrich issues per-document model calls with bounded concurrency, defaulting to a maximum of 4 concurrent calls. This default is deliberately conservative: it keeps a laptop responsive and respects provider rate limits when a cloud LLM is configured.

Stage	Parameter	Default
Enrich	max concurrency	4

Tune concurrency through the adapter’s indx.toml sub-table when a backend can handle more throughput. For the broader picture of how each stage parallelizes, see the performance guide.

Determinism

Enrichment is built to be reproducible:

temperature=0.0 by default. The LLM.complete signature defaults the temperature to zero so output is as stable as the provider allows.
Recorded provenance. Because some providers are not bit-for-bit reproducible, the resolved model name and version are written into index.json.metadata and the .indx archive manifest for auditability — a loaded archive records exactly which model produced its metadata.

Re-running with identical inputs, config, and component versions yields a byte-identical index.json (modulo the created_at timestamp). See reproducibility for the full determinism contract.

Privacy: the one stage that may egress

The default LLM is cloud-backed, so Enrich can egress unless you switch to ollama:qwen2.5 or none. When the local profile is used, enrichment stays on your machine; when a cloud LLM or VLM is configured, treat this stage as the egress boundary.

If you do point Enrich at a cloud provider, two patterns keep you in control:

Insert a redaction stage before Enrich so sensitive content is stripped before any egress-capable component sees it. Because stages are ordinary objects that mutate the shared context, a custom stage slots cleanly into the pipeline:

class PiiRedactStage:
    name = "pii-redact"
    def run(self, ctx: SpaceContext) -> SpaceContext:
        for chunk in ctx.chunks:
            chunk.text = redact(chunk.text)
        return ctx            # MUST return the same context

# Lands after Chunk (index 2) and before Relate (which shifts from index 3 to 4) — and therefore before Enrich.
pipeline.insert(3, PiiRedactStage())

See writing a custom stage for the full recipe.

Running with no LLM

Enrich is fully optional. If no LLM is available or wanted, drop the stage and the pipeline produces a complete knowledge space with the document graph, chunks, relations, and embeddings — just without AI-derived metadata.

from indx import DirectoryPipeline

pipeline = (
    DirectoryPipeline(embedder="bge-m3", store="qdrant")
    .drop("enrich")          # no LLM calls at all
)
space = pipeline.run("./docs", "./ai-ready")

On the CLI, set --llm none to disable text enrichment while keeping the stage in place. Per-item failures inside Enrich (for example, an LLM call that times out for a single document) are recorded as a skip error on ctx.errors and the pipeline continues; the --strict flag promotes those skips to fatal.

Error handling

Like Parse, Enrich defaults to per-item resilience. A model call that fails for one document appends a StageError(kind="skip") to the context and processing continues, so one bad document never aborts the build. Those non-fatal errors surface on the result under space.metadata["errors"]. Misconfiguration — an unknown LLM name, for instance — is fatal and aborts before any stage runs. See errors and exit codes.

Next stage

With metadata attached, the context flows into the final stage, which vectorizes every chunk, writes it to the store, and seals the archive.

→ 06 · Embed + Pack