Skip to content

05 · Enrich

Enrich is the fifth pipeline stage. It calls the configured LLM (and, optionally, a VLM) once per document to add the AI-derived metadata that makes a knowledge space searchable and skimmable: a detected type, a list of topics, tags, and a summary. It is the only stage that can send your content to a cloud service — and the only stage you routinely drop entirely for a pure-graph, no-LLM run.

By default this stage uses the cloud-backed LLM openai:gpt-5-mini and the VLM is none (disabled). For air-gapped runs, switch the LLM to ollama:qwen2.5 or none; the stage contract is the same.

Enrich reads the Documents and Chunks assembled by the earlier stages from the shared SpaceContext and writes its results back onto the same context (the stage contract is run(ctx: SpaceContext) -> SpaceContext). The enrichments land on each Document and propagate into the relevant Chunk.metadata:

FieldWhere it landsPhaseDescription
typeDocument.type (and Source.type)P1Detected/refined document type, e.g. policy, guide, table. Type-aware enrichment tailors the prompt to the document kind.
topicsDocument.topics, Chunk.metadata.topicsP0Salient subjects covered by the document.
tagsDocument.tagsP0Short, keyword-style labels for filtering.
summaryDocument.summary, Chunk.metadata.summaryP0A concise natural-language summary.
referencesDocument.references (typed Relations)P1LLM-assisted reference resolution that complements the Relate stage.

A resulting Document looks like this in index.json:

{
"id": "doc_0007",
"path": "policies/data/retention.pdf",
"type": "policy",
"topics": ["retention", "compliance"],
"tags": ["gdpr", "data"],
"summary": "Defines the 90-day retention rule…"
}

See index.json and the data models for the full shape.

Enrich is the only stage bound to the LLM and VLM component slots. Both are swappable adapters behind typed protocols — swap them by name in indx.toml or by passing an instance via the SDK (see bring your own stack).

The text model drives type, topics, tags, and summaries.

@runtime_checkable
class LLM(Protocol):
"""Text generation for enrichment (type, topics, tags, summaries).
Default: openai:gpt-5-mini."""
def complete(self, prompt: str, *, system: str | None = None,
max_tokens: int = 512, temperature: float = 0.0) -> str: ...

The default openai:gpt-5-mini uses the OpenAI adapter. The name string carries an optional :model suffix, so openai:gpt-5-mini selects the openai adapter with the gpt-5-mini model, while ollama:qwen2.5 selects the local Ollama adapter. Setting the LLM to none resolves the null adapter and effectively skips text enrichment.

The optional vision-language model describes images and layout captured during Parse (carried on ParsedDoc.images). It is off by default.

@runtime_checkable
class VLM(Protocol):
"""Vision-language enrichment for images/layout. Default: none (disabled)."""
def describe(self, image: bytes, *, prompt: str | None = None) -> str: ...

The [enrich] section of indx.toml controls both the models and exactly which enrichments are produced via metadata:

[enrich]
llm = "openai:gpt-5-mini" # LLM name[:model] or "none"
vlm = "none" # VLM name or "none"
metadata = ["type", "topics", "tags", "summary"]
KeyTypeDefaultAllowed values
llmstringopenai:gpt-5-mini<name>[:model], none
vlmstringnone<name>, none
metadatalist of strings["type","topics","tags","summary"]any subset of those four

Trim metadata to skip work you don’t need — for example metadata = ["summary"] produces summaries only, saving LLM calls. The same values can be overridden on the CLI with --llm and --vlm (see the CLI reference).

Enrich issues per-document model calls with bounded concurrency, defaulting to a maximum of 4 concurrent calls. This default is deliberately conservative: it keeps a laptop responsive and respects provider rate limits when a cloud LLM is configured.

StageParameterDefault
Enrichmax concurrency4

Tune concurrency through the adapter’s indx.toml sub-table when a backend can handle more throughput. For the broader picture of how each stage parallelizes, see the performance guide.

Enrichment is built to be reproducible:

  • temperature=0.0 by default. The LLM.complete signature defaults the temperature to zero so output is as stable as the provider allows.
  • Recorded provenance. Because some providers are not bit-for-bit reproducible, the resolved model name and version are written into index.json.metadata and the .indx archive manifest for auditability — a loaded archive records exactly which model produced its metadata.

Re-running with identical inputs, config, and component versions yields a byte-identical index.json (modulo the created_at timestamp). See reproducibility for the full determinism contract.

The default LLM is cloud-backed, so Enrich can egress unless you switch to ollama:qwen2.5 or none. When the local profile is used, enrichment stays on your machine; when a cloud LLM or VLM is configured, treat this stage as the egress boundary.

If you do point Enrich at a cloud provider, two patterns keep you in control:

Enrich is fully optional. If no LLM is available or wanted, drop the stage and the pipeline produces a complete knowledge space with the document graph, chunks, relations, and embeddings — just without AI-derived metadata.

from indx import DirectoryPipeline
pipeline = (
DirectoryPipeline(embedder="bge-m3", store="qdrant")
.drop("enrich") # no LLM calls at all
)
space = pipeline.run("./docs", "./ai-ready")

On the CLI, set --llm none to disable text enrichment while keeping the stage in place. Per-item failures inside Enrich (for example, an LLM call that times out for a single document) are recorded as a skip error on ctx.errors and the pipeline continues; the --strict flag promotes those skips to fatal.

Like Parse, Enrich defaults to per-item resilience. A model call that fails for one document appends a StageError(kind="skip") to the context and processing continues, so one bad document never aborts the build. Those non-fatal errors surface on the result under space.metadata["errors"]. Misconfiguration — an unknown LLM name, for instance — is fatal and aborts before any stage runs. See errors and exit codes.

With metadata attached, the context flows into the final stage, which vectorizes every chunk, writes it to the store, and seals the archive.

06 · Embed + Pack