Skip to content

Installation

indx ships as a deliberately tiny core plus a matrix of optional extras. You install the small base with pip install indx, then add only the backends your stack actually needs — a parser, an LLM, an embedder, a store. This page covers the requirements, the install model, every extra, the convenience bundles, and how indx tells you exactly which extra to add when one is missing.

  • Python 3.11+ (supported on 3.11, 3.12, and 3.13). 3.11 is the floor because indx relies on stdlib tomllib, modern typing primitives (Self, LiteralString), and ExceptionGroup / except* for fan-out error handling — all without backports.
  • A virtual environment is recommended (python -m venv .venv && . .venv/bin/activate).
  • The base install works on Linux, macOS, and Windows with no native build step.
Terminal window
pip install indx

The core install is intentionally light. It pulls in only:

DependencyRole
TyperType-annotated CLI surface
RichTerminal rendering: progress, tables, tracebacks
ClickUnderlying CLI argument parser Typer builds on
Pydantic v2All boundary data models and validation
pydantic-settingsLayered config: defaults, indx.toml, environment, CLI

TOML parsing uses the standard library’s tomllib, so there is no parsing dependency. No parser toolchain, no Torch, no vector-DB client, and no cloud SDK is installed by the base package.

indx composes parsers and models; it does not bundle them. Heavy local runtimes and cloud dependencies (Docling, Torch, Qdrant clients, OpenAI/Anthropic SDKs) are each an optional extra, never a core requirement. This keeps pip install indx small and fast, makes the cloud-backed default explicit, preserves a first-class local profile, and avoids vendor lock-in — swapping a backend is a config change, not a reinstall of the world. See the architecture overview and design principles for the reasoning.

Even with nothing but the core installed, a complete build runs offline, because the core ships zero-dependency fallbacks for the slots that would otherwise need a backend:

SlotBuilt-in fallback
Parserplaintext
Storejsonl (no database)
VLMnone (skips vision enrichment)
Output writer.indx and jsonl

That means indx ./docs --out ./ai-ready produces a usable, self-contained .indx archive with no extras, no database, and no network. The fallbacks trade quality and scale for portability — the plaintext parser only reads text, and the jsonl store does a brute-force linear scan suited to small corpora.

The recommended cloud-backed stack uses Docling for parsing, OpenAI for text enrichment and embeddings, and Qdrant for storage:

Terminal window
pip install "indx[docling,openai,qdrant]"

Set OPENAI_API_KEY before running:

Terminal window
export OPENAI_API_KEY="..."

Each extra enables one or more implementations and pulls only that backend’s dependencies. Pick per slot.

| Install | Enables | Slot | |---|---|---| | pip install "indx[docling]" | Docling parser (the default; high-fidelity layout, local) | parser | | pip install "indx[markitdown]" | MarkItDown parser (lightest local option) | parser |

See choosing a parser for the trade-offs.

| Install | Enables | Slot | |---|---|---| | pip install "indx[openai]" | OpenAI LLM (default gpt-5-mini), embedder, and GPT-4o VLM | llm / vlm / embed | | pip install "indx[ollama]" | Ollama LLM (qwen2.5 in the local profile) | llm | | pip install "indx[anthropic]" | Anthropic LLM | llm |

VLM defaults to none; enable a vision model only when you want figure and image descriptions. See enrichment with LLMs and VLMs.

| Install | Enables | Slot | |---|---|---| | pip install "indx[openai]" | OpenAI embedder (default text-embedding-3-small, API, light) | embed | | pip install "indx[bge]" | BGE-M3 embedder (local profile, dim 1024; pulls Torch) | embed |

Local embedding is the heaviest optional path because it pulls Torch and model weights. See choosing an embedder.

| Install | Enables | Slot | |---|---|---| | pip install "indx[qdrant]" | Qdrant store (the default; embedded or server) | store | | pip install "indx[pgvector]" | pgvector store (Postgres) | store | | pip install "indx[chroma]" | Chroma store | store | | pip install "indx[lancedb]" | LanceDB store (file-based, columnar) | store |

The built-in jsonl store needs no extra at all. See choosing a store.

| Install | Enables | Slot | |---|---|---| | pip install "indx[langchain]" | LangChain Document writer | output | | pip install "indx[llamaindex]" | LlamaIndex node writer | output |

The .indx and jsonl writers ship in core. See output formats.

| Install | What you get | |---|---| | pip install "indx[local]" / pip install "indx[defaults]" | The recommended local / air-gapped stack: Docling + Ollama + BGE-M3 + Qdrant | | pip install "indx[all]" | Every extra above, unioned |

For the complete, authoritative list of extras and the exact packages each pulls, see the extras reference.

Extras compose — request several at once with commas:

Terminal window
pip install "indx[docling,bge,qdrant,openai]"

This installs the Docling parser, the local BGE-M3 embedder, the Qdrant store, and the OpenAI SDK (for an OpenAI LLM/embedder) — a typical hybrid local-parse / cloud-LLM setup.

Because backends are optional, indx is designed so that a missing dependency produces a precise, actionable message — and only when that slot is actually selected.

  • If you select a backend whose extra is not installed (for example store = "qdrant" without indx[qdrant]), the run fails fast with an error that names the exact command to fix it:

    MissingDependencyError: store 'qdrant' requires: pip install indx[qdrant]
  • The error is raised lazily. The registry resolves a backend only when its slot is chosen, so an installed-but-unused plugin — or simply not having an extra you do not use — never breaks an unrelated run. A plaintext + jsonl core-only build is never affected by missing parser or store extras.

This behaviour (surfaced through MissingDependencyError) is the contract that lets the core stay light without trapping you later. See errors and exit codes for the full hierarchy.

  1. Confirm the CLI is on your path and prints a version:

    Terminal window
    indx --version
  2. Run a zero-config build on a small directory — this works even with a core-only install via the built-in fallbacks:

    Terminal window
    indx ./docs --out ./ai-ready
  3. Inspect the resulting archive:

    Terminal window
    indx inspect ./ai-ready/handbook.indx

If indx inspect prints a document tree and counts, your install is working. The CLI reference documents every command and flag.