Skip to content

FAQ

Short answers to the questions people ask most about indx. The throughline: indx turns a directory into an AI-ready knowledge space by composing the tools you already use — it does not try to replace them. For deeper dives, follow the links into the rest of the docs.

indx is an open-source Python CLI and SDK that turns a directory of documents into a portable, queryable knowledge space — structure, relationships, and semantic metadata that agents and RAG systems can reason over — in one command. See What is indx and the concepts overview.

No. indx is not a parser, and it does not try to build a better PDF or DOCX extractor. It composes parsers — Docling (the default), Unstructured, LlamaParse, MarkItDown, or your own — behind a single Parser slot, then adds the directory-level structure that file parsers throw away: folder lineage, file-to-file relationships, document types, topics, summaries, and a chunk graph.

No. indx is not a vector database. It writes to one — Qdrant (default), pgvector, Chroma, LanceDB, or a zero-dependency jsonl store — through the Store slot, and it does not store or serve vectors at runtime. Pick the right backend in Choosing a store.

Is indx a hosted retrieval or agent runtime?

Section titled “Is indx a hosted retrieval or agent runtime?”

No. indx builds the knowledge space; serving, retrieval at scale, and agent orchestration are downstream concerns. The .indx archive it produces is the handoff point — load it into your own runtime, or feed its outputs to LangChain / LlamaIndex. indx is also not an OCR/embedding model vendor (it calls models through adapters, it does not train them) and not a general ETL framework.

Does it work offline? Does it send my data anywhere?

Section titled “Does it work offline? Does it send my data anywhere?”

Yes — there are two distinct offline configurations, and both run with no network egress:

  • Local profile (opt-in) — a local database: parser=docling, llm=ollama:qwen2.5, vlm=none, embedder=bge-m3 (dim 1024), store=qdrant running in its embedded local mode, output=.indx. This is the recommended offline stack when you want ANN performance; opt in with indx[local].
  • Fully air-gapped, no-DBno database at all: swap the store for jsonl (--store jsonl), which inlines vectors into the archive so nothing needs to be installed or served. Ideal for truly air-gapped hosts or small corpora.

The general zero-config defaults use cloud-backed OpenAI model components, so choose the local profile (or the no-DB variant) when no data may leave your machine. There is no telemetry by default; any telemetry would be strictly opt-in.

Data leaves your network only if you explicitly name a cloud component (e.g. --llm openai, a LlamaParse parser, or a Cohere embedder). For both recipes — the embedded-DB local profile and the no-database run — see Local & air-gapped and, for the store trade-off, Choosing a store.

indx is free and open-source under Apache-2.0 — fine for commercial and on-prem use. Install with:

Terminal window
pip install indx

The repository lives at github.com/indx/indx. Install details and optional extras are in Installation.

Python 3.11+, specifically tested on 3.11, 3.12, and 3.13. The 3.11 floor lets indx rely on stdlib tomllib, modern typing primitives, and ExceptionGroup/except* without backports. 3.10 and earlier are not supported.

What’s the difference between the .indx archive and the expanded output?

Section titled “What’s the difference between the .indx archive and the expanded output?”

A build (indx ./docs --out ./ai-ready) writes both forms side by side:

FormWhat it isWhen to use it
handbook.indxA single self-contained ZIP archive (manifest, index.json, chunks/, embeddings/)Ship, diff, move between machines, or load() without re-processing
Expanded ./ai-ready/The same data unpacked: index.json, chunks/, embeddings/Inspect or stream the files directly with ordinary tooling

They carry the same knowledge graph — the archive is the portable, sealed version. See the .indx archive reference and the index.json reference.

Yes — two equivalent ways. From the CLI, pass --llm none. From the SDK, drop the stage entirely:

from indx import DirectoryPipeline
space = (
DirectoryPipeline(embedder="bge-m3", store="jsonl")
.drop("enrich") # no topics/tags/summaries; everything else still runs
.run("./docs", "./out")
)

You still get walk, parse, chunk, relate, and embed+pack. More on enrichment in Enrichment with LLM & VLM.

Yes. Use the jsonl store (--store jsonl), which performs a brute-force scan and inlines vectors so the .indx archive is fully self-contained — no database installed, no server running. It is ideal for air-gapped or small corpora; for non-trivial spaces the default Qdrant ANN index is faster. Details in Choosing a store and Local & air-gapped.

How do I add my own parser, store, or embedder?

Section titled “How do I add my own parser, store, or embedder?”

Every slot is a typed Protocol, so an implementation needs no base class and no fork. Two paths:

  • Bring-your-own (in code): pass any object satisfying the protocol to the constructor or .use(...). See Bring your own stack and Custom components.
  • Plugins (packaged): publish a distribution that advertises a Python entry point under groups like indx.parsers, indx.stores, indx.embedders; indx discovers it at runtime so backend = "myparser" just works. See Authoring a plugin.

You can also add a whole pipeline step — see Custom stage.

How is indx different from LangChain or LlamaIndex?

Section titled “How is indx different from LangChain or LlamaIndex?”

They are outputs and consumers, not competitors. indx focuses narrowly on turning a directory into a structured knowledge space; LangChain and LlamaIndex are application frameworks that consume that knowledge to build chains and agents. indx ships langchain and llamaindex output writers precisely so the space you build feeds straight into them. Because every component is swappable, indx also works as a neutral migration layer between stacks. See Output formats and Use cases.

indx is designed for 10k+ files. It streams and processes incrementally rather than loading the whole estate into memory, parses files in parallel, and batches embedding and store upserts (the single biggest performance lever). Concurrency and batch sizes are tunable per backend. See Performance.

Given the same inputs, config, and component versions, runs are reproducible: chunk and document ids are assigned in a deterministic traversal order, enrichment defaults to temperature=0.0, and the resolved config plus the embedder’s name/dim are recorded in the manifest. Inherently non-deterministic LLM steps are flagged. See Reproducibility.