CLI Reference
The indx command line exposes three subcommands — build, inspect, and query — over the same pipeline and data model as the SDK. This page documents every flag, the stdout shapes, and all exit codes.
Install with pip install indx (Python 3.11–3.13). For the programmatic equivalent of every command here, see the SDK reference.
Synopsis
Section titled “Synopsis”indx <dir> --out <dir> [--config indx.toml] [options] # build a knowledge spaceindx inspect <archive.indx> [options] # summarize an archiveindx query <archive.indx> "<text>" [options] # semantic searchindx <dir> — build
Section titled “indx <dir> — build”Process a directory (or a .zip) through the six-stage pipeline and write an AI-ready knowledge space to --out. The output directory receives the portable handbook.indx archive plus the expanded index.json, chunks/, and embeddings/ layout.
Build is the implicit default subcommand — there is no indx build keyword. Passing a directory (or .zip) as the first positional triggers a build; inspect and query are the only named subcommands.
| Flag | Type | Default | Description |
|---|---|---|---|
<dir> (positional) | path | — (required) | Directory or .zip to process. |
--out, -o | path | — (required) | Output directory; receives handbook.indx, index.json, chunks/, embeddings/. |
--config, -c | path | ./indx.toml if present | Configuration file. See the configuration reference. |
--parser | str | docling | Override the parser engine. |
--llm | str | openai:gpt-5-mini | Override the enrichment LLM (none to disable, ollama:qwen2.5 for local). |
--vlm | str | none | Override the vision model. |
--embedder | str | openai:text-embedding-3-small | Override the embedder (bge-m3 for local). |
--store | str | qdrant | Override the vector store backend. |
--format | str | .indx | Output writer: .indx, jsonl, langchain, or llamaindex. |
--name | str | handbook | Archive base name (produces handbook.indx). |
--strict | flag | off | Promote per-item skips to fatal failures. |
--resume | flag | off | Reuse cached stage outputs for unchanged files and config. |
--jobs, -j | int | CPU count | Parallel workers for parse/embed. |
--no-embed | flag | off | Skip stage 06 vectorization (produce a graph-only space). |
--quiet / --verbose | flag | normal | Decrease / increase log verbosity. |
Output
Section titled “Output”By default the build prints one progress line per stage, then a summary:
indx ./docs → ./ai-ready 01 walk 128 files, 14 folders 02 parse 128 ok, 0 skipped 03 chunk 1042 chunks 04 relate 380 relations 05 enrich 128 documents (openai:gpt-5-mini) 06 embed 1042 vectors → qdrant, sealed handbook.indxdone: 1042 chunks, 128 docs, embed_dim=1536 (12.4s)--quiet suppresses the per-stage lines (the summary still prints); --verbose adds detail such as per-stage cache hits/misses when --resume is active.
SDK equivalent:
from indx import DirectoryPipeline
space = DirectoryPipeline( parser="docling", llm="openai:gpt-5-mini", embedder="openai:text-embedding-3-small", store="qdrant",).run("./docs", "./ai-ready")The --strict flag corresponds to strict=True in the SDK; --no-embed corresponds to dropping the embed-pack stage (pipeline.drop("embed-pack")). Full details are in the SDK reference.
indx inspect <archive.indx>
Section titled “indx inspect <archive.indx>”Summarize a sealed .indx archive without re-running the pipeline. By default it prints space stats, a document-type histogram, and a sample of relations.
| Flag | Type | Default | Description |
|---|---|---|---|
<archive.indx> (positional) | path | — (required) | The .indx archive to inspect. |
--json | flag | off | Emit the full space.stats object as JSON instead of the human-readable summary. |
--documents [type] | str (optional) | — | List documents, optionally filtered by detected type. |
The --json output mirrors the SpaceStats model — documents, chunks, relations, embeddings, embed_dim, the per-type types histogram, and bytes_source. See data models for field meanings.
Output
Section titled “Output”By default inspect prints the space stats, a document-type histogram, and a sample of relations:
handbook.indx (indx 1.0, produced by indx 0.4.2) documents 128 chunks 1042 relations 380 embed_dim 1536 types policy 41 guide 33 reference 29 faq 25 relations (sample) chunk:0a1f → chunk:9c3e follows chunk:7b22 → chunk:1d80 references doc:contracts → doc:terms cross-referencesWith --documents [type] each row lists the document id, detected type, source path, and chunk count; passing a type filters the listing to that detected type.
SDK equivalent: inspect reads the KnowledgeSpace you get from KnowledgeSpace.load("./ai-ready/handbook.indx") — space.stats for the summary and space.documents(type=...) for the document listing.
indx query <archive.indx> "<text>"
Section titled “indx query <archive.indx> "<text>"”Run a semantic search against a sealed archive and return the most similar chunks. The query text is embedded with the same embedder pinned in the archive manifest, guaranteeing query-time compatibility.
| Flag | Type | Default | Description |
|---|---|---|---|
<archive.indx> (positional) | path | — (required) | The .indx archive to search. |
"<text>" (positional) | str | — (required) | The query string. |
-k | int | 5 | Number of hits to return. |
--type | str | — | Restrict results to a single document type. |
--json | flag | off | Emit the results as a SearchHit[] JSON array (including .chunk, .neighbors, and .source). |
Output
Section titled “Output”Default output is human-readable: for each hit, the rank, similarity score, source path, and chunk text along with its neighbor chunk ids (the context window). With --json, each element is a serialized SearchHit carrying the matched chunk, its score, and resolved neighbor chunks.
indx query ./ai-ready/handbook.indx "how long is data retained?" -k 3 --type policySDK equivalent:
from indx import KnowledgeSpace
space = KnowledgeSpace.load("./ai-ready/handbook.indx")for hit in space.search("how long is data retained?", k=3): print(hit.score, hit.source.path) print(hit.chunk.text)-k maps to the k argument of space.search(query, k=...). See the SDK reference.
Exit codes
Section titled “Exit codes”Every command returns one of these process exit codes:
| Code | Meaning |
|---|---|
0 | Success. |
1 | Fatal pipeline/runtime error (including a --strict skip promoted to fatal). |
2 | Usage error (bad flags or arguments). |
3 | Configuration error (invalid indx.toml or an unknown component name). |
4 | Archive error (missing, corrupt, or incompatible .indx). |
See also
Section titled “See also”- Configuration reference — every
indx.tomlkey and the precedence rules CLI flags participate in. - SDK reference — the programmatic counterpart of every command above.
- Inspect and query guide — task-oriented walkthrough of the read-side commands.
- Output formats — what
--formatproduces.