Skip to content

Authoring a Plugin

A plugin is an ordinary Python distribution that adds a new backend — a parser, LLM, VLM, embedder, store, output writer, or even a whole pipeline stage — and advertises it through a Python entry point. Once a user runs pip install your-package, indx discovers the backend lazily and it becomes usable by name, exactly like a built-in, with no fork of indx and no edits to the core.

This guide shows the end-to-end recipe using a worked example, indx-weaviate, which adds a Weaviate vector store.

indx maintains a per-slot registry that maps a short name (the string you write in config) to a class. First-party builtins are registered lazily inside indx; third-party backends are found at runtime through importlib.metadata entry points.

On the first time a slot is resolved, the registry scans these entry-point groups and merges anything it finds into that slot’s registry:

Entry-point groupSlotProtocol it must satisfy
indx.parsersparserParser
indx.llmsllmLLM
indx.vlmsvlmVLM
indx.embeddersembedderEmbedder
indx.storesstoreStore
indx.outputsoutputOutputWriter
indx.stagespipelineStage

The protocol signatures are defined in the protocols reference; the registry and the full list of built-in names live in registry and defaults.

A plugin uses the standard src-layout and contains just the adapter module plus a pyproject.toml:

indx-weaviate/
├── pyproject.toml
└── src/
└── indx_weaviate/
├── __init__.py
└── store.py # class WeaviateStore — satisfies the Store protocol

The adapter implements the relevant protocol structurally — no base class to subclass, you just match the method signatures. For a store, that means upsert, query, and persist:

src/indx_weaviate/store.py
from __future__ import annotations
from indx.core.errors import MissingDependencyError
class WeaviateStore:
"""Vector store backed by Weaviate. Satisfies the Store protocol."""
def __init__(self, url: str = "http://localhost:8080") -> None:
# Keep construction cheap; defer the real client to first use.
self.url = url
self._client = None
def _connect(self) -> None:
try:
import weaviate # lazy: only when actually used
except ModuleNotFoundError as exc:
raise MissingDependencyError(
"The Weaviate store requires its dependency. "
"Install it with: pip install indx-weaviate[weaviate]"
) from exc
self._client = weaviate.connect_to_local(self.url)
def upsert(
self,
ids: list[str],
vectors: list[list[float]],
payloads: list[dict],
) -> None:
if self._client is None:
self._connect()
# convert core ids/vectors/payloads -> Weaviate objects HERE, at the edge
...
def query(
self,
vector: list[float],
k: int = 5,
filter: dict | None = None,
) -> list[tuple[str, float]]:
if self._client is None:
self._connect()
...
def persist(self, dest: str) -> None:
"""Flush/export vectors into the output `embeddings/` layout so the
sealed `.indx` archive stays portable regardless of backend."""
...

The single line that makes your backend discoverable is a [project.entry-points."<group>"] table in pyproject.toml. The left-hand side is the registry name users will type; the right-hand side is the module:Class target.

indx-weaviate/pyproject.toml
[project]
name = "indx-weaviate"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = ["indx"]
[project.optional-dependencies]
# Declare the heavy/optional backend dependency as an extra named after the
# registry key, so users can `pip install indx-weaviate[weaviate]`.
weaviate = ["weaviate-client>=4.0"]
[project.entry-points."indx.stores"]
weaviate = "indx_weaviate.store:WeaviateStore"

That weaviate = "indx_weaviate.store:WeaviateStore" line is the whole contract: it says “register the class WeaviateStore from indx_weaviate.store under the name weaviate in the store slot.”

A package may register several backends and across several slots — just add more lines or more [project.entry-points."..."] tables (e.g. a indx.parsers table alongside indx.stores).

After pip install indx-weaviate, the name weaviate is available everywhere a built-in store name is — with no change to indx itself. In configuration:

indx.toml
[store]
backend = "weaviate"

Or directly in the SDK:

from indx import DirectoryPipeline
pipeline = DirectoryPipeline(store="weaviate")
space = pipeline.run("./docs", "./ai-ready")

Or on the CLI:

Terminal window
indx ./docs --out ./ai-ready --store weaviate

The registry validates that WeaviateStore actually satisfies the Store protocol when it resolves the name. If it does not, resolution fails loudly with an actionable error rather than failing deep inside a stage.

Every backend, whether built-in or third-party, follows the same fixed recipe. Plugins must additionally be safe to discover.

Match the method signatures in the protocols reference precisely. Structural typing means you do not inherit from anything — you satisfy the interface. The @runtime_checkable protocols are available if you want an isinstance check in your own tests.

Accept an optional model/config keyword and read backend-specific options from the matching indx.toml sub-table. For a store named weaviate, indx passes the contents of [store.weaviate] verbatim to your constructor; those keys are opaque to the core.

[store]
backend = "weaviate"
[store.weaviate]
url = "http://localhost:8080" # passed straight to WeaviateStore(url=...)

Defer heavy or optional dependencies to construction or method call time, never to module import. Discovery imports your module to read the class; if that import triggers import weaviate at the top level, a user who installed your plugin but not its backend would see unrelated runs break. Import inside the method, and raise MissingDependencyError with a pip install hint (see the example above).

For a Store, implement persist() so vectors can be materialized into the standard embeddings/ layout. This is what lets a .indx archive remain portable and loadable regardless of which backend produced it — see the .indx archive format.

Rule 5: Custom stages return the same context

Section titled “Rule 5: Custom stages return the same context”

If you ship a Stage under indx.stages, its run(ctx) must return the same SpaceContext instance it received, mutated in place, and append per-item failures to ctx.errors rather than raising (unless the failure is genuinely fatal). See Custom Stage for the full pattern.

A copy-paste plugin template lives in the indx repository under examples/plugin-template/. It contains a minimal src-layout package, a ready-to-edit pyproject.toml with an entry-point table, and a skeleton adapter — clone it, rename the package and registry name, fill in the protocol methods, and publish.

Terminal window
# from a checkout of https://github.com/indx/indx
cp -r examples/plugin-template ../indx-mybackend

Plugins are insulated from churn by indx’s compatibility promise:

  • Protocols and the index.json / .indx schemas are versioned with indx’s major version. Within a major version, fields are only added, never removed or retyped.
  • Additive changes are safe. New RelationType members or metadata keys may appear in a minor release; well-behaved adapters ignore values they do not recognize rather than failing.
  • An adapter built against an earlier minor version keeps working against later minors of the same major version. A major version bump is the only place a protocol may change incompatibly.

Pin your dependency accordingly — for example dependencies = ["indx>=0.4,<1.0"] — and you can rely on your backend continuing to resolve and run across minor upgrades.

  • Custom Components — pass a backend as an in-process object without packaging it.
  • Custom Stage — insert your own work into the pipeline.
  • Registry and Defaults — every built-in name and how resolution precedence works.
  • Protocols — the exact method signatures your backend must satisfy.
  • Adding a Backend — contributing a backend to indx core instead of shipping a separate package.