Adding a Backend
Adding a new backend — a Parser, LLM, VLM, Embedder, Store, or OutputWriter — is the most common contribution to indx, and the recipe is deliberately fixed. Follow these six steps and your adapter drops straight into the registry, resolves by name, and never weighs down the dependency-light core.
The whole design rests on two ideas: backends satisfy a typed Protocol by structure (no subclassing), and heavy dependencies live behind optional extras that are imported lazily. This page is the contributor-facing checklist; for shipping an adapter as a separate PyPI package, see Authoring a Plugin.
The six-step recipe
Section titled “The six-step recipe”Every backend follows the same path. The steps below are normative — the pull-request checklist enforces them.
| # | Step | Why it matters |
|---|---|---|
| 1 | Implement the Protocol exactly | Structural typing means any object that fits “drops in” |
| 2 | Convert at the edge | Vendor types never leak into core models |
| 3 | Lazy-import the heavy dependency | pip install indx stays light and air-gapped |
| 4 | Declare the extra in pyproject.toml | Users get one clear install command |
| 5 | Register via entry point | No edits to core/; resolves by name |
| 6 | Write the adapter contract test | Proves the Protocol fit and the round-trip |
1. Implement the Protocol — exactly
Section titled “1. Implement the Protocol — exactly”Match the method signatures in core protocols exactly. indx uses typing.Protocol (structural typing), not abstract base classes, so you do not subclass anything — you just write a class whose methods satisfy the interface.
For a Store, that means implementing upsert, query, and persist:
from __future__ import annotations
class Qdrant: # satisfies the Store protocol structurally """Vector store backed by Qdrant. Registry key: 'qdrant'."""
def __init__(self, url: str = "http://localhost:6333") -> None: self.url = url
def upsert( self, ids: list[str], vectors: list[list[float]], payloads: list[dict], ) -> None: ...
def query( self, vector: list[float], k: int = 5, filter: dict | None = None, ) -> list[tuple[str, float]]: ...
def persist(self, dest: str) -> None: ...2. Convert at the edge — never leak vendor types
Section titled “2. Convert at the edge — never leak vendor types”Your adapter accepts and returns only core domain types (ParsedDoc, Chunk, vectors as list[float], and so on). The vendor SDK exists only inside your adapter module. A Document must never store a qdrant_client.PointStruct, and a Protocol method must never return a raw provider response.
This is the rule that keeps the dependency graph a DAG pointing inward at core/: convert core types into vendor types on the way in, and vendor results back into core types on the way out, right there at the adapter boundary.
# ✅ vendor type built here, at the edge, and discarded heredef upsert(self, ids, vectors, payloads): from qdrant_client.models import PointStruct # vendor type, local to this method points = [ PointStruct(id=i, vector=v, payload=p) for i, v, p in zip(ids, vectors, payloads) ] self._client.upsert(collection_name="indx", points=points)3. Lazy-import the heavy dependency
Section titled “3. Lazy-import the heavy dependency”The bare install must work with no network and no GPU. So the vendor SDK is imported inside the method that needs it — never at module top level — and a missing dependency raises MissingDependencyError carrying the exact pip install indx[<extra>] hint. The extra is named after the registry key.
from indx.core.errors import MissingDependencyError
class Qdrant: def __init__(self, url: str = "http://localhost:6333") -> None: self.url = url
def connect(self) -> None: try: from qdrant_client import QdrantClient # lazy import except ModuleNotFoundError as exc: raise MissingDependencyError( "The Qdrant store requires the 'qdrant' extra. " "Install it with: pip install indx[qdrant]" ) from exc self._client = QdrantClient(url=self.url)Because the import is deferred to runtime, plugin discovery never fails just because a backend’s package is absent — the error only surfaces when that slot is actually selected.
4. Declare the extra in pyproject.toml
Section titled “4. Declare the extra in pyproject.toml”Add an entry under [project.optional-dependencies], keyed by the registry key, listing the packages your adapter needs:
[project.optional-dependencies]qdrant = ["qdrant-client>=1.7"]Now pip install indx[qdrant] pulls exactly what the Qdrant store needs, and nothing more. See Extras for the full install matrix and the defaults / all bundles.
5. Register via entry point — don’t hard-wire it
Section titled “5. Register via entry point — don’t hard-wire it”indx resolves backends by name through per-slot registries. This recipe targets third-party plugins shipped as their own PyPI package — wire your class in with an entry point so it resolves by name, and never hard-code it into core/. (A first-party adapter living in this repository instead registers in registry/builtins.py; entry points are reserved for out-of-tree plugins. See Authoring a Plugin for the standalone-package path.)
[project.entry-points."indx.stores"]qdrant = "indx.store.qdrant:Qdrant" # registry key → "module:Class"Each slot has its own entry-point group:
| Entry-point group | Slot | Protocol |
|---|---|---|
indx.parsers | parser | Parser |
indx.llms | llm | LLM |
indx.vlms | vlm | VLM |
indx.embedders | embedder | Embedder |
indx.stores | store | Store |
indx.outputs | output | OutputWriter |
indx.stages | pipeline | Stage |
Once registered, the backend is usable by name anywhere a built-in is — in indx.toml, on the CLI, or in code:
[store]backend = "qdrant"from indx import DirectoryPipeline
DirectoryPipeline(store="qdrant")See Registry and Defaults for resolution order and how first-party built-ins relate to discovered plugins.
6. Write the adapter contract test
Section titled “6. Write the adapter contract test”Every Protocol implementation ships with a contract test that proves two things: the adapter satisfies the Protocol, and it round-trips core types. The test layout mirrors src/ — so src/indx/store/qdrant.py is tested by tests/store/test_qdrant.py (unit tests under tests/unit/).
Network and model calls are mocked: real provider calls never run in the default offline suite. Use a fake client or a recorded HTTP cassette, and seed any randomness so the test is deterministic.
from indx.store import Storefrom indx.store.qdrant import Qdrant
def test_qdrant_satisfies_store_protocol(): store = Qdrant(url="http://localhost:6333") assert isinstance(store, Store) # @runtime_checkable protocol
def test_qdrant_round_trips_core_types(fake_qdrant_client): store = Qdrant() store._client = fake_qdrant_client # network mocked store.upsert(["chunk_0001"], [[0.1, 0.2]], [{"path": "a.md"}]) hits = store.query([0.1, 0.2], k=1) assert hits == [("chunk_0001", 1.0)] # returns core types, not vendor objectsFor Store adapters specifically, also exercise persist() so the archive’s embeddings/ layout is materialized — this keeps sealed .indx archives portable regardless of which backend produced them. See Testing for the full contract-test conventions, fakes, and golden-file rules.
Full Qdrant example
Section titled “Full Qdrant example”Putting the pieces together, here is the minimal shape of a backend — pyproject.toml declarations plus the adapter skeleton with its lazy import and edge conversion:
[project.optional-dependencies]qdrant = ["qdrant-client>=1.7"]
[project.entry-points."indx.stores"]qdrant = "indx.store.qdrant:Qdrant" # registry key → classfrom __future__ import annotations
from indx.core.errors import MissingDependencyError
class Qdrant: # satisfies the Store protocol structurally """Vector store backed by Qdrant. Registry key: 'qdrant'."""
def __init__(self, url: str = "http://localhost:6333") -> None: self.url = url self._client = None
def _connect(self) -> None: try: from qdrant_client import QdrantClient # lazy import except ModuleNotFoundError as exc: raise MissingDependencyError( "The Qdrant store requires the 'qdrant' extra. " "Install it with: pip install indx[qdrant]" ) from exc self._client = QdrantClient(url=self.url)
def upsert( self, ids: list[str], vectors: list[list[float]], payloads: list[dict], ) -> None: if self._client is None: self._connect() from qdrant_client.models import PointStruct # vendor type, local # convert core ids/vectors/payloads → vendor PointStruct *here*, never in core/ points = [ PointStruct(id=i, vector=v, payload=p) for i, v, p in zip(ids, vectors, payloads) ] self._client.upsert(collection_name="indx", points=points)
def query( self, vector: list[float], k: int = 5, filter: dict | None = None, ) -> list[tuple[str, float]]: if self._client is None: self._connect() results = self._client.search( collection_name="indx", query_vector=vector, limit=k ) # convert vendor results → core tuples at the edge return [(str(r.id), float(r.score)) for r in results]
def persist(self, dest: str) -> None: """Flush/export vectors into the output `embeddings/` layout.""" ...That is the entire contract. The adapter satisfies Store structurally, never leaks qdrant_client types out of the module, imports the SDK lazily with an actionable hint, declares its extra, registers by entry point, and is covered by a contract test.
Pull-request checklist
Section titled “Pull-request checklist”Before a new backend can merge, the following must be true:
- The class satisfies its Protocol exactly (signatures match the protocols reference).
- Only core types cross the boundary; vendor types stay inside the adapter module.
- The heavy dependency is imported lazily and raises
MissingDependencyErrorwith apip install indx[<extra>]hint. - The extra is declared under
[project.optional-dependencies], named after the registry key. - The class is registered via the correct
[project.entry-points."indx.<group>"]. - An adapter contract test proves the Protocol fit and core-type round-trip, with network/model mocked.
-
ruff,mypy --strict, pyright (strict), and the offlinepytestsuite all pass.
See also
Section titled “See also”- Component protocols reference — the exact signatures every backend must satisfy.
- Authoring a Plugin — ship a backend as a standalone PyPI package.
- Testing — contract tests, fakes, and golden files.
- Registry and Defaults — how names resolve to classes.
- Extras — the optional-dependency install matrix.