Contributing Overview
indx is open-source (Apache-2.0) and built to be extended. Whether you are fixing a bug, adding a vector store, or writing docs, this page is your starting point: the dev stack, the workflow, and exactly what a pull request has to satisfy before it merges.
The stack
Section titled “The stack”indx is a strictly-typed, dependency-light Python project. Knowing the toolchain up front saves round-trips with CI.
| Area | Choice | Notes |
|---|---|---|
| Language | Python 3.11+ (3.11–3.13) | tomllib, Self, ExceptionGroup/except* all require 3.11 as the floor. |
| Typing | Strict, full hints | from __future__ import annotations in every module; mypy --strict is the gate, pyright (strict) for editor feedback. No bare Any. |
| Data models | Pydantic v2 | All boundary types (config, index.json, domain records); plain dataclasses only for hot-loop internals. |
| CLI | Typer + Rich | Typer builds the command tree from typed signatures; Rich renders progress, tables, and panels. |
| Lint + format | Ruff | One tool replaces flake8 + black + isort. |
| Type check | mypy (gate) + pyright (editor) | mypy is canonical when the two disagree. |
| Tests | pytest | Offline by default; VCR/mocking for network calls; golden-file tests for artifacts. |
| Plugins | Entry points + registry | Built-ins register internally; third parties advertise entry points under indx.parsers, indx.stores, etc. |
| Extras | Optional dependencies | Heavy/cloud backends ship behind pip install indx[<extra>]; the core stays light. |
| Build | Hatchling + PEP 621 | python -m build must produce a clean light-core install. |
| Project docs | Astro Starlight (this site) + MkDocs API reference | Two surfaces — the user-facing docs you are reading are built with Astro Starlight (npm run build/npm run dev to preview); the SDK/API reference is generated from typed docstrings with MkDocs + Material + mkdocstrings (mkdocs build). Starlight content edits need a Starlight build; docstring/API changes need an MkDocs build. |
For the full normative rules behind these choices, see Coding Standards, and for the rationale and trade-offs see the architecture overview.
Architecture you are working within
Section titled “Architecture you are working within”A few load-bearing rules shape almost every contribution:
- Protocol-first. Behaviour is defined as a typed
Protocol(Parser,LLM,VLM,Embedder,Store,OutputWriter, plus theStageprotocol) before any implementation exists. Code depends on interfaces, never on concrete backends. - Dependency direction points inward.
core/imports nothing internal; implementations import only their own slot’s protocol andcore— never a sibling or another slot. An import-linter contract in CI enforces this. - Convert at the edge. Vendor types stay inside the adapter module. A
Documentnever holds a raw provider response. - Stages share one context. Every stage obeys
run(ctx: SpaceContext) -> SpaceContextand returns the same mutated context. See Pipeline and Stages.
Git and contribution workflow
Section titled “Git and contribution workflow”Branch naming
Section titled “Branch naming”Branches follow type/short-slug:
feat/lancedb-storefix/parse-empty-pdfdocs/coding-standardsConventional Commits
Section titled “Conventional Commits”Commit messages use Conventional Commits. The recognised types are:
| Type | Use for |
|---|---|
feat | A new feature or backend |
fix | A bug fix |
docs | Documentation only |
refactor | A code change that neither fixes a bug nor adds a feature |
test | Adding or fixing tests |
chore | Tooling, deps, housekeeping |
perf | A performance improvement |
Breaking changes append ! to the type (for example feat!:) and add a BREAKING CHANGE: footer. Keep the subject in imperative mood and 72 characters or fewer.
feat!: rename the `store` registry key for the qdrant backend
BREAKING CHANGE: `store = "qdrant-local"` is now `store = "qdrant"`.The PR checklist
Section titled “The PR checklist”Every pull request must satisfy this checklist before merge. CI re-runs all of it, so check it locally first.
-
ruff checkandruff format --checkpass -
mypy --strictand pyright (strict) pass with zero errors -
pytest(offline suite) passes, and new code is covered - Golden files updated deliberately if the
index.json/.indxshape changed — with aschema_versionbump and a documented migration - New backend? Its extra is declared in
pyproject.toml, its entry point is registered, and an adapter contract test is added - Public API is documented (Google-style docstrings), and CLI ⇔ SDK parity is preserved
- Starlight content changed?
npm run buildsucceeds (preview locally withnpm run dev) -
mkdocs buildsucceeds if the SDK/API reference (docstrings) changed
What CI enforces
Section titled “What CI enforces”CI is the same checklist, automated, and a red check blocks merge. The GitHub Actions matrix runs across Python 3.11 / 3.12 / 3.13 (Linux primary, macOS/Windows smoke) and enforces:
- Ruff — lint and format-check.
- mypy + pyright — strict type checking, with mypy as the canonical gate.
- pytest with coverage — the offline suite only; real provider calls never run in the default suite. Coverage targets are ≥ 90% for
core/andstages/, ≥ 80% for adapters, and new code must not lower coverage. python -m build— the package builds and the light core installs cleanly with no extras.
A separate, manually triggered job runs the opt-in --live integration tests against real backends (Qdrant, Ollama, and downloaded models). These are gated by @pytest.mark.integration and skip automatically when the required extra or service is absent.
Where to go next
Section titled “Where to go next”- Adding a Backend — the fixed recipe for a new parser, LLM, VLM, embedder, store, or writer.
- Coding Standards — the enforceable rules on typing, models, errors, and layout.
- Testing — unit vs integration, fakes and cassettes, and the golden-file workflow.
- Architecture Overview — the dependency graph and design principles you build within.
Found a bug or have an idea? Open an issue or PR on the GitHub repository. If a rule here gets in your way, propose a change to the rule in a PR rather than routing around it.