Configuration¶

txi reads configuration from a TOML file. Settings are looked up in this order, with earlier sources overriding later ones:

The --config PATH flag, if passed.
The TRANSCRIPT_INDEXER_CONFIG environment variable.
$XDG_CONFIG_HOME/transcript-indexer/config.toml (or ~/.config/transcript-indexer/config.toml on macOS/Linux, %APPDATA%\transcript-indexer\config.toml on Windows).
Built-in defaults.

Inspecting the active config¶

Show the path of the file txi is reading:

txi config path

Print its contents with TOML syntax highlighting:

txi config view

Validate the file's TOML syntax and field types:

txi config validate

All three commands accept --config PATH to inspect an alternate file.

First run¶

On first invocation, if no config file exists at the user-level path and TRANSCRIPT_INDEXER_CONFIG is unset, txi writes a fully-populated default config there. Edit any value to override it; remove a key (or section) to fall back to the built-in default.

Example¶

[paths]
transcripts = "~/transcripts"
db          = "~/.local/share/transcript-indexer/index.db"

[embedding]
provider     = "voyage"
model        = "voyage-3"
dimensions   = 1024
batch_size   = 64
api_key_env  = "VOYAGE_API_KEY"

[chunking]
strategy         = "turn_group"
turns_per_chunk  = 4
overlap_turns    = 1
min_chunk_tokens = 200
max_chunk_tokens = 8000  # set to your model's token limit; 0 = no cap

[indexing]
strict_format = false

[server]
host = "127.0.0.1"
port = 8765

Sections¶

`[paths]`¶

transcripts — directory containing raw transcript files (.txt). Default ~/transcripts.
db — SQLite database file. Default ~/.local/share/transcript-indexer/index.db.

`[embedding]`¶

Embedding-provider settings used when populating the vector index. The api_key_env field names the environment variable that holds the provider's API key; the key itself never lives in the config file.

`[chunking]`¶

How transcripts are split before embedding. Turn-group chunking groups N consecutive turns together with optional overlap.

turns_per_chunk — number of turns per window. Default 4.
overlap_turns — turns shared between consecutive windows. Default 1.
min_chunk_tokens — windows below this count are dropped (requires a token counter). Default 200.
max_chunk_tokens — windows exceeding this count are shrunk by removing turns from the tail until they fit. 0 disables the cap. Set this to your model's token limit (e.g. 8000 for OpenAI text-embedding-3-small, which has an 8192-token ceiling). Default 0.

`[indexing]`¶

strict_format — abort sync on the first malformed source instead of warning.

`[server]`¶

Settings used by txi serve.

host — host interface to bind. Default 127.0.0.1.
port — TCP port to bind. Default 8765.

See the config API reference for the underlying Pydantic models.

Configuration¶

Inspecting the active config¶

First run¶

Example¶

Sections¶

[paths]¶

[embedding]¶

[chunking]¶

[indexing]¶

[server]¶

`[paths]`¶

`[embedding]`¶

`[chunking]`¶

`[indexing]`¶

`[server]`¶