Skip to content

Sync

transcript_indexer.sync

Idempotent two-pass sync engine.

Pass 1 walks conversation sources (Otter .txt today), reads, hashes, and ingests new/changed/renamed/deleted conversations along with their turns and participants. Pass 2 walks note sources (.md summaries) and pairs each to its parent conversation by filename stem.

This phase deliberately omits LLM extraction (Phase 4) and embedding (Phase 5). Sync ingests structural rows only.

sync(conn, cfg, *, only=None, strict=False, keep_orphans=False)

Run a full sync against the given DB connection.

Source code in src/transcript_indexer/sync.py
def sync(
    conn: sqlite3.Connection,
    cfg: Config,
    *,
    only: Iterable[Path] | None = None,
    strict: bool = False,
    keep_orphans: bool = False,
) -> SyncReport:
    """Run a full sync against the given DB connection."""
    report = SyncReport()
    only_list = list(only) if only is not None else None
    started = datetime.now(UTC)

    if not strict:
        strict = cfg.indexing.strict_format

    _sync_conversations(
        conn, cfg, only=only_list, strict=strict, keep_orphans=keep_orphans, report=report
    )
    _sync_notes(conn, cfg, only=only_list, keep_orphans=keep_orphans, report=report)
    resolve_people(conn)

    report.elapsed_seconds = (datetime.now(UTC) - started).total_seconds()
    return report