Engineering

Thoth and the Akashic Records: When the Scribe Meets the Library

In Egyptian mythology, Thoth maintained the Akashic Records — the cosmic library of all existence. In our stack, the same relationship emerged organically. A documentation engine that writes knowledge and a vector store that makes it searchable. The scribe feeds the library.

March 1, 2026
10 min read
#thoth#akashic-records#knowledge-management
Thoth and the Akashic Records: When the Scribe Meets the Library
Share

I didn't plan the mythology. The mythology planned itself.

When I built a documentation synchronization engine that scans 47 GitHub repos every night, extracts knowledge from merged pull requests, and writes it into a centralized knowledge base — I called it Thoth. The Egyptian god of writing, wisdom, and the moon. The ibis-headed deity who invented hieroglyphics and served as scribe to the gods.

When I built a semantic search system that indexes 100+ markdown files into a vector store and serves natural-language queries to every consumer in the stack — I called it the Akashic Records. The cosmic library of all human experience, encoded in the fabric of existence itself.

It wasn't until both systems were running in production that I realized what I'd actually built: the exact relationship described in ancient mythology. Thoth maintains the Akashic Records. The scribe feeds the library. And in our technology, the same loop exists — Thoth writes knowledge, the Akashic Records indexes it, consumers query it, work gets done, PRs merge, and Thoth writes again.

The names weren't a metaphor. They were a specification.

The Mythology

In the Egyptian pantheon, Thoth occupied a unique position. He wasn't a warrior god like Horus or a creator deity like Ra. Thoth was the infrastructure. He invented the writing system that allowed civilization to record and transmit knowledge. He maintained the divine library where all events, thoughts, and deeds were inscribed. He stood in the Hall of Ma'at during the judgment of the dead, recording the outcome as Anubis weighed the heart.

Thoth didn't generate knowledge. He organized, recorded, and made it retrievable.

The Akashic Records, drawn from the Sanskrit word akasha meaning "sky" or "aether," represent the totality of all information that exists. Every event, every discovery, every lesson — encoded in a universal medium accessible to those who know how to query it. In theosophical tradition, the Akashic Records aren't a place you visit. They're a field you tap into. The knowledge is always there. The question is whether you have the interface to reach it.

DOCTRINE

Thoth doesn't own the knowledge. The Akashic Records aren't his creation. His role is to ensure that knowledge flows from where it's generated to where it can be found. He is the pipeline between experience and memory. The scribe between action and archive.

That's a system architecture document written 4,000 years ago.

The Scribe: Thoth in Production

Thoth — the system, not the god — runs every night at 11:30 PM. It has one job: make sure that engineering work gets documented. Not with AI hallucinations. Not with generated summaries that sound right but aren't. With the actual structured content that engineers already wrote in their pull request descriptions.

THOTH — DOCUMENTATION SYNC PIPELINE
Daily cron · 11:30 PM ET · No LLM required
GitHub · Invictus-Labs
47 repos · merged PRs · gh CLI
STAGE 01
Scanner
gh pr list
5s timeout · skip archived
STAGE 02
Analyzer
Classify PRs
≥50 adds · no .md changes
STAGE 03
Generator
Extract + Template
parse body · sanitize tables
STAGE 04
Publisher
gh pr create
max 5 per run · human review
Knowledge Base
kb/projects/*.md · doc PRs
report.json
scan results · KB health metrics
Mission Control
/doc-health panel · stat cards

The pipeline has four stages, and none of them require an LLM.

Stage 1 — Scanner. Thoth queries the GitHub API for every non-archived repository in the Invictus-Labs organization. Currently 47 repos. For each repo, it fetches all pull requests merged that day. The raw material for documentation already exists — it's in the PR bodies that engineers wrote to explain their changes.

Stage 2 — Analyzer. Not every PR needs documentation. A 15-line dependency bump doesn't belong in the knowledge base. Thoth classifies each PR against a set of rules: minimum 50 additions, not from dependabot or renovate, doesn't already include doc changes. PRs that pass classification get priority-scored — feat: prefixed PRs or those with 200+ additions rank HIGH, fix: and refactor: rank MEDIUM, everything else LOW.

Stage 3 — Generator. Here's where the zero-LLM constraint matters. Every PR in the Tesseract Intelligence ecosystem follows a structured format: ## Summary, ## Architecture, ## Test plan. Thoth extracts these sections and templates them into knowledge base markdown. For new repos, it creates the full project page. For existing repos, it appends a changelog row. The information is already correct because it was written by the engineer who built the feature.

Stage 4 — Publisher. Thoth clones the knowledge-base repo, creates a branch per source repo, commits the generated docs, pushes, and opens a PR via gh pr create. Rate-limited to 5 doc PRs per run to keep the review queue manageable.

Repos Scanned
47
nightly across the entire Invictus-Labs org
LLM Cost
$0
pure extraction and templating, zero inference
KB Coverage
18% → 46%
first week of production (Feb 21-27)

The result: documentation that tracks engineering velocity automatically. When a feature ships at 3 PM, Thoth documents it at 11:30 PM. The knowledge base stays current without anyone remembering to update it.

The Library: Akashic Records in Production

The previous article covered the Akashic Records architecture in depth. The short version: it's a FastAPI service on port 8002 that indexes 104 markdown files across 7 source locations into a ChromaDB vector store. Sentence-transformer embeddings convert text into 384-dimensional vectors. Cosine similarity finds semantically related content regardless of how the query is phrased versus how the knowledge was originally written.

Total Chunks
1,211
across 6 knowledge categories
Query Latency
<100ms
embed + similarity search + ranked retrieval

The critical detail for this article: one of those 7 source locations is the knowledge-base repository. The same repo that Thoth writes to every night. The Akashic Records indexes it every 6 hours via incremental reindex.

Which means Thoth's output becomes searchable knowledge within hours of being written.

The Convergence

Here's where the mythology stops being a naming convention and starts being an architecture diagram.

Engineering Work (47 repos)
        │
        ▼
   ┌─────────┐     PR merges
   │  GitHub  │────────────────┐
   └─────────┘                 │
                               ▼
                    ┌──────────────────┐
                    │      THOTH       │  11:30 PM nightly
                    │   (Doc Sync)     │  Extract → Template → PR
                    └────────┬─────────┘
                             │
                             ▼ writes docs to
                    ┌──────────────────┐
                    │  Knowledge Base  │  ~/Documents/Dev/knowledge-base/
                    │     (GitHub)     │
                    └────────┬─────────┘
                             │
                             ▼ indexed by
                    ┌──────────────────┐
                    │ AKASHIC RECORDS  │  Every 6 hours
                    │  (Vector Store)  │  Chunk → Embed → Store
                    └────────┬─────────┘
                             │
                             ▼ queried by
              ┌──────────────┼──────────────┐
              │              │              │
        Claude Code    Mission Control   OpenClaw
         (MCP)          (REST API)      (curl/cron)
              │              │              │
              └──────────────┼──────────────┘
                             │
                             ▼ informs
                    ┌──────────────────┐
                    │  Engineering     │
                    │    Decisions     │──── which produce PRs ───┐
                    └──────────────────┘                          │
                                                                  │
                    ┌─────────────────────────────────────────────┘
                    │
                    ▼
               Back to GitHub → Back to Thoth → Back to Akashic Records
INSIGHT

This is a closed knowledge loop. Work generates PRs. Thoth extracts documentation from PRs. The Akashic Records indexes that documentation. Consumers query the indexed knowledge to inform new work. New work generates new PRs. The cycle continues — every iteration adding to the total knowledge available to the system.

The loop has no manual steps. No one needs to remember to document a feature. No one needs to remember to reindex the knowledge base. No one needs to know which file contains the answer to their question. Thoth writes. The Akashic Records indexes. The query interface serves.

What This Looks Like in Practice

A concrete example. On February 27th, three PRs merged across the ecosystem:

  1. A feat: PR in polymarket-bot adding a new early-window predictive scoring path
  2. A fix: PR in indecision-discord-bot resolving a WebSocket reconnection issue
  3. A refactor: PR in mission-control restructuring the portfolio API

At 11:30 PM, Thoth scanned all 47 repos. The analyzer classified all three PRs as needing documentation — the polymarket-bot PR ranked HIGH (feat prefix, 200+ additions), the other two ranked MEDIUM. The generator extracted the structured content from each PR body and either created or updated the corresponding knowledge-base project pages. Three doc PRs were opened, reviewed by CodeRabbit, and merged.

Six hours later, the Akashic Records ran its incremental reindex. It detected the three modified files in the knowledge-base mount, re-chunked and re-embedded them, and updated the vector store. Total incremental cost: under 2 seconds.

The next morning, when a Claude Code session needed to understand "how does the early window predictive scoring work?" — the Akashic Records returned the freshly indexed chunk from the polymarket-bot knowledge-base page, which Thoth had generated from the actual PR description written by the engineer who built the feature. The information was accurate because it was never generated — it was extracted and preserved.

ALPHA

The knowledge that answered the query was less than 12 hours old. It traveled from a PR merge → through Thoth's extraction pipeline → into the knowledge base → through the Akashic Records' embedding pipeline → into the vector store → back to a consumer. Automatically. At zero marginal cost.

Why Zero-LLM Matters for the Scribe

There's a temptation in every AI-adjacent system to throw an LLM at the problem. Let the model summarize the PR. Let it generate documentation from the diff. Let it write the knowledge-base page from scratch.

I deliberately built Thoth without an LLM, and the InDecision Framework taught me why. When you're building systems that other systems depend on for ground truth, the information chain must be lossless. An LLM summarizing a PR will produce something that sounds right. It will use correct-seeming technical terms. It will structure the output beautifully. And it will occasionally hallucinate a detail that was never in the PR, or omit a critical constraint that was.

Thoth doesn't summarize. It extracts. The ## Summary section from the PR body goes into the knowledge base as-written. The ## Architecture section is preserved verbatim. The information is correct because it is the original information — reformatted, not rewritten.

The Akashic Records is where AI enters the chain. The embedding model converts text to vectors for similarity search. But embedding is a mathematical transformation, not a generative one. The content isn't altered — it's projected into a searchable space. The information integrity survives the entire pipeline from PR body to query result.

WARNING

Generative AI is powerful for synthesis and creation. It is dangerous for transcription and archival. The scribe must be faithful. The library can be intelligent.

The Convergence Pattern

CONVERGENCE PATTERN — BATCH MERGE
Multiple PRs editing the same file require sequential merge passes
PASS
CREATED
MERGED
CONFLICTS
STATUS
Pass 1
created 35
merged 11
conflicts 24
CONFLICTS
Pass 2
created 12
merged 9
conflicts 3
CONFLICTS
Pass 3
created 5
merged 4
conflicts 1
CONFLICTS
Pass 4
created 2
merged 2
conflicts 0
CONVERGED
Create PRs
CodeRabbit Review
Merge
Close Conflicts
Re-run
Budget N/5 merge passes where N = total PRs. Each merge invalidates overlapping branches.

This isn't just a two-system integration. It's a pattern that applies anywhere knowledge is generated and consumed:

Separate the writer from the reader. Thoth writes. The Akashic Records reads. Neither system does both. This separation means each can be optimized independently — Thoth for extraction accuracy and coverage, the Akashic Records for retrieval speed and semantic relevance.

Automate the boring middle. The hard part of knowledge management isn't writing docs or searching them. It's the transfer — getting knowledge from where it's generated (PRs, conversations, decisions) into where it's needed (search results, dashboards, agent context). Thoth automates the transfer. The Akashic Records automates the retrieval. The middle disappears.

Close the loop. A knowledge system that doesn't feed back into work is a library nobody visits. The MCP bridge, the REST API, the Docker network integration — these are the feedback paths that ensure indexed knowledge flows back into the engineering process that generates more knowledge.

The Numbers

The combined system, after one week of production:

Knowledge Pipeline
47 → 1,211
repos scanned nightly → searchable chunks available
Documentation Velocity
0 → 5
doc PRs generated per night, automatically
Total Infrastructure Cost
$0/month
CPU embeddings, no cloud vector DB, no LLM inference
Test Coverage
92% + 96%
Akashic Records (76 tests) + Thoth (74 tests)

Zero ongoing cost. 150 tests between the two systems. The entire knowledge pipeline — from PR merge to semantic query result — runs on a Mac Mini with no external API dependencies at inference time.

What Comes Next

The mythology has one more layer I haven't built yet. In the Egyptian tradition, Thoth didn't just write in the Akashic Records. He also read from them. He consulted the cosmic library to make judgments, resolve disputes, and advise other gods. The scribe wasn't just an input mechanism — he was a bidirectional interface.

In our stack, Thoth currently writes to the knowledge base but doesn't read from the Akashic Records. The next evolution is giving Thoth awareness of what's already documented. Before generating a new knowledge-base page, query the Akashic Records for existing coverage. Before creating a doc PR, check if the semantic content already exists in a different source. Use the library to make the scribe smarter — reduce redundancy, identify gaps, prioritize what actually needs to be written.

Rewired Minds explores how cognitive systems compound over time. This is the technical manifestation of that thesis. Every cycle through the loop makes the system more complete. Every document Thoth writes becomes searchable context that improves the next cycle's output. The knowledge base doesn't just grow — it compounds.

The scribe maintains the library. The library informs the scribe. The loop tightens with every iteration.

Thoth and the Akashic Records. The mythology was the architecture all along.

Explore the Invictus Labs Ecosystem

// Join the Network

Follow the Signal

If this was useful, follow along. Daily intelligence across AI, crypto, and strategy — before the mainstream catches on.

No spam. Unsubscribe anytime.

Share
// More SignalsAll Posts →