Project Archaeology

You've got an existing project with months or years of history. Can you build a decision graph retroactively? Absolutely—and it's one of the most valuable things you can do.

The Scenario

Picture this: You're joining a project that's been running for two years. There's code everywhere, scattered documentation, and that one engineer who knew everything just left. The codebase works, but why does it work this way?

Or maybe it's your own project. You built it six months ago. You made decisions. But now you look at the code and think: "Why did I use MongoDB here? Was there a reason?"

This is where archaeology comes in—building a decision graph by mining your existing history.

The One Root Approach

Unlike forward logging where you create many independent goals, archaeology typically flows from one origin point because you're reconstructing a single project's history.

            ┌─────────────────────┐
            │   Project Origin    │   Root goal
            │  "Build Acme Corp"  │
            └──────────┬──────────┘
                       │
      ┌────────────────┼────────────────┐
      ▼                ▼                ▼
┌───────────┐   ┌───────────┐   ┌───────────┐
│   Auth    │   │ Database  │   │   API     │   Major themes
│  System   │   │  Choice   │   │  Design   │
└─────┬─────┘   └─────┬─────┘   └─────┬─────┘
      │               │               │
┌─────┴─────┐   ┌─────┴─────┐   ┌─────┴─────┐
│  Options  │   │  Options  │   │  Options  │   Branches of
│  Actions  │   │  Actions  │   │  Actions  │   exploration
│  Outcomes │   │  Outcomes │   │  Outcomes │
└───────────┘   └───────────┘   └───────────┘

Step 1: Initialize and Set the Root

Watch how we establish the foundation for reconstructing a project's decision history:

Setting Up Archaeology
user
cd ~/projects/acme-corp
user
deciduous init
output
Initialized deciduous in /Users/you/projects/acme-corp
Created .deciduous/deciduous.db
user
deciduous add goal "Acme Corp Platform" -c 100 --prompt-stdin << 'EOF' Original project goal: Build a B2B SaaS platform for inventory management. Started January 2023. Team of 3 engineers initially. Key constraints: Must integrate with existing ERP systems, handle 10k concurrent users, SOC2 compliance required. EOF
output
Created node #1: goal "Acme Corp Platform"
user
deciduous show 1
output
goal #1: Acme Corp Platform Status: active | Confidence: 100% Branch: main Prompt: Original project goal: Build a B2B SaaS platform for inventory management. Started January 2023. Team of 3 engineers initially. Key constraints: Must integrate with existing ERP systems, handle 10k concurrent users, SOC2 compliance required.

Animation Paused

The root node captures everything known about the project's original intent—requirements, constraints, team size, timeline pressures.

The Root Node

Capture everything you know about the project's original intent. Requirements, constraints, team size, timeline pressures. This context will inform everything that branches from it.

Step 2: Mine the Git Log for Major Themes

Your git history is a goldmine. Watch how we extract patterns:

Mining Git History
user
git log --oneline --reverse | head -15
output
abc1234 Initial commit def5678 Add user authentication ghi9012 Implement JWT tokens jkl3456 Add PostgreSQL schema mno7890 First API endpoints pqr1234 Add inventory model stu5678 Implement ERP sync vwx9012 Add multi-tenant support yza3456 Webhook system bcd7890 MongoDB to PostgreSQL migration
user
git branch -a | grep -E 'feature|release' | head -10
output
feature/auth-overhaul feature/multi-tenant feature/realtime-sync feature/graphql-api release/v1.0 release/v2.0
user
gh pr list --state merged --limit 20 --json title | jq '.[].title'
output
"Add multi-tenant support" "Migrate to PostgreSQL from MongoDB" "Implement webhook system" "Add realtime inventory updates" "GraphQL API for mobile clients" "Auth overhaul - OAuth2 support"

Animation Paused

Major themes identified: Authentication, Database choices, Multi-tenancy, API design, Integrations.

From this mining, we can identify major themes:

Step 3: Create Decision Points for Major Themes

Now we translate those themes into decision nodes:

Building the Theme Structure
user
deciduous add decision "Database architecture choice" -c 85
output
Created node #2: decision "Database architecture choice"
user
deciduous link 1 2 -r "Foundational infrastructure decision"
output
Created edge: #1 -> #2 (leads_to)
user
deciduous add option "MongoDB - document store" -c 70
output
Created node #3: option "MongoDB - document store"
user
deciduous add option "PostgreSQL - relational" -c 85
output
Created node #4: option "PostgreSQL - relational"
user
deciduous link 2 3 -r "Initially chosen for flexible schema"
user
deciduous link 2 4 -r "Better for complex queries, ACID compliance"
output
Created edge: #2 -> #3 (leads_to)
Created edge: #2 -> #4 (leads_to)
user
deciduous add action "Migrated from MongoDB to PostgreSQL" -c 90 \ --commit bcd7890 --date "2023-06-15"
output
Created node #5: action "Migrated from MongoDB to PostgreSQL"
commit: bcd7890
date: 2023-06-15
user
deciduous link 2 5 --edge-type chosen \ -r "Complex reporting needs, transaction requirements"
output
Created edge: #2 -> #5 (chosen)

Animation Paused

We've created a decision point with options and linked it to the actual commit where the migration happened.

Step 4: Mine Issues and PRs for Context

GitHub issues and PRs often contain the why that's missing from commits:

Extracting Context from Issues
user
gh issue list --state all --search "database postgres mongodb" --limit 10
output
#42 MongoDB performance issues with complex queries (closed) #67 Investigate PostgreSQL migration (closed) #89 Transaction support needed for inventory ops (closed)
user
gh issue view 67 --json body,comments
output
{ "body": "We're hitting MongoDB limitations on reporting.", "comments": [ {"body": "Tried aggregation pipeline, still too slow"}, {"body": "PostgreSQL POC shows 10x improvement"} ] }
user
deciduous add observation "MongoDB hit scaling limits on reporting" -c 90 \ --prompt-stdin << 'EOF' From issue #67: Complex inventory reports took 30+ seconds. MongoDB aggregation pipeline couldn't optimize the queries. PostgreSQL POC with proper indexes showed 10x improvement. Decision point: migrate or live with slow reports. EOF
output
Created node #6: observation "MongoDB hit scaling limits on reporting"
user
deciduous link 6 5 -r "This discovery triggered the migration decision"
output
Created edge: #6 -> #5 (leads_to)

Animation Paused

Issues and PRs contain discussions that explain the reasoning. Capture these as observations with full context.

Step 5: Trace Evolutionary Patterns

Projects evolve. Look for patterns where early decisions were revisited:

Tracking Architecture Evolution
user
git log --oneline --all --grep="refactor\|migrate\|overhaul" | head -10
output
abc123 refactor: extract auth into microservice def456 migrate: move to event-driven architecture ghi789 overhaul: replace REST with GraphQL for mobile jkl012 refactor: split monolith into services
analysis
This tells a story of evolution: Monolith -> Microservices -> Event-driven
user
deciduous add decision "System architecture approach" -c 80
output
Created node #10: decision "System architecture approach"
user
deciduous add option "Monolithic Rails app" -c 70
user
deciduous add option "Microservices" -c 75
user
deciduous add option "Event-driven architecture" -c 85
output
Created node #11: option "Monolithic Rails app"
Created node #12: option "Microservices"
Created node #13: option "Event-driven architecture"
user
deciduous link 10 11 -r "V1: Fast to build, single deployment"
user
deciduous link 11 12 -r "V2: Auth extracted due to scaling needs"
user
deciduous link 12 13 -r "V3: Async processing for ERP integrations"
output
Created edge: #10 -> #11 (leads_to)
Created edge: #11 -> #12 (leads_to)
Created edge: #12 -> #13 (leads_to)

Animation Paused

The edges between options show the evolution path: V1 -> V2 -> V3, each transition driven by specific needs.

The Archaeology Checklist

A systematic approach to mining your project history:

Source What to Look For Creates
Initial commits Original tech stack, early architecture decisions, options
Feature branches Major initiatives and their scope goals, actions
PR descriptions Why changes were made observations, rationales
Issues/tickets Problems encountered, discussions observations, context
Release notes Major milestones outcomes
README history How the project description evolved goal refinements
Dependency changes Library swaps, version upgrades decisions, actions

A Real Example: Deciduous Itself

Watch archaeology in action on the deciduous project:

Archaeology on Deciduous
user
git log --oneline --reverse | head -10
output
a1b2c3d Initial commit - decision graph CLI e4f5g6h Add SQLite storage with Diesel i7j8k9l Implement TUI with Ratatui m0n1o2p Add web viewer with React q3r4s5t Add deciduous serve command u6v7w8x Export to DOT format
user
deciduous add goal "Decision graph tooling for AI development" -c 95 \ --prompt-stdin << 'EOF' Build a tool that captures decisions made during AI-assisted development. Core problem: LLM context loss destroys institutional knowledge. Solution: Persistent decision graph that survives sessions. Must be: CLI-native, fast, integrates with Claude Code. EOF
output
Created node #1: goal "Decision graph tooling for AI development"
user
deciduous add decision "Data storage approach" -c 90
user
deciduous add option "Plain JSON files" -c 50
user
deciduous add option "SQLite with Diesel ORM" -c 90
output
Created node #2: decision "Data storage approach"
Created node #3: option "Plain JSON files"
Created node #4: option "SQLite with Diesel ORM"
user
deciduous link 1 2 -r "Foundation decision"
user
deciduous link 2 3 -r "Simple but no querying capability"
user
deciduous link 2 4 --edge-type chosen \ -r "Query support needed for /recover, branch filtering"
output
Created edge: #1 -> #2 (leads_to)
Created edge: #2 -> #3 (leads_to)
Created edge: #2 -> #4 (chosen)
user
deciduous nodes
output
ID TYPE STATUS TITLE #1 goal active Decision graph tooling for AI development #2 decision active Data storage approach #3 option active Plain JSON files #4 option active SQLite with Diesel ORM

Animation Paused

In just a few commands, we've reconstructed the foundational decisions behind deciduous itself.

When to Do Archaeology

Archaeology vs Forward Logging

Aspect Archaeology Forward Logging
Root nodes Usually one (the project) Many (each feature/session)
Detail level High-level themes Granular steps
Prompts Reconstructed from issues/PRs Captured verbatim in real-time
Accuracy Best-effort reconstruction Ground truth
Time investment One-time deep dive Continuous small logs
The Hybrid Approach

Most projects benefit from both: Do archaeology once to capture historical context, then switch to forward logging for all new work. The graph becomes a living document that spans your project's entire history.

Tips for Effective Archaeology

  1. Start with major themes, not details. Don't try to capture every commit. Focus on the big decisions first.
  2. Use commit hashes liberally. Link nodes to commits with --commit for traceability.
  3. Backdate nodes to when decisions were made. Use --date "2023-06-15" to place nodes at their historical point in time. Supports YYYY-MM-DD, YYYY-MM-DD HH:MM:SS, and RFC3339 formats.
  4. Capture uncertainty. If you're not sure why something was done, say so: -c 50 for low confidence.
  5. Interview people. If the original developers are available, ask them. Capture their answers as prompts.
  6. Don't over-connect. Some decisions are independent. Not everything needs to link to everything.