Knowledge Graph Documentation Index¶
Welcome to the knowledge graph system! This directory contains everything you need to maintain, query, and visualize your textbook's knowledge structure.
Quick Links¶
New to the system? Start here: - TUTORIAL.md — Complete walkthrough covering setup, queries, validation, and visualization
Need specific documentation? - README.md — Overview, ID conventions, update workflow, API reference - graph.yaml — The knowledge graph itself (human-editable source of truth) - schema.yaml — Schema definition for nodes and edges
Tools Reference¶
Core Tools¶
| Tool | Purpose | Usage |
|---|---|---|
| kg_tools.py | Query library with 20+ methods | python kg_tools.py or from kg_tools import KnowledgeGraph |
| validate_kg.py | Validation and consistency checking | python validate_kg.py or make validate |
| visualize_kg.py | Generate dependency graphs (PNG) | python visualize_kg.py --chapter CH-1 --output ch01.png |
| example_queries.py | Working examples of common tasks | python example_queries.py or make examples |
| Makefile | Quick command shortcuts | make stats, make validate, make visualize |
Example Commands¶
# Show graph statistics
make stats
# Validate before committing
make validate
# Generate visualizations
make visualize
# Run example queries
make examples
# Quick setup help
make help
Common Tasks¶
Planning a New Chapter¶
from kg_tools import KnowledgeGraph
kg = KnowledgeGraph("graph.yaml")
# What needs to be done first?
blockers = kg.find_blockers("CH-12")
# What does it depend on?
deps = kg.transitive_dependencies("CH-12")
# What's the full scope?
summary = kg.chapter_summary("CH-12")
Checking Test Coverage¶
# Which equations need tests?
untested = kg.find_untested_equations()
# Overall coverage report
coverage = kg.coverage_report()
# Implementation status
impl_status = kg.implementation_status()
Before Committing Changes to graph.yaml¶
# 1. Validate the graph
python validate_kg.py
# 2. Check statistics
python kg_tools.py
# 3. Generate visualizations (optional)
python visualize_kg.py --chapter CH-1 --output ch01.png
# 4. Commit
git add graph.yaml ch01.png # Include visualization
git commit -m "docs: update knowledge graph"
Generating Visualizations¶
# Single chapter with dependencies
python visualize_kg.py --chapter CH-1 --depth 2 --output ch01.png
# Show what blocks a chapter
python visualize_kg.py --deps CH-11 --output ch11_deps.png
# Implementation map (modules → equations)
python visualize_kg.py --impl-map --output impl.png
File Organization¶
docs/knowledge_graph/
├── INDEX.md # This file
├── TUTORIAL.md # Complete tutorial (start here)
├── README.md # Detailed documentation
├── graph.yaml # Knowledge graph (source of truth)
├── schema.yaml # Schema definition
├── kg_tools.py # Query library (620 lines)
├── validate_kg.py # Validator (340 lines)
├── visualize_kg.py # Visualizer (440 lines)
├── example_queries.py # Usage examples
├── Makefile # Command shortcuts
└── INDEX.md # This file
Key Concepts¶
Node Types¶
- chapter (CH-X) — Major sections of the textbook
- equation (EQ-X.Y) — Mathematical formulas
- theorem (THM-X.Y.Z) — Mathematical theorems
- definition (DEF-X.Y.Z) — Concept definitions
- concept (CN-slug) — Abstract ideas
- algorithm — Named algorithms
- module (MOD-path.to.module) — Code modules
- test (TEST-path) — Test files
- plan (PLN-slug) — Research plans
- doc — Documentation
Edge Types¶
- defines — Chapter defines an equation/theorem/concept
- proves — Chapter proves a theorem
- uses — Module uses another module
- implements — Module implements an equation/algorithm
- tested_by — Test validates an equation/module
- depends_on — Node depends on another
- refers_to_future — Forward hook to planned content
- superseded_by — Archived node replaced by new one
Status Values¶
- planned — Placeholder, content pending
- in_progress — Partial content/code exists
- complete — Fully implemented and documented
- archived — Superseded by newer content
API Quick Reference¶
Loading the Graph¶
from kg_tools import KnowledgeGraph
kg = KnowledgeGraph("graph.yaml")
Finding Things¶
equations = kg.nodes_by_kind('equation')
completed = kg.nodes_by_status('complete')
data = kg.get_node_data('CH-1')
Analysis Queries¶
untested = kg.find_untested_equations()
unimpl = kg.find_unimplemented_equations()
blockers = kg.find_blockers('CH-11')
deps = kg.transitive_dependencies('CH-11')
dependents = kg.transitive_dependents('CH-1')
Validation¶
missing = kg.find_missing_refs()
orphans = kg.find_orphan_nodes()
cycles = kg.find_cycles()
coverage = kg.coverage_report()
status = kg.status_summary()
impl_status = kg.implementation_status()
stats = kg.export_stats()
Visualization¶
from visualize_kg import KGVisualizer
viz = KGVisualizer(kg)
viz.visualize_chapter('CH-1', output_path='ch01.png')
viz.visualize_dependencies('CH-11', output_path='deps.png')
viz.visualize_implementation_map(output_path='impl.png')
Validation Checks¶
The validator (validate_kg.py) automatically checks:
- ✅ Referential integrity — No dangling references
- ✅ File existence — All referenced files exist
- ✅ Anchor presence — Declared anchors in files
- ✅ Status consistency — Logical status transitions
- ✅ Circular dependencies — No cycles detected
- ✅ Orphan nodes — Isolated nodes flagged
- ✅ Coverage gaps — Untested equations identified
- ✅ Schema compliance — Valid kinds and edge types
Always run make validate before committing changes to graph.yaml.
Architecture¶
The knowledge graph works by:
- Loading —
kg_tools.pyreadsgraph.yamlinto a NetworkX MultiDiGraph - Querying — Methods like
transitive_dependencies()use graph algorithms - Validating —
validate_kg.pychecks consistency using NetworkX operations - Visualizing —
visualize_kg.pygenerates PNG graphs using matplotlib
The entire system is ephemeral—the NetworkX graph is created fresh each time you load the YAML. This keeps your source of truth (YAML) simple and Git-friendly.
Troubleshooting¶
"Module not found: yaml"¶
source .venv/bin/activate
pip install pyyaml networkx matplotlib
"File not found" in validation¶
Check that your graph.yaml has correct file paths relative to the repo root.
Visualization produces empty graph¶
Make sure the node ID exists (e.g., CH-1 not ch-1). Node IDs are case-sensitive.
"Circular dependency detected"¶
Run kg.find_cycles() to see the exact cycle, then restructure your edges to break it.
Contributing¶
When you modify graph.yaml:
- Update nodes/edges with new content
- Run
python validate_kg.py(exit 0 = success) - Run
python kg_tools.pyto check statistics - Generate visualizations if topology changes significantly
- Commit both
graph.yamland visualization PNGs
Resources¶
- NetworkX Documentation — Graph algorithms reference
- YAML Syntax — YAML format reference
- Matplotlib Documentation — Visualization reference
Next Steps¶
- Start with TUTORIAL.md if you're new to the system
- Run
make examplesto see the tools in action - Try
python validate_kg.pyto check your graph - Modify
graph.yamlwith your new content - Query the graph using Python from the examples
Questions? Check the TUTORIAL.md for detailed explanations and practical examples.
Happy graphing! 🎉