Knowledge Graph & Knowledge Base Tools for Markdown Files
Context: Solanasis has 457+ markdown files in an Obsidian vault (394 with frontmatter). The vault contains playbooks, meeting notes, client files, research, and operational docs. It’s getting hard to navigate and query effectively.
1. Obsidian-Native Solutions (Start Here)
Smart Connections Plugin + MCP Server
- What it does: Generates vector embeddings for every note, enables semantic search (find by meaning, not just keywords)
- MCP integration: There’s a dedicated Smart Connections MCP server that exposes your vault to Claude Code via semantic search
- Key benefit: No re-indexing required — reuses existing Obsidian Smart Connections embeddings
- Uses: TaylorAI/bge-micro-v2 model, 384-dimensional vectors, cosine similarity
- Setup: Point Claude Desktop’s MCP config to your vault path, server auto-starts
- Verdict: Best first step. Install Smart Connections plugin + MCP server = instant semantic search from Claude Code
Dataview Plugin
- What it does: SQL-like queries across your vault using frontmatter metadata
- Example:
TABLE file.name, tags FROM "playbooks" WHERE contains(tags, "sales") - Best for: Structured queries when you have good frontmatter discipline
- Current state: 394/457 files already have frontmatter — good foundation
Graph View + Graph Analysis Plugin
- What it does: Visual knowledge graph showing note connections via links
- Limitation: Only shows explicit
[[links]], not semantic relationships - Best for: Understanding document structure, finding orphan notes
Obsidian Copilot / AI Plugins
- What they do: In-vault AI chat that uses your notes as context
- Options: Copilot (uses OpenAI/local models), Smart Connections chat mode
- Best for: Asking questions about your own notes
Linter Plugin
- What it does: Enforces consistent YAML frontmatter formatting
- Best for: Standardizing metadata across all 457 files (cleanup tool)
2. MCP-Based Solutions (Best for Claude Code Integration)
markdown-vault-mcp
- What it does: Full MCP server for indexing, searching, and managing markdown collections
- Source: github.com/pvliesdonk/markdown-vault-mcp
- Search modes:
- FTS5 (Full-Text Search): SQLite-based keyword search with BM25 scoring + porter stemming
- Semantic search: Vector embeddings via FastEmbed (local), Ollama (local/remote), or OpenAI
- Hybrid search: Combines both using Reciprocal Rank Fusion for best results
- Additional features:
- Frontmatter-aware indexing with required field enforcement
- Incremental indexing (hash-based change detection — only reprocesses modified files)
- Git integration with auto-commits
- Backlink detection, broken link identification, connection mapping
- Read, write, and edit tools over MCP
- Setup: Install via PyPI, Docker, or from source. Set
MARKDOWN_VAULT_MCP_SOURCE_DIRto vault path - Verdict: Most full-featured option. Best if you want read+write+search all through Claude Code
obsidian-mcp-tools
- Source: github.com/jacksteamdev/obsidian-mcp-tools
- What it does: Obsidian integrations including semantic search + custom Templater prompts for Claude
- Best for: Deep Obsidian integration with Claude
3. RAG-Based Solutions (For Deep Querying)
Khoj (Self-Hosted)
- What it does: AI personal assistant that indexes your markdown files for chat-based querying
- Supports: Markdown, org-mode, PDF, images
- Self-hosted: Yes, runs locally
- Best for: Chat interface over your knowledge base
PrivateGPT / LocalGPT
- What they do: Local RAG pipelines — ingest docs, create embeddings, chat with them
- Best for: Privacy-sensitive document querying
- Downside: More setup, less Obsidian-specific
Vector Database + Custom Embeddings
- Architecture: Embed all 457 docs → store in ChromaDB/Pinecone/Qdrant → query via API
- Pro: Maximum flexibility, works with any model
- Con: Most engineering effort, need to maintain embedding pipeline
- When to use: Only if MCP-based solutions don’t meet needs
4. Alternative Knowledge Management Platforms
Logseq
- What it does: Obsidian alternative with built-in knowledge graph, works on markdown
- Key difference: Outliner-first (block-based), not document-first
- Bidirectional sync: No native sync with Obsidian, but both use markdown files
Dendron
- What it does: Hierarchical knowledge management on top of VS Code
- Key benefit: Better for large vaults with schema-based organization
- Status: Less actively maintained since 2024
Foam
- What it does: VS Code-based personal knowledge management (inspired by Roam)
- Works on: Standard markdown files with wikilinks
5. Organizational Best Practices (Do This Regardless)
Frontmatter Standardization
Your vault already has 394/457 files with frontmatter. Standardize on these fields:
---
title: Document Title
created: YYYY-MM-DD
updated: YYYY-MM-DD
type: playbook | reference | meeting-note | client-file | daily-note | outreach
tags: [tag1, tag2, tag3]
status: active | draft | archived | superseded
related: [[Other Note]]
---Pro tip: Use Obsidian’s Linter plugin to enforce these fields across all files. Frontmatter tags (without #) are preferred over inline tags for vault-level classification.
MOC (Maps of Content) Pattern
- Create “hub” notes that link to related documents by topic
- Example:
MOC - Sales Playbooks.mdthat links to all sales-related playbooks - These become your navigation layer on top of the flat file structure
PARA-Inspired Organization
You’re already mostly doing this:
- Projects =
solanasis-client-files/, active playbooks - Areas =
playbooks/,operations/,content-creation/ - Resources =
reference/,misc/,ai-training/ - Archives =
archive/
Tagging Taxonomy
Based on vault scan, current tags are minimal. Suggested taxonomy:
- Type tags:
#playbook,#reference,#meeting,#outreach,#client - Domain tags:
#sales,#operations,#marketing,#legal,#technical - Status tags:
#active,#draft,#archived - Vertical tags:
#ria,#nonprofit,#smb,#wealth-management
Top 3 Recommendations
Immediate (This Week)
- Install Smart Connections plugin + MCP server — Instant semantic search from Claude Code with zero re-indexing. This is the highest-ROI move.
Short-Term (This Month)
- Set up markdown-vault-mcp — Gives Claude Code full read/write/search over the vault with hybrid search (FTS5 + semantic). More powerful than Smart Connections alone.
Ongoing
- Standardize frontmatter + add MOC hub notes — Makes both human navigation and AI querying dramatically better. Run Linter plugin to enforce consistency across all 457 files.
What NOT to Do
- Don’t abandon markdown/Obsidian — The ecosystem is strong and getting stronger with MCP
- Don’t migrate to Notion/Confluence — Loss of local control, markdown portability
- Don’t build a custom vector DB pipeline — MCP servers already do this
- Don’t split into multiple vaults — One vault with good metadata beats multiple small ones for AI search