GTM Knowledge Base Platform Research

Date: 2026-03-27 Purpose: Research findings supporting the GTM Knowledge Base deep plan Related Plan: .claude-plans/deep-plan-gtm-knowledge-base-2026-03-27.md


Current State Assessment

Document Inventory (as of 2026-03-27)

CategoryCountLocationHas Metadata
Playbooks192playbooks/ + playbooks/misc/Blockquote (inconsistent)
Meeting notes72meeting-notes/Date in filename only
Daily notes24daily-notes/None
Reference15reference/Minimal
Content creation11content-creation/Mixed
Operations10operations/Some version/date
Sales11sales/None
Outreach5outreach/None
Website config8website-config/None
Total589solanasis-docs/~10% have any metadata

Existing Infrastructure

  • Supabase Postgres (localhost:5433) — docs_documents table exists with pgvector, HNSW index, search_docs() function
  • Directus (localhost:8055) — running, introspects all tables
  • Quartz (docs.solanasis.com) — static site reader for Obsidian vault
  • SilverBullet (edit.solanasis.com) — web-based markdown editor
  • SupabaseClient — Python client with full CRUD, already includes docs_documents

Metadata Patterns Found

Playbooks use blockquote-style frontmatter (not YAML):

> **Version:** 2.0
> **Date:** 2026-03-15
> **Type:** GTM Strategy
> **Owner:** Dmitri Zasage
> **Purpose:** Master GTM playbook
> **Status:** Active
> **Companion docs:** Link1 | Link2

Only 1 file had YAML frontmatter (--- delimited). Quartz supports both.


Platform Comparison Summary

Evaluated Options

OptionScoreCostSelf-HostedCommentsRAGVerdict
Supabase + Directus84/90$0/moYes (deployed)Record-levelYesWINNER
Notion62/90Free 1yr → $8/moNoBest inlineNoPhase 5 option
Outline67/90$0Yes (+1.5GB RAM)Inline paragraphsNoConsider later
Docmost65/90$0Yes (+1GB RAM)Page + inlineNoPhase 5 option
Coda56/90$10/mo+NoInlineNoToo limited
WikiJS63/90$0YesNoneNoNo comments

Why Supabase + Directus Wins

  1. Already deployed — zero new infrastructure for Phases 1-4
  2. $0/month — no SaaS subscriptions
  3. Full SQL control — can build any query, function, or view
  4. pgvector + HNSW — semantic search and RAG from day 1
  5. SupabaseClient — Claude Code has native Python access
  6. Directus comments — per-record comments sufficient for single-user
  7. Extensible — can add Docmost or Notion as a commenting layer later
  8. Backup included — existing daily pg_dump covers new tables

When to Add a Richer UI (Phase 5 Criteria)

Add Docmost, Notion, or Outline when:

  • Directus record-level comments feel insufficient for playbook review
  • 1099 contractors need to collaboratively edit playbooks
  • Dmitri needs inline text highlighting/annotation (not just whole-doc comments)
  • Evaluation period: 2 weeks after Phase 3 completes

Supabase Schema Design (Key Additions)

New Columns on docs_documents

  • status — document lifecycle (draft/active/review_needed/archived/superseded)
  • category — document type (playbook/research/sop/template/reference/outreach)
  • slug — URL-friendly unique identifier
  • description — short blurb for search results
  • Review tracking: last_reviewed_at, last_reviewed_by, review_interval_days, next_review_due
  • Sync tracking: checksum (SHA-256), synced_at, file_modified_at
  • metadata (jsonb) — flexible key-value pairs
  • fts (generated tsvector) — auto-updating full-text search vector

New Tables

  • docs_versions — content version history
  • docs_reviews — review feedback log
  • docs_chunks — document chunks with embeddings for RAG

New Functions

  • search_kb() — full-text search with category/status filtering
  • search_chunks() — semantic search against chunk embeddings
  • search_kb_hybrid() — Reciprocal Rank Fusion combining FTS + semantic
  • docs_needing_review() — overdue document list
  • record_review() — record review + auto-update lifecycle timestamps

RAG Architecture Notes

Chunking Strategy

  • Method: RecursiveCharacterTextSplitter with markdown-aware separators
  • Chunk size: 512 tokens with 50-token overlap
  • Context enrichment: Prepend document title + section heading to each chunk (+18% accuracy)
  • Separators: \n## , \n### , \n#### , \n\n, \n, . , , “

Embedding Model

  • Model: text-embedding-3-small (1536 dimensions, matches existing schema)
  • Cost: $0.02 per million tokens
  • Full corpus estimate: ~384K tokens → ~$0.008 per full re-embed run

Search Strategy

  • Hybrid search (RRF): Combines full-text ranking (30% weight) + semantic similarity (70% weight)
  • Threshold: 0.7 cosine similarity minimum for semantic results
  • Reranking: Reciprocal Rank Fusion for combining result sets

Security Model

  • Database: localhost-only (not exposed to internet)
  • Web UIs: Cloudflare Tunnel + Access OTP (same pattern as sm.solanasis.com)
  • API keys: Stored in Infisical, retrieved via secret get
  • RLS: Not needed for single-user; available if contractors need scoped access
  • Backup: Existing daily pg_dump → AES-256-CBC → Cloudflare R2

Sources Consulted

  • Supabase Full Text Search docs
  • Supabase Vector Columns / pgvector docs
  • PostgreSQL v18 textsearch documentation
  • Firecrawl: Best Chunking Strategies for RAG in 2025
  • LangCopilot: Document Chunking for RAG Practical Guide
  • Docmost self-hosting guide + API docs
  • Directus Knowledge Base tutorial + Flexible Editor extension
  • AFFiNE / AppFlowy / WikiJS / BookStack self-hosting docs
  • pgvector HNSW vs IVFFlat performance comparisons
  • Cloudflare Access OTP setup guides
  • OpenAI text-embedding-3-small dimension performance analysis