GTM Knowledge Base Platform Research

Date: 2026-03-27 Purpose: Research findings supporting the GTM Knowledge Base deep plan Related Plan: .claude-plans/deep-plan-gtm-knowledge-base-2026-03-27.md

Current State Assessment

Document Inventory (as of 2026-03-27)

Category	Count	Location	Has Metadata
Playbooks	192	`playbooks/` + `playbooks/misc/`	Blockquote (inconsistent)
Meeting notes	72	`meeting-notes/`	Date in filename only
Daily notes	24	`daily-notes/`	None
Reference	15	`reference/`	Minimal
Content creation	11	`content-creation/`	Mixed
Operations	10	`operations/`	Some version/date
Sales	11	`sales/`	None
Outreach	5	`outreach/`	None
Website config	8	`website-config/`	None
Total	589	solanasis-docs/	~10% have any metadata

Existing Infrastructure

Supabase Postgres (localhost:5433) — docs_documents table exists with pgvector, HNSW index, search_docs() function
Directus (localhost:8055) — running, introspects all tables
Quartz (docs.solanasis.com) — static site reader for Obsidian vault
SilverBullet (edit.solanasis.com) — web-based markdown editor
SupabaseClient — Python client with full CRUD, already includes docs_documents

Metadata Patterns Found

Playbooks use blockquote-style frontmatter (not YAML):

> **Version:** 2.0
> **Date:** 2026-03-15
> **Type:** GTM Strategy
> **Owner:** Dmitri Zasage
> **Purpose:** Master GTM playbook
> **Status:** Active
> **Companion docs:** Link1 | Link2

Only 1 file had YAML frontmatter (--- delimited). Quartz supports both.

Platform Comparison Summary

Evaluated Options

Option	Score	Cost	Self-Hosted	Comments	RAG	Verdict
Supabase + Directus	84/90	$0/mo	Yes (deployed)	Record-level	Yes	WINNER
Notion	62/90	Free 1yr → $8/mo	No	Best inline	No	Phase 5 option
Outline	67/90	$0	Yes (+1.5GB RAM)	Inline paragraphs	No	Consider later
Docmost	65/90	$0	Yes (+1GB RAM)	Page + inline	No	Phase 5 option
Coda	56/90	$10/mo+	No	Inline	No	Too limited
WikiJS	63/90	$0	Yes	None	No	No comments

Why Supabase + Directus Wins

Already deployed — zero new infrastructure for Phases 1-4
$0/month — no SaaS subscriptions
Full SQL control — can build any query, function, or view
pgvector + HNSW — semantic search and RAG from day 1
SupabaseClient — Claude Code has native Python access
Directus comments — per-record comments sufficient for single-user
Extensible — can add Docmost or Notion as a commenting layer later
Backup included — existing daily pg_dump covers new tables

When to Add a Richer UI (Phase 5 Criteria)

Add Docmost, Notion, or Outline when:

Directus record-level comments feel insufficient for playbook review
1099 contractors need to collaboratively edit playbooks
Dmitri needs inline text highlighting/annotation (not just whole-doc comments)
Evaluation period: 2 weeks after Phase 3 completes

Supabase Schema Design (Key Additions)

New Columns on docs_documents

status — document lifecycle (draft/active/review_needed/archived/superseded)
category — document type (playbook/research/sop/template/reference/outreach)
slug — URL-friendly unique identifier
description — short blurb for search results
Review tracking: last_reviewed_at, last_reviewed_by, review_interval_days, next_review_due
Sync tracking: checksum (SHA-256), synced_at, file_modified_at
metadata (jsonb) — flexible key-value pairs
fts (generated tsvector) — auto-updating full-text search vector

New Tables

docs_versions — content version history
docs_reviews — review feedback log
docs_chunks — document chunks with embeddings for RAG

New Functions

search_kb() — full-text search with category/status filtering
search_chunks() — semantic search against chunk embeddings
search_kb_hybrid() — Reciprocal Rank Fusion combining FTS + semantic
docs_needing_review() — overdue document list
record_review() — record review + auto-update lifecycle timestamps

RAG Architecture Notes

Chunking Strategy

Method: RecursiveCharacterTextSplitter with markdown-aware separators
Chunk size: 512 tokens with 50-token overlap
Context enrichment: Prepend document title + section heading to each chunk (+18% accuracy)
Separators: \n## , \n### , \n#### , \n\n, \n, . , , “

Embedding Model

Model: text-embedding-3-small (1536 dimensions, matches existing schema)
Cost: $0.02 per million tokens
Full corpus estimate: ~384K tokens → ~$0.008 per full re-embed run

Search Strategy

Hybrid search (RRF): Combines full-text ranking (30% weight) + semantic similarity (70% weight)
Threshold: 0.7 cosine similarity minimum for semantic results
Reranking: Reciprocal Rank Fusion for combining result sets

Security Model

Database: localhost-only (not exposed to internet)
Web UIs: Cloudflare Tunnel + Access OTP (same pattern as sm.solanasis.com)
API keys: Stored in Infisical, retrieved via secret get
RLS: Not needed for single-user; available if contractors need scoped access
Backup: Existing daily pg_dump → AES-256-CBC → Cloudflare R2

Sources Consulted

Supabase Full Text Search docs
Supabase Vector Columns / pgvector docs
PostgreSQL v18 textsearch documentation
Firecrawl: Best Chunking Strategies for RAG in 2025
LangCopilot: Document Chunking for RAG Practical Guide
Docmost self-hosting guide + API docs
Directus Knowledge Base tutorial + Flexible Editor extension
AFFiNE / AppFlowy / WikiJS / BookStack self-hosting docs
pgvector HNSW vs IVFFlat performance comparisons
Cloudflare Access OTP setup guides
OpenAI text-embedding-3-small dimension performance analysis

Solanasis Docs

Explorer

gtm-knowledge-base-platform-research-2026-03-27