PRD: VC Portfolio Discount Cold Outreach Pipeline

Owner: Dmitri Zasage, Solanasis LLC Date: 2026-03-22 Status: Ready for implementation (senior-reviewed, all must-fix items resolved) Execution note: This PRD is designed to be self-contained. It can be handed to another Claude Code instance on a different server with access to the same repos (solanasis-scripts/, solanasis-docs/, solanasis-site/). Senior review: APPROVED WITH NOTES (2026-03-22). All 4 must-fix items resolved in this version.

1. Problem Statement

Solanasis provides cybersecurity assessments and operational resilience services to small and mid-size organizations. Early-stage startups are chronically underserved in this space; they skip basic security hygiene until a customer asks for SOC 2 or a breach forces the conversation. By then, it costs 3x what it would have at Series A.

VCs have a vested interest in preventing this. Many funds now have portfolio-support teams, platform functions, or operating partners whose job is to help portfolio companies avoid preventable mistakes. Solanasis can position itself as a go-to resource these teams recommend to portfolio companies.

The play: Reach out to VCs with a portfolio discount offer. “We’ll give your portfolio companies a discounted Startup Security Baseline at $2, 500 f i x e df ee (n or ma ll y$ 3,500). Five-day turnaround, investor-ready deliverable. Helps your companies get proper cybersecurity practices in place before it becomes an expensive problem.”

This is a B2B2B channel: VC becomes a distribution channel; Solanasis delivers to portfolio companies.

2. Product Definition: Two-Tier Startup Offering

Tier 1: Startup Security Baseline (SSB)

Field	Value
Duration	5 business days
Portfolio rate (via VC)	$2,500 fixed
Direct startup rate	$3,500 fixed
Target	Post-funding startups (Seed through Series B, 5-100 employees)
Scope	Access hygiene review, backup/DR verification (real restore test), cloud infrastructure posture scan, vendor/tool inventory, 30/60/90-day remediation roadmap
Deliverable	Investor-ready security posture summary (1 page) + detailed findings report (10-15 pages)
NOT included	Full compliance mapping (SOC 2, HIPAA, etc.), penetration testing, ongoing monitoring

Tier 2: Full Compliance Readiness Assessment (CRA)

Field	Value
Duration	10 business days
Portfolio rate (via VC)	** $5, 000 f i x e d * * (d i sco u n t e df ro m$ 5K-$7.5K standard)
Direct startup rate	$5, 000 -$ 7,500
Scope	Everything in SSB + full regulatory compliance mapping, incident response documentation, board-ready assessment report

VC Incentive Model

The discount IS the value. No referral fee paid to VCs.
This avoids “paying VCs to sell to their own companies” optics.
Exception: if a VC explicitly wants referral partner status, standard 10% applies but portfolio discount is removed.

Template Messaging

Emails lead with $2,500 SSB as the entry point
Mention full CRA as upgrade option for more mature portfolio companies
Frame as “a menu your platform team can share”

3. Technical Architecture

3.1 Overview

Build solanasis-scripts/vc-pipeline/ with 11 Python scripts + tests, following the Import → Enrich → Score → Template → Outreach → Daily Brief architecture used by the existing fCTO pipeline (solanasis-scripts/fcto-pipeline/).

Note: The fCTO pipeline has a discover_prospects.py (DuckDuckGo-based discovery). The VC pipeline does NOT need this; SEC Form D bulk data replaces the discovery step entirely. download_form_d.py + import_form_d.py + import_free_databases.py cover discovery and import in one pass. Do NOT fork discover_prospects.py.

3.2 Directory Structure

solanasis-scripts/vc-pipeline/
  config.py
  download_form_d.py
  import_form_d.py
  import_free_databases.py
  merge_sources.py
  enrich_websites.py
  enrich_emails.py
  score_prospects.py
  templates.py
  generate_outreach_csv.py
  generate_daily_outreach.py
  requirements.txt
  data/
    raw/
      form_d/           # SEC quarterly ZIPs and extracted TSVs
      openvc/           # OpenVC CSV exports
      ramp/             # Ramp VC Database exports
      other/            # Other free database exports
    intermediate/
      vc_funds_imported.csv
      vc_free_sources_imported.csv
      vc_prospects_merged.csv
      vc_prospects_enriched.csv
      vc_prospects_scored.csv
      website_cache/    # Per-domain JSON cache for scraped websites
    output/
      vc_outreach_full.csv
      vc_outreach_queue.csv
      pipeline_report.md
      outreach_tracker.json
  tests/
    __init__.py
    conftest.py
    test_import_form_d.py
    test_score_prospects.py
    test_templates.py
    test_merge_sources.py
    fixtures/
      sample_offering.tsv
      sample_relatedpersons.tsv
      sample_formdsubmission.tsv
      sample_openvc.csv

3.3 Dependencies

# requirements.txt — same as fCTO pipeline; httpx handles Hunter.io API calls too
httpx>=0.27.0
python-dotenv>=1.0.0
trafilatura>=2.0.0

4. Data Sources (Detailed)

4.1 Primary: SEC Form D Data Sets (Free, Legal, Structured)

URL: https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets

Every VC fund that raises money under Regulation D must file Form D with the SEC. Quarterly bulk downloads are available as ZIP files containing tab-delimited text files.

Files per quarterly ZIP:

FORMDSUBMISSION.tsv — Filing metadata (CIK, entity name, filing date, entity type)
OFFERING.tsv — Offering details (contains INVESTMENTFUNDTYPE field for filtering)
RELATEDPERSONS.tsv — Executive officers, directors, promoters (names + addresses)
RECIPIENTS.tsv — States/countries where securities were offered
SIGNATURES.tsv — Signatory names and dates

Key fields to extract:

From OFFERING.tsv:

ACCESSIONNUMBER (join key)
INVESTMENTFUNDTYPE — Filter: “Venture Capital Fund”
ISSUERENTITYNAME — Fund name
STREET1, STREET2, CITY, STATEORCOUNTRY, ZIPCODE — Address
PHONENUMBER — Phone
INDUSTRYGROUPTYPEVALUE — Industry classification
REVENUERANGE — Fund size indicator
TOTALOFFERINGAMOUNT — Target raise
TOTALAMOUNTSOLD — Amount raised to date
TOTALNUMBERALREADYINVESTED — Investor count

From RELATEDPERSONS.tsv:

ACCESSIONNUMBER (join key)
FIRSTNAME, MIDDLENAME, LASTNAME — Person name
STREET1, CITY, STATEORCOUNTRY, ZIPCODE — Person address
RELATIONSHIPLIST — Roles (Executive Officer, Director, Promoter)

From FORMDSUBMISSION.tsv:

ACCESSIONNUMBER (join key)
FILEDDATE — When filed (for recency scoring)
SUBMISSIONTYPE — New filing vs. amendment

Download strategy: Last 8 quarters (2 years of filings). SEC updates quarterly; filings after 5:30 PM ET on the last business day of a quarter roll to the next posting.

API access (supplemental):

EDGAR Full-Text Search: POST https://efts.sec.gov/LATEST/search-index with forms=D, startdt, enddt params. 10 req/sec, no auth, returns JSON.
data.sec.gov REST: https://data.sec.gov/submissions/CIK{number}.json for individual entity lookups. Must include User-Agent header with email.

What SEC does NOT provide: Email addresses. This is the critical gap.

Expected yield: ~4,000-6,000 unique VC fund entities after deduplication across 8 quarters.

4.2 Supplemental Free Databases

Source	URL	Size	Has Email	Export Format	Notes
OpenVC	openvc.app/investor-database	20,000+ investors	Yes (preferred contact method)	CSV	Best free source for emails. Manual export.
Ramp	ramp.com/vc-database	1,200+ VCs, 2,200+ angels	Yes (verified)	CSV/Excel/JSON	Clean data. Email required to access.
FreeStartupFunding	freestartupfunding.com	7,884 SEC-verified funds	Yes (some)	Web interface	Analyst verification layer on SEC data.
findfunding.vc	findfunding.vc	Pre-Seed to Series A	Some	Web interface	Open-source, VC-verified profiles.
VC Sheet	vcsheet.com	”Largest open list”	Some	Web interface	Filter by sector, stage, geography.

Acquisition order: SEC Form D first (bulk, structured), OpenVC second (has emails), Ramp third (clean CSV), others supplemental.

4.3 Email Gap Resolution Strategy

SEC Form D has zero email addresses. Resolution layers:

Website team page scraping (30-40% hit rate, free) — enrich_websites.py scrapes /team, /about, /contact pages for email patterns
OpenVC preferred contact methods — merge in emails from OpenVC CSV export
Ramp verified emails — merge in from Ramp CSV
Hunter.io free tier — 25 domain searches/month, reserve for A-tier only
Manual research queue — daily brief flags A/B-tier funds without email for 5-minute research (check firm website, Google “{fund name} email contact”)

Realistic total coverage: 40-60% of all funds, higher (60-75%) for A/B-tier because established firms have public team pages.

5. Script Specifications

5.1 `config.py`

Fork from: solanasis-scripts/fcto-pipeline/config.py Changes: All content is new; structure is identical.

# Key constants to define:
 
# Paths (same pattern as fCTO)
PIPELINE_DIR = Path(__file__).parent
DATA_DIR = PIPELINE_DIR / "data"
RAW_DIR = DATA_DIR / "raw"
INTERMEDIATE_DIR = DATA_DIR / "intermediate"
OUTPUT_DIR = DATA_DIR / "output"
WEBSITE_CACHE_DIR = INTERMEDIATE_DIR / "website_cache"
DAILY_OUTREACH_DIR = PIPELINE_DIR.parent.parent / "solanasis-docs" / "daily-outreach"
 
# SEC Form D
FORM_D_DATA_URL = "https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets"
SEC_USER_AGENT = "Solanasis-Research/1.0 (contact@solanasis.com)"
INVESTMENT_FUND_TYPE_FILTER = "Venture Capital Fund"
QUARTERS_TO_DOWNLOAD = 8  # Last 2 years
 
# Keyword categories (for website enrichment)
PLATFORM_KEYWORDS = [
    r"\bplatform\b", r"\bportfolio.?support\b", r"\bvalue.?creation\b",
    r"\boperating\s+partner\b", r"\bfounder\s+success\b",
    r"\bventure\s+studio\b", r"\bventure\s+creation\b",
    r"\bportfolio\s+growth\b", r"\bportfolio\s+services\b",
]
CYBER_FUND_KEYWORDS = [
    r"\bcybersecurity\b", r"\bcyber\b", r"\binfosec\b",
    r"\binformation\s+security\b", r"\bciso\b", r"\bsecurity\s+focused\b",
]
ENTERPRISE_SAAS_KEYWORDS = [
    r"\benterprise\b", r"\bb2b\s+saas\b", r"\binfrastructure\b",
    r"\bcloud\b", r"\bdevops\b", r"\bfintech\b",
]
EARLY_STAGE_KEYWORDS = [
    r"\bearly.stage\b", r"\bseed\b", r"\bpre.seed\b",
    r"\bseries\s+[ab]\b", r"\bstartup\b", r"\bfounder\b",
]
DEFUNCT_KEYWORDS = [
    r"\bformerly\b", r"\bacquired\b", r"\bshut\s*down\b",
    r"\bwind(?:ing)?\s+down\b", r"\bdissolved\b",
]
 
# Scoring model (see section 6)
SCORING = { ... }
 
# Tier thresholds
TIERS = {"A": 40, "B": 25, "C": 12}
 
# Template rules (priority order, first match wins)
TEMPLATE_RULES = [
    (1, "Platform Team Portfolio Offer", {"has_platform": True, "tier_in": ("A",)}),
    (2, "CISO Fund Operator", {"cyber_focus": True}),
    (3, "Founder Safety Net", {"tier_in": ("A", "B"), "early_stage": True}),
    (4, "Colorado VC Peer", {"colorado": True}),
    (5, "Generic VC Outreach", {}),  # fallback
]
 
# CRM output columns
CRM_COLUMNS = [
    "prospect_id", "fund_name", "fund_type", "city", "state",
    "phone", "website", "managing_partners", "partner_emails",
    "fund_size_range", "investment_stage", "industry_focus",
    "filing_date", "data_sources",
    "has_platform_team", "cyber_focus", "enterprise_focus", "early_stage",
    "prospect_score", "prospect_tier", "score_breakdown",
    "template_variant", "mailto_link",
    "outreach_status", "date_sent", "notes", "last_updated",
]
 
# Website scraping (same as fCTO)
SCRAPE_CONCURRENCY = 5
SCRAPE_DELAY = 0.5
SCRAPE_TIMEOUT = 15
JINA_READER_BASE = "https://r.jina.ai/"
 
# Tracker
TRACKER_FILE = DATA_DIR / "outreach_tracker.json"

5.2 `download_form_d.py`

Fork from: Pattern from foundation-pipeline/download_bmf.py and msp-pipeline/download_sec_adv.py This is genuinely new — SEC Form D ZIPs are a different format than IRS BMF CSVs.

Logic:

Navigate to SEC Form D data sets page
Parse available quarterly ZIP links (pattern: quarterly data sets for last 8 quarters)
Download each ZIP to data/raw/form_d/
Extract tab-delimited files from each ZIP
Print download summary (file count, row counts per file, total size)
If files exist, prompt before re-download
Set User-Agent: "Solanasis-Research/1.0 (contact@solanasis.com)" (SEC requires this)
Handle 403/rate-limit gracefully with manual download fallback instructions

Important SEC-specific detail: The exact URL pattern for quarterly ZIPs may change. The script should first try to parse the data sets page for download links. If that fails, print manual instructions pointing the user to the SEC page.

Output: Raw TSV files in data/raw/form_d/YYYY_QN/ subdirectories.

5.3 `import_form_d.py`

Fork from: fcto-pipeline/import_prospects.py (for structure) Genuinely new parsing logic for tab-delimited SEC data with cross-file joins.

Logic:

Load all OFFERING.tsv files across quarters
Filter rows where INVESTMENTFUNDTYPE == "Venture Capital Fund"
Join with RELATEDPERSONS.tsv on ACCESSIONNUMBER to get managing partner names/addresses
Join with FORMDSUBMISSION.tsv on ACCESSIONNUMBER for filing dates
Deduplicate by entity name (or CIK if available); keep most recent filing per fund
Generate prospect_id (hash of normalized fund name + state)
Extract fields: fund_name, city, state, phone, managing_partners (pipe-separated names), fund_size_range (from revenue range or offering amount), filing_date
Output: data/intermediate/vc_funds_imported.csv
Print quality report: total funds, with phone %, geographic distribution, filing date distribution

Edge cases:

Some funds file multiple amendments per quarter; keep only the most recent
RELATEDPERSONS may have multiple people per fund; concatenate as pipe-separated list
Phone may be in various formats; normalize to digits-only for storage, display formatted
Some OFFERING rows may have NULL entity names; skip these

5.4 `import_free_databases.py`

Fork from: fcto-pipeline/import_prospects.py (column mapping pattern)

Logic:

Accept CSV files via CLI flags: --openvc PATH, --ramp PATH, --vcsheet PATH, --custom PATH
Column mappings per source (each source has different column names):

OPENVC_MAP = {
    "Investor Name": "fund_name",
    "Website": "website",
    "Preferred Contact": "preferred_contact",
    "Email": "contact_email",
    "Stage": "investment_stage",
    "Industry": "industry_focus",
    "Location": "location",  # needs parsing into city/state
}
RAMP_MAP = {
    "Name": "fund_name",
    "Website": "website",
    "Email": "contact_email",
    "Stage": "investment_stage",
    "Industry Focus": "industry_focus",
    "City": "city",
    "State": "state",
}
# etc.

Normalize to unified schema matching CRM_COLUMNS
Deduplicate by normalized fund_name + domain
Output: data/intermediate/vc_free_sources_imported.csv
Print source breakdown stats

5.5 `merge_sources.py`

New script. Light version exists in foundation pipeline but needs adaptation for multi-source VC data.

Logic:

Load vc_funds_imported.csv (SEC) and vc_free_sources_imported.csv (free databases)
Fuzzy-match on normalized fund name (strip “LLC”, “LP”, “Fund”, “Capital”, “Ventures” suffixes, lowercase, remove punctuation)
For matches: merge fields, preferring SEC for entity info (legal name, address, phone, filing data), free sources for contact info (email, preferred contact, investment stage)
For SEC-only records: keep as-is, flag data_sources: "sec_form_d"
For free-source-only records: keep as-is, flag with source name
Track provenance in data_sources column (comma-separated: “sec_form_d,openvc,ramp”)
Output: data/intermediate/vc_prospects_merged.csv
Print merge stats: matched, SEC-only, free-only, total

5.6 `enrich_websites.py`

Fork from: solanasis-scripts/fcto-pipeline/enrich_websites.py — nearly identical, change keyword lists and scrape targets.

Changes from fCTO version:

Scrape additional pages: /team, /about, /portfolio, /platform, /founders, /resources, /contact
Use keyword categories from config: PLATFORM_KEYWORDS, CYBER_FUND_KEYWORDS, ENTERPRISE_SAAS_KEYWORDS, EARLY_STAGE_KEYWORDS, DEFUNCT_KEYWORDS
Extract email addresses from team/contact pages (regex: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})
Same caching mechanism (per-domain JSON in website_cache/)
Same Jina Reader API fallback for JS-heavy sites
Same concurrency/rate-limiting (5 concurrent, 0.5s delay)

New boolean columns added:

has_platform_team — any PLATFORM_KEYWORDS match
cyber_focus — any CYBER_FUND_KEYWORDS match
enterprise_focus — any ENTERPRISE_SAAS_KEYWORDS match
early_stage — any EARLY_STAGE_KEYWORDS match
appears_defunct — any DEFUNCT_KEYWORDS match
website_emails — comma-separated emails found on site

5.7 `enrich_emails.py`

New script. Inspired by foundation-pipeline/enrich_emails.py.

Logic:

Load enriched prospects
For prospects WITHOUT email (from merge or website scraping): a. Check if website_emails were found in enrich_websites step; if so, pick best one (prefer partner name matches, then info@, then first found) b. If still no email and Hunter.io API key exists in .env: try domain search (free tier: 25/month). Only for A-tier prospects. c. If still no email: flag in email_research_needed column for manual research queue
Prefer: operating_partner@ > platform@ > info@ > first partner name
Track email source: email_source column (values: “openvc”, “ramp”, “website_scrape”, “hunter”, “manual”)
Output: updated data/intermediate/vc_prospects_enriched.csv
Print email coverage stats by tier

Environment variable: HUNTER_API_KEY in .env (optional; script works without it, just skips Hunter enrichment)

5.8 `score_prospects.py`

Fork from: solanasis-scripts/fcto-pipeline/score_prospects.py — same scoring engine structure, different signal evaluation logic.

Scoring model (18 signals):

SCORING = {
    # Segment signals (highest weight — determines fit)
    "has_platform_team": 18,        # Website mentions platform/value-creation/founder-success
    "cyber_security_focus": 15,     # Cybersecurity-focused fund (highest natural fit)
    "enterprise_saas_focus": 10,    # Enterprise/B2B SaaS fund (portfolio needs security)
    "early_stage_focus": 8,         # Pre-seed/Seed/Series A (startups most underserved)
 
    # Contact quality (determines reachability)
    "has_email": 8,                 # Found a contact email (critical for this pipeline)
    "has_website": 4,               # Has an active website
    "has_partner_names": 3,         # Named managing partners (from SEC RELATEDPERSONS)
    "has_phone": 2,                 # Has phone number (from SEC filing)
 
    # Fund characteristics
    "fund_size_sweet_spot": 6,      # $50M-$500M AUM (big enough to have portfolio, small enough to care)
    "fund_size_large": 3,           # $500M-$2B (has platform team, harder to reach)
    "recent_filing": 5,             # Filed in last 12 months (active fund)
 
    # Geographic
    "colorado_based": 5,            # Colorado HQ (local angle for Dmitri)
    "us_based": 2,                  # US-based (vs international)
 
    # Negative signals
    "no_us_presence": -15,          # International fund with no US office
    "appears_defunct": -10,         # No recent filings, dead website, shutdown language
    "mega_fund": -5,                # >$2B AUM (won't care about $2,500 assessment)
    "pure_fintech_fund": -3,        # Invests IN security/fintech companies (portfolio already knows this)
    "no_contact_path": -8,          # No email, no phone, no website = unreachable
}
 
TIERS = {
    "A": 40,   # High-priority: platform-oriented or cyber-focused funds with email
    "B": 25,   # Good fit: right sector, some contact path
    "C": 12,   # Maybe: has some signals but lower confidence
    # D = everything below 12
}

Expected tier distribution: A ~5-8%, B ~15-20%, C ~30-40%, D = remainder. With 4,000-6,000 total funds: A = 200-480, B = 600-1,200.

Template assignment: Uses TEMPLATE_RULES from config (priority order, first match wins). Same pattern as fCTO.

5.9 `templates.py`

Fork from: solanasis-scripts/fcto-pipeline/templates.py — same structure, function signatures, voice rules. All email copy is new.

Voice rules (non-negotiable, from solanasis-voice-profile.md):

NO em dashes (use semicolons, periods, commas, parentheses)
NO “seamless”, “frictionless”, “audit”, “genuinely”, “SMBs”
Under 125 words (50-80 ideal)
End with a specific question (not generic CTA)
Plain text only, no HTML, no images, no tracking
Signature: Dmitri Sunshine\nFounder, Solanasis\nsolanasis.com | 303-900-8969

Template 1: Platform Team Portfolio Offer

When: A-tier, has platform/value-creation language on website
Subject: Portfolio security resource for {fund_name}'s teams
Body: Lead with the portfolio support angle. “I help early-stage teams avoid preventable security and operational mistakes before they become expensive distractions. I’d like to offer {fund_name}‘s portfolio companies a discounted Startup Security Baseline; a 5-day assessment at $2,500 fixed rate, designed for post-funding teams. Your portfolio companies get operational resilience practices established early. You get a resource that makes your platform team more valuable. For more mature companies, we also run a 10-day full assessment. Would it make sense for me to send the one-pager to whoever manages portfolio support?”
~95 words

Template 2: CISO Fund Operator

When: Cyber-focused fund or CISO angel network
Subject: Operational resilience for your portfolio, {first_name}
Body: Peer-to-peer operator angle. “Fellow security practitioner here. I keep seeing the same pattern: funded startups skip basic security hygiene until a customer asks for a SOC 2 or a breach forces the conversation. By then, it costs 3x what it would have at Series A. Solanasis runs a 5-day Startup Security Baseline; access hygiene, real backup restore test, cloud posture review, remediation roadmap. $2,500 fixed fee for portfolio companies. Has this come up with any of your portfolio companies?”
~80 words

Template 3: Founder Safety Net

When: B-tier funds, generalist or early-stage focus
Subject: Quick question about {fund_name}'s post-investment support
Body: Lighter touch, asks a question first. “Curious whether {fund_name} has a go-to resource when portfolio companies need cybersecurity or DR verification. Most startups don’t think about this until something breaks. We run a 5-day Startup Security Baseline; specifically designed for post-funding teams. $2,500 fixed fee, investor-ready deliverable. Full compliance assessment also available for more mature companies. Is this something that would be useful for your portfolio?”
~70 words

Template 4: Colorado VC Peer

When: Colorado-based fund
Subject: Colorado resource for your portfolio teams
Body: Local peer angle. “Fellow Colorado business owner here. I run Solanasis; we do cybersecurity assessments and operational resilience work. I’m looking to partner with a small number of Colorado-based funds to offer their portfolio companies a discounted security baseline. 5-day turnaround, $2,500 fixed fee, investor-ready deliverable. Full assessment available for companies further along. Coffee or a quick call?”
~60 words

Template 5: Generic VC Outreach

When: B/C-tier, no strong segment signal
Subject: Security baseline offer for portfolio companies
Body: Shortest, most direct. “We run a 5-day Startup Security Baseline for early-stage companies. Backup verification, access hygiene, security posture review. $2,500 fixed fee for portfolio companies; investor-ready report. I’d like to offer this to {fund_name}‘s portfolio. Worth a look?”
~45 words

Follow-Up Day 5:

Subject: Quick follow-up; portfolio security resource
Body: References original offer. Lists what baseline covers in 3-4 bullets. “Want the one-pager?”

Follow-Up Day 10:

Subject: Last note; portfolio resilience resource
Body: One paragraph graceful close. “If any of your portfolio companies ever need a security baseline done quickly, we’re here.”

Functions (same signatures as fCTO):

extract_first_name(full_name) → first name from “Jeffrey B Smith”
assign_template(row) → variant ID based on TEMPLATE_RULES
build_personalized_first_line(row) → “Hi {first_name}, I came across {fund_name} while researching {segment} investors in {location}.”
format_template(variant_id, **kwargs) → (subject, body)
format_follow_up(day, **kwargs) → (subject, body)

5.10 `generate_outreach_csv.py`

Fork from: solanasis-scripts/fcto-pipeline/generate_outreach_csv.py — nearly identical.

Changes:

CRM_COLUMNS use VC-specific fields (fund_name instead of company_name, managing_partners, fund_size_range, etc.)
Pipeline report text references VC-specific next steps
Outreach queue filters: A/B-tier, has_email, outreach_status != “sent”

Output:

data/output/vc_outreach_full.csv — all scored prospects
data/output/vc_outreach_queue.csv — A/B-tier with email, ready for daily brief
data/output/pipeline_report.md — summary stats

5.11 `generate_daily_outreach.py`

Fork from: solanasis-scripts/fcto-pipeline/generate_daily_outreach.py — same tracker, same markdown format.

Changes:

Output file: solanasis-docs/daily-outreach/YYYY-MM-DD-vc.md
Prospect cards include: fund name, segment (platform/cyber/enterprise/early-stage), fund size, managing partners, contact info
Research queue section flags A/B-tier funds needing email discovery
Follow-up templates from templates.py (Day 5 + Day 10)
Safety reminder in header: “Plain text only. Max 5-10/day total across ALL pipelines.”

CLI interface (same as fCTO):

python generate_daily_outreach.py              # Default 3 prospects
python generate_daily_outreach.py --count 5    # More prospects
python generate_daily_outreach.py --mark-sent prospect_id1,prospect_id2
python generate_daily_outreach.py --status     # Pipeline stats

6. Scoring Model Rationale

The scoring model prioritizes two dimensions:

Dimension 1: Does this fund have a reason to care about portfolio security?

Platform team (+18) = explicit portfolio-support function; they’re paid to help portfolio companies
Cyber focus (+15) = they understand the problem natively; shortest sales cycle
Enterprise SaaS (+10) = their portfolio companies sell to enterprises that demand security posture
Early stage (+8) = their portfolio companies are most likely to lack security practices

Dimension 2: Can we actually reach them?

Email (+8) = the only realistic channel for cold outreach at 5-10/day manual volume
Website (+4) = indicates an active fund (vs. a filing-only entity)
Partner names (+3) = enables personalization
Phone (+2) = backup channel for follow-up

Negative signals prevent wasted effort:

No US presence (-15) = compliance context doesn’t translate; CAN-SPAM doesn’t cover international well
Defunct (-10) = no point reaching out
Mega fund (-5) = $2, 500 a ssess m e n t i s n o i se t o a$ 5B fund’s operations
Pure fintech (-3) = their portfolio companies are literally in the security space

7. `/for/startups` Landing Page

Fork from: solanasis-site/src/pages/for/foundations.astro

URL: /for/startups

Sections (mirroring /for/foundations structure):

Hero — “Startup Security Baseline. Cybersecurity practices before they become expensive problems.” CTA: “Get Started” → /#compliance-assessment
Pain Points (4 cards):
- “No one owns security” — Everyone’s too busy building product
- “Customer due diligence” — Enterprise customers will ask; better to be ready
- “Investor confidence” — Show your board you’re not a liability
- “Preventable breaches” — The basics stop 90% of incidents
Two-Tier Assessment (2-column):
- Left: SSB ( $2, 500 p or t f o l i o /$ 3,500 direct, 5 days, scope bullets)
- Right: Full CRA ($5,000 portfolio / standard, 10 days, scope bullets)
Who This Is For (4 cards):
- Seed-stage startups (5-20 employees)
- Series A/B companies (20-100 employees)
- VC portfolio companies (any stage)
- Startups approaching enterprise sales
How We Work (5 steps):
- 15-min scoping call → Access provisioning → Assessment execution → Report delivery → Remediation roadmap review
FAQ (4 items):
- “What’s the difference between SSB and full CRA?”
- “Do you work with pre-revenue startups?”
- “What do we need to provide for the assessment?”
- “Can our VC arrange this for multiple portfolio companies?”
Contact CTA — “Schedule a 15-minute call” → /contact

Nav + Footer: Add “Startups” to Industries dropdown (desktop + mobile) and footer.

Voice: 3 signature terms (blind spot, false comfort, risk debt), 1 transition (Here’s the thing), parenthetical aside, 2+ semicolons.

Tests: 7 startup-specific + 2 cross-linking tests in vertical-pages.spec.ts (mirrors foundation test pattern).

8. Testing Specification (30-40 pytest tests)

8.1 `tests/conftest.py`

Fixtures (pattern from foundation-pipeline/tests/conftest.py):

tmp_data_dir — creates raw/intermediate/output subdirectories in temp dir
sample_offering_tsv — 10 rows: 6 VC funds (various sizes, stages) + 4 non-VC (hedge fund, PE, real estate, other)
sample_relatedpersons_tsv — managing partner data for the 6 VC funds (12 people total, ~2 per fund)
sample_formdsubmission_tsv — filing dates and submission types for all 10
sample_openvc_csv — 5 rows of mock OpenVC data (3 overlap with SEC, 2 unique)
sample_scored_prospects — 10 rows with pre-computed scores and tiers (2 A, 3 B, 3 C, 2 D)

8.2 `tests/test_import_form_d.py` (~10 tests)

Filters only INVESTMENTFUNDTYPE == "Venture Capital Fund" (expects 6 of 10)
Joins OFFERING + RELATEDPERSONS correctly on ACCESSIONNUMBER
Handles missing/null RELATEDPERSONS fields gracefully
Deduplicates by fund name (keeps most recent filing)
Extracts phone from SEC filing and normalizes format
Handles multi-quarter data (same fund files amendments)
Generates stable prospect_id from name + state
Rejects rows with NULL entity names
Produces correct column schema in output CSV
Quality report counts match expectations

8.3 `tests/test_score_prospects.py` (~10 tests)

A-tier scoring: platform team + email + recent filing = 40+ (A)
B-tier scoring: enterprise focus + website + email = 25-39 (B)
C-tier scoring: US-based + phone only = 12-24 (C)
D-tier scoring: defunct international fund = <12 (D)
Negative signals reduce score correctly (each one)
Tier thresholds produce expected distribution on sample data
Template assignment matches segment (platform → Template 1, cyber → Template 2, etc.)
Fallback template (5) assigned when no rules match
Score breakdown string is human-readable
Empty/null fields don’t crash scoring

8.4 `tests/test_templates.py` (~8 tests)

Each template variant (1-5) renders without KeyError
Personalized first lines work with partial data (name only, name+fund, full)
NO em dashes (— or –) in any template output
NO forbidden words (“seamless”, “frictionless”, “audit”, “genuinely”, “SMBs”) in any template
All template bodies are under 125 words
All subject lines are under 60 characters
mailto: URL generation produces valid URLs under 2000 chars
Follow-up templates (Day 5, Day 10) render correctly
Signature block is present in all initial templates

8.5 `tests/test_merge_sources.py` (~5 tests)

SEC + OpenVC merge deduplicates correctly (3 overlapping records become 3, not 6)
Prefers SEC entity name over free-source name
Prefers free-source email over empty SEC email
Source provenance tracked correctly (“sec_form_d,openvc”)
One-sided records preserved (SEC-only and OpenVC-only)

9. Additional Deliverables

Location: solanasis-site/public/downloads/solanasis-startup-security-baseline.pdf
Content: SSB product overview, two-tier pricing, scope bullets, “How it works” (5 steps), Solanasis contact info
Design: match existing PDF assets (clean, professional, Solanasis brand)
Generation: python-pptx → PDF (same pipeline as pitch deck) or direct PDF via reportlab

9.2 Operational Playbook

Location: solanasis-docs/playbooks/VC_Portfolio_Outreach_Playbook.md
Content: Setup steps, daily ritual, 30-day pilot design, compliance checklist, success metrics, what to do when a VC responds

9.3 Short URLs

go.solanasis.com/startup-security → /for/startups
go.solanasis.com/ssb → one-pager PDF
Create via short.io API (tag: claude-managed)
Log in solanasis-docs/operations/short-url-tracking-log.md

9.4 Updated Daily Outreach Ritual

Update Foundation_Daily_Outreach_Ritual.md to reference VC pipeline
Add python generate_daily_outreach.py command for VC pipeline
Note: capacity allocation across pipelines to be decided after prospect quality assessed

10. Compliance Requirements

From the existing VC ecosystem playbook and cold outreach cheat sheet:

CAN-SPAM: Accurate sender identity (Dmitri Sunshine, Solanasis), physical address, functioning opt-out mechanism. CAN-SPAM makes no exception for B2B email.
Domain safety: Max 5-10 emails/day from solanasis.com across ALL pipelines. Plain text only, no HTML, no tracking pixels, no images. Stagger sends 2+ hours apart.
LinkedIn: No scraping, no automation, no prohibited extensions. Manual only if used at all.
Attachments: No attachments in first email. Offer to send one-pager in reply.
Red flags: 3+ bounces in one day → stop immediately. Spam complaints → investigate and adjust.
Manual sending only from solanasis.com (no Instantly, no GMass, no automation tools for primary domain)

11. 30-Day Pilot Design

Week 1: Infrastructure + First Sends

Build and test all pipeline scripts
Download SEC Form D data + at least one free database (OpenVC or Ramp)
Run full pipeline end-to-end
Review top 20 A-tier prospects for quality
Create SSB one-pager PDF
Build /for/startups landing page
Send first 15-20 emails

Week 2: Expand + Learn

Send 20-30 more emails
Track which segments respond (platform teams vs. GPs vs. operating partners)
Identify which template variant gets highest open/reply rates
Process Day 5 follow-ups from Week 1

Week 3: Refine

Review response quality (not just reply rate)
Tighten subject lines and messaging based on data
If any VC meeting booked, prepare portfolio presentation
Process Day 10 follow-ups from Week 1

Week 4: Decide

Total expected: 60-80 VC emails over 4 weeks (note: 5-10/day limit is shared across ALL pipelines; VC allocation decided post-build based on prospect quality)
Expected reply rate: 3-8% for cold B2B email to VCs (2-6 replies)
Expected meeting rate: 1-3% (1-2 meetings)
Decision: scale VC channel, pivot to founder-direct (Motion B), or adjust offer/pricing

Success Metrics

Minimum viable: 1 meeting booked from 80 emails (1.25% meeting rate)
Good: 3+ meetings, 1 pilot SSB delivered via VC introduction
Great: Ongoing portfolio relationship with 1+ fund, repeat referrals

12. Implementation Sequence

Build in this order (dependencies flow top-to-bottom):

Step	Script	Effort	Fork Source	Notes
1	`config.py`	30 min	`fcto-pipeline/config.py`	All new content, same structure
2	`download_form_d.py`	1 hr	New (pattern from `download_bmf.py`)	Then run it to get data
3	`import_form_d.py` + tests	1.5 hr	New (TSV parsing + joins)	Core data processing
4	`import_free_databases.py`	1 hr	`import_prospects.py` pattern	Can parallelize with step 3
5	`merge_sources.py` + tests	45 min	New (fuzzy match)	Depends on 3 + 4
6	`enrich_websites.py`	45 min	`fcto-pipeline/enrich_websites.py`	Fork, change keywords. Then run (1-2 hr execution)
7	`enrich_emails.py`	1 hr	New (inspired by foundation)	Depends on 6
8	`score_prospects.py` + tests	45 min	`fcto-pipeline/score_prospects.py`	Fork, change signal evaluation
9	`templates.py` + tests	1.5 hr	`fcto-pipeline/templates.py`	Writing email copy is the bottleneck
10	`generate_outreach_csv.py`	30 min	`fcto-pipeline/generate_outreach_csv.py`	Mostly fork
11	`generate_daily_outreach.py`	45 min	`fcto-pipeline/generate_daily_outreach.py`	Mostly fork
12	`requirements.txt`	5 min	Copy from fCTO
13	`/for/startups` landing page	2 hr	`for/foundations.astro`	Fork, new content
14	SSB one-pager PDF	1 hr	New
15	Operational playbook	45 min	Pattern from existing playbooks
16	Short URLs	10 min	short.io API

Total estimated effort: ~12-14 hours

13. Decisions Made

Pricing: $2, 500 p or t f o l i o /$ 3,500 direct for SSB; full CRA at $5,000 portfolio discount
Scope: Both tiers offered (SSB 5-day entry + full CRA 10-day upgrade)
Landing page: Yes, build /for/startups
Capacity: Decide allocation after pipeline is built based on prospect quality
VC incentive: Discount = the value; no referral fees to VCs

14. Critical Source Files

These files must be read before implementation begins:

File	Purpose
`solanasis-scripts/fcto-pipeline/config.py`	Primary fork source for pipeline config
`solanasis-scripts/fcto-pipeline/templates.py`	Fork source for email templates + voice rules
`solanasis-scripts/fcto-pipeline/generate_daily_outreach.py`	Fork source for daily briefs
`solanasis-scripts/fcto-pipeline/score_prospects.py`	Fork source for scoring engine
`solanasis-scripts/fcto-pipeline/generate_outreach_csv.py`	Fork source for outreach CSV
`solanasis-scripts/fcto-pipeline/enrich_websites.py`	Fork source for website enrichment (543 lines, verified)
`solanasis-scripts/fcto-pipeline/import_prospects.py`	Reference for multi-source importer structure (column mapping pattern)
`solanasis-docs/playbooks/solanasis-investor-list-vc-ecosystem-outbound-playbook-handoff-2026-03-20.md`	Strategy playbook (segments, messaging, compliance)
`solanasis-docs/playbooks/Manual_Cold_Outreach_Cheat_Sheet.md`	Daily outreach safety rules
`solanasis-site/src/pages/for/foundations.astro`	Fork source for `/for/startups` landing page (verified exists)
`solanasis-scripts/foundation-pipeline/tests/conftest.py`	Pattern for pytest fixtures
`solanasis-docs/solanasis-voice-profile.md`	Voice rules for all copy

15. Fork Warnings and Out-of-Scope Items

Fork bugs to fix during implementation

generate_daily_outreach.py ~line 336: The fCTO version contains “SMBs” (forbidden word) and an em dash in a LinkedIn connection message template. When forking, replace “SMBs” with “small and mid-size organizations” and replace the em dash with a semicolon.

Explicitly out of scope (defer to post-pilot)

migrate_to_baserow.py: The fCTO pipeline has a Baserow migration script. The VC pipeline does NOT need this in Phase 1. If the pilot succeeds (Week 4 decision), add Baserow migration as a Phase 2 item. Table name would be “VC Partners.”
discover_prospects.py: Not needed; SEC Form D bulk data replaces web discovery.
Motion B (founder-direct outreach): Targeting portfolio companies directly is a separate Phase 2 effort that would require Crunchbase or PitchBook data for portfolio company lists.

SCORING dict in config.py

The config.py skeleton in section 5.1 shows SCORING = { ... }. The full scoring dict is defined in section 5.8. When implementing, copy the complete dict from section 5.8 into config.py.

Template 1 word count

Template 1 is ~95 words (within the 125-word limit but above the 50-80 ideal). The personalized first line adds ~20 words. If total exceeds 125 after personalization, trim the “For more mature companies” sentence.

Solanasis Docs

Explorer

VC_Pipeline_PRD_2026-03-22

PRD: VC Portfolio Discount Cold Outreach Pipeline

1. Problem Statement

2. Product Definition: Two-Tier Startup Offering

Tier 1: Startup Security Baseline (SSB)

Tier 2: Full Compliance Readiness Assessment (CRA)

VC Incentive Model

Template Messaging

3. Technical Architecture

3.1 Overview

3.2 Directory Structure

3.3 Dependencies

4. Data Sources (Detailed)

4.1 Primary: SEC Form D Data Sets (Free, Legal, Structured)

4.2 Supplemental Free Databases

4.3 Email Gap Resolution Strategy

5. Script Specifications

5.1 config.py

5.2 download_form_d.py

5.3 import_form_d.py

5.4 import_free_databases.py

5.5 merge_sources.py

5.6 enrich_websites.py

5.7 enrich_emails.py

5.8 score_prospects.py

5.9 templates.py

5.10 generate_outreach_csv.py

5.11 generate_daily_outreach.py

6. Scoring Model Rationale

7. /for/startups Landing Page

8. Testing Specification (30-40 pytest tests)

8.1 tests/conftest.py

8.2 tests/test_import_form_d.py (~10 tests)

8.3 tests/test_score_prospects.py (~10 tests)

8.4 tests/test_templates.py (~8 tests)

8.5 tests/test_merge_sources.py (~5 tests)

9. Additional Deliverables

9.1 Startup Security Baseline One-Pager (PDF)

9.2 Operational Playbook

9.3 Short URLs

9.4 Updated Daily Outreach Ritual

10. Compliance Requirements

11. 30-Day Pilot Design

Week 1: Infrastructure + First Sends

Week 2: Expand + Learn

Week 3: Refine

Week 4: Decide

Success Metrics

12. Implementation Sequence

13. Decisions Made

14. Critical Source Files

15. Fork Warnings and Out-of-Scope Items

Fork bugs to fix during implementation

Explicitly out of scope (defer to post-pilot)

SCORING dict in config.py

Template 1 word count

16. Verification Checklist

Graph View

Table of Contents

5.1 `config.py`

5.2 `download_form_d.py`

5.3 `import_form_d.py`

5.4 `import_free_databases.py`

5.5 `merge_sources.py`

5.6 `enrich_websites.py`

5.7 `enrich_emails.py`

5.8 `score_prospects.py`

5.9 `templates.py`

5.10 `generate_outreach_csv.py`

5.11 `generate_daily_outreach.py`

7. `/for/startups` Landing Page

8.1 `tests/conftest.py`

8.2 `tests/test_import_form_d.py` (~10 tests)

8.3 `tests/test_score_prospects.py` (~10 tests)

8.4 `tests/test_templates.py` (~8 tests)

8.5 `tests/test_merge_sources.py` (~5 tests)