PRD: VC Portfolio Discount Cold Outreach Pipeline

Owner: Dmitri Zasage, Solanasis LLC Date: 2026-03-22 Status: Ready for implementation (senior-reviewed, all must-fix items resolved) Execution note: This PRD is designed to be self-contained. It can be handed to another Claude Code instance on a different server with access to the same repos (solanasis-scripts/, solanasis-docs/, solanasis-site/). Senior review: APPROVED WITH NOTES (2026-03-22). All 4 must-fix items resolved in this version.


1. Problem Statement

Solanasis provides cybersecurity assessments and operational resilience services to small and mid-size organizations. Early-stage startups are chronically underserved in this space; they skip basic security hygiene until a customer asks for SOC 2 or a breach forces the conversation. By then, it costs 3x what it would have at Series A.

VCs have a vested interest in preventing this. Many funds now have portfolio-support teams, platform functions, or operating partners whose job is to help portfolio companies avoid preventable mistakes. Solanasis can position itself as a go-to resource these teams recommend to portfolio companies.

The play: Reach out to VCs with a portfolio discount offer. “We’ll give your portfolio companies a discounted Startup Security Baseline at 3,500). Five-day turnaround, investor-ready deliverable. Helps your companies get proper cybersecurity practices in place before it becomes an expensive problem.”

This is a B2B2B channel: VC becomes a distribution channel; Solanasis delivers to portfolio companies.


2. Product Definition: Two-Tier Startup Offering

Tier 1: Startup Security Baseline (SSB)

FieldValue
Duration5 business days
Portfolio rate (via VC)$2,500 fixed
Direct startup rate$3,500 fixed
TargetPost-funding startups (Seed through Series B, 5-100 employees)
ScopeAccess hygiene review, backup/DR verification (real restore test), cloud infrastructure posture scan, vendor/tool inventory, 30/60/90-day remediation roadmap
DeliverableInvestor-ready security posture summary (1 page) + detailed findings report (10-15 pages)
NOT includedFull compliance mapping (SOC 2, HIPAA, etc.), penetration testing, ongoing monitoring

Tier 2: Full Compliance Readiness Assessment (CRA)

FieldValue
Duration10 business days
Portfolio rate (via VC)**5K-$7.5K standard)
Direct startup rate7,500
ScopeEverything in SSB + full regulatory compliance mapping, incident response documentation, board-ready assessment report

VC Incentive Model

  • The discount IS the value. No referral fee paid to VCs.
  • This avoids “paying VCs to sell to their own companies” optics.
  • Exception: if a VC explicitly wants referral partner status, standard 10% applies but portfolio discount is removed.

Template Messaging

  • Emails lead with $2,500 SSB as the entry point
  • Mention full CRA as upgrade option for more mature portfolio companies
  • Frame as “a menu your platform team can share”

3. Technical Architecture

3.1 Overview

Build solanasis-scripts/vc-pipeline/ with 11 Python scripts + tests, following the Import → Enrich → Score → Template → Outreach → Daily Brief architecture used by the existing fCTO pipeline (solanasis-scripts/fcto-pipeline/).

Note: The fCTO pipeline has a discover_prospects.py (DuckDuckGo-based discovery). The VC pipeline does NOT need this; SEC Form D bulk data replaces the discovery step entirely. download_form_d.py + import_form_d.py + import_free_databases.py cover discovery and import in one pass. Do NOT fork discover_prospects.py.

3.2 Directory Structure

solanasis-scripts/vc-pipeline/
  config.py
  download_form_d.py
  import_form_d.py
  import_free_databases.py
  merge_sources.py
  enrich_websites.py
  enrich_emails.py
  score_prospects.py
  templates.py
  generate_outreach_csv.py
  generate_daily_outreach.py
  requirements.txt
  data/
    raw/
      form_d/           # SEC quarterly ZIPs and extracted TSVs
      openvc/           # OpenVC CSV exports
      ramp/             # Ramp VC Database exports
      other/            # Other free database exports
    intermediate/
      vc_funds_imported.csv
      vc_free_sources_imported.csv
      vc_prospects_merged.csv
      vc_prospects_enriched.csv
      vc_prospects_scored.csv
      website_cache/    # Per-domain JSON cache for scraped websites
    output/
      vc_outreach_full.csv
      vc_outreach_queue.csv
      pipeline_report.md
      outreach_tracker.json
  tests/
    __init__.py
    conftest.py
    test_import_form_d.py
    test_score_prospects.py
    test_templates.py
    test_merge_sources.py
    fixtures/
      sample_offering.tsv
      sample_relatedpersons.tsv
      sample_formdsubmission.tsv
      sample_openvc.csv

3.3 Dependencies

# requirements.txt — same as fCTO pipeline; httpx handles Hunter.io API calls too
httpx>=0.27.0
python-dotenv>=1.0.0
trafilatura>=2.0.0

4. Data Sources (Detailed)

URL: https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets

Every VC fund that raises money under Regulation D must file Form D with the SEC. Quarterly bulk downloads are available as ZIP files containing tab-delimited text files.

Files per quarterly ZIP:

  • FORMDSUBMISSION.tsv — Filing metadata (CIK, entity name, filing date, entity type)
  • OFFERING.tsv — Offering details (contains INVESTMENTFUNDTYPE field for filtering)
  • RELATEDPERSONS.tsv — Executive officers, directors, promoters (names + addresses)
  • RECIPIENTS.tsv — States/countries where securities were offered
  • SIGNATURES.tsv — Signatory names and dates

Key fields to extract:

From OFFERING.tsv:

  • ACCESSIONNUMBER (join key)
  • INVESTMENTFUNDTYPEFilter: “Venture Capital Fund”
  • ISSUERENTITYNAME — Fund name
  • STREET1, STREET2, CITY, STATEORCOUNTRY, ZIPCODE — Address
  • PHONENUMBER — Phone
  • INDUSTRYGROUPTYPEVALUE — Industry classification
  • REVENUERANGE — Fund size indicator
  • TOTALOFFERINGAMOUNT — Target raise
  • TOTALAMOUNTSOLD — Amount raised to date
  • TOTALNUMBERALREADYINVESTED — Investor count

From RELATEDPERSONS.tsv:

  • ACCESSIONNUMBER (join key)
  • FIRSTNAME, MIDDLENAME, LASTNAME — Person name
  • STREET1, CITY, STATEORCOUNTRY, ZIPCODE — Person address
  • RELATIONSHIPLIST — Roles (Executive Officer, Director, Promoter)

From FORMDSUBMISSION.tsv:

  • ACCESSIONNUMBER (join key)
  • FILEDDATE — When filed (for recency scoring)
  • SUBMISSIONTYPE — New filing vs. amendment

Download strategy: Last 8 quarters (2 years of filings). SEC updates quarterly; filings after 5:30 PM ET on the last business day of a quarter roll to the next posting.

API access (supplemental):

  • EDGAR Full-Text Search: POST https://efts.sec.gov/LATEST/search-index with forms=D, startdt, enddt params. 10 req/sec, no auth, returns JSON.
  • data.sec.gov REST: https://data.sec.gov/submissions/CIK{number}.json for individual entity lookups. Must include User-Agent header with email.

What SEC does NOT provide: Email addresses. This is the critical gap.

Expected yield: ~4,000-6,000 unique VC fund entities after deduplication across 8 quarters.

4.2 Supplemental Free Databases

SourceURLSizeHas EmailExport FormatNotes
OpenVCopenvc.app/investor-database20,000+ investorsYes (preferred contact method)CSVBest free source for emails. Manual export.
Rampramp.com/vc-database1,200+ VCs, 2,200+ angelsYes (verified)CSV/Excel/JSONClean data. Email required to access.
FreeStartupFundingfreestartupfunding.com7,884 SEC-verified fundsYes (some)Web interfaceAnalyst verification layer on SEC data.
findfunding.vcfindfunding.vcPre-Seed to Series ASomeWeb interfaceOpen-source, VC-verified profiles.
VC Sheetvcsheet.com”Largest open list”SomeWeb interfaceFilter by sector, stage, geography.

Acquisition order: SEC Form D first (bulk, structured), OpenVC second (has emails), Ramp third (clean CSV), others supplemental.

4.3 Email Gap Resolution Strategy

SEC Form D has zero email addresses. Resolution layers:

  1. Website team page scraping (30-40% hit rate, free) — enrich_websites.py scrapes /team, /about, /contact pages for email patterns
  2. OpenVC preferred contact methods — merge in emails from OpenVC CSV export
  3. Ramp verified emails — merge in from Ramp CSV
  4. Hunter.io free tier — 25 domain searches/month, reserve for A-tier only
  5. Manual research queue — daily brief flags A/B-tier funds without email for 5-minute research (check firm website, Google “{fund name} email contact”)

Realistic total coverage: 40-60% of all funds, higher (60-75%) for A/B-tier because established firms have public team pages.


5. Script Specifications

5.1 config.py

Fork from: solanasis-scripts/fcto-pipeline/config.py Changes: All content is new; structure is identical.

# Key constants to define:
 
# Paths (same pattern as fCTO)
PIPELINE_DIR = Path(__file__).parent
DATA_DIR = PIPELINE_DIR / "data"
RAW_DIR = DATA_DIR / "raw"
INTERMEDIATE_DIR = DATA_DIR / "intermediate"
OUTPUT_DIR = DATA_DIR / "output"
WEBSITE_CACHE_DIR = INTERMEDIATE_DIR / "website_cache"
DAILY_OUTREACH_DIR = PIPELINE_DIR.parent.parent / "solanasis-docs" / "daily-outreach"
 
# SEC Form D
FORM_D_DATA_URL = "https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets"
SEC_USER_AGENT = "Solanasis-Research/1.0 (contact@solanasis.com)"
INVESTMENT_FUND_TYPE_FILTER = "Venture Capital Fund"
QUARTERS_TO_DOWNLOAD = 8  # Last 2 years
 
# Keyword categories (for website enrichment)
PLATFORM_KEYWORDS = [
    r"\bplatform\b", r"\bportfolio.?support\b", r"\bvalue.?creation\b",
    r"\boperating\s+partner\b", r"\bfounder\s+success\b",
    r"\bventure\s+studio\b", r"\bventure\s+creation\b",
    r"\bportfolio\s+growth\b", r"\bportfolio\s+services\b",
]
CYBER_FUND_KEYWORDS = [
    r"\bcybersecurity\b", r"\bcyber\b", r"\binfosec\b",
    r"\binformation\s+security\b", r"\bciso\b", r"\bsecurity\s+focused\b",
]
ENTERPRISE_SAAS_KEYWORDS = [
    r"\benterprise\b", r"\bb2b\s+saas\b", r"\binfrastructure\b",
    r"\bcloud\b", r"\bdevops\b", r"\bfintech\b",
]
EARLY_STAGE_KEYWORDS = [
    r"\bearly.stage\b", r"\bseed\b", r"\bpre.seed\b",
    r"\bseries\s+[ab]\b", r"\bstartup\b", r"\bfounder\b",
]
DEFUNCT_KEYWORDS = [
    r"\bformerly\b", r"\bacquired\b", r"\bshut\s*down\b",
    r"\bwind(?:ing)?\s+down\b", r"\bdissolved\b",
]
 
# Scoring model (see section 6)
SCORING = { ... }
 
# Tier thresholds
TIERS = {"A": 40, "B": 25, "C": 12}
 
# Template rules (priority order, first match wins)
TEMPLATE_RULES = [
    (1, "Platform Team Portfolio Offer", {"has_platform": True, "tier_in": ("A",)}),
    (2, "CISO Fund Operator", {"cyber_focus": True}),
    (3, "Founder Safety Net", {"tier_in": ("A", "B"), "early_stage": True}),
    (4, "Colorado VC Peer", {"colorado": True}),
    (5, "Generic VC Outreach", {}),  # fallback
]
 
# CRM output columns
CRM_COLUMNS = [
    "prospect_id", "fund_name", "fund_type", "city", "state",
    "phone", "website", "managing_partners", "partner_emails",
    "fund_size_range", "investment_stage", "industry_focus",
    "filing_date", "data_sources",
    "has_platform_team", "cyber_focus", "enterprise_focus", "early_stage",
    "prospect_score", "prospect_tier", "score_breakdown",
    "template_variant", "mailto_link",
    "outreach_status", "date_sent", "notes", "last_updated",
]
 
# Website scraping (same as fCTO)
SCRAPE_CONCURRENCY = 5
SCRAPE_DELAY = 0.5
SCRAPE_TIMEOUT = 15
JINA_READER_BASE = "https://r.jina.ai/"
 
# Tracker
TRACKER_FILE = DATA_DIR / "outreach_tracker.json"

5.2 download_form_d.py

Fork from: Pattern from foundation-pipeline/download_bmf.py and msp-pipeline/download_sec_adv.py This is genuinely new — SEC Form D ZIPs are a different format than IRS BMF CSVs.

Logic:

  1. Navigate to SEC Form D data sets page
  2. Parse available quarterly ZIP links (pattern: quarterly data sets for last 8 quarters)
  3. Download each ZIP to data/raw/form_d/
  4. Extract tab-delimited files from each ZIP
  5. Print download summary (file count, row counts per file, total size)
  6. If files exist, prompt before re-download
  7. Set User-Agent: "Solanasis-Research/1.0 (contact@solanasis.com)" (SEC requires this)
  8. Handle 403/rate-limit gracefully with manual download fallback instructions

Important SEC-specific detail: The exact URL pattern for quarterly ZIPs may change. The script should first try to parse the data sets page for download links. If that fails, print manual instructions pointing the user to the SEC page.

Output: Raw TSV files in data/raw/form_d/YYYY_QN/ subdirectories.

5.3 import_form_d.py

Fork from: fcto-pipeline/import_prospects.py (for structure) Genuinely new parsing logic for tab-delimited SEC data with cross-file joins.

Logic:

  1. Load all OFFERING.tsv files across quarters
  2. Filter rows where INVESTMENTFUNDTYPE == "Venture Capital Fund"
  3. Join with RELATEDPERSONS.tsv on ACCESSIONNUMBER to get managing partner names/addresses
  4. Join with FORMDSUBMISSION.tsv on ACCESSIONNUMBER for filing dates
  5. Deduplicate by entity name (or CIK if available); keep most recent filing per fund
  6. Generate prospect_id (hash of normalized fund name + state)
  7. Extract fields: fund_name, city, state, phone, managing_partners (pipe-separated names), fund_size_range (from revenue range or offering amount), filing_date
  8. Output: data/intermediate/vc_funds_imported.csv
  9. Print quality report: total funds, with phone %, geographic distribution, filing date distribution

Edge cases:

  • Some funds file multiple amendments per quarter; keep only the most recent
  • RELATEDPERSONS may have multiple people per fund; concatenate as pipe-separated list
  • Phone may be in various formats; normalize to digits-only for storage, display formatted
  • Some OFFERING rows may have NULL entity names; skip these

5.4 import_free_databases.py

Fork from: fcto-pipeline/import_prospects.py (column mapping pattern)

Logic:

  1. Accept CSV files via CLI flags: --openvc PATH, --ramp PATH, --vcsheet PATH, --custom PATH
  2. Column mappings per source (each source has different column names):
OPENVC_MAP = {
    "Investor Name": "fund_name",
    "Website": "website",
    "Preferred Contact": "preferred_contact",
    "Email": "contact_email",
    "Stage": "investment_stage",
    "Industry": "industry_focus",
    "Location": "location",  # needs parsing into city/state
}
RAMP_MAP = {
    "Name": "fund_name",
    "Website": "website",
    "Email": "contact_email",
    "Stage": "investment_stage",
    "Industry Focus": "industry_focus",
    "City": "city",
    "State": "state",
}
# etc.
  1. Normalize to unified schema matching CRM_COLUMNS
  2. Deduplicate by normalized fund_name + domain
  3. Output: data/intermediate/vc_free_sources_imported.csv
  4. Print source breakdown stats

5.5 merge_sources.py

New script. Light version exists in foundation pipeline but needs adaptation for multi-source VC data.

Logic:

  1. Load vc_funds_imported.csv (SEC) and vc_free_sources_imported.csv (free databases)
  2. Fuzzy-match on normalized fund name (strip “LLC”, “LP”, “Fund”, “Capital”, “Ventures” suffixes, lowercase, remove punctuation)
  3. For matches: merge fields, preferring SEC for entity info (legal name, address, phone, filing data), free sources for contact info (email, preferred contact, investment stage)
  4. For SEC-only records: keep as-is, flag data_sources: "sec_form_d"
  5. For free-source-only records: keep as-is, flag with source name
  6. Track provenance in data_sources column (comma-separated: “sec_form_d,openvc,ramp”)
  7. Output: data/intermediate/vc_prospects_merged.csv
  8. Print merge stats: matched, SEC-only, free-only, total

5.6 enrich_websites.py

Fork from: solanasis-scripts/fcto-pipeline/enrich_websites.py — nearly identical, change keyword lists and scrape targets.

Changes from fCTO version:

  • Scrape additional pages: /team, /about, /portfolio, /platform, /founders, /resources, /contact
  • Use keyword categories from config: PLATFORM_KEYWORDS, CYBER_FUND_KEYWORDS, ENTERPRISE_SAAS_KEYWORDS, EARLY_STAGE_KEYWORDS, DEFUNCT_KEYWORDS
  • Extract email addresses from team/contact pages (regex: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})
  • Same caching mechanism (per-domain JSON in website_cache/)
  • Same Jina Reader API fallback for JS-heavy sites
  • Same concurrency/rate-limiting (5 concurrent, 0.5s delay)

New boolean columns added:

  • has_platform_team — any PLATFORM_KEYWORDS match
  • cyber_focus — any CYBER_FUND_KEYWORDS match
  • enterprise_focus — any ENTERPRISE_SAAS_KEYWORDS match
  • early_stage — any EARLY_STAGE_KEYWORDS match
  • appears_defunct — any DEFUNCT_KEYWORDS match
  • website_emails — comma-separated emails found on site

5.7 enrich_emails.py

New script. Inspired by foundation-pipeline/enrich_emails.py.

Logic:

  1. Load enriched prospects
  2. For prospects WITHOUT email (from merge or website scraping): a. Check if website_emails were found in enrich_websites step; if so, pick best one (prefer partner name matches, then info@, then first found) b. If still no email and Hunter.io API key exists in .env: try domain search (free tier: 25/month). Only for A-tier prospects. c. If still no email: flag in email_research_needed column for manual research queue
  3. Prefer: operating_partner@ > platform@ > info@ > first partner name
  4. Track email source: email_source column (values: “openvc”, “ramp”, “website_scrape”, “hunter”, “manual”)
  5. Output: updated data/intermediate/vc_prospects_enriched.csv
  6. Print email coverage stats by tier

Environment variable: HUNTER_API_KEY in .env (optional; script works without it, just skips Hunter enrichment)

5.8 score_prospects.py

Fork from: solanasis-scripts/fcto-pipeline/score_prospects.py — same scoring engine structure, different signal evaluation logic.

Scoring model (18 signals):

SCORING = {
    # Segment signals (highest weight — determines fit)
    "has_platform_team": 18,        # Website mentions platform/value-creation/founder-success
    "cyber_security_focus": 15,     # Cybersecurity-focused fund (highest natural fit)
    "enterprise_saas_focus": 10,    # Enterprise/B2B SaaS fund (portfolio needs security)
    "early_stage_focus": 8,         # Pre-seed/Seed/Series A (startups most underserved)
 
    # Contact quality (determines reachability)
    "has_email": 8,                 # Found a contact email (critical for this pipeline)
    "has_website": 4,               # Has an active website
    "has_partner_names": 3,         # Named managing partners (from SEC RELATEDPERSONS)
    "has_phone": 2,                 # Has phone number (from SEC filing)
 
    # Fund characteristics
    "fund_size_sweet_spot": 6,      # $50M-$500M AUM (big enough to have portfolio, small enough to care)
    "fund_size_large": 3,           # $500M-$2B (has platform team, harder to reach)
    "recent_filing": 5,             # Filed in last 12 months (active fund)
 
    # Geographic
    "colorado_based": 5,            # Colorado HQ (local angle for Dmitri)
    "us_based": 2,                  # US-based (vs international)
 
    # Negative signals
    "no_us_presence": -15,          # International fund with no US office
    "appears_defunct": -10,         # No recent filings, dead website, shutdown language
    "mega_fund": -5,                # >$2B AUM (won't care about $2,500 assessment)
    "pure_fintech_fund": -3,        # Invests IN security/fintech companies (portfolio already knows this)
    "no_contact_path": -8,          # No email, no phone, no website = unreachable
}
 
TIERS = {
    "A": 40,   # High-priority: platform-oriented or cyber-focused funds with email
    "B": 25,   # Good fit: right sector, some contact path
    "C": 12,   # Maybe: has some signals but lower confidence
    # D = everything below 12
}

Expected tier distribution: A ~5-8%, B ~15-20%, C ~30-40%, D = remainder. With 4,000-6,000 total funds: A = 200-480, B = 600-1,200.

Template assignment: Uses TEMPLATE_RULES from config (priority order, first match wins). Same pattern as fCTO.

5.9 templates.py

Fork from: solanasis-scripts/fcto-pipeline/templates.py — same structure, function signatures, voice rules. All email copy is new.

Voice rules (non-negotiable, from solanasis-voice-profile.md):

  • NO em dashes (use semicolons, periods, commas, parentheses)
  • NO “seamless”, “frictionless”, “audit”, “genuinely”, “SMBs”
  • Under 125 words (50-80 ideal)
  • End with a specific question (not generic CTA)
  • Plain text only, no HTML, no images, no tracking
  • Signature: Dmitri Sunshine\nFounder, Solanasis\nsolanasis.com | 303-900-8969

Template 1: Platform Team Portfolio Offer

  • When: A-tier, has platform/value-creation language on website
  • Subject: Portfolio security resource for {fund_name}'s teams
  • Body: Lead with the portfolio support angle. “I help early-stage teams avoid preventable security and operational mistakes before they become expensive distractions. I’d like to offer {fund_name}‘s portfolio companies a discounted Startup Security Baseline; a 5-day assessment at $2,500 fixed rate, designed for post-funding teams. Your portfolio companies get operational resilience practices established early. You get a resource that makes your platform team more valuable. For more mature companies, we also run a 10-day full assessment. Would it make sense for me to send the one-pager to whoever manages portfolio support?”
  • ~95 words

Template 2: CISO Fund Operator

  • When: Cyber-focused fund or CISO angel network
  • Subject: Operational resilience for your portfolio, {first_name}
  • Body: Peer-to-peer operator angle. “Fellow security practitioner here. I keep seeing the same pattern: funded startups skip basic security hygiene until a customer asks for a SOC 2 or a breach forces the conversation. By then, it costs 3x what it would have at Series A. Solanasis runs a 5-day Startup Security Baseline; access hygiene, real backup restore test, cloud posture review, remediation roadmap. $2,500 fixed fee for portfolio companies. Has this come up with any of your portfolio companies?”
  • ~80 words

Template 3: Founder Safety Net

  • When: B-tier funds, generalist or early-stage focus
  • Subject: Quick question about {fund_name}'s post-investment support
  • Body: Lighter touch, asks a question first. “Curious whether {fund_name} has a go-to resource when portfolio companies need cybersecurity or DR verification. Most startups don’t think about this until something breaks. We run a 5-day Startup Security Baseline; specifically designed for post-funding teams. $2,500 fixed fee, investor-ready deliverable. Full compliance assessment also available for more mature companies. Is this something that would be useful for your portfolio?”
  • ~70 words

Template 4: Colorado VC Peer

  • When: Colorado-based fund
  • Subject: Colorado resource for your portfolio teams
  • Body: Local peer angle. “Fellow Colorado business owner here. I run Solanasis; we do cybersecurity assessments and operational resilience work. I’m looking to partner with a small number of Colorado-based funds to offer their portfolio companies a discounted security baseline. 5-day turnaround, $2,500 fixed fee, investor-ready deliverable. Full assessment available for companies further along. Coffee or a quick call?”
  • ~60 words

Template 5: Generic VC Outreach

  • When: B/C-tier, no strong segment signal
  • Subject: Security baseline offer for portfolio companies
  • Body: Shortest, most direct. “We run a 5-day Startup Security Baseline for early-stage companies. Backup verification, access hygiene, security posture review. $2,500 fixed fee for portfolio companies; investor-ready report. I’d like to offer this to {fund_name}‘s portfolio. Worth a look?”
  • ~45 words

Follow-Up Day 5:

  • Subject: Quick follow-up; portfolio security resource
  • Body: References original offer. Lists what baseline covers in 3-4 bullets. “Want the one-pager?”

Follow-Up Day 10:

  • Subject: Last note; portfolio resilience resource
  • Body: One paragraph graceful close. “If any of your portfolio companies ever need a security baseline done quickly, we’re here.”

Functions (same signatures as fCTO):

  • extract_first_name(full_name) → first name from “Jeffrey B Smith”
  • assign_template(row) → variant ID based on TEMPLATE_RULES
  • build_personalized_first_line(row) → “Hi {first_name}, I came across {fund_name} while researching {segment} investors in {location}.”
  • format_template(variant_id, **kwargs) → (subject, body)
  • format_follow_up(day, **kwargs) → (subject, body)

5.10 generate_outreach_csv.py

Fork from: solanasis-scripts/fcto-pipeline/generate_outreach_csv.py — nearly identical.

Changes:

  • CRM_COLUMNS use VC-specific fields (fund_name instead of company_name, managing_partners, fund_size_range, etc.)
  • Pipeline report text references VC-specific next steps
  • Outreach queue filters: A/B-tier, has_email, outreach_status != “sent”

Output:

  • data/output/vc_outreach_full.csv — all scored prospects
  • data/output/vc_outreach_queue.csv — A/B-tier with email, ready for daily brief
  • data/output/pipeline_report.md — summary stats

5.11 generate_daily_outreach.py

Fork from: solanasis-scripts/fcto-pipeline/generate_daily_outreach.py — same tracker, same markdown format.

Changes:

  • Output file: solanasis-docs/daily-outreach/YYYY-MM-DD-vc.md
  • Prospect cards include: fund name, segment (platform/cyber/enterprise/early-stage), fund size, managing partners, contact info
  • Research queue section flags A/B-tier funds needing email discovery
  • Follow-up templates from templates.py (Day 5 + Day 10)
  • Safety reminder in header: “Plain text only. Max 5-10/day total across ALL pipelines.”

CLI interface (same as fCTO):

python generate_daily_outreach.py              # Default 3 prospects
python generate_daily_outreach.py --count 5    # More prospects
python generate_daily_outreach.py --mark-sent prospect_id1,prospect_id2
python generate_daily_outreach.py --status     # Pipeline stats

6. Scoring Model Rationale

The scoring model prioritizes two dimensions:

Dimension 1: Does this fund have a reason to care about portfolio security?

  • Platform team (+18) = explicit portfolio-support function; they’re paid to help portfolio companies
  • Cyber focus (+15) = they understand the problem natively; shortest sales cycle
  • Enterprise SaaS (+10) = their portfolio companies sell to enterprises that demand security posture
  • Early stage (+8) = their portfolio companies are most likely to lack security practices

Dimension 2: Can we actually reach them?

  • Email (+8) = the only realistic channel for cold outreach at 5-10/day manual volume
  • Website (+4) = indicates an active fund (vs. a filing-only entity)
  • Partner names (+3) = enables personalization
  • Phone (+2) = backup channel for follow-up

Negative signals prevent wasted effort:

  • No US presence (-15) = compliance context doesn’t translate; CAN-SPAM doesn’t cover international well
  • Defunct (-10) = no point reaching out
  • Mega fund (-5) = 5B fund’s operations
  • Pure fintech (-3) = their portfolio companies are literally in the security space

7. /for/startups Landing Page

Fork from: solanasis-site/src/pages/for/foundations.astro

URL: /for/startups

Sections (mirroring /for/foundations structure):

  1. Hero — “Startup Security Baseline. Cybersecurity practices before they become expensive problems.” CTA: “Get Started” → /#compliance-assessment
  2. Pain Points (4 cards):
    • “No one owns security” — Everyone’s too busy building product
    • “Customer due diligence” — Enterprise customers will ask; better to be ready
    • “Investor confidence” — Show your board you’re not a liability
    • “Preventable breaches” — The basics stop 90% of incidents
  3. Two-Tier Assessment (2-column):
    • Left: SSB (3,500 direct, 5 days, scope bullets)
    • Right: Full CRA ($5,000 portfolio / standard, 10 days, scope bullets)
  4. Who This Is For (4 cards):
    • Seed-stage startups (5-20 employees)
    • Series A/B companies (20-100 employees)
    • VC portfolio companies (any stage)
    • Startups approaching enterprise sales
  5. How We Work (5 steps):
    • 15-min scoping call → Access provisioning → Assessment execution → Report delivery → Remediation roadmap review
  6. FAQ (4 items):
    • “What’s the difference between SSB and full CRA?”
    • “Do you work with pre-revenue startups?”
    • “What do we need to provide for the assessment?”
    • “Can our VC arrange this for multiple portfolio companies?”
  7. Contact CTA — “Schedule a 15-minute call” → /contact

Nav + Footer: Add “Startups” to Industries dropdown (desktop + mobile) and footer.

Voice: 3 signature terms (blind spot, false comfort, risk debt), 1 transition (Here’s the thing), parenthetical aside, 2+ semicolons.

Tests: 7 startup-specific + 2 cross-linking tests in vertical-pages.spec.ts (mirrors foundation test pattern).


8. Testing Specification (30-40 pytest tests)

8.1 tests/conftest.py

Fixtures (pattern from foundation-pipeline/tests/conftest.py):

  • tmp_data_dir — creates raw/intermediate/output subdirectories in temp dir
  • sample_offering_tsv — 10 rows: 6 VC funds (various sizes, stages) + 4 non-VC (hedge fund, PE, real estate, other)
  • sample_relatedpersons_tsv — managing partner data for the 6 VC funds (12 people total, ~2 per fund)
  • sample_formdsubmission_tsv — filing dates and submission types for all 10
  • sample_openvc_csv — 5 rows of mock OpenVC data (3 overlap with SEC, 2 unique)
  • sample_scored_prospects — 10 rows with pre-computed scores and tiers (2 A, 3 B, 3 C, 2 D)

8.2 tests/test_import_form_d.py (~10 tests)

  • Filters only INVESTMENTFUNDTYPE == "Venture Capital Fund" (expects 6 of 10)
  • Joins OFFERING + RELATEDPERSONS correctly on ACCESSIONNUMBER
  • Handles missing/null RELATEDPERSONS fields gracefully
  • Deduplicates by fund name (keeps most recent filing)
  • Extracts phone from SEC filing and normalizes format
  • Handles multi-quarter data (same fund files amendments)
  • Generates stable prospect_id from name + state
  • Rejects rows with NULL entity names
  • Produces correct column schema in output CSV
  • Quality report counts match expectations

8.3 tests/test_score_prospects.py (~10 tests)

  • A-tier scoring: platform team + email + recent filing = 40+ (A)
  • B-tier scoring: enterprise focus + website + email = 25-39 (B)
  • C-tier scoring: US-based + phone only = 12-24 (C)
  • D-tier scoring: defunct international fund = <12 (D)
  • Negative signals reduce score correctly (each one)
  • Tier thresholds produce expected distribution on sample data
  • Template assignment matches segment (platform → Template 1, cyber → Template 2, etc.)
  • Fallback template (5) assigned when no rules match
  • Score breakdown string is human-readable
  • Empty/null fields don’t crash scoring

8.4 tests/test_templates.py (~8 tests)

  • Each template variant (1-5) renders without KeyError
  • Personalized first lines work with partial data (name only, name+fund, full)
  • NO em dashes ( or ) in any template output
  • NO forbidden words (“seamless”, “frictionless”, “audit”, “genuinely”, “SMBs”) in any template
  • All template bodies are under 125 words
  • All subject lines are under 60 characters
  • mailto: URL generation produces valid URLs under 2000 chars
  • Follow-up templates (Day 5, Day 10) render correctly
  • Signature block is present in all initial templates

8.5 tests/test_merge_sources.py (~5 tests)

  • SEC + OpenVC merge deduplicates correctly (3 overlapping records become 3, not 6)
  • Prefers SEC entity name over free-source name
  • Prefers free-source email over empty SEC email
  • Source provenance tracked correctly (“sec_form_d,openvc”)
  • One-sided records preserved (SEC-only and OpenVC-only)

9. Additional Deliverables

9.1 Startup Security Baseline One-Pager (PDF)

  • Location: solanasis-site/public/downloads/solanasis-startup-security-baseline.pdf
  • Content: SSB product overview, two-tier pricing, scope bullets, “How it works” (5 steps), Solanasis contact info
  • Design: match existing PDF assets (clean, professional, Solanasis brand)
  • Generation: python-pptx → PDF (same pipeline as pitch deck) or direct PDF via reportlab

9.2 Operational Playbook

  • Location: solanasis-docs/playbooks/VC_Portfolio_Outreach_Playbook.md
  • Content: Setup steps, daily ritual, 30-day pilot design, compliance checklist, success metrics, what to do when a VC responds

9.3 Short URLs

  • go.solanasis.com/startup-security/for/startups
  • go.solanasis.com/ssb → one-pager PDF
  • Create via short.io API (tag: claude-managed)
  • Log in solanasis-docs/operations/short-url-tracking-log.md

9.4 Updated Daily Outreach Ritual

  • Update Foundation_Daily_Outreach_Ritual.md to reference VC pipeline
  • Add python generate_daily_outreach.py command for VC pipeline
  • Note: capacity allocation across pipelines to be decided after prospect quality assessed

10. Compliance Requirements

From the existing VC ecosystem playbook and cold outreach cheat sheet:

  • CAN-SPAM: Accurate sender identity (Dmitri Sunshine, Solanasis), physical address, functioning opt-out mechanism. CAN-SPAM makes no exception for B2B email.
  • Domain safety: Max 5-10 emails/day from solanasis.com across ALL pipelines. Plain text only, no HTML, no tracking pixels, no images. Stagger sends 2+ hours apart.
  • LinkedIn: No scraping, no automation, no prohibited extensions. Manual only if used at all.
  • Attachments: No attachments in first email. Offer to send one-pager in reply.
  • Red flags: 3+ bounces in one day → stop immediately. Spam complaints → investigate and adjust.
  • Manual sending only from solanasis.com (no Instantly, no GMass, no automation tools for primary domain)

11. 30-Day Pilot Design

Week 1: Infrastructure + First Sends

  • Build and test all pipeline scripts
  • Download SEC Form D data + at least one free database (OpenVC or Ramp)
  • Run full pipeline end-to-end
  • Review top 20 A-tier prospects for quality
  • Create SSB one-pager PDF
  • Build /for/startups landing page
  • Send first 15-20 emails

Week 2: Expand + Learn

  • Send 20-30 more emails
  • Track which segments respond (platform teams vs. GPs vs. operating partners)
  • Identify which template variant gets highest open/reply rates
  • Process Day 5 follow-ups from Week 1

Week 3: Refine

  • Review response quality (not just reply rate)
  • Tighten subject lines and messaging based on data
  • If any VC meeting booked, prepare portfolio presentation
  • Process Day 10 follow-ups from Week 1

Week 4: Decide

  • Total expected: 60-80 VC emails over 4 weeks (note: 5-10/day limit is shared across ALL pipelines; VC allocation decided post-build based on prospect quality)
  • Expected reply rate: 3-8% for cold B2B email to VCs (2-6 replies)
  • Expected meeting rate: 1-3% (1-2 meetings)
  • Decision: scale VC channel, pivot to founder-direct (Motion B), or adjust offer/pricing

Success Metrics

  • Minimum viable: 1 meeting booked from 80 emails (1.25% meeting rate)
  • Good: 3+ meetings, 1 pilot SSB delivered via VC introduction
  • Great: Ongoing portfolio relationship with 1+ fund, repeat referrals

12. Implementation Sequence

Build in this order (dependencies flow top-to-bottom):

StepScriptEffortFork SourceNotes
1config.py30 minfcto-pipeline/config.pyAll new content, same structure
2download_form_d.py1 hrNew (pattern from download_bmf.py)Then run it to get data
3import_form_d.py + tests1.5 hrNew (TSV parsing + joins)Core data processing
4import_free_databases.py1 hrimport_prospects.py patternCan parallelize with step 3
5merge_sources.py + tests45 minNew (fuzzy match)Depends on 3 + 4
6enrich_websites.py45 minfcto-pipeline/enrich_websites.pyFork, change keywords. Then run (1-2 hr execution)
7enrich_emails.py1 hrNew (inspired by foundation)Depends on 6
8score_prospects.py + tests45 minfcto-pipeline/score_prospects.pyFork, change signal evaluation
9templates.py + tests1.5 hrfcto-pipeline/templates.pyWriting email copy is the bottleneck
10generate_outreach_csv.py30 minfcto-pipeline/generate_outreach_csv.pyMostly fork
11generate_daily_outreach.py45 minfcto-pipeline/generate_daily_outreach.pyMostly fork
12requirements.txt5 minCopy from fCTO
13/for/startups landing page2 hrfor/foundations.astroFork, new content
14SSB one-pager PDF1 hrNew
15Operational playbook45 minPattern from existing playbooks
16Short URLs10 minshort.io API

Total estimated effort: ~12-14 hours


13. Decisions Made

  • Pricing: 3,500 direct for SSB; full CRA at $5,000 portfolio discount
  • Scope: Both tiers offered (SSB 5-day entry + full CRA 10-day upgrade)
  • Landing page: Yes, build /for/startups
  • Capacity: Decide allocation after pipeline is built based on prospect quality
  • VC incentive: Discount = the value; no referral fees to VCs

14. Critical Source Files

These files must be read before implementation begins:

FilePurpose
solanasis-scripts/fcto-pipeline/config.pyPrimary fork source for pipeline config
solanasis-scripts/fcto-pipeline/templates.pyFork source for email templates + voice rules
solanasis-scripts/fcto-pipeline/generate_daily_outreach.pyFork source for daily briefs
solanasis-scripts/fcto-pipeline/score_prospects.pyFork source for scoring engine
solanasis-scripts/fcto-pipeline/generate_outreach_csv.pyFork source for outreach CSV
solanasis-scripts/fcto-pipeline/enrich_websites.pyFork source for website enrichment (543 lines, verified)
solanasis-scripts/fcto-pipeline/import_prospects.pyReference for multi-source importer structure (column mapping pattern)
solanasis-docs/playbooks/solanasis-investor-list-vc-ecosystem-outbound-playbook-handoff-2026-03-20.mdStrategy playbook (segments, messaging, compliance)
solanasis-docs/playbooks/Manual_Cold_Outreach_Cheat_Sheet.mdDaily outreach safety rules
solanasis-site/src/pages/for/foundations.astroFork source for /for/startups landing page (verified exists)
solanasis-scripts/foundation-pipeline/tests/conftest.pyPattern for pytest fixtures
solanasis-docs/solanasis-voice-profile.mdVoice rules for all copy

15. Fork Warnings and Out-of-Scope Items

Fork bugs to fix during implementation

  • generate_daily_outreach.py ~line 336: The fCTO version contains “SMBs” (forbidden word) and an em dash in a LinkedIn connection message template. When forking, replace “SMBs” with “small and mid-size organizations” and replace the em dash with a semicolon.

Explicitly out of scope (defer to post-pilot)

  • migrate_to_baserow.py: The fCTO pipeline has a Baserow migration script. The VC pipeline does NOT need this in Phase 1. If the pilot succeeds (Week 4 decision), add Baserow migration as a Phase 2 item. Table name would be “VC Partners.”
  • discover_prospects.py: Not needed; SEC Form D bulk data replaces web discovery.
  • Motion B (founder-direct outreach): Targeting portfolio companies directly is a separate Phase 2 effort that would require Crunchbase or PitchBook data for portfolio company lists.

SCORING dict in config.py

The config.py skeleton in section 5.1 shows SCORING = { ... }. The full scoring dict is defined in section 5.8. When implementing, copy the complete dict from section 5.8 into config.py.

Template 1 word count

Template 1 is ~95 words (within the 125-word limit but above the 50-80 ideal). The personalized first line adds ~20 words. If total exceeds 125 after personalization, trim the “For more mature companies” sentence.


16. Verification Checklist

  1. download_form_d.py downloads and extracts SEC ZIPs without errors
  2. import_form_d.py produces 4,000-6,000 unique VC funds
  3. No PE, hedge, or real estate funds in filtered output (spot-check 20 records)
  4. merge_sources.py deduplicates correctly (merged count < sum of inputs)
  5. enrich_websites.py caches results and respects rate limits
  6. Email coverage is 40%+ for A/B-tier prospects after all enrichment
  7. score_prospects.py produces tier distribution: A ~5-8%, B ~15-20%
  8. All templates render without errors; no forbidden words or em dashes
  9. All subject lines < 60 chars; all bodies < 125 words
  10. python -m pytest tests/ — all 30-40 tests pass
  11. generate_daily_outreach.py produces markdown brief with working mailto: links
  12. Click 3 mailto: links → verify subject + body render in email client
  13. Top 10 A-tier prospects are real, active VC funds (manual review)
  14. /for/startups page renders correctly, passes accessibility checks
  15. Short URLs resolve correctly
  16. Pipeline report includes meaningful stats (tier distribution, email coverage, top prospects)