Estate Planning Attorney Pipeline — Continuation Prompt

Purpose: Paste this entire document as the first message in a new Claude Code session (on the server or any machine with the same repo layout) to build and run the full estate planning attorney cold email prospecting pipeline.

Last updated: 2026-03-22 Session workspace: The _solanasis container folder (same layout as C:\Users\zasya\Documents\_solanasis) Estimated build time: 3-4 hours for full pipeline + first discovery run


What to Build

Build a complete estate planning attorney cold email prospecting pipeline at solanasis-scripts/attorney-pipeline/. This pipeline discovers estate planning attorneys nationally, enriches them with emails and signals, scores and tiers them, and generates daily outreach briefs with pre-written emails.

This is NOT a greenfield build. Clone heavily from the existing fcto-pipeline/ (same data shape: people-at-firms). Reuse the foundation pipeline’s email enrichment patterns. Adapt the Cold Outreach Kit v1 email templates.


Strategic Context (Why Estate Planning Attorneys)

Estate planning attorneys are the #1 smartcut into the wealth management ecosystem:

  • They sit at the center of every wealth team (RIA + CPA + attorney)
  • One attorney client = 170K Year 1 potential via the flywheel (attorney CPA referral RIA referral)
  • March-May is the buying window (malpractice insurance renewals, bar conferences, summer associate urgency)
  • They handle the MOST sensitive data (SSNs, financial accounts, trust documents, medical info, beneficiary designations)
  • 66% of law firms lack incident response plans (ABA TechReport 2023)
  • ABA Rules 1.6(c) and 1.1 require “reasonable efforts” to protect client data (nationwide, enforceable)
  • Malpractice carriers are now asking cybersecurity questions on renewal applications

Positioning: “Data Protection for Client Trust” — NOT “cybersecurity assessment.” Frame everything as compliance with ABA ethical obligations, not IT security.

Key playbooks to reference:

  • solanasis-docs/playbooks/Estate_Attorney_Cold_Outreach_Kit_v1.md — email templates, language shifts, phone scripts, objection handling
  • solanasis-docs/playbooks/Estate_Planning_Attorney_Smartcut_Playbook.md — full 9-part strategy

Data Acquisition Strategy: Multi-Layer Free Scraping

No paid tools required. Stack multiple free data sources, each handled by a Python script Claude Code builds and runs.

Layer 1: DuckDuckGo Search (PRIMARY, runs locally, free)

Clone from fcto-pipeline/discover_prospects.py. Already proven pattern. Law firm websites are well-indexed; this works even better for attorneys than it did for fCTOs.

Attorney-specific queries (run per city for each target state):

  • "estate planning attorney" {city} {state}
  • "trusts and estates" attorney {city} law firm
  • "elder law attorney" {city} {state}
  • "estate planning" "managing partner" {city}
  • "wealth transfer attorney" {city}
  • site:linkedin.com "estate planning attorney" {city}
  • site:justia.com estate planning attorney {state}
  • site:avvo.com estate planning attorney {city}

Expected yield: 50-100 unique firm/attorney records per state.

Layer 2: gosom/google-maps-scraper (BULK, Docker, free, no API key)

This machine should have Docker. The gosom/google-maps-scraper (3.5K+ GitHub stars, MIT license, actively maintained) is the single highest-yield free tool.

  • Input: text file with one search query per line (e.g., “estate planning attorney in Denver CO”)
  • Output: CSV with 33+ fields including name, address, phone, website, rating, review count, email
  • Built-in email extraction from business websites
  • No API key needed, no rate limits (self-hosted)
docker pull gosom/google-maps-scraper
# Run with query file:
docker run -v $(pwd)/data:/data gosom/google-maps-scraper -input /data/queries.txt -results /data/output.csv

Build server/discover_gmaps_scraper.py to:

  1. Generate query files from the city list (one query per city: “estate planning attorney in {city} {state}”)
  2. Run the Docker container
  3. Import the CSV output into the pipeline

Layer 3: Website Scraping for Emails (free, unlimited)

For every firm with a website URL from Layers 1-2, scrape their contact/about/team pages.

Reuse these exact patterns:

  • foundation-pipeline/enrich_emails.py — Jina Reader (r.jina.ai) for clean text, regex email extraction, email quality classification (named_person / role_based / generic), SKIP_PREFIXES, PREFERRED_PREFIXES
  • fcto-pipeline/enrich_websites.py — trafilatura for text extraction, httpx async scraping, website cache system, Jina Reader as fallback for JS-heavy sites, keyword signal detection with regex

Scrape these pages on each firm website: /contact, /about, /attorneys, /our-team, /people, /staff

Layer 4: Email Pattern Guessing + DNS Verification (free, unlimited)

When website scraping finds attorney names but no email, generate candidates using law firm email patterns.

Most common law firm email formats (in order of frequency):

  1. {first_initial}{last}@domain — most common for law firms (e.g., jsmith@smithlaw.com)
  2. {first}.{last}@domain
  3. {first}{last}@domain
  4. {first}@domain

Verification:

  1. DNS MX record lookup (does the domain accept email?) — dnspython package
  2. Optional: SMTP RCPT TO check (does the specific address exist?)

Build guess_emails.py — takes name + domain, generates candidates, verifies via DNS MX, outputs best guess with confidence score.

Layer 5: theHarvester (domain OSINT, free)

Open-source tool (15.9K GitHub stars). Queries 40+ sources to find emails for a domain.

theHarvester -d smithlawfirm.com -b all

Use on high-value A-tier firms where Layers 3-4 didn’t find an email. Install: pip install theHarvester (requires Python 3.12+).

Layer 6: Outscraper Google Maps API (500 free records/month)

Managed API, no subscription. 500 free business records/month, then $3/1K. pip install outscraper Use as supplemental data when DDG + gmaps scraper don’t cover a city well enough.

Layer 7: Apify Justia Scraper ($5/mo free credits)

Pre-built scraper for Justia’s lawyer directory. Estate planning practice area filter. Returns name, firm, phone, email, bio, practice areas, social links.

$5/month free Apify platform credits = ~1,250 attorney listings/month.

Direct scraping of Justia returns 403 (Cloudflare). Apify handles this.


Pipeline Architecture

solanasis-scripts/attorney-pipeline/
  ## Core scripts (LOCAL)
  config.py                      # Scoring, keywords, paths, tiers, city lists
  templates.py                   # 5 email templates + 2 follow-ups
  discover_ddg.py                # DuckDuckGo discovery
  import_prospects.py            # Import from all sources (DDG, gmaps, Justia CSVs)
  enrich_websites.py             # Scrape firm websites for emails + signals
  guess_emails.py                # Email pattern generation + DNS verification
  score_prospects.py             # Attorney-specific scoring model
  generate_outreach_csv.py       # CRM-ready CSV with mailto links
  generate_daily_outreach.py     # Daily markdown brief (3-5 prospects/day)
  migrate_to_baserow.py          # Optional CRM migration
  requirements.txt

  ## Server scripts (DOCKER)
  server/
    discover_gmaps_scraper.py    # Generates query files + runs Docker scraper
    SERVER_SETUP.md              # Deployment guide
    queries/                     # Generated query files (one per state)

  data/
    raw/                         # Source CSVs from each discovery method
    intermediate/                # Imported, enriched, scored; website_cache/
    output/                      # Final outreach queue, outreach_tracker.json

What to Clone From (Exact File Paths)

New ScriptClone FromKey Adaptations
config.pyfcto-pipeline/config.pyReplace keywords, scoring model, city lists, add practice area regex
templates.pyfcto-pipeline/templates.pyReplace with 5 attorney templates from Cold Outreach Kit v1
discover_ddg.pyfcto-pipeline/discover_prospects.pyReplace QUERIES list, adapt name extraction for law firm patterns
import_prospects.pyfcto-pipeline/import_prospects.pyAdd gmaps scraper and Justia column mappings
enrich_websites.pyfcto-pipeline/enrich_websites.py + foundation-pipeline/enrich_emails.pyCombine: trafilatura + Jina, email regex + quality classification, attorney keywords
guess_emails.pyNew (see spec below)Email pattern generation, DNS MX verification
score_prospects.pyfcto-pipeline/score_prospects.pyAttorney-specific signals and thresholds
generate_outreach_csv.pyfcto-pipeline/generate_outreach_csv.pyAttorney column names
generate_daily_outreach.pyfcto-pipeline/generate_daily_outreach.py4-touch cadence (Day 1/4/8/14), phone scripts
migrate_to_baserow.pyfcto-pipeline/migrate_to_baserow.pyAttorney table/fields
server/discover_gmaps_scraper.pyNewQuery file generation + Docker runner

Scoring Model

SCORING = {
    # Core fit signals
    "estate_planning_focus": 15,       # Practice area confirmed via website/directory
    "firm_size_sweet_spot": 12,        # 2-15 attorneys (has budget, no in-house IT)
    "has_email": 8,                    # Direct email discovered
 
    # Market signals
    "top_wealth_state": 5,             # CA, NY, FL, TX, CO
    "has_website": 5,                  # Enables enrichment
    "no_security_vendor": 5,           # No cybersecurity/CISO/SOC 2 mentions on site
    "no_it_team": 5,                   # No IT director/CISO listed on team page
 
    # Authority signals
    "actec_fellow": 5,                 # ACTEC membership (premium prospect)
    "mentions_confidentiality": 3,     # Website mentions client data protection (buying signal)
    "high_google_rating": 3,           # 4.5+ stars on Google
    "has_phone": 3,                    # Phone available for Day 8 follow-up
    "free_consultation": 2,            # Justia flag (accessible/open to conversations)
 
    # Negative signals
    "solo_practitioner": -10,          # Rarely has budget
    "has_security_vendor": -8,         # Already has cybersecurity covered
    "large_firm": -8,                  # 20+ attorneys = has in-house IT
    "litigation_focus": -5,            # Primary practice is litigation, not estate planning
}
 
TIERS = {"A": 35, "B": 22, "C": 10}  # D = below 10

Geographic Scope: National (Top Wealth States)

Target Cities by Tier

Tier 1 (highest wealth density):

  • CA: Los Angeles, San Francisco, San Diego, San Jose, Palo Alto, Beverly Hills, Newport Beach
  • NY: Manhattan, Brooklyn, White Plains, Garden City, Long Island
  • FL: Miami, Palm Beach, Fort Lauderdale, Naples, Tampa, Jacksonville
  • TX: Houston, Dallas, Austin, San Antonio, Fort Worth
  • CO: Denver, Boulder, Colorado Springs, Fort Collins, Lakewood

Tier 2 (strong wealth markets):

  • IL: Chicago, Naperville, Evanston
  • MA: Boston, Cambridge, Newton, Wellesley
  • CT: Greenwich, Stamford, Hartford, New Haven
  • NJ: Princeton, Morristown, Hackensack, Newark
  • PA: Philadelphia, Pittsburgh, King of Prussia
  • WA: Seattle, Bellevue, Tacoma

Tier 3 (secondary markets):

  • AZ: Scottsdale, Phoenix, Tucson
  • GA: Atlanta, Buckhead, Savannah
  • NC: Charlotte, Raleigh, Durham
  • VA: McLean, Arlington, Richmond
  • MN: Minneapolis, St. Paul
  • OH: Cleveland, Columbus, Cincinnati
  • MD: Bethesda, Baltimore, Annapolis

Email Templates (from Cold Outreach Kit v1)

Critical Voice Rules (MUST follow in all templates)

  • NO em dashes (use semicolons, periods, commas, parentheses)
  • NO “seamless”, “frictionless”, “audit”, “genuinely”, “SMBs”
  • Language shifts: “data protection review” not “cybersecurity assessment”; “reasonable efforts verification” not “compliance audit”; “exposure points for client data” not “vulnerabilities”; “breach notification readiness” not “incident response plan”
  • Under 125 words per email body
  • End with a specific question, not generic CTA
  • Plain text only, no HTML, no images, no tracking pixels

5 Template Variants

Template 1: Ethical Duty (default A/B-tier)

  • Subject: Quick question about client data protection at {firm_name}
  • Hook: ABA Rule 1.6(c) “reasonable efforts” + “66% of law firms lack incident response plans”
  • When: Default for any attorney with email, A/B-tier

Template 2: Malpractice Insurance (insurance renewal signals)

  • Subject: Malpractice insurance and cybersecurity at {firm_name}
  • Hook: “Your malpractice carrier is asking 3 specific cybersecurity questions this renewal”
  • When: Firms with insurance renewal timing, or mentions on website

Template 3: Data Sensitivity (trusts/wealth/elder law)

  • Subject: Protecting {firm_name}'s trust and estate documents
  • Hook: Estate data = SSNs + financial accounts + family relationships + medical info of multiple generations
  • When: Practice areas include trusts, wealth transfer, elder law, special needs

Template 4: Peer Proof (B-tier, casual)

  • Subject: How estate firms are handling the cybersecurity question
  • Hook: What we’re seeing from similar firms + specific question about their approach
  • When: B-tier, casual opener

Template 5: Industry Authority (out-of-state, national positioning)

  • Subject: Estate planning data protection, {first_name}
  • Hook: Position as a specialist firm serving estate practices nationally with specific stats
  • When: Out-of-state prospects where “local peer” doesn’t apply

Follow-Up Cadence

  • Day 4: Follow-up email (malpractice insurance angle from Kit Email 2)
  • Day 8: Phone call (90-second script from Kit: lead with ABA obligation + documented proof)
  • Day 14: Breakup email (“closing the loop” + $200K average breach cost for professional services)

Template Assignment Logic

def assign_template(row):
    tier = row["prospect_tier"]
    state = row["state"]
    practice = row.get("practice_keywords_found", "")
 
    # Trusts/wealth/elder law + A/B tier -> Data Sensitivity
    if any(k in practice for k in ["trust", "wealth", "elder"]) and tier in ("A", "B"):
        return 3
    # A-tier default -> Ethical Duty
    if tier == "A":
        return 1
    # Colorado -> Template 1 (Ethical Duty with local angle in first line)
    if state == "CO":
        return 1
    # B-tier -> Peer Proof
    if tier == "B":
        return 4
    # Out-of-state C-tier -> Industry Authority
    return 5

guess_emails.py Spec (New Script)

"""Email pattern generation + DNS MX verification for law firms.
 
Given an attorney name + firm domain, generates likely email addresses
using common law firm patterns, verifies the domain has MX records,
and returns the best candidate with a confidence score.
 
Usage:
    python guess_emails.py --csv data/intermediate/prospects_enriched.csv
    python guess_emails.py --test "John Smith" "smithlaw.com"
"""
 
# Pattern priority (law firm specific):
PATTERNS = [
    ("{f}{last}", 0.35),          # jsmith@domain (most common for law firms)
    ("{first}.{last}", 0.25),     # john.smith@domain
    ("{first}{last}", 0.15),      # johnsmith@domain
    ("{first}", 0.10),            # john@domain
    ("{f}.{last}", 0.08),         # j.smith@domain
    ("{last}", 0.05),             # smith@domain
    ("{first}_{last}", 0.02),     # john_smith@domain
]
 
# Verification chain:
# 1. Check domain has MX records (dnspython) — if no MX, skip entire domain
# 2. Generate all pattern candidates
# 3. If theHarvester found emails for this domain, match pattern to known emails
# 4. Output best guess with confidence score (0.0-1.0)

server/discover_gmaps_scraper.py Spec (New Script)

"""Generate query files for gosom/google-maps-scraper and import results.
 
Modes:
    python discover_gmaps_scraper.py --generate    # Create query files in queries/
    python discover_gmaps_scraper.py --import FILE  # Import CSV from gmaps scraper
 
The --generate mode creates one query file per state tier, with one search
query per line: "estate planning attorney in {city} {state}"
 
The --import mode reads the gmaps scraper CSV output and converts it to
the pipeline's standard CSV format in data/raw/gmaps_batch{N}.csv
"""

server/SERVER_SETUP.md Content

Step-by-step guide:

  1. docker pull gosom/google-maps-scraper
  2. Copy queries/ directory to server
  3. Run: docker run -v $(pwd):/data gosom/google-maps-scraper -input /data/queries/tier1.txt -results /data/tier1_results.csv -lang en -depth 1
  4. Copy results CSV back to local machine: data/raw/gmaps_tier1.csv
  5. Import: python import_prospects.py --gmaps data/raw/gmaps_tier1.csv

Website Keyword Signals (for enrich_websites.py)

Practice Area Keywords (must-have to confirm estate focus)

PRACTICE_KEYWORDS = [
    r"\bestate\s+planning\b",
    r"\btrusts?\s+(?:and|&)\s+estates?\b",
    r"\belder\s+law\b",
    r"\bprobate\b",
    r"\bwealth\s+transfer\b",
    r"\bspecial\s+needs\s+trust\b",
    r"\bguardianship\b",
    r"\bconservator\b",
    r"\bwill(?:s)?\s+(?:and|&)\s+trust\b",
    r"\bfamily\s+(?:wealth|office)\b",
]

Tech Covered Keywords (negative; they already have security)

TECH_COVERED_KEYWORDS = [
    r"\bcybersecurity\b",
    r"\bsecurity\s+assessment\b",
    r"\bSOC\s+2\b",
    r"\bincident\s+response\b",
    r"\bCISO\b",
    r"\binformation\s+security\b",
    r"\bdata\s+protection\s+officer\b",
]

Buying Signal Keywords (positive; they conceptually care)

BUYING_SIGNAL_KEYWORDS = [
    r"\bclient\s+confidentiality\b",
    r"\bprotect(?:ing)?\s+(?:your|client)\s+(?:legacy|interest|data|information)\b",
    r"\bprivacy\b",
    r"\bsecure\s+(?:portal|file|document|client)\b",
    r"\bABA\b",
    r"\bethical\s+obligation\b",
    r"\bmalpractice\b",
]

Firm Size Detection

FIRM_SIZE_KEYWORDS = {
    "solo": [r"\bsolo\s+practi(?:ce|tioner)\b", r"\blaw\s+office\s+of\b"],
    "small": [r"\b[2-9]\s+attorney", r"\boutique\s+(?:firm|practice)\b"],
    "mid": [r"\b1[0-9]\s+attorney", r"\b20\s+attorney"],
    "large": [r"\b[3-9]\d\s+attorney", r"\b\d{3,}\s+attorney", r"\bnational\s+firm\b"],
}

CRM Output Columns

CRM_COLUMNS = [
    "prospect_id",
    "attorney_name",
    "first_name",
    "firm_name",
    "title",                    # Managing Partner, Partner, Associate, etc.
    "email",
    "email_source",             # website_scrape, pattern_guess, theHarvester, directory
    "email_quality",            # named_person, role_based, generic
    "phone",
    "website",
    "linkedin_url",
    "city",
    "state",
    "practice_areas",           # Comma-separated from website/directory
    "source",                   # ddg, gmaps, justia, actec, manual
    "google_rating",
    "google_reviews_count",
    "firm_size_estimate",       # solo, small, mid, large
    "practice_confirmed",       # True if estate planning confirmed from website
    "has_security_vendor",      # True if TECH_COVERED_KEYWORDS found
    "has_it_team",              # True if CISO/IT director found on site
    "buying_signals_found",     # Comma-separated matched keywords
    "prospect_score",
    "prospect_tier",            # A, B, C, D
    "score_breakdown",          # Human-readable scoring explanation
    "template_variant",         # 1-5
    "personalized_first_line",
    "mailto_link",
    "outreach_status",          # Not Sent, Sent, Replied, Meeting Booked, Not Interested, Bounced
    "date_sent",
    "follow_up_date",
    "notes",
    "last_updated",
]

Daily Outreach Configuration

  • Default count: 3-5 prospects/day (attorneys require more personalization than foundations)
  • Output file: solanasis-docs/daily-outreach/YYYY-MM-DD-attorney.md
  • Follow-up cadence: Day 1 (initial) / Day 4 (email) / Day 8 (phone) / Day 14 (breakup)
  • Tracker: data/output/outreach_tracker.json
  • Template rotation: Never send same template two consecutive days
  • Personalization: ALWAYS customize first line with firm name, city, or specific detail from their website
  • Anti-spam rules: Plain text only, space sends 2+ hours apart, max 5-10/day total across all pipelines, no HTML/images/tracking/attachments/UTMs/link shorteners

Execution Order

  1. pip install -r requirements.txt
  2. Create directory structure: data/raw/, data/intermediate/, data/intermediate/website_cache/, data/output/, server/queries/
  3. Build config.py
  4. Build templates.py (read Cold Outreach Kit v1 for exact email copy; adapt to voice rules)
  5. Build discover_ddg.py (clone fCTO, swap queries)
  6. Run discover_ddg.py for Tier 1 states → CSV in data/raw/
  7. Build server/discover_gmaps_scraper.py (generate query files)
  8. Run gmaps scraper via Docker: docker run ... → CSV in data/raw/
  9. Build import_prospects.py (consolidate all CSVs, deduplicate by domain)
  10. Run import_prospects.pydata/intermediate/prospects_imported.csv
  11. Build enrich_websites.py (clone fCTO + foundation email patterns)
  12. Run enrich_websites.pydata/intermediate/prospects_enriched.csv
  13. Build guess_emails.py
  14. Run guess_emails.py on prospects without email → updates enriched CSV
  15. Build score_prospects.py
  16. Run score_prospects.pydata/intermediate/prospects_scored.csv
  17. Build generate_outreach_csv.py
  18. Run generate_outreach_csv.pydata/output/attorney_outreach_queue.csv
  19. Build generate_daily_outreach.py
  20. Run generate_daily_outreach.pysolanasis-docs/daily-outreach/YYYY-MM-DD-attorney.md
  21. Review generated emails for voice compliance, then start sending

Verification Checklist

  • discover_ddg.py --dry-run shows correct query list and time estimate
  • discover_ddg.py produces CSV with 50+ results per Tier 1 state
  • import_prospects.py consolidates sources and deduplicates
  • enrich_websites.py --limit 10 discovers emails and keyword signals correctly
  • guess_emails.py --test "John Smith" "smithlawfirm.com" generates patterns and verifies MX
  • score_prospects.py produces bell-curve tier distribution (few A, many B/C, some D)
  • generate_daily_outreach.py outputs markdown with pre-written emails and mailto links
  • Voice check: 3 generated emails have NO em dashes, use correct language shifts, under 125 words
  • CAN-SPAM check: emails include physical address + opt-out text
  • gosom/google-maps-scraper Docker runs and produces CSV (if Docker available)

What NOT to Do

  • Do NOT use Apollo, ZoomInfo, or any paid data tool. Free layers are sufficient.
  • Do NOT scrape Justia or FindLaw directly (both return 403). Use Apify if needed.
  • Do NOT scrape NAELA (reCAPTCHA protected). Small enough to browse manually.
  • Do NOT say “cybersecurity assessment” in any template. Always “data protection review.”
  • Do NOT claim Colorado mandates cybersecurity CLE. It does NOT as of 2026.
  • Do NOT send more than 10 emails/day total across all pipelines from solanasis.com.
  • Do NOT use HTML, images, tracking pixels, UTMs, or link shorteners in cold emails.
  • Do NOT commit .env files or API keys to git.

Reference: Solanasis Signature

Dmitri Sunshine
Founder, Solanasis
solanasis.com | 303-900-8969

Reference: Key Stats for Templates

  • 66% of law firms lack incident response plans (ABA TechReport 2023)
  • $200K average breach cost for professional services firms
  • ABA Rule 1.6(c): “reasonable efforts to prevent the inadvertent or unauthorized disclosure”
  • ABA Rule 1.1 Comment 8: keep abreast of “benefits and risks associated with relevant technology”
  • 294K identity theft tax returns intercepted in 2023
  • Estate files contain SSNs of clients AND all beneficiaries (often including children and grandchildren)