What I Learned Mining 18 Years of LinkedIn Data with AI
Author: Dmitri Sunshine, CEO of Solanasis LLC Date: March 2026 Category: Responsible AI | Data Analysis | Professional Strategy
A few weeks ago I did something unusual: I downloaded everything LinkedIn has ever recorded about me and fed it to an AI pipeline.
Not just my profile. Everything. The full data export — 37 CSV files, 18 years of professional history, every connection I’ve ever made, every message I’ve sent, every ad I’ve clicked, every company I’ve followed. Then I wrote a suite of Python scripts to parse it, cross-reference it, and score it across 20 dimensions. Then I had an LLM read 670 samples of my own writing to build a communication profile.
What came back surprised me. Not because the data was wrong, but because of how clearly it revealed the gap between who I think I am professionally and what my digital footprint actually shows.
Here’s what I found — and what it means for anyone thinking seriously about AI-assisted business intelligence.
The Dataset
Before the findings, the scope: LinkedIn’s data export is bigger than most people realize.
- 2,498 connections spanning 67 months of tracked activity
- 562 conversation threads — every DM I’ve sent or received
- 670 writing samples across comments, posts, and invitation messages
- 5,852 ad clicks over 21 months
- 2,030 member follows — people I’ve followed but haven’t connected with
- 1,549 invitations sent and received with outcome tracking
- 13,000+ total data rows across 20+ active dimensions
The 18-year history isn’t just a number. The oldest connection in my export goes back to 2008. The platform has been quietly accumulating behavioral data about me longer than some of my current clients have been in business.
Finding 1: LinkedIn’s AI Had Me Wrong
LinkedIn’s inference engine — the system that categorizes you for advertisers — had tagged me in seven categories. Among them: “Interested in technology media,” “HR professional,” and “influences public opinion.”
Not on the list: cybersecurity, fractional executive, CIO, COO, or anything close to what Solanasis actually does.
This matters because LinkedIn’s inferences drive who sees your content, who your ads reach, and how the platform positions you in search. I’ve been operating under a miscategorized identity. The platform’s AI was optimizing my reach toward the wrong audience.
The fix isn’t complicated — but you have to know the problem exists before you can address it. Most people never look at their inference data. It’s buried in the export, labeled Ad_Targeting.csv, and contains exactly seven rows.
Finding 2: I Have 2,024 People Who Follow Me But Haven’t Connected
This one stopped me.
LinkedIn distinguishes between connections and followers. People who follow you see your content but aren’t in your first-degree network. I had 2,030 total follows. Of those, 2,024 had followed me but never accepted or sent a connection request.
That’s 2,024 people who raised their hand and said “I’m interested in what you’re doing” — and I had never reached back.
That’s not a missed opportunity. That’s a warm outreach pool that most LinkedIn users don’t even know exists. The data is in Member_Follows.csv. Five minutes to export. Most people never open it.
Finding 3: My Best Month Is Always November
Across three independent dimensions — ad engagement, connection growth, and reaction activity — November is my peak month every year.
- November 2025: 1,805 ad clicks (peak month, more than double any other)
- November 2025: 268 new connections (highest single month in the dataset)
- November 2025: 88 reactions given (highest reaction month on record)
This correlates with an event I attended — the gBETA Greeley Showcase — but the pattern predates it. Something about Q4 puts me into high gear professionally.
Knowing this let me do something practical: schedule the heavy outreach pushes for October/November rather than trying to force activity in February when my engagement data shows I’m historically quieter.
Data-driven calendar optimization sounds like consulting jargon. It’s just reading what you already know about yourself and acting on it.
Finding 4: My Communication Style Has a Fingerprint
The LLM voice analysis of 670 writing samples identified a consistent pattern I’d never articulated:
I affirm before I challenge. Almost every critical or skeptical comment I’ve ever written starts with warmth. “I love that you guys are doing this for free, and yet my skeptical side thinks, ‘What’s the catch?‘” The contrast lands because the affirmation is genuine. The skepticism registers because it’s unexpected.
The analysis also found that I write very differently in public versus private. In comments and shares — my public voice — I perform warmth. In DMs, I move fast from warmth to logistics. The average DM conversation arc: personal context → current project → meeting request → calendar link. Four beats, consistent across 562 conversations.
Knowing this explicitly lets me use it intentionally. The public warmth is a feature, not a liability. The DM speed-to-logistics is efficient but sometimes skips relationship-building steps that would improve close rates.
The AI found patterns in my writing I had never named. That’s the capability.
Finding 5: The Gap Between Network Size and Network Activation
My invitation acceptance rate is 49.2% for outgoing invites. That’s exceptionally high for LinkedIn cold outreach — the platform average is closer to 20-30%.
But cross-referencing connections against Solanasis service lines told a harder story. When I mapped my 2,478 contacts against the seven services Solanasis offers — security assessments, disaster recovery, data migrations, CRM setup, systems integration, responsible AI, and fractional leadership — only a small fraction of contacts had been approached about any of them.
I had a warm network with strong relationship capital and almost no revenue generated from it. The data made this undeniable in a way gut feeling never could.
The A-tier contacts (warmth scores 40-50) include founders, executive directors, and CEOs of exactly the SMBs and nonprofits Solanasis serves. They’re not strangers. They’re people I’ve had conversations with in the last 90 days.
What This Has to Do with AI
I run Solanasis as a fractional CIO/CSIO/COO firm. We help SMBs and nonprofits get the C-suite technology leadership they need without the full-time cost. Responsible AI implementation is one of our core service lines.
This project was me doing on myself what I’d charge a client to do for their business: take their existing data, build a structured analysis pipeline, extract insights they couldn’t see from inside the data, and translate findings into actionable decisions.
The tools used were not exotic. Python scripts for parsing. Claude (Anthropic’s API) for language analysis. Standard CSV and JSON for data transport. Nothing here requires a data science team or a six-figure software investment. It requires knowing what questions to ask and building the pipeline to answer them.
That’s what responsible AI implementation actually looks like in an SMB context — not replacing people, not deploying LLMs into critical systems without guardrails, but using AI to surface what’s already in your data and make better decisions with it.
The Uncomfortable Part
The thing that stays with me from this project isn’t any specific number. It’s how clearly the data showed the distance between my self-perception and my actual professional footprint.
LinkedIn thought I was an HR professional interested in technology media. My own DMs showed I was rushing to scheduling links before I’d fully built the relationship. My warmest contacts — people with 40+ warmth scores built from years of interaction — had never been asked if they needed what I offer.
The data wasn’t wrong. It was accurate. The uncomfortable insight was mine to sit with.
If you have a LinkedIn account more than a few years old, the data is sitting there. You exported it, you have 37 files you’ve never opened, and somewhere in there is an accurate picture of your professional behavior that you’ve never looked at directly.
Worth looking at.
What’s Next
This project produced structured, database-ready exports that feed directly into Solanasis’s outreach and operational systems — 37 CSV/JSON files mapped to specific business decisions, not a slide deck that lives in a folder.
If your organization has data it’s sitting on but not acting on, that’s the conversation I’m most interested in having.
Dmitri Sunshine is the CEO of Solanasis LLC, a fractional CIO/CSIO/COO firm helping SMBs and nonprofits with security, systems, and responsible AI. Based in Colorado.
Tags: AI, Data Analysis, LinkedIn, Responsible AI, SMB, Fractional Leadership, Business Intelligence, Professional Development