Powerful AI Starts with High-Quality Data: Lessons from Edwin Chen and Surge AI

Powerful AI Starts with High-Quality Data: Lessons from Edwin Chen and Surge AI

3 min read

In a world where AI headlines are dominated by billion-dollar fundraises, massive model sizes, and compute power arms races, Edwin Chen offers a refreshing counter-narrative. As the founder of Surge AI, Chen built a $1 billion-per-year data labeling business with just 150 employees in 5 years —no external investors, no sales team, and no PR machine. His story is a powerful reminder that in AI, quality often trumps quantity.


Independent Thinking as a Superpower

Chen’s approach is rooted in independent thinking. He avoids the noise of social media and Silicon Valley groupthink, choosing instead to focus on substance. His insights come from trusted colleagues and customers—not viral threads. This mindset allows him to build durable products that solve real problems, not just chase trends.

“I’m glad I’m not surrounded by the default ways of Silicon Valley thinking.”

A Business Built on Product, Not Pitch

Surge AI was profitable from day one. Chen didn’t raise venture capital—not because he couldn’t, but because he didn’t need to. He believes in letting the product speak for itself, and in shaping it through genuine customer feedback rather than sales-driven hype.

“We didn’t need the money. And I didn’t want a sales team convincing people to buy a product they didn’t deeply understand.”

Small Teams, Big Impact

Chen is blunt about inefficiencies in Big Tech:

“Ninety percent of employees at tech giants are working on useless problems.”

Surge AI operates with lean teams, no standing 1:1s, and asynchronous communication. This fosters speed, autonomy, and transparency—qualities often lost in larger organizations.

Build First, Fund Later

Chen urges startups to stop making excuses and start building. With today’s tools, most teams can launch a minimum viable product (MVP) without significant capital. Fundraising, he argues, should follow validation—not precede it.

“For 90–95% of startups, there’s no excuse. Just build the MVP. See if anyone cares.”

The Real Bottleneck in AI: Data Quality

Chen’s journey began at Twitter, where poor data labeling hindered even basic sentiment analysis. That experience led to a core realization: high-quality data is the foundation of powerful AI.

“Without clean, contextual, high-quality training data, even the best models underperform.”

While compute and algorithms get the spotlight, Chen ranks data quality as the #1 constraint in AI today. Without it, more compute simply accelerates failure.

Synthetic Data vs. Human Judgment

Synthetic data has its place, but Chen warns against overreliance. Models trained on synthetic data often struggle in real-world scenarios, lacking nuance and diversity. In many cases, a few thousand well-labeled human examples outperform millions of synthetic ones.

Specialized Models Still Matter

Despite the dominance of general-purpose models, Chen sees enduring value in domain-specific approaches. Smaller teams can move faster, encode expert knowledge, and align more closely with user needs.

“Some products simply can’t be built within the constraints of Big Tech companies.”

AI Safety Is a Now Problem

Chen challenges the notion that AI safety is a future concern. Misaligned objectives—like optimizing for engagement over truth—are already causing harm. As AI systems become more embedded in critical domains, the stakes will only rise.

“The real risk isn’t that AI becomes evil. It’s that we train it toward the wrong objectives—and don’t realize it until it’s too late.”

Final Thoughts

Few areas have more potential for AI-driven transformation than healthcare. Yet the data in this field remains fragmented and inconsistent. Chen’s success calls for a collective effort to raise the standard of healthcare data—not just as a technical challenge, but as a moral imperative. If you're working on improving healthcare data—or want to—reach out. Let’s build something meaningful together.

Other Posts You Might Like

Budget Impact Models: A Practical Tool for Healthcare Decision-Making
Budget Impact Models: A Practical Tool for Healthcare Decision-Making
Oct 07, 2025
Learn how Budget Impact Models help payers and HTA agencies assess short-term affordability of new healthcare treatments alongside cost-effectiveness analysis....
Read more
New White Paper: Charting the Landscape of Real-World Data in the U.S.
New White Paper: Charting the Landscape of Real-World Data in the U.S.
Oct 01, 2025
Learn how real-world data is transforming U.S. healthcare and life sciences. Our new white paper maps datasets, applications, challenges, and future directions....
Read more
Chart showing global vaccine trial trends
Polygon Health Analytics Launches Vaccine Trial Atlas: Making Clinical Trial Data Accessible
Sep 16, 2025
The vaccine research and development community has faced unprecedented challenges in recent months, including policy upheavals, leadership changes, research program cancellations, and a surge of misinformation...
Read more
Launchpad
Polygon Health Analytics Celebrates Graduation of 2025 Launchpad Cohort
Sep 03, 2025
[Philadelphia, September 2, 2025] – Polygon Health Analytics proudly announces the successful graduation of its 2025 Launchpad Program cohort—the second since the program’s inception—marking...
Read more
Patient_Reported_With_Doctor
Patient-Reported Outcomes: Bringing the Patient’s Voice into Clinical Development and Outcomes Research
Aug 14, 2025
When it comes to healthcare, numbers and lab results only tell part of the story. What about how patients feel? How treatments impact their daily lives?...
Read more
Understanding Health Technology Assessment (HTA): A Strategic Imperative for Innovators
Understanding Health Technology Assessment (HTA): A Strategic Imperative for Innovators
Jul 15, 2025
If you're a scientist transitioning into the biotechnology sector or an entrepreneur developing a novel therapy for an unmet medical need, one term will...
Read more
Literature Reviews
Systematic Literature Reviews and AI: A Gold Standard Meets Innovation
Jun 01, 2025
Discover how Systematic Literature Reviews (SLRs) provide a gold-standard method for evidence synthesis in biomedicine—and how AI is transforming the process. Learn the key steps, differences from other review types, and the future potential of automation in research....
Read more
Perfect timing
No Perfect Timing: Embracing Uncertainty in Business and Life
Apr 16, 2025
"Is now the right time?" It’s a question I hear often—from aspiring entrepreneurs, working parents, and close friends. Whether it’s starting...
Read more
Calling for Lupus Patients and Caregivers to Shape the Future of Lupus Care
Calling for Lupus Patients and Caregivers to Shape the Future of Lupus Care
Mar 11, 2025
Sponsored by the National Institute of Minority Health and Health Disparities (NIMHD), Polygon Health Analytics is iSMILE (Individualized Social Risk Management in Systemic Lupus Erythematosus Care)...
Read more
AI_Life_Science
Is Life Science Ready for AI?
Jan 01, 2025
A few days ago, Dr. Geoffrey Hinton, often referred to as the “Godfather of AI,” made a chilling statement: “There is a 10-20 percent probability that...
Read more
View all