Advertisement
Some of the most interesting ideas in tech don't come from large panels or white papers. They appear in a thread, a blog post, or a talk given by someone who's spent years knee-deep in data. These voices shape how we think about machine learning, large-scale analytics, ethical modeling, and applied AI.
If you work in data science—or want to—you probably follow a few of these leaders already. But in 2025, a new wave of thinkers and doers are changing the field in small, meaningful ways. This list isn't ranked and includes both established names and new ones to watch.
Cassie's work at Google centers on the relationship between decision-making and data. She focuses less on code and more on how to ask the right questions. Her posts are readable and rarely packed with buzzwords, which makes her voice easy to return to when the field feels too technical. She often discusses applied statistics, business impact, and the role of intuition in data work.
Hilary has been active in the data space for over a decade. Her early work at Bitly and Fast Forward Labs gave her a reputation for making machine learning practical. Now, with Hidden Door, she's exploring the intersection of AI and interactive storytelling. She avoids trend-chasing and focuses on real use cases, making her insights valuable for those who want to understand what problems data science should be solving.

Ng has become a fixture in machine learning education. His courses brought deep learning to a broader audience. In 2025, he's still active in publishing, sharing updates on LLM efficiency, fine-tuning strategies, and MLOps. His calm, academic style contrasts with the loud corners of AI Twitter, making him a grounding voice in the field.
Kate's approach is community-driven. She focuses on data storytelling, visualization, and making data concepts less intimidating for newcomers. Her LinkedIn and YouTube content often includes interviews, mini-tutorials, and career tips. Her practical advice and emphasis on soft skills stand out in a field that can feel overly technical.
Rumman combines technical depth with strong ethical framing. Formerly on Twitter's META team and now working independently, she continues to focus on algorithmic fairness and bias audits. She doesn't just point out problems—she builds tools and frameworks for solving them. Her recent work on red-teaming generative models has drawn attention to their clear structure and collaborative nature.
You've probably used something Hadley built if you've touched the R language. His work on the tidyverse ecosystem helped standardize how people clean, transform, and visualize data in R. He continues to share ideas on reproducibility, human-centered tooling, and how to design better programming interfaces for data work. His influence stretches beyond R to how we think about data tooling in general.
Monica’s background includes roles at LinkedIn and Jawbone, but she’s better known now for her writing and strategic insights. She coined the “AI Hierarchy of Needs,” which became a reference point for organizations looking to build practical AI stacks. She’s not posting every day, but when she does, her thoughts tend to clarify messy industry trends.
Ben combines research and market awareness. His podcast, The Data Exchange, often features guests working at the edge of applied machine learning, AI infrastructure, and data engineering. He's less about hot takes and more about long-term trends, like the future of feature stores, open-source ML tools, and synthetic data.
Emily is known for making data science more accessible. Her writing, which is often shared on Medium or LinkedIn, focuses on job transitions, communication skills, and what it means to be useful as a data scientist. She doesn't shy away from the messier parts of data work—bad metrics, unclear stakeholders, and imposter syndrome.

Wendy focuses on enterprise data strategies and how organizations can structure themselves to make better data decisions. Her work blends governance, ethics, and culture. She often discusses building internal data products, empowering cross-functional teams, and how leadership should interact with data professionals. She's especially relevant for those in mid-to-senior roles.
Jay is best known for his clear visual explanations of complex topics like transformers, BERT, and vector embeddings. His blog posts and notebooks are widely cited across technical communities. In 2025, he's been focusing more on the explainability of LLMs and vector search. His work is worth bookmarking if you learn best through diagrams and clear language.
Margaret’s work combines academic depth with policy engagement. As one of the co-founders of the BigScience project (behind the BLOOM language model), she’s pushed for more open, transparent AI development. Her posts often challenge assumptions about scalability, fairness, and the human cost of training large models. She doesn’t water things down, but she does explain them clearly.
Following the right voices in data science can sharpen how you think, build, and solve problems. The 12 leaders on this list bring hands-on experience, thoughtful writing, and a habit of asking better questions. They don't just chase trends—they shape the conversations that matter. Whether improving model fairness, simplifying tools, or helping others grow in the field, their work is grounded in clarity and impact. To stay relevant in 2025, keep learning from those who do the work well and discuss it honestly. That's where real insight—and real progress—comes from.
Advertisement
Need to deploy a 405B-parameter Llama on Vertex AI? Follow these steps for a smooth deployment on Google Cloud
Compare ChatGPT vs. HuggingChat to find out which AI chatbot works better for writing, coding, privacy, and hands-on control. Learn which one fits your real-world use
Hugging Face and FriendliAI have partnered to streamline model deployment on the Hub, making it faster and easier to bring AI models into production with minimal setup
Google debuts new tools and an agent protocol to simplify the creation and management of AI-powered agents.
Explore the Python strftime() function and how it helps convert datetime objects into formatted strings. Learn common usage, tips, and avoid pitfalls in this detailed guide
Is premium AR worth the price? Discover how Xreal Air 2 Ultra offers a solid and budget-friendly AR experience without the Apple Vision Pro’s cost
Gemma 3 mirrors DSLMs in offering higher value than LLMs by being faster, smaller, and more deployment-ready
Samsung launches world’s smartest AI phone with the new Galaxy S24 series, bringing real-time translation, smart photography, and on-device AI that adapts to your daily routine
Discover the top data science leaders to follow in 2025. These voices—from educators to machine learning experts—shape how real-world AI and data projects are built and scaled
What makes BigCodeBench stand out from HumanEval? Explore how this new coding benchmark challenges models with complex, real-world tasks and modern evaluation
How Phi-2 is changing the landscape of language models with compact brilliance, offering high performance without large-scale infrastructure or excessive parameter counts
Curious about Vicuna vs Alpaca? This guide compares two open-source LLMs to help you choose the better fit for chat applications, instruction tasks, and real-world use