Advertisement
Some of the most interesting ideas in tech don't come from large panels or white papers. They appear in a thread, a blog post, or a talk given by someone who's spent years knee-deep in data. These voices shape how we think about machine learning, large-scale analytics, ethical modeling, and applied AI.
If you work in data science—or want to—you probably follow a few of these leaders already. But in 2025, a new wave of thinkers and doers are changing the field in small, meaningful ways. This list isn't ranked and includes both established names and new ones to watch.
Cassie's work at Google centers on the relationship between decision-making and data. She focuses less on code and more on how to ask the right questions. Her posts are readable and rarely packed with buzzwords, which makes her voice easy to return to when the field feels too technical. She often discusses applied statistics, business impact, and the role of intuition in data work.
Hilary has been active in the data space for over a decade. Her early work at Bitly and Fast Forward Labs gave her a reputation for making machine learning practical. Now, with Hidden Door, she's exploring the intersection of AI and interactive storytelling. She avoids trend-chasing and focuses on real use cases, making her insights valuable for those who want to understand what problems data science should be solving.
Ng has become a fixture in machine learning education. His courses brought deep learning to a broader audience. In 2025, he's still active in publishing, sharing updates on LLM efficiency, fine-tuning strategies, and MLOps. His calm, academic style contrasts with the loud corners of AI Twitter, making him a grounding voice in the field.
Kate's approach is community-driven. She focuses on data storytelling, visualization, and making data concepts less intimidating for newcomers. Her LinkedIn and YouTube content often includes interviews, mini-tutorials, and career tips. Her practical advice and emphasis on soft skills stand out in a field that can feel overly technical.
Rumman combines technical depth with strong ethical framing. Formerly on Twitter's META team and now working independently, she continues to focus on algorithmic fairness and bias audits. She doesn't just point out problems—she builds tools and frameworks for solving them. Her recent work on red-teaming generative models has drawn attention to their clear structure and collaborative nature.
You've probably used something Hadley built if you've touched the R language. His work on the tidyverse ecosystem helped standardize how people clean, transform, and visualize data in R. He continues to share ideas on reproducibility, human-centered tooling, and how to design better programming interfaces for data work. His influence stretches beyond R to how we think about data tooling in general.
Monica’s background includes roles at LinkedIn and Jawbone, but she’s better known now for her writing and strategic insights. She coined the “AI Hierarchy of Needs,” which became a reference point for organizations looking to build practical AI stacks. She’s not posting every day, but when she does, her thoughts tend to clarify messy industry trends.
Ben combines research and market awareness. His podcast, The Data Exchange, often features guests working at the edge of applied machine learning, AI infrastructure, and data engineering. He's less about hot takes and more about long-term trends, like the future of feature stores, open-source ML tools, and synthetic data.
Emily is known for making data science more accessible. Her writing, which is often shared on Medium or LinkedIn, focuses on job transitions, communication skills, and what it means to be useful as a data scientist. She doesn't shy away from the messier parts of data work—bad metrics, unclear stakeholders, and imposter syndrome.
Wendy focuses on enterprise data strategies and how organizations can structure themselves to make better data decisions. Her work blends governance, ethics, and culture. She often discusses building internal data products, empowering cross-functional teams, and how leadership should interact with data professionals. She's especially relevant for those in mid-to-senior roles.
Jay is best known for his clear visual explanations of complex topics like transformers, BERT, and vector embeddings. His blog posts and notebooks are widely cited across technical communities. In 2025, he's been focusing more on the explainability of LLMs and vector search. His work is worth bookmarking if you learn best through diagrams and clear language.
Margaret’s work combines academic depth with policy engagement. As one of the co-founders of the BigScience project (behind the BLOOM language model), she’s pushed for more open, transparent AI development. Her posts often challenge assumptions about scalability, fairness, and the human cost of training large models. She doesn’t water things down, but she does explain them clearly.
Following the right voices in data science can sharpen how you think, build, and solve problems. The 12 leaders on this list bring hands-on experience, thoughtful writing, and a habit of asking better questions. They don't just chase trends—they shape the conversations that matter. Whether improving model fairness, simplifying tools, or helping others grow in the field, their work is grounded in clarity and impact. To stay relevant in 2025, keep learning from those who do the work well and discuss it honestly. That's where real insight—and real progress—comes from.
Advertisement
Need to deploy a 405B-parameter Llama on Vertex AI? Follow these steps for a smooth deployment on Google Cloud
How to use Librosa for handling audio files with practical steps in loading, visualizing, and extracting features from audio data. Ideal for speech and music and audio analysis projects using Python
Discover how Nvidia continues to lead global AI chip innovation despite rising tariffs and international trade pressures.
What makes BigCodeBench stand out from HumanEval? Explore how this new coding benchmark challenges models with complex, real-world tasks and modern evaluation
How to use the Python time.sleep() function with clear examples. Discover smart ways this sleep function can improve your scripts and automate delays
Curious about Vicuna vs Alpaca? This guide compares two open-source LLMs to help you choose the better fit for chat applications, instruction tasks, and real-world use
Discover the top data science leaders to follow in 2025. These voices—from educators to machine learning experts—shape how real-world AI and data projects are built and scaled
Discover OpenAI's key features, benefits, applications, and use cases for businesses to boost productivity and innovation.
Thousands have been tricked by a fake ChatGPT Windows client that spreads malware. Learn how these scams work, how to stay safe, and why there’s no official desktop version from OpenAI
IBM AI agents boost efficiency and customer service by automating tasks and delivering fast, accurate support.
How the open-source BI tool Metabase helps teams simplify data analysis and reporting through easy data visualization and analytics—without needing technical skills
Hugging Face and FriendliAI have partnered to streamline model deployment on the Hub, making it faster and easier to bring AI models into production with minimal setup