Coursera
  • Find a Course
  • Sign Up
Coursera Blog
  • Degrees
  • For Business
  • For Educators
  • Product News
  • Stories

Solving the “Data Explosion” Problem with University of Illinois Data Mining Pioneer Jiawei Han

October 4, 2019

Share
Share on Facebook Share
Share on TwitterTweet
Share on LinkedIn Share
Send email Mail

Jiawei Han, a professor of computer science at the University of Illinois at Urbana-Champaign, was recently named a Michael Aiken Chair, one of the University’s highest awards. The endowed chair is the latest honor in Han’s distinguished and pioneering career, with notable accomplishments including creating core data mining algorithms and co-authoring the textbook that is considered by many to have defined the field. Professor Han is also a busy and successful teacher with a love for “train[ing] the younger generation, whether at UIUC or all over the world on Coursera.” Professor Han had three PhD students graduate in May, with one becoming a professor at Georgia Tech, one joining Google, and one joining Facebook. Students taking his classes as part of the Online Master of Computer Science in Data Science degree have an opportunity to learn from him through videos and can ask him questions directly during live office hours. 

In this conversation, Professor Han shares his perspective on the history and the future of data mining, the challenge of the “data explosion” problem, and why he thinks the University of Illinois offers set students up for long term success. 

Can you explain what you mean when you talk about a “Data Explosion” problem?

Originally, people would say they are ‘data poor’ and that they couldn’t get enough data. Now there is lots of data–the new problem is actually extracting knowledge from it.

Whether you’re a journalist, a biologist, an engineer, or in almost any other discipline, there is this ‘data explosion’ problem: you need to turn unstructured data into structured knowledge. That means spend[ing] a lot of time figuring out how to structure your unstructured data into networks, and then how to mine that data. 

For example, I have a group of students who work on how to handle biomedical literature. With biomedical literature, we can easily get 36 million papers – but to effectively use this huge corpus, you would have to ask experts to label which terms are genes, which are proteins, which are diseases.

It’s not realistic to ask humans to go through 1,000 papers and go over every sentence and label them. So, we take existing dictionaries with lists of genes, diseases, or chemicals as our starting place. Then, we take the massive unlabeled corpus and try to build a network that can find patterns and linkages automatically with a machine. Data mining can replace humans’ boring work. 

You founded the data mining group several decades ago. What led you to get into this field to start with, and what has your research group accomplished over the years?

It’s a long journey! I started with databases. In the 1980s, when I did my PhD, lots of people built database systems allowing us to index them, sort them, and search them in powerful ways. I talked to my advisor and said I want[ed] the database to have intelligence, so my PhD thesis was essentially feeding the database logic and defining rules to make the database more intelligent. 

Later, I found that if you ask humans to build the rules for the database, it is still a prohibitive burden. You have limited experts, and you have unlimited data and unlimited problems – you cannot scale up. The best way is to let data show the pattern by itself: data mining. 

I clearly remember the first international Knowledge Discovery for Data (KDD) workshop in 1989, it was just 20 to 30 people in a workshop. But I got together with some of my collaborators to write and present a paper on a method to dig rules out of data. After a few years, lots of people found this direction promising, and by 1995 they held the first international conference on KDD. To everybody’s surprise, 500 to 600 people attended!

For the second conference, they elected me to be co-chair, and I shifted the majority of my research from deductive databases to inductive databases – from “you give me rules, I will get more data” to “give me more data and I will develop rules.” I had many students joining me to work on this, and we wrote several very impactful papers and algorithms. Two of these algorithms are so influential that they are introduced in many textbooks on pattern discovery. In the Spark Machine Learning Library, they have only collected two algorithms for pattern discovery – FPGrowth and PrefixSpan – and both are from my group.

And you’re teaching this influential research in your Coursera course on pattern discovery.

Yes, in 1999, I finished the first data mining textbook (Data Mining: Concepts and Techniques), and it basically defined what data mining is. The major contribution of this book is that it defined the key issues of data mining and the key things a student needs to learn. Data mining has its own dedicated algorithms, like pattern discovery, and we also use a lot of statistics and machine learning techniques like classification and cluster analysis. 

What sets the U of I data science track apart from other universities?

Because the field and applications are so broad, we need lots of different types of experts. At UIUC, we have professors from very different backgrounds; we have people from computer science, but we also have people from library information science, and we have people from statistics. So, I think UIUC has a unique advantage just because the university has so many great departments that students wouldn’t typically have access to.

The field of data mining has changed a lot over your career – where do you see it going?

Data mining basically serves as a bridge between core techniques like machine learning, statistics, and optimization and their application to real world problems – and we are not confined to any approach, as we can use and develop different technologies for different problems. That’s the reason data mining has life, because you are facing the real world, which is so diverse.

Share
Share on Facebook Share
Share on TwitterTweet
Share on LinkedIn Share
Send email Mail

Keep reading

  • 2026’s Fastest-Growing Skills and Top Learning Trends From 2025
  • Empowering leaders to build a skills-first future
  • Celebrating 10 million GenAI enrollments on AI Appreciation Day
  • New on Coursera, Google Agile Essentials course helps professionals deliver projects and results faster
  • 2024’s Rising Content and Fastest Growing Skills for 2025
  • Wharton Online and OpenAI launch a new course on Coursera, “AI in Education: Leveraging ChatGPT for Teaching”
  • Announcing 10 entry-level Professional Certificates from our biggest partners as digital transformation reshapes the labor market
  • Google launches Prompting Essentials course as demand for AI skills continues to grow
  • Coursera celebrates AI Appreciation Day with new GenAI courses, Professional Certificate enhancements, and GenAI Academy for Teams
  • Presenting the 2024 Coursera Global Skills Report
  • Coursera launches a new suite of Academic Integrity features to help universities verify learning in an age of AI-assisted cheating
  • Announcing four new entry-level certificates and Universal Skills scholarship program from Microsoft to help learners land in-demand jobs
  • Clemson University partners with Coursera to launch first degree in South Carolina with no application*
  • Google launches AI Essentials course on Coursera to help learners boost their productivity and thrive
  • Twelve Google and IBM Professional Certificates on Coursera receive ECTS credit recommendations
  • Coursera Launches Generative AI Academy to Improve Executive and Foundational Literacy 
  • Introducing “Navigating Generative AI: A CEO Playbook” 
  • Nevada DETR and Coursera announce statewide program providing free job training to thousands of people 
  • University of Texas System and Coursera Launch the Most Comprehensive Industry Micro-Credential Program Offered by a U.S. University System
  • Coursera announces new AI content and innovations to help HR and learning leaders drive organizational agility amid relentless disruption
  • New Coursera survey shows high demand for industry micro-credentials from students and employers in tight labor market
  • Coursera and Google partner with the University of Texas System to provide critical job skills to students across eight campuses  
  • What the world learned on Coursera in 2022
  • Coursera partners with state university and workforce systems to prepare Louisiana’s workforce for jobs of the future 
  • Connecting Google certificates and university Specializations to help learners prepare for in-demand career fields
  • Coursera partners with IFC and the European Commission to publish global study on women and online learning in emerging markets 
  • Coursera Launches Clips to Accelerate Skills Development through Short Videos and Lessons
  • Preparing learners around the world for in-demand jobs with Career Academy and new entry-level certificates from Meta and IBM
  • Coursera and Milken Center for Advancing the American Dream launch free nationwide skills training initiative for underserved Americans 
  • Coursera’s response to the humanitarian crisis in Ukraine
  • Coursera announces 3 new job-relevant degrees from leading universities
  • Coursera doubles down on Middle East with new leadership, content and platform features
  • Coursera Women and Skills Report indicates a narrowing gender gap in online learning
  • Coursera accelerates India growth plans
  • Announcing the Coursera Global Skills Report 2021
  • Coursera launches five new Professional Certificates and expands AI and human skills learning
  • Turning Entrepreneurial Ambition into Global Impact
  • Introducing Coursera’s Job Skills Report 2026: The most critical skills the world’s learners need this year
Coursera

Coursera was launched in 2012 by Andrew Ng and Daphne Koller, with a mission to provide universal access to world-class learning. It is one of the largest online learning platforms in the world, with millions of registered learners and thousands of institutional customers.

© 2026 Coursera Inc. All rights reserved.

Download on the App Store Get it on Google Play

B Corp
  • Coursera
  • About
  • What We Offer
  • Leadership
  • Careers
  • Catalog
  • Professional Certificates
  • MasterTrack™ Certificates
  • Degrees
  • For Enterprise
  • For Campus
  • For Government
  • Become a Partner
  • Coronavirus Response
  • Community
  • Learners
  • Partners
  • Developers
  • Beta Testers
  • Translators
  • Tech Blog
  • Teaching Center
  • Connect
  • Blog
  • Facebook
  • LinkedIn
  • Twitter
  • Instagram
  • More
  • Press
  • Investors
  • Terms
  • Privacy
  • Help
  • Accessibility
  • Contact
  • Articles
  • Directory
  • Affiliates