Big Data: How Data Analytics Is Transforming the World
Dr. Tim Chartier is an Associate Professor of Mathematics and Computer Science at Davidson College. He holds a B.S. in Applied Mathematics and an M.S. in Computational Mathematics, both from Western Michigan University. He received his Ph.D. in Applied Mathematics from the University of Colorado Boulder. Professor Chartier is a recipient of a national teaching award from the Mathematical Association of America (MAA). He is the author of Math Bytes: Google Bombs, Chocolate-Covered Pi, and Other Cool Bits in Computing and coauthor (with Anne Greenbaum) of Numerical Methods: Design, Analysis, and Computer Implementation of Algorithms. As a researcher, he has worked with both Lawrence Livermore National Laboratory and Los Alamos National Laboratory, and his research was recognized with an Alfred P. Sloan Research Fellowship. Dr. Chartier is a member and past chairperson of the Advisory Council for the National Museum of Mathematics, and was named the first Math Ambassador of the Mathematical Association of America. He fields mathematical questions for ESPN's Sport Science program and has served as a resource for the CBS Evening News, National Public Radio, The New York Times, and other major news outlets.
01: Data Analytics-What's the "Big" Idea?
Sample the tremendous scope and power of data analytics, which is transforming science, business, medicine, public policy, and many other spheres of modern life. Investigate why this revolution is happening now, and look at some common misconceptions about data analysis.
02: Got Data? What Are You Wondering About?
Data analysis is not just for large organizations and large datasets; it's also for the average person. Learn how to put data to work in your own life-from charting your cell phone usage to personalizing your medical care or improving your exercise routine....
03: A Mindset for Mastering the Data Deluge
Today's data users often feel like they're drinking from a fire hose of information. Investigate strategies that help manage the data deluge, and learn efficient ways to think about data that separate what's genuinely useful from what can be strategically ignored....
04: Looking for Patterns-and Causes
Humans are experts at pattern recognition, which is a key skill in data analysis. But when are patterns real and when are they imagined? Study some surprising correlations between apparently unrelated phenomena, asking whether there is a cause-and-effect relation or mere coincidence is involved.
05: Algorithms-Managing Complexity
Algorithms-rules to follow for solving problems-are the secret of managing huge datasets. Start by looking at simple algorithms, including an amazingly effective sorting procedure that you can perform by hand. Then see how these concepts apply to more complex problems, such as web search engines.
06: The Cycle of Data Management
Study what happens after you gather data. It must first be stored, then organized, integrated with data from other sources, and analyzed. Now you are ready to act on the information that the data provides. Determine how this cycle works in practice, and uncover some hidden pitfalls.
07: Getting Graphic and Seeing the Data
Graphics have long been a compelling way to present and understand data. Survey some unusually effective graphics from the pre-computer era. Then explore the wealth of graphical tools available today. Graphics can reveal new information, but they can also obscure it when used poorly.
08: Preparing Data Is Training for Success
"Garbage in, garbage out" is a famous expression in computer science, underscoring the importance of starting with reliable data. Learn how data is prepared to remove errors and ambiguities. As an example, see how the US Postal Service perfected machines that can read hastily scribbled addresses.
09: How New Statistics Transform Sports
Follow the saga of the 2002 Oakland A's, famously depicted in the book and film Moneyball. Thanks to data analytics, the A's made it to the major league playoffs with a roster of undervalued players. Survey the increasing role of data at all levels of sports competition.
10: Political Polls-How Weighted Averaging Wins
Study the role of big data in predicting election results. Contrast the disastrous 1936 presidential poll by the Literary Digest with today's impressively accurate aggregators of polls, such as statistician Nate Silver. Analyze what makes aggregation more effective than any single poll.
11: When Life Is (Almost) Linear-Regression
Explore the power of regression analysis for modeling the past and future, focusing on a technique called the linear least squares method. As an example, use data from Olympic gold medal times for the 100-meter dash. Calculate a theoretical fastest possible time for the event.
12: Training Computers to Think like Humans
Delve into the field of artificial intelligence, discovering how computers are programmed to think and make decisions like humans. An automated version of the 20 questions game illustrates how neural networks are the key to machine learning-a technology that is now in widespread use.
13: Anomalies and Breaking Trends
Sometimes it is the odd bit of data-the outlier in a sea of statistics-that is crucial to solving a mystery. See how sophisticated anomaly detection has led to a significant drop in credit card fraud. The same approach helps understand cultural trends that go viral.
14: Simulation-Beyond Data, Beyond Equations
Enter the world of simulation, which allows researchers to model behavior that would otherwise be too dangerous or expensive to study. Investigate the history of the subject and its multiplying applications-from science and engineering to entertainment.
15: Overfitting-Too Good to Be Truly Useful
Learn how to avoid the perils of overfitting, which is when an overly complex model or noisy data leads to flawed conclusions. Explore object lessons in this common pitfall, including an earthquake forecast that was disastrously wrong.
16: Bracketology-The Math of March Madness
Every year, millions of people engage in a hugely popular data exercise called March Madness. See how a mathematical approach called bracketology helps you excel at picking winners in the playoff games of the NCAA basketball tournament.
17: Quantifying Quality on the World Wide Web
Internet searches used to be frustratingly hit-or-miss. See how Google changed that by creating a realistic model of the way web surfers use the Internet. Then look at attempts to hijack search results to improve page rankings and how programmers thwart these tactics.
18: Watching Words-Sentiment and Text Analysis
We are nearing the point where every book ever written is accessible and searchable in digital form-as already exists for the even more voluminous texts from Twitter, Facebook, and other media. Learn how data analysts mine this limitless storehouse of words for new cultural and business insights.
19: Data Compression and Recommendation Systems
Data compression is crucial for storing and transmitting digital images at a fraction of their original size. See how compression also improves online recommendations, as shown by the Netflix million dollar competition, which led to a new algorithm for personalized recommendations.
20: Decision Trees-Jump-Start an Analysis
Probe the power of decision trees by breaking down the demographics of survivors of the Titanic disaster, an analysis that tells the tragic story of events aboard the sinking ship. Then test decision trees in other applications, marveling at their ability to carve quickly through data.
21: Clustering-The Many Ways to Create Groups
Clustering is a powerful way to discover new relationships in data by sorting it into groups, called clusters. Explore this family of techniques by searching for clusters in the Million Song Dataset. Then try other examples that show the exceptional flexibility of clustering.
22: Degrees of Separation and Social Networks
Test the popular theory that six steps, at most, connect you to any person on the planet. Social networks like Facebook provide a wealth of data for quantifying our relative connectedness. See how graph theory helps you to visualize any linked phenomena.
23: Challenges of Privacy and Security
Big data can be a big threat to privacy. Learn how surveillance cameras, smart phones, and Internet use provide a wealth of opportunities for tracking specific individuals. Examine privacy issues raised by corporate and government activity, and review what you can do to lead a more secure life.
24: Getting Analytical about the Future
Focus on a branch of data analytics called predictive analytics, concerned with predicting the future. Imagine attending such a conference years from now. What can you expect? Answer the question with the tools you have learned in the course, and come up with some surprising forecasts!