My Submission to the University of Illinois at Urbana-Champaign’s Data Visualization Class

I’m a huge fan of MOOCs (Massive Open Online Courses). I am always on the hunt for something new to learn to increase my knowledge and productivity; and because I run a blog, MOOCs provide fodder for me to share what I learn.

I recently took the Data Visualization class offered by the University of Illinois at Urbana-Champaign on Coursera. The class is offered as part of the Data Mining specialty of six courses that when taken together can lead to graduate credit in its online Master of Computer Science Degree in Data Science.

Ok enough with the brochure items. For the first assignment I constructed a visualization based upon temperature information from NASA’s Goddard Institute for Space Studies (GISS).

Data Definition:

In order to understand the data, you have to understand why temperature anomalies are used as opposed to raw absolute temperature measurements. It is important to note that the temperatures shown in my visualization are not absolute temperatures but rather temperature anomalies.

Basic Terminology

Here’s an explanation from NOAA:

“In climate change studies, temperature anomalies are more important than absolute temperature. A temperature anomaly is the difference from an average, or baseline, temperature. The baseline temperature is typically computed by averaging 30 or more years of temperature data. A positive anomaly indicates the observed temperature was warmer than the baseline, while a negative anomaly indicates the observed temperature was cooler than the baseline.”

Interpreting the Visualization

The course leaves it up to the learner to decide which visualization tool to use in order to display the temperature change information. Although I have experience with multiple visualization programs like Qlikview and Power BI, Tableau is my tool of choice. I didn’t just create a static visualization, I created an interactive dashboard that you can reference by clicking below.

From a data perspective, I believe the numbers in the file that the course provides is a bit different than the one I am linked to here but you can see the format of the data that needs to be pivoted in order to make an appropriate line graph.

All of the data in this set illustrates that temperature anomalies are increasing from the corresponding 1951-1980 mean temperatures as years progress. Every line graph of readings from meteorological stations shows an upward trend in temperature deviation readings. The distribution bins illustrate that the higher temperature deviations occur in more recent years. The recency of years is indicated by the intensity of the color red.

Let’s break down the visualization:

UIUC Top Portion

Top Section Distribution Charts:

  • There are three sub-sections representing global, northern hemisphere and southern hemisphere temperature deviations
  • The x axis represents temperature deviations in bins of 10 degrees
  • The y axis is a count of the number of years that fall between the binned temperature ranges
    • For example, if 10 years have a recorded temperature anomaly between 60 and 69 degrees, then the x axis would be 60 and the y axis would be 10

UIUC Distribution Focus.png

  • Each 10 degree bin is comprised of the various years that correspond to a respective temperature anomaly range
    • For example in the picture above, the year 1880 (as designated by the tooltip) had a temperature anomaly that was 19 degrees lower than the 30 year average. This is why the corresponding box for the year 1880 is not intensely colored.
    • Additionally, the -19 degree anomaly is located in the -10 degree bin (which contains anomalies from -10 to -19 degrees)
    • These aspects are more clearly illustrated when interacting with the Tableau Public dashboard
  • The intensity of the color of red indicates the recency of the year; for example year 1880 would be represented as white while year 2014 would be indicated by a deep red color

Bottom Section Line Graph Chart:

UIUC Bottom Portion

  • The y axis represents the temperature deviation from the corresponding 1951-1980 mean temperatures
  • Each line represents the temperature deviation at a specific geographic location during the 1880-2014 period
  • The x axis represents the year of the temperature reading

UIUC Gobal Average

In the above picture I strip out the majority of lines leaving only the global deviation line. Climate science deniers may want to look away as the data clearly shows that global temperatures are rising.

Bottom Line:

All in all I thought it was a decent class covering very theoretical issues regarding data visualization. Practicality is exclusively covered in the exercises as the class does not provide any instruction on how to use any of the tools required to complete the class. I understand the reason as this is not a “How to Use a Software Tool” class.

I’d define the exercises as “BYOE” (i.e., bring your own expertise). The class forces you to do your own research in regards to visualization tool instruction. This is especially true regarding the second exercise which requires you to learn how to visualize graphs and nodes. I had to learn how to use a program called Gephi in order to produce a network map of the cities in my favorite board game named Pandemic. The lines between the city nodes are the paths that one can travel within the game.

UIUC Data Viz Week 3

If you’re looking for more practicality and data visualization best practices as opposed to hardcore computer science topics take a look at the Coursera specialization from UC Davis called “Visualization with Tableau”.

In case you were wondering I received at 96% grade in the UIUC course.

My final rating for the class is 3 stars out 5; worth a look.

Advertisements

Calculate Bar Chart Percent of Total in Power BI

The humble bar chart is the heart and soul of any visualization tool and is the most effective way to compare individual categorical values. We as humans are very adept at detecting small differences in length from a common baseline [1].

To quote the Harvard Business Review [2], “The ability to create smart data visualizations was once a nice-to-have skill. But in today’s complex business world, where the amount of data is overwhelming, being able to create and communicate through compelling data visualizations is a must-have skill for managers.”

If you’re going to start learning a new visualization tool, there is no better place to start than with bar chart basics. In this video I will share how to place a “percent of total” measure (i.e. value) on a Power BI bar chart. We’ll also briefly touch upon customizing the chart’s diverging color scheme.

Since Microsoft is basically giving away Power BI Desktop for free, it may become as ubiquitous as Excel. Don’t be left out!

References:

[1] Cotgreave, A., Shaffer, J., Wexler, S. (2017). The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios. Hoboken, NJ: John Wiley & Sons, Inc.

[2] https://hbr.org/webinar/2018/02/the-right-stuff-chart-types-and-visualization-best-and-worst-practices

Add a “Filters in Use” Alert to Your Tableau Dashboard

In this video we will learn to add a “Filters in Use Alert” to a Tableau Dashboard. If you have a dashboard with multiple filters, apply this quick and easy tip to inform your users that filters are in play. This tip builds upon the dashboard that I showcased recently in a previous post: Add a Reset All Filters Button to Your Tableau Dashboard.

I learned this current tip from a presentation given by Tableau Zen Master Ryan Sleeper, so I have to give credit where credit is due.

If you’re interested in Business Intelligence & Tableau subscribe and check out my videos either here on this site or on my Youtube channel.

Add Totals to Stacked Bar Charts in Tableau

 

In this video I demonstrate a couple of methods that will display the total values of your stacked bar charts in Tableau. The first method deals with a dual axis approach while the second method involves individual cell reference lines. Both approaches accomplish the same objective. Hope you enjoy this tip!

If you’re interested in Business Intelligence & Tableau subscribe and check out my videos either here on this site or on my Youtube channel.

Tableau K-Means Clustering Analysis w/ NBA Data

Interact with this visualization on Tableau Public.

In this video we will explore the Tableau K-Means Clustering algorithm. K-Means Clustering is an effective way to segment your data points into groups when those data points have not explicitly been assigned to groups within your population. Analysts can use clustering to assign customers to different groups for marketing campaigns, or to group transaction items together in order to predict credit card fraud.

In this analysis, we’ll take a look at the NBA point guard and center positions. Our aim is to determine if Tableau’s clustering algorithm is smart enough to categorize these two distinct positions based upon a player’s number of assists and blocks per game.

Nicola Jokic is a Statistical Unicorn

If you also watch the following video you’ll understand why 6 ft. 11 center Nikola Jokic is mistakenly categorized as a point guard by the algorithm. This big man can drop some dimes!

If you’re interested in Business Intelligence & Tableau subscribe and check out my videos either here on this site or on my Youtube channel.

Ranking Banks by Number of Complaints

I recently downloaded a dataset from the Consumer Finance Protection Bureau (CFPB) in order to construct a handy visualization. The CFPB maintains a database that houses a collection of complaints on a range of consumer financial products and services that are sent to companies for a response.

Per the CFPB, “the database also includes information about the actions taken by the company in response to the complaint, such as, whether the company’s response was timely and how the company responded.”

Although the database is updated daily, I chose to visualize information from the complete year of 2017. In fairness to the financial institutions, company level information should be considered in context of company size and/or market share.

Financial institutions analyze this information frequently as a way of understanding and continuously improving their customer service.

I highly recommend “The Big Book of Dashboards” by Jeffrey Shaffer, Andy Cotgreave and Steve Wexler. The book contains a number of visualization examples that provide guidance on dashboard creation for any number of business use cases. In this Tableau Public dashboard I relied heavily on the visual guidance for their Complaints Dashboard as you can observe.

Screen Shot 2018-06-03 at 10.02.14 PM

Complaints Dashboard from “The Big Book of Dashboards”

Click on the picture link to view the dashboard on Tableau Public (not optimized for mobile).

Dashboard 1

If you’re interested in Business Intelligence & Tableau subscribe and check out my videos either here on this site or on my Youtube channel.

Create Multiple KPI Donut Charts in Tableau

In honor of National Doughnut Day (June 1st), let’s devour this sweet Tableau tip without worrying about the calories. In this video I we will create a multiple donut chart visualization that will display the sum of profits by a region. Then we’ll use the donuts as a filter for a simple dashboard. Once you finish watching this video you’ll know how to create and use donut charts as a filter to other information on your dashboard.

I know that donuts are not considered best practice, (especially when negative numbers are involved) but they have their uses. Assuming you know that bar charts are a best practice, it never hurts to learn other techniques that add a little “flair” from the boring world of bar charts.

Have you ever looked at a Picasso painting? Obviously Picasso was well versed in painting best practices (understatement) but in some of his art, the people are not rendered in the best practice. Always learn the best practices, but know when to leave them behind and add a little flair! (In no way am I comparing myself to Picasso).

Three-Musicians-By-Pablo-Picasso

Three Musicians – Pablo Picasso

Three Musicians by Picasso is not best practice but it is a work of art!

If you’re interested in Business Intelligence & Tableau subscribe and check out my videos either here on this site or on my Youtube channel.