Information Visualization

Create a Well Designed Pareto Chart in Tableau

In this video I will show you how to visualize Vilfredo Pareto’s namesake chart in Tableau. The Pareto Principle defines the 80/20 rule in that roughly 80% of the effects come from 20% of the causes.

I will use sample Tableau Superstore data to determine which states are responsible for 80% of sales. I’ll start with a basic Pareto chart and then move on to a visualization with a little more flair. This video should serve you well in your future data analyses.

 

B.I. Basics Part 4: Learn the QlikView ApplyMap Function

There will come a time in your QlikView load scripting endeavors where you will need to map a single key value to a lookup table and return the lookup value. If you’ve ever wanted a Qlikview function that is somewhat analogous to a CASE statement for simple lookups/transformations, then look no further than the ApplyMap function.

My video breaks down the hard to interpret user manual definition and provides a simple example that will have you performing QlikView lookups in no time.

B.I. Basics Part 2: A Simple Tableau Nested Sort Solution

For those of you that are familiar with Tableau, you know that sorting can be an exercise in frustration and futility. Fortunately when you understand how Tableau intends its sort functionality to work, you’ll discover that there is a method to the madness. My video presents a simple solution that will alleviate your sorting frustration and should find a place in your Tableau toolbox.

Anthony Smoak Final Project: Data Visualization and Communication with Tableau

 

 

I recently earned a verified course certificate from Coursera in the “Data Visualization and Communication with Tableau” class. This class is the 3rd offered in the “Excel to MySQL: Analytic Techniques for Business” Coursera Specialization. I’m looking forward to taking a couple more MOOCs dealing with Tableau and visualization to supplement and reinforce existing knowledge. I would recommend the class to anyone looking to frame an analysis and learn a good bit about using Tableau.

Consumer Financial Protection Bureau Infographic: Complaints Analysis

Background

As a data and visualization endeavor, I put together an infographic that highlights some product complaints analyses I performed using publicly available Consumer Financial Protection Bureau data.

In case you are unfamiliar with the CFPB, it is an organization that was created in 2010 as a result of the financial calamity that gripped that nation during the great recession. The CFPB’s mission is to write and enforce rules for financial institutions, examine both bank and non-bank financial institutions, monitor and report on markets, as well as collect and track consumer complaints.

On the bureau’s website they host a consumer complaint database that houses a number of complaints that consumers file against financial institutions.

Each week we send thousands of consumers’ complaints about financial products and services to companies for response. Those complaints are published here after the company responds or after 15 days, whichever comes first. By adding their voice, consumers help improve the financial marketplace.

Process

I downloaded the complaint database from the CFPB’s  website and then decided to concentrate on selected bank complaints from the many financial institutions that are present in the database. I settled on a self-defined “National” category and a “Regional” category and then analyzed the percentage of complaints across three product spaces (Mortgages, Bank Accounts & Credit Cards).

I felt a percentage approach would be more useful than just merely listing a total count of complaints. The national banks category consists of the four nationally known firms: JP Morgan Chase, Wells Fargo, Bank of America and Citibank. The regional banks category consists of ten fairly large regional banks that have product offerings similar to the national banks.

It’s fairly obvious that the behemoth national banks are going to have more mortgage complaints than the much smaller regional banks on a total count basis. The more interesting analysis is to look at the rate of mortgage complaints for the national banks as compared to the regional banks (e.g. divide a specific product complaints total like mortgage by the total complaints for all products; calculate this percentage for national and regional categories across all three products).

I carried out this analysis using the ggplot package in R to generate the base graphics for the infographic. Adobe Illustrator was then used to further refine the graphics into what you see below:

IST 719 Final Project-01_BLOG_VERSION

I have an additional unrefined chart that is a straight output from the ggplot package in R. I didn’t have enough space on the infographic to include it there. However, this analysis is the same as is represented in the bottom quadrant of the infographic, except that it solely applies to regional banks.

The analysis consists of totaling all of the specific PRODUCT complaints filed against a particular bank and then dividing that number by the total number of ALL complaints filed against the individual bank (e.g. Total mortgage complaints filed against a bank/total complaints filed against a bank). I call the resulting number the Complaint Ratio.

In the ggplot graph output below we can see that Regions’s “Bank Account or service” product represents about 67% of all complaints filed against Regions. If I were to break out the numbers on a total count basis, we’d see that Regions’s overall complaints total is relatively small compared to other banks. However, the bulk of its complaints are distributed in the “Bank Account or service” product area.

9_Regional Data by Product

May your next bank be your best bank.

Additional Reading:

An Interesting Comparison of Bank of America to JPMorgan Chase

L.A. Lakers Visualization: R Code Plus Illustrator for the Win

I am a huge Los Angeles Lakers fan since I grew up on the West Coast; I lived in Los Angeles for a year and Las Vegas for many years as a kid. Magic Johnson and the “Showtime” squad of the 80’s will always be the best team dynasty in NBA history in my rather biased opinion. I wanted to make a visualization using base R code to plot a bar chart of Lakers wins by season and then use Adobe Illustrator to complete the effort. Using a .csv data file from Basketball-Reference.com I was able to tell the story of the franchise in an easy to comprehend visualization. I love bringing data to life and making it tell a story!

Laker Wins By Season

Twitter Link

Visualizations with R and Adobe Illustrator

I’ve been reading Visualize This by Nathan Yau to better understand visualization concepts. The book provides some direction regarding how to begin graphing data in R and then touching up the graphics in Adobe Illustrator. Here are a few visualizations I was able to create with some basic knowledge of R code and Adobe Illustrator. Nathan’s book provides most of the R code but the Illustrator portion took some work to get just right.

This slideshow requires JavaScript.

Billboard Hip Hop Chart Visualized 1989-2015

polygraph-chart-publicenemy-billboard-820

I enjoy a great work of visualization and this interactive data graph by Polygraph charting the top 10 Billboard hip hop hits from 1989-2015 is phenomenal. The number one song in the list plays until it is supplanted by the next number one chart-topping song.

For someone like me who grew up in the 90’s and listened to the golden age of rap music, this graph is a very enjoyable walk down memory lane conjuring up mental images of high school, college and enjoyable times thereafter.  I can pinpoint where I surrendered my knowledge of mainstream hip hop as it ceded to the tastes of a younger generation (around 2011). Enjoy the link:

http://poly-graph.co/billboard/