Tableau Table Calculations Made Simple

In this Tableau tutorial I will discuss a basic quick table calculation and try to demystify what is happening behind the scenes. All of the following will hopefully be made more clear in the video but I’m sharing the text below for reference after you watch the video.

I can’t take credit for the content in this post as the Tableau online help site has some quality information on table calculations which I will reference below. There are two important concepts in understanding table calculations: partitioning and addressing.

Key Concepts with Table Calculations: Addressing vs. Partitioning

The dimensions that define how to group the calculation (the scope of data it is performed on) are called partitioning fields. The table calculation is performed separately within each partition.

Partitioning fields break the view up into multiple sub-views (or sub-tables), and then the table calculation is applied to the marks within each such partition.

The remaining dimensions, upon which the table calculation is performed, are called addressing fields, and determine the direction of the calculation.

The direction in which the calculation moves (for example, in calculating a running sum, or computing the difference between values) is determined by the addressing fields.

So when you order the fields in the Specific Dimensions section of the Table Calculation dialog box from top to bottom, you are specifying the direction in which the calculation moves through the various marks in the partition.

When you add a table calculation using the Compute Using options, Tableau identifies some dimensions as addressing and others as partitioning automatically, as a result of your selections.

But when you use Specific Dimensions, then it’s up to you to determine which dimensions are for addressing and which for partitioning.

At the level (Partitioning)

The At the level option is only available when you select Specific Dimensions in the Table Calculations dialog box, and when more than one dimension is selected in the field immediately below the Compute Using options —that is, when more than one dimension is defined as an addressing field.

This option is not available when you’re defining a table calculation with Compute Using, because those values establish partitions by position. But with Specific Dimensions, because the visual structure and the table calculation are not necessarily aligned, the At the level option is available to let you fine-tune your calculation.

Use this setting to set a break (that is, restart of the calculation) in the view, based on a particular dimension. How is this different from just using that dimension for partitioning? In fact, it is partitioning, but it’s partitioning by position rather than by value, which is how partitioning is defined with the Compute Using options.

Filtering on Table Calculations in Tableau

Filtering on Table Calculations in Tableau can be a bit hacky. More often than not, table calculations are dependent upon the data in the view. That means in order to filter on a table calculation, we need a way to preserve underlying data and only hide data from the view.

Filters based on table calculations do not filter out underlying data in the data set, because table calculation filters are applied last in the order of operations. Instead, the data is hidden from the view, allowing dimension members to be hidden from the view without impacting the data in the view.

Notice in the order of operations diagram below how Dimension and Measure filters are applied before Table calculation filters. When trying to filter on a table calculation (which is dependent upon the data in the view) the results may be unexpected. If we turn our dimension or measure into a table calculation, we can then filter the Table calculation at the appropriate level, which preserves underlying data and only hides the table calculation values we wish to filter from the view.

Tableau Order of Operations

The content in this post was quoted from the sources below.

As always, If you find this type of instruction valuable make sure to subscribe to my Youtube channel.

References:

All views and opinions are solely my own and do NOT necessarily reflect those my employer.

Create Rounded Bar Charts in Tableau

Part 1: How to Make Rounded Bar Charts in Tableau

In this post you’re getting two videos for the price of one (considering they’re all free for now, that’s a good thing). I put together a relatively simple dashboard to help illustrate a few intermediate level concepts. In this first video I take a look at the number of total assists by NBA players during the 2017-2018 season. In case you were wondering, Russell Westbrook led the league in assists during that season. If you don’t know who Russell Westbrook is, then skip this Tableau stuff and watch the last video immediately (and then come back to the Tableau stuff).

In the first Tableau dashboard video, you’ll learn two concepts:

  • How to make rounded bar charts;
  • How to filter the number of bar chart marks via use of a parameter;

Part 2: Apply Custom Sorting in Tableau

In the second video I build upon the dashboard built in the first video by showing you how to add a custom sort. The custom sort relies upon the creation of a parameter and a calculated field. The parameter and calculated field enable the user to select either a dimension (e.g., Player Name) or a measure (e.g., sum of assists) from a drop down box and the visualization will sort ascending or descending as requested.

The calculated field relies upon the RANK_UNIQUE function.

In this context, RANK_UNIQUE returns the unique rank of each player’s assist total. The key with RANK_UNIQUE is that identical values are assigned different ranks. As an example, the set of values (6, 9, 9, 14) would be ranked (4, 2, 3, 1), as no tied rankings are allowed.

Part 3: Interact with the Dashboard

Bonus: Russell Westbrook on the Attack

For those of you who do not know who Russell Westbrook is, I’ve got you covered. These aren’t assists but in these situations, he didn’t need to pass!

References:

Thanks to both the Tableau Magic blog for outlining the concept of rounded bar charts and the VizJockey blog for the custom sort methodology. Check out and support these  blogs!

As always, do great things with your data!

If you find this type of instruction valuable make sure to subscribe to my Youtube channel.

The Ultimate Tableau Slope Graph Video

In this video I tackle the subject of slope graphs also known as slope charts. I had some fun putting together this dashboard that illustrates the changes in wins for NBA teams during the 2016-2017 and 2017-2018 seasons. From the video you’ll discover that Chicago, Atlanta and Memphis are on a Hindenburg-like trajectory, while trusting the process in Philadelphia led to huge season gains in overall wins.

Here’s what you will learn from this video:

  • How to create a parameter that enables a user to select which win statistic measure to visualize;
  • How to use Table calculations like, LOOKUP(), FIRST() and LAST() to calculate period over period change;
  • How the impact of Mike Conley’s injury affected the Memphis Grizzlies last season;

Click the pic to interact with the Tableau Public visualization, also download the workbook and data to dissect as needed.

For your convenience the calculated fields that I used to create the measures are listed here. Note that [Selected Measure] is a parameter that you need to create that lists all of the measures.

Calc Select Measure
CASE [Selected Measure]
WHEN “Home Losses” Then [Home Losses]
WHEN “Home Wins” Then [Home Wins]
WHEN “Overall Losses” Then [Overall Losses]
WHEN “Overall Wins” Then [Overall Wins]
WHEN “Road Losses” Then [Road Losses]
WHEN “Road Wins” Then [Road Wins]
WHEN “vs East Conf Losses” Then [vs East Conf Losses]
WHEN “vs East Conf Wins” Then [vs East Conf Wins]
WHEN “vs West Conf Losses” Then [vs West Conf Losses]
WHEN “vs West Conf Wins” Then [vs West Conf Wins]

END
Better or Worse
IF [Selected Measure] = “Home Wins” OR
[Selected Measure] = “Overall Wins” OR
[Selected Measure] = “Road Wins” OR
[Selected Measure] = “vs East Conf Wins” OR
[Selected Measure] = “vs West Conf Wins”
THEN
//WIN MEASURES: Negative delta treated as “WORSE”, Positive delta treated as “BETTER”
(IF [Delta] < 0 THEN “WORSE” ELSEIF [Delta] = 0 THEN “SAME” ELSE “BETTER” END)
ELSE
//LOSS MEASURES: Positive delta treated as “WORSE” (more losses are worse), Negative delta treated as “BETTER”
(IF [Delta] > 0 THEN “WORSE” ELSEIF [Delta] = 0 THEN “SAME” ELSE “BETTER” END)
END
Delta
LOOKUP(SUM([Calc Select Measure]),LAST()) – LOOKUP(SUM([Calc Select Measure]),FIRST())
Delta ABS Value
ABS(LOOKUP(SUM([Calc Select Measure]),LAST()) – LOOKUP(SUM([Calc Select Measure]),FIRST()))
ToolTip
<Team> Trend: <AGG(Better or Worse)> by <AGG(Delta ABS Value)>
During the <Season> Season, the <Team> had <SUM(Calc Select Measure)> <Parameters.Selected Measure>.

I have to give thanks to Ben Jones at the Data Remixed blog for the inspiration!

As always, If you find this type of instruction valuable make sure to subscribe to my Youtube channel.

How to Compare Actuals vs. Forecast in Tableau

Forecasting in Tableau uses a technique known as exponential smoothing. This is when an algorithm tries to find a regular pattern in your data that can be continued into the future.

In this video I’ll share some helpful tips to help you determine which options you should select that will enable Tableau to make the most predictive forecast for your data. By the end of the video you will be able to differentiate between an additive and multiplicative data pattern and to evaluate MASE to measure the accuracy of the forecast.

I’m not talking about this Mase:

Harlem World

Rather, you’ll learn about the mean absolute scaled error (i.e., MASE) and how it helps you judge the quality of the model.

In addition, you’ll also also learn how to compare your actual data to the Tableau forecast in order to judge if the model is doing its job.

If you’ve used the forecasting capabilities in Tableau without knowing about these concepts, you might have generated an inaccurate error riddled forecast. Don’t just set a forecast and forget it. Watch this video and generate better forecasts in Tableau!

Here is additional reading from Tableau on the forecast descriptions (including MASE).

As always, If you find this type of instruction valuable make sure to subscribe to my Youtube channel.

Row and Column Highlighting in Tableau

In this post you’ll learn how to highlight values in your Tableau table using set actions. The dashboard in this video displays the number of total points scored by NBA teams by position in the 2017-2018 season. I will give you step by step instructions on how to implement row and column highlighting on this dataset downloaded from basketballreference.com.

I’ve only made a few minor tweaks but this technique was developed by Tableau Zen Master Matt Chambers. You can check out his blog at sirvizalot.com and follow him at Big shout out to Matt for sharing this technique with the Tableau community!

You can interact with my visualization on Tableau Public:

If you find this type of instruction valuable make sure to subscribe to my Youtube channel!

Connecting Tableau with R: Combining Beauty and Power

The following is a guest post from Sinna Muthiah Meiyappan.

Tableau, as we all know, is the go-to tool for visualization. R is an open-source statistical language used by academicians, statisticians, data analysts and data scientists to develop machine learning models. R’s popularity is owed to the power of its packages.

Why Connect Tableau with R?

An analytics tool can generate enormous value if it has the following aspects:

1) a user-friendly interface for a business user
2) the scope and power to run different analyses and deliver the results to the interface in an easily interpretable way for the business user to understand and make decisions

Tableau can be used to develop beautiful dashboards which can serve as a very good user interface. R can be used to develop and run complex machine learning models whose results can be displayed in Tableau dashboards in an easily interpretable way.

In short, by linking Tableau and R, you are giving business users the opportunity to create enormous value by running and analyzing the results of complex machine learning models without having knowledge or experience in R!

How to Connect Tableau with R

1) Install Rserve package and run Rserve command by typing the following lines in your R console:

install.packages("Rserve")
Rserve()

Now, R should print ‘Starting Rserve…’. If you observe this output, then R is reaching out to Tableau for a connection.

2) Open Tableau & click on Help > Settings & Performance > Manage External Service Connections.
3) In the dialog box that opens, choose ‘localhost’ for Server and type ‘6311’ for Port.
4) Then, click on Test Connection

Now, you should get a dialog box stating, ‘Successfully connected to the R serve service’. If you observe this message, then you are all set to use the power of R from within Tableau.

Objective of this article

All companies maintain a database containing details about their customers. Customer Segmentation using clustering algorithms is a proven technique used by companies to understand its customers. This helps you offer targeted deals that convert better! The objective of this article is to create a dashboard where an executive from a wholesale business can segment customers using R within Tableau, without having to know R Programming, just by clicking in the dashboard. Then, visualize the different customer segments to find patterns that can be used to make business decisions like providing incentives, discounts, etc.

Dataset Used

The dataset referenced in this post can be downloaded in CSV format from UCI Machine Learning repository by clicking on ‘Data Folder’ link at the top left of the page. In this dataset, each row contains details about a customer of the Wholesaler regarding his channel, region and the dollar amount spent by him on different product categories like Fresh, Frozen, Grocery, Milk, Detergent Paper & Delicatessen. This dataset can be mined to understand the buying patterns/characteristics of customers based on their spending.

Designing a Customer Segmentation dashboard in Tableau

Loading data into Tableau

Open the CSV file, add a column named ‘Customer ID’ and fill it from 1 to 440.
Connect to the CSV file using the ‘Text File’ option in Tableau and you should see the below screen in your Tableau worksheet.

Picture1

Creating Parameters

Let’s cluster users based on their spending in different categories using k-means clustering algorithm. K-Means is an unsupervised learning algorithm that identifies clusters in data based on similarity of features used to form the clusters. In our case, these features would be the customers spending in different categories.

K-means algorithm requires the following two inputs for it to form clusters: (1) the number of clusters to be formed & (2) How many times to run the algorithm before picking the best clusters. Basically, the K-Means algorithm takes as many random center points as specified by the user in input 1 and starts assigning the data points to these center points based on the criteria defined to calculate distance. Once all points are assigned, it measures the Sum of Squared Errors (SSE). This process is repeated as many times as specified in input 2. Then, the model with the least SSE is picked and shown as the result.

Let’s give the option of specifying the number of clusters formed to the user who is going to use the dashboard by creating a Tableau parameter. To do that, right click on the Measures pane and click on ‘Create Parameter’. Fill the Create Parameter dialog box with details as shown below and click OK.

Picture2

Right click on the newly created ‘Centers’ parameter and select ‘Show Parameter Control’ and set Centers value to 5.
For our results to be reproducible, we need to ‘set seed’ in R. The seed defines the random numbers generated that are used in the algorithms. So, whenever we set the same seed and run a model, we would get the same results. Let us give this option of defining the seed also to our user by creating a parameter as follows:

Picture3

Right click on the newly created ‘Seed’ parameter and select ‘Show Parameter Control’ and set seed value to 500. Now, your screen should look like this:

Picture4

SCRIPT Functions to Receive Data From R

Now, to segment customers into different clusters, we are going to pass data to R, cluster it and get the results back in a Calculated Field that can be used in Tableau Visualizations. Tableau has four scripting functions which are related to the type of data they would receive from the external service Tableau is connected to. They are SCRIPT_REAL, SCRIPT_INT, SCRIPT_BOOL & SCRIPT_STR.

The syntax of a Script function is as follows:
SCRIPT_XXX (‘code to be executed as a string’, argument 1, argument2, etc.)

The arguments are referred as ‘.arg1’, ‘.arg2’, etc. respectively inside the code. All the arguments whether quantitative or categorical should be passed as aggregated versions to R since this is a table calculation. Since all our variables are quantitative, we aggregate them using the SUM function. The last line of the code in string would be returned as an output of the function by R to this calculated field.

So, let’s create a Calculated Field named ‘Cluster_kmeans’ with the following code in it and click OK:

Picture5

Code Recap:

1) Line 1: Sets seed, so that the results can be regenerated again if needed. As already said, K-means choses the center points at random and starts assigning data points to these cluster centers. The clustering results will change if these center points change. So, seed is used to define the random numbers generated so that if you set the same seed and run, the same center points will be taken up for clustering.

2) Line 2: Combines all features (i.e. the customer spends in each category) that must be used for clustering into a data frame.

3) Line 3: Runs the k Means model 20 times, in the above generated data frame, with as many cluster centers as specified by the user in the dashboard.

4) Line 4: The best model out of the above 20 is picked and its results are returned to the Cluster_kmeans calculated field in Tableau, which can be used in visualizations.

We pass 8 arguments to the Script function. First is Seed, second is Centers (as in the number of clusters that we want k-means algorithm to form for this data) and the remaining 6 are the spend in each category. The six spend arguments are vectors of 440 dollar amounts each (as there are 40 customers in our dataset).

What is interesting is, the Seed and Centers parameters are also passed as vectors of length 440, but of the same value. i.e. Seed has ‘500’ repeated 440 times. In R, we pass a single number as seed, not a vector of length more than 1. So, we extract the first element of the vector alone and pass it to ‘set.seed’ function as follows:

Set.seed(.arg1[1])

Similarly, to the centers argument of k-means algorithm also we pass only a single value. If given a single value, the algorithm generates that many number of random centers and starts assigning points to clusters. If you give a vector to this argument, then it takes the values in the vector as the spatial location of center points and starts assigning points to clusters. Now, if you pass a vector of length 440 where all values in the vector being the same, then you would get an error as follows:

Error in kmeans(ds, centers = .arg2, nstart = 20) : initial centers are not distinct

To avoid this, we extract the first element of the vector alone and pass it to ‘kmeans’ function as follows:

km_res <- kmeans(ds, centers = .arg2[1], nstart = 20)

Visualizing the clusters created by R

Now that the calculated field for clusters is created, we can start creating the visualization shown in the figure below. Before that, go to Analysis > Aggregate Measures and turn it off. Because, if you don’t, Tableau will aggregate all the spend for all customers and send a single row of input to R, resulting in the following error. This error says there are more than one center points to start with and only one data point that can be assigned. Hence, the clustering cannot be completed.

Error in kmeans(ds, centers = .arg2[1], nstart = 20) : more cluster centers than distinct data points

Picture6

This sheet describes the spend of each cluster in Fresh and Frozen categories. To create the above visualization in Tableau, do the following:

1) Drag and drop ‘Cluster_kmeans’ into ‘Columns’ shelf and into Color
2) Drag and drop ‘Fresh’ and ‘Frozen’ into ‘Rows’ shelf
3) Set the chart type to ‘Gantt Bar’
4) From ‘Analytics’ tab at the top left, drag and drop ‘Box Plot’ into ‘Cells’ of the chart
5) From ‘Analytics’ tab at the top left, drag and drop ‘Average Line’ into ‘Panes’ of the chart
6) Adjust Opacity in the color palette by setting it to 50%
7) Fix the Y-axis of the graphs to read from 0 to 115,000 so that it would be easier to compare across all the six charts we are going to develop for the dashboard. ( I chose 115,000 as the upper limit, because the max spend by a single customer in a single category is of the order of 112k)
8) You can adjust the tooltip to reflect the information that you deem necessary/relevant at this point.

Now, similarly create two other worksheets which would look as follows:

Picture7

This sheet describes the spend of each cluster in Milk and Groceries categories.

Picture8

This sheet describes the spend of each cluster in Delicatessen and Detergent Paper categories.

Creating Customer Segmentation Dashboard

Create a new dashboard and drag all three worksheets you created into the dashboard. Arrange the parameter controls and chart legends appropriately. Use blank containers to fill space if needed. Now your dashboard may look like this.

Picture9

So now, we have put together a dashboard where a business user (in this case, the Wholesaler) can choose how many clusters he wants to form from his customer data and understand about the different clusters formed. The cluster results generated are always reproducible if the user inputs the same seed again (in this case 500).

Recalculating Clusters at the Snap of Your Fingers!

To change the number of clusters from 5 to any other value, all the user must do is to change the number in the Centers parameter list box. Tableau will pass the updated parameter to R, clusters will be re-calculated through Table calculation in R, and the dashboard will be updated to reflect the new cluster details! The below image shows the dashboard when the Centers is changed from 5 to 3.

Picture10

Features of this Customer Segmentation Dashboard

Since, we fixed all Y axes to the same range, these charts are comparable across too.
The Gantt bar chart shows each customer as a bar in the chart. This will also help us identify if the clusters formed have too few customers in them (they will have very few bars) so that, we can be taking decisions keeping that in mind.
Hovering over the box plots in each pane will give details about the quantiles of spending by the respective cluster in that respective category.

Clicking on a specific cluster in the chart legend at the right top will highlight only that cluster in the dashboard by dimming others and all the average lines in each category pane will be recalculated to show the average spend of this specific cluster in that category.

Interpreting Results and Taking Actions to Generate Business Value

We, at Perceptive Analytics, feel that the main aim of any business intelligence dashboard is to deliver insight and inspire action of the business users. Because, business value will not be created until any one of them acts on the insights! It is our (dashboard designer’s) job to make sure that the dashboard is designed in a way such that the business users does not have to spend more time searching for insights or trying to understand the dashboard.

They should spend very less time in organizing data & more time in interpreting information / insights and creating / strategizing action plans.
Let’s now see what insights we can generate out of this dashboard with three clusters, shown above:

  • The highest spend in registered by a customer in Fresh category (at the left top of the dashboard).
  • Categories Fresh, Grocery, Milk and Detergents Paper have registered a decent number of spends above 20k dollars. So, these are the categories where the Wholesaler gets more money from his customers.
  • Looking at the spending pattern of the three clusters formed in the above graph, I can gather the following:
    • Cluster1 (Blue) seems to spend in all categories indifferently (In all categories, their spend is spread above and below the respective category average spend)
    • Cluster2 (Orange) spends way above category averages in grocery, milk and detergents paper categories
    • Cluster3 (Green) spends extra ordinarily in Fresh category and like Cluster1 in all other categories.

Within the same cluster, some customers purchase many items, others less. Recommendations can be provided to those who purchase less based on the missing items, compared to the customer who purchased more in the same category. Category related offer mails and discount mails can be targeted to those group of customers who actively purchase in that respective category.

Now, the wholesaler can create test and control groups within clusters and start sending targeted marketing campaigns to the customers in relevant segments. The test and control split would be useful to know if the campaigns are working or not. By continuously repeating this experiment of targeted marketing campaigns and analyzing the results after, the wholesaler can fine-tune himself to look for patterns in his customer data and know what kind of a marketing campaign will motivate them to purchase more.

Conclusion

In this article, we created a dashboard that enables a business user to segment customers and visualize the spending patterns of different clusters to understand customers better. We leveraged the power of R to segment the customers from within Tableau itself. This knowledge can be used to make business decisions like offering trade discounts or targeted promotional offers.

Author Bio

This article was contributed by Sinna Muthiah Meiyappan.

Image courtesy of ARPORN SEEMAROJ / 123rf.com

How to Drill into Data Using Set Actions in Tableau

Drilling with Set Actions

If you’ve ever tried to use the default drill functionality within Tableau, you know that it could be a more user friendly experience. The default table drill functionality opens all of the options at the next drill level which can force a user to lose sight of the data upon which they’re focusing. A more user-friendly option enables the user to only drill into a specific selected value where focus and attention can be maintained. This is otherwise known as asymmetric drill down.

Fortunately as of version 2018.3, Tableau has added Set Actions as a new functionality. At a high level, developers can take an existing set and update its values based upon a user’s actions in the visualization. The set can be employed via a calculated field within the visualization, via direct placement in the visualization or on the marks card property.

In lay terms this means empowering a user with more interactivity to impact their analyses.

In this first video, I’ll demonstrate a use of set actions on an NBA data set. We’ll drill from Conference to Division to Team to Player. This tip will be easily applicable to your Tableau data. And with the bonus tree-map tip you’ll release your inner Piet Mondrian.

Feel free to interact with the set action example on Tableau Public and then download and dissect the workbook.


Drilling with Level of Detail (LOD) Calculations
If you want to stay with a classic approach, a nice Level of Detail (LOD) workaround can be employed to drill into the next level. Here is a tip that accomplishes a similar outcome where I demonstrate a technique originally presented by Marc Rueter at Tableau Conference 2017.

Now that I’ve equipped you with the knowledge to incorporate customized drilling functionality into your analyses, go forth and do some great things with your data!

References:

https://onlinehelp.tableau.com/current/pro/desktop/en-us/actions_sets.htm
https://www.tableau.com/learn/tutorials/on-demand/set-actions
https://www.basketball-reference.com/leagues/NBA_2018.html
https://www.youtube.com/watch?v=d22A4XVoUEs

Image Copyright dzxy on 123rf.com