Connecting Tableau with R: Combining Beauty and Power

The following is a guest post from Perceptive Analytics.

Tableau, as we all know, is the go-to tool for visualization. R is an open-source statistical language used by academicians, statisticians, data analysts and data scientists to develop machine learning models. R’s popularity is owed to the power of its packages.

Why Connect Tableau with R?

An analytics tool can generate enormous value if it has the following aspects:

1) a user-friendly interface for a business user
2) the scope and power to run different analyses and deliver the results to the interface in an easily interpretable way for the business user to understand and make decisions

Tableau can be used to develop beautiful dashboards which can serve as a very good user interface. R can be used to develop and run complex machine learning models whose results can be displayed in Tableau dashboards in an easily interpretable way.

In short, by linking Tableau and R, you are giving business users the opportunity to create enormous value by running and analyzing the results of complex machine learning models without having knowledge or experience in R!

How to Connect Tableau with R

1) Install Rserve package and run Rserve command by typing the following lines in your R console:

install.packages("Rserve")
Rserve()

Now, R should print ‘Starting Rserve…’. If you observe this output, then R is reaching out to Tableau for a connection.

2) Open Tableau & click on Help > Settings & Performance > Manage External Service Connections.
3) In the dialog box that opens, choose ‘localhost’ for Server and type ‘6311’ for Port.
4) Then, click on Test Connection

Now, you should get a dialog box stating, ‘Successfully connected to the R serve service’. If you observe this message, then you are all set to use the power of R from within Tableau.

Objective of this article

All companies maintain a database containing details about their customers. Customer Segmentation using clustering algorithms is a proven technique used by companies to understand its customers. This helps you offer targeted deals that convert better! The objective of this article is to create a dashboard where an executive from a wholesale business can segment customers using R within Tableau, without having to know R Programming, just by clicking in the dashboard. Then, visualize the different customer segments to find patterns that can be used to make business decisions like providing incentives, discounts, etc.

Dataset Used

The dataset referenced in this post can be downloaded in CSV format from UCI Machine Learning repository by clicking on ‘Data Folder’ link at the top left of the page. In this dataset, each row contains details about a customer of the Wholesaler regarding his channel, region and the dollar amount spent by him on different product categories like Fresh, Frozen, Grocery, Milk, Detergent Paper & Delicatessen. This dataset can be mined to understand the buying patterns/characteristics of customers based on their spending.

Designing a Customer Segmentation dashboard in Tableau

Loading data into Tableau

Open the CSV file, add a column named ‘Customer ID’ and fill it from 1 to 440.
Connect to the CSV file using the ‘Text File’ option in Tableau and you should see the below screen in your Tableau worksheet.

Picture1

Creating Parameters

Let’s cluster users based on their spending in different categories using k-means clustering algorithm. K-Means is an unsupervised learning algorithm that identifies clusters in data based on similarity of features used to form the clusters. In our case, these features would be the customers spending in different categories.

K-means algorithm requires the following two inputs for it to form clusters: (1) the number of clusters to be formed & (2) How many times to run the algorithm before picking the best clusters. Basically, the K-Means algorithm takes as many random center points as specified by the user in input 1 and starts assigning the data points to these center points based on the criteria defined to calculate distance. Once all points are assigned, it measures the Sum of Squared Errors (SSE). This process is repeated as many times as specified in input 2. Then, the model with the least SSE is picked and shown as the result.

Let’s give the option of specifying the number of clusters formed to the user who is going to use the dashboard by creating a Tableau parameter. To do that, right click on the Measures pane and click on ‘Create Parameter’. Fill the Create Parameter dialog box with details as shown below and click OK.

Picture2

Right click on the newly created ‘Centers’ parameter and select ‘Show Parameter Control’ and set Centers value to 5.
For our results to be reproducible, we need to ‘set seed’ in R. The seed defines the random numbers generated that are used in the algorithms. So, whenever we set the same seed and run a model, we would get the same results. Let us give this option of defining the seed also to our user by creating a parameter as follows:

Picture3

Right click on the newly created ‘Seed’ parameter and select ‘Show Parameter Control’ and set seed value to 500. Now, your screen should look like this:

Picture4

SCRIPT Functions to Receive Data From R

Now, to segment customers into different clusters, we are going to pass data to R, cluster it and get the results back in a Calculated Field that can be used in Tableau Visualizations. Tableau has four scripting functions which are related to the type of data they would receive from the external service Tableau is connected to. They are SCRIPT_REAL, SCRIPT_INT, SCRIPT_BOOL & SCRIPT_STR.

The syntax of a Script function is as follows:
SCRIPT_XXX (‘code to be executed as a string’, argument 1, argument2, etc.)

The arguments are referred as ‘.arg1’, ‘.arg2’, etc. respectively inside the code. All the arguments whether quantitative or categorical should be passed as aggregated versions to R since this is a table calculation. Since all our variables are quantitative, we aggregate them using the SUM function. The last line of the code in string would be returned as an output of the function by R to this calculated field.

So, let’s create a Calculated Field named ‘Cluster_kmeans’ with the following code in it and click OK:

Picture5

Code Recap:

1) Line 1: Sets seed, so that the results can be regenerated again if needed. As already said, K-means choses the center points at random and starts assigning data points to these cluster centers. The clustering results will change if these center points change. So, seed is used to define the random numbers generated so that if you set the same seed and run, the same center points will be taken up for clustering.

2) Line 2: Combines all features (i.e. the customer spends in each category) that must be used for clustering into a data frame.

3) Line 3: Runs the k Means model 20 times, in the above generated data frame, with as many cluster centers as specified by the user in the dashboard.

4) Line 4: The best model out of the above 20 is picked and its results are returned to the Cluster_kmeans calculated field in Tableau, which can be used in visualizations.

We pass 8 arguments to the Script function. First is Seed, second is Centers (as in the number of clusters that we want k-means algorithm to form for this data) and the remaining 6 are the spend in each category. The six spend arguments are vectors of 440 dollar amounts each (as there are 40 customers in our dataset).

What is interesting is, the Seed and Centers parameters are also passed as vectors of length 440, but of the same value. i.e. Seed has ‘500’ repeated 440 times. In R, we pass a single number as seed, not a vector of length more than 1. So, we extract the first element of the vector alone and pass it to ‘set.seed’ function as follows:

Set.seed(.arg1[1])

Similarly, to the centers argument of k-means algorithm also we pass only a single value. If given a single value, the algorithm generates that many number of random centers and starts assigning points to clusters. If you give a vector to this argument, then it takes the values in the vector as the spatial location of center points and starts assigning points to clusters. Now, if you pass a vector of length 440 where all values in the vector being the same, then you would get an error as follows:

Error in kmeans(ds, centers = .arg2, nstart = 20) : initial centers are not distinct

To avoid this, we extract the first element of the vector alone and pass it to ‘kmeans’ function as follows:

km_res <- kmeans(ds, centers = .arg2[1], nstart = 20)

Visualizing the clusters created by R

Now that the calculated field for clusters is created, we can start creating the visualization shown in the figure below. Before that, go to Analysis > Aggregate Measures and turn it off. Because, if you don’t, Tableau will aggregate all the spend for all customers and send a single row of input to R, resulting in the following error. This error says there are more than one center points to start with and only one data point that can be assigned. Hence, the clustering cannot be completed.

Error in kmeans(ds, centers = .arg2[1], nstart = 20) : more cluster centers than distinct data points

Picture6

This sheet describes the spend of each cluster in Fresh and Frozen categories. To create the above visualization in Tableau, do the following:

1) Drag and drop ‘Cluster_kmeans’ into ‘Columns’ shelf and into Color
2) Drag and drop ‘Fresh’ and ‘Frozen’ into ‘Rows’ shelf
3) Set the chart type to ‘Gantt Bar’
4) From ‘Analytics’ tab at the top left, drag and drop ‘Box Plot’ into ‘Cells’ of the chart
5) From ‘Analytics’ tab at the top left, drag and drop ‘Average Line’ into ‘Panes’ of the chart
6) Adjust Opacity in the color palette by setting it to 50%
7) Fix the Y-axis of the graphs to read from 0 to 115,000 so that it would be easier to compare across all the six charts we are going to develop for the dashboard. ( I chose 115,000 as the upper limit, because the max spend by a single customer in a single category is of the order of 112k)
8) You can adjust the tooltip to reflect the information that you deem necessary/relevant at this point.

Now, similarly create two other worksheets which would look as follows:

Picture7

This sheet describes the spend of each cluster in Milk and Groceries categories.

Picture8

This sheet describes the spend of each cluster in Delicatessen and Detergent Paper categories.

Creating Customer Segmentation Dashboard

Create a new dashboard and drag all three worksheets you created into the dashboard. Arrange the parameter controls and chart legends appropriately. Use blank containers to fill space if needed. Now your dashboard may look like this.

Picture9

So now, we have put together a dashboard where a business user (in this case, the Wholesaler) can choose how many clusters he wants to form from his customer data and understand about the different clusters formed. The cluster results generated are always reproducible if the user inputs the same seed again (in this case 500).

Recalculating Clusters at the Snap of Your Fingers!

To change the number of clusters from 5 to any other value, all the user must do is to change the number in the Centers parameter list box. Tableau will pass the updated parameter to R, clusters will be re-calculated through Table calculation in R, and the dashboard will be updated to reflect the new cluster details! The below image shows the dashboard when the Centers is changed from 5 to 3.

Picture10

Features of this Customer Segmentation Dashboard

Since, we fixed all Y axes to the same range, these charts are comparable across too.
The Gantt bar chart shows each customer as a bar in the chart. This will also help us identify if the clusters formed have too few customers in them (they will have very few bars) so that, we can be taking decisions keeping that in mind.
Hovering over the box plots in each pane will give details about the quantiles of spending by the respective cluster in that respective category.

Clicking on a specific cluster in the chart legend at the right top will highlight only that cluster in the dashboard by dimming others and all the average lines in each category pane will be recalculated to show the average spend of this specific cluster in that category.

Interpreting Results and Taking Actions to Generate Business Value

We, at Perceptive Analytics, feel that the main aim of any business intelligence dashboard is to deliver insight and inspire action of the business users. Because, business value will not be created until any one of them acts on the insights! It is our (dashboard designer’s) job to make sure that the dashboard is designed in a way such that the business users does not have to spend more time searching for insights or trying to understand the dashboard.

They should spend very less time in organizing data & more time in interpreting information / insights and creating / strategizing action plans.
Let’s now see what insights we can generate out of this dashboard with three clusters, shown above:

  • The highest spend in registered by a customer in Fresh category (at the left top of the dashboard).
  • Categories Fresh, Grocery, Milk and Detergents Paper have registered a decent number of spends above 20k dollars. So, these are the categories where the Wholesaler gets more money from his customers.
  • Looking at the spending pattern of the three clusters formed in the above graph, I can gather the following:
    • Cluster1 (Blue) seems to spend in all categories indifferently (In all categories, their spend is spread above and below the respective category average spend)
    • Cluster2 (Orange) spends way above category averages in grocery, milk and detergents paper categories
    • Cluster3 (Green) spends extra ordinarily in Fresh category and like Cluster1 in all other categories.

Within the same cluster, some customers purchase many items, others less. Recommendations can be provided to those who purchase less based on the missing items, compared to the customer who purchased more in the same category. Category related offer mails and discount mails can be targeted to those group of customers who actively purchase in that respective category.

Now, the wholesaler can create test and control groups within clusters and start sending targeted marketing campaigns to the customers in relevant segments. The test and control split would be useful to know if the campaigns are working or not. By continuously repeating this experiment of targeted marketing campaigns and analyzing the results after, the wholesaler can fine-tune himself to look for patterns in his customer data and know what kind of a marketing campaign will motivate them to purchase more.

Conclusion

In this article, we created a dashboard that enables a business user to segment customers and visualize the spending patterns of different clusters to understand customers better. We leveraged the power of R to segment the customers from within Tableau itself. This knowledge can be used to make business decisions like offering trade discounts or targeted promotional offers.

Author Bio

This article was contributed by Perceptive Analytics. Sinna Muthiah Meiyappan contributed to this article.
Perceptive Analytics provides Tableau Consulting, data analytics, business intelligence and reporting services to e-commerce, retail, healthcare and pharmaceutical industries. Our client roster includes Fortune 500 and NYSE listed companies in the USA and India.

Image courtesy of ARPORN SEEMAROJ / 123rf.com

Advertisements

How to Drill into Data Using Set Actions in Tableau

Drilling with Set Actions

If you’ve ever tried to use the default drill functionality within Tableau, you know that it could be a more user friendly experience. The default table drill functionality opens all of the options at the next drill level which can force a user to lose sight of the data upon which they’re focusing. A more user-friendly option enables the user to only drill into a specific selected value where focus and attention can be maintained. This is otherwise known as asymmetric drill down.

Fortunately as of version 2018.3, Tableau has added Set Actions as a new functionality. At a high level, developers can take an existing set and update its values based upon a user’s actions in the visualization. The set can be employed via a calculated field within the visualization, via direct placement in the visualization or on the marks card property.

In lay terms this means empowering a user with more interactivity to impact their analyses.

In this first video, I’ll demonstrate a use of set actions on an NBA data set. We’ll drill from Conference to Division to Team to Player. This tip will be easily applicable to your Tableau data. And with the bonus tree-map tip you’ll release your inner Piet Mondrian.

Feel free to interact with the set action example on Tableau Public and then download and dissect the workbook.


Drilling with Level of Detail (LOD) Calculations
If you want to stay with a classic approach, a nice Level of Detail (LOD) workaround can be employed to drill into the next level. Here is a tip that accomplishes a similar outcome where I demonstrate a technique originally presented by Marc Rueter at Tableau Conference 2017.

Now that I’ve equipped you with the knowledge to incorporate customized drilling functionality into your analyses, go forth and do some great things with your data!

References:

https://onlinehelp.tableau.com/current/pro/desktop/en-us/actions_sets.htm
https://www.tableau.com/learn/tutorials/on-demand/set-actions
https://www.basketball-reference.com/leagues/NBA_2018.html
https://www.youtube.com/watch?v=d22A4XVoUEs

Image Copyright dzxy on 123rf.com

T-SQL Tips and Quick Reference

Whenever I have to fire up SQL Server to perform some analyses there are a few functions, keywords and capabilities that I always find myself referring to in order to analyze my data. As is the case with most T-SQL users, even those of us that have been using T-SQL for over a decade, in our heads we always know what we want to do but will refer to our favorite syntax reference sources in order to progress. I decided to make a handy reference sheet for myself and then decided to post it here for anyone else.

How to Create a Temporary Table in T-SQL / SQL Server

Temporary (i.e., temp) tables enable the storage of result sets from SQL scripts yet require less record locking overhead and thus increase performance. They remain in effect until they are explicitly dropped, or until the connection that created them is discontinued.

As I see it, their main benefit is that they preclude me from writing difficult to comprehend nested queries since I can place a result set inside a temp table and then join it back to a normal table at-will.

In this example, the results of permanent table ‘TABLE1’ will be placed into global temporary table ##TEMPTABLE:

SELECT 
     FIELDNAME1,
     FIELDNAME2,
     FILEDNAME3 
INTO ##TEMPTABLE 
FROM TABLE1

Temp tables are stored in the tempdb system database.

TempDB

“The tempdb system database is a global resource that is available to all users connected to the instance of SQL Server or connected to SQL Database.”

Additional Reference:

What do the Hashtags Mean in T-SQL Temp Table Creation?

 The number of hash signs “#” preceding the name of the temp table affects whether the scope of the table is local or global.

  • If you precede the temp table name with “#”, then the table will be treated as a local temp table.
  • If you precede the temp table with “##”, then the table will be treated as a global temp table.

“You can create local and global temporary tables. Local temporary tables are visible only in the current session, and global temporary tables are visible to all sessions. Temporary tables cannot be partitioned. Prefix local temporary table names with single number sign (#table_name), and prefix global temporary table names with a double number sign (##table_name).”

Additional References:


How to Drop a Temp Table
in T-SQL / SQL Server

 There are times when you will need to rerun code that creates a temp table. If the temp table has already been created, you will encounter an error.

“There is already an object named ‘##TEMP_TABLE_NAME’ in the database.”

Place the following code above the creation of your temp tables to force SQL Server to drop the temp table if it already exists. Change ##TEMP_TABLE_NAME to your table name and use the correct number of hashtags as applicable to a local (#) or global (##) temp table.

IF OBJECT_ID('tempdb..##TEMP_TABLE_NAME') IS NOT NULL
DROP TABLE ##TEMP_TABLE_NAME

How to Add a New Field to a Temp Table in T-SQL / SQL Server (ALTER TABLE)

Here is example T-SQL that illustrates how to add a new field to a global temp table. The code below adds a simple bit field (holds either 1 or 0) named FIELD1 to the temp table, declares it as NOT NULL (i.e., it must have a value) and then defaults the value to 0.

ALTER TABLE ##TEMP_TABLE
ADD FIELD1 Bit NOT NULL DEFAULT (0)

The following code changes the data type of an existing field in a global temp table. FIELD1 has its data type changed to NVARCHAR(2) and is declared as NOT NULL.

ALTER TABLE ##TEMP_TABLE
ALTER COLUMN FIELD1 NVARCHAR(20) NOT NULL;

Additional References:


How to Use a CASE Statement in T-SQL / SQL Server

The following information on the CASE statement is direct from Microsoft:

The CASE expression evaluates a list of conditions and returns one of multiple possible result expressions. The CASE expression has two formats:

  • The simple CASE expression compares an expression to a set of simple expressions to determine the result.
  • The searched CASE expression evaluates a set of Boolean expressions to determine the result.

Both formats support an optional ELSE argument.

CASE can be used in any statement or clause that allows a valid expression. For example, you can use CASE in statements such as SELECT, UPDATE, DELETE and SET, and in clauses such as select_list, IN, WHERE, ORDER BY, and HAVING.

Examples from Microsoft:

SELECT
ProductNumber,
Category =
CASE ProductLine
WHEN 'R' THEN 'Road'
WHEN 'M' THEN 'Mountain'
WHEN 'T' THEN 'Touring'
WHEN 'S' THEN 'Other sale items'
ELSE 'Not for sale'
 END,
Name
FROM Production.Product
ORDER BY ProductNumber;


SELECT
ProductNumber,
Name,
"Price Range" =
CASE
WHEN ListPrice =  0 THEN 'Mfg item - not for resale'
WHEN ListPrice = 50 and ListPrice = 250 and ListPrice < 1000 THEN 'Under $1000'
ELSE 'Over $1000'
END
FROM Production.Product
ORDER BY ProductNumber ;

Here is a link to great post that highlights some of the unexpected results when using the CASE statement.


How to Use the Cast Function in T-SQL / SQL Server

When you need to convert a data field or expression to another data type then the cast function can be helpful. I typically have the need to take imported text fields and evaluate them as a datetime. The cast statement below helps me resolve this issue.

Select cast(txtOrder_Date as datetime) as Order_Date

This statement can also be used in a WHERE clause to filter the text as if it were a true datetime field/.

Where cast(txtOrder_Date as datetime)) between '20170101' and '20181231'

Furthermore, you can cast a literal string to an integer or decimal as needed.

Select cast(‘12345’ as int) as Integer_Field
Select cast(‘12345.12’’ as decimal (9,2)) as Decimal_Field

When your FIELDNAME is a text value, you can use the cast function to change its data type to an integer or decimal, and then sum the results. Here are a few examples I have had to use in the past with the sum function.

sum(cast(FIELDNAME as int)) as Sum_Overall_Qty

sum(cast(ltrim(rtrim(FIELDNAME2)) as decimal(38,2))) as Sum_Sales_Price

Additional Reference:


Using the REPLACE Function in T-SQL / SQL Server

The Replace function is useful when you need to replace all occurrences of one character or substring with another character or substring. The following select will replace the string ‘Anthony’ with ‘Tony’.

Select REPLACE(‘My name is Anthony’, ‘Anthony’, ‘Tony’);

REPLACE Function

Additional Reference:


How to Convert a Negative Text Number in Parenthesis Format to a Numeric Data Type (T-SQL / SQL Server)

I’ve found this particular expression useful when trying to convert a negative number in text format to a decimal value when the text is enclosed in parentheses; i.e., changing (123.45) to -123.45

It makes use of the REPLACE function to find the leading parenthesis and replace it with a negative sign. This first REPLACE is nested inside another REPLACE function in order to find the trailing parenthesis and replace

Select cast(replace(replace('(123.45)','(','-'),')','') as money);

REPLACE function2

You can also use the convert function to accomplish the same result. Below I used this line of code to sum the negative formatted text (represented by FIELD_NAME) by converting it to the money data type after replacing the parenthesis.

sum(convert(money,replace(replace(FIELD_NAME,'(','-'),')',''))) as Sum_Domestic_Price


COALESCE Function in T-SQL / SQL Server

The COALESCE function is very useful when replacing NULL field values with a substitute value. Per Microsoft, the COALESCE function evaluates in order a comma delimited list of expressions and returns the current value of the first expression that initially does not evaluate to NULL.

For example,

SELECT COALESCE(NULL, NULL, 'third_value', 'fourth_value');

returns the third value because the third value is the first value that is not null. I will use the COALESCE function at times to replace NULL values with 0 for use in calculations.

Select COALESCE(NULL_FIELD, 0)

https://docs.microsoft.com/en-us/sql/t-sql/language-elements/coalesce-transact-sql?view=sql-server-2017

The Dos and Don’ts of Designing Efficient Tableau Dashboards

This following is a guest post contributed by Perceptive Analytics.

A dashboard is to a user what an assistant is to a boss. While an assistant helps manage multiple tasks for a boss, a dashboard helps manage multiple data sources for a user. Insights are only as good as the underlying data and dashboards are an excellent medium to provide those insights.

Dashboards provide “at-a-glance” views of key metrics which are relevant for business users to perform their tasks effectively. In other words, dashboards are an interactive form of reporting which provides users with consolidated views of different metrics to make impactful, data-driven decisions. A dashboard should speak on the creator’s behalf, acting as an expert providing actionable insights to its users. The dashboard should be self-sufficient when it comes to answering the question, “what can my data tell me?”

There are a plethora of tools available in the market for creating dashboards. However, a badly designed dashboard or incompatible (or wrong) tool can lead to hundreds of thousands of dollars in investment losses when accounting for inefficient time and effort spent by development and analysis teams. It becomes imperative for an organization to choose the right tool and have a step by step approach for dashboard development.

Currently, one of the top business intelligence tools available in the market is Tableau. It is used to create interactive dashboards for users. Tableau has been named a ‘Leader’ in the Gartner Magic Quadrant for six straight years in a row (Source – Tableau.com).

In this post, we will highlight a few best practices that you should follow when developing your Tableau dashboard. We will also talk about some of the pitfalls you should avoid while creating a Tableau dashboard.

We’ll divide the best practices into three different stages of dashboard development.

  1. Pre-Development: Ideation and Conceptualization
  2. Development
  3. Post Development: Maintenance

Ideation and Conceptualization

During the conceptualization and ideation stage, there are a few aspects that one should consider before starting to develop a dashboard.

1. Goal

Understand clearly why you are creating the dashboard in the first place. What is the end objective that you want to achieve via this dashboard? Is it automating a reporting process at month-end? Is it providing a better visualization to a complex calculation created in another platform?

Having a clear understanding of your dashboarding goal or objective keeps you focused and on the right track.

2. Audience

Keep in mind that your audience is a key part of creating a purposeful, impactful dashboard. The dashboard used by the CEO or other members of the C-suite will be very different from the dashboard used by business unit heads, which in turn will be very different from the dashboards used by branch managers. Thus, you need to consider who will use your dashboard and how will it be used?

For instance, a CEO is interested in key metrics at an overall organizational level like the overall financial and operational heath of the company. On the other hand, a procurement manager would be interested in the amount of material being procured from different vendors and their respective procurement costs. Having a GOAL in mind before development is essential because it helps identify the end user of the dashboard.

3. Key Performance Indicators (KPIs)

After thoroughly understanding the various stakeholder requirements, it is important to develop a list of KPIs for each user and/or department. Having the stakeholders sign-off on dashboard KPIs substantially reduces development and re-work time.

4. Data Sources

After achieving sign-off on KPIs, inventory the various data sources that are required for development. This step is important because each data source can potentially increase complexity and computing costs required to calculate the KPIs. It’s always better to only connect those data sources which contain relevant data.

5. Infrastructure

Storage and computation requirements should be taken into consideration commensurate with the dashboard’s degree of data volume and complexity. Having a right-sized backend infrastructure will improve dashboard performance considerably. Also, it is essential to understand the dashboard’s update frequency. Will the data be refreshed once a day? Is it going to be real-time? Having the answer to these questions will help generate infrastructure requirements that will prevent performance issues down the road.

Development

Once you have identified what needs to be presented on the dashboard and set up the infrastructure, it’s time to move to the second phase of dashboard development.

The following items should be considered during the development phase.

6. Design

Design is an important part of overall dashboard development. You should be very selective with the colors, fonts and font sizes that you employ. There is no rule book that establishes the right color or the right font for dashboard design; in our opinion, one should design with the company’s coloring scheme in mind.

This is a safe bet as it keeps the company’s brand identity intact, especially if the dashboard is accessible to external parties. Fonts should not be very light in color and the charts should not be very bright. Having a subtle color scheme that incorporates the brand’s identity resonates well with internal and external parties.

7. Visualization Impact

Identify the right type of visualization to create an impactful first glance for the users. Certain types of data points are better represented by certain types of visualizations. For instance, time trend analysis is usually represented on a line graph. A comparison of the same metric across different business lines are presented well via a heat map. Consider a sales dashboard where revenue and cost numbers for the current year should be presented as standalone numbers with a larger font size, while the historical trend analysis should be placed below.

8. Captions and Comments

Tableau provides users’ with the functionality to add captions and comments to visualizations. Bear in mind that you won’t be around all the time to explain what the different charts in the dashboard represent. Therefore, add relevant descriptions, comments and/or captions wherever it can be useful for the viewer.

Post Development: Maintenance

Once you have created the dashboard, there are additional aspects you should consider for effective and smooth dashboard operation.

9. Robust Testing

After creating the dashboard, conduct robust testing of the entire platform. Testing helps identify any bugs and deployment errors which if not rectified can lead to system failure or erratic results at a later stage.

10. Maintenance

This is the most ignored phase in the dashboard development lifecycle but it is a crucial phase. Once you have created a dashboard, proper maintenance should be conducted in terms of software updates, connections to databases and infrastructure requirements. If the volume of data increases at a fast pace, you will need to upgrade the storage and computing infrastructure accordingly so that the system doesn’t crash or become prohibitively slow.

Avoid the Following

Up to this point we have highlighted some of the best practices to consider while creating a dashboard. Now, let’s broach the aspects you should avoid while creating a dashboard.

1. Starting with a Complex Dashboard

Remember that creating a dashboard is a phased approach. Trying to develop an overly complicated dashboard in one phase may complicate things and led to project failure. The ideal approach is to inventory and prioritize all requirements and proceed with a phased approach. Start development with the highest priority requirements or KPIs and gradually move to the lower priority KPIs in subsequent phases.

2. Placing Too Many KPIs on a Single Chart

Although Tableau has the capability to handle multiple measures and dimensions in a single chart, you should be judicious while choosing the dimensions and measures you want to present in a single graph. For instance, placing revenue, expenses and profit margins in a single chart may be of value; while placing revenue and vendor details in the same chart may not be as valuable.

3. Allocating Too Little Time to Deployment and Maintenance

The appropriate amount of time, budget and resources should be allocated to each constituent phase of the deployment cycle (i.e., KPI identification, dashboard development, testing and maintenance).

We are sure that after reading this post, you have a better idea regarding what practices should be considered while developing a Tableau dashboard. The principles offered here are from a high level perspective. There may be other project nuances to consider in your specific endeavors. We would be happy to hear your thoughts and the best practices that you follow while creating a Tableau dashboard.

Author Bio

This article was contributed by Perceptive Analytics. Prudhvi Sai Ram, Saneesh Veetil and Chaitanya Sagar contributed to this article.

Perceptive Analytics provides Tableau Consulting, data analytics, business intelligence and reporting services to e-commerce, retail, healthcare and pharmaceutical industries. Our client roster includes Fortune 500 and NYSE listed companies in the USA and India.

Yet Another Market Basket Analysis in Tableau

This video represents part two in my Market Basket Analysis series.

The steps in the post were inspired by the book Tableau Unlimited written by former co-worker of mine, Chandraish Sinha. I wasn’t planning to construct another market basket analysis video but when I saw the approach outlined in his book, I felt like it warranted sharing with my readers and followers.

In this version we’ll use default Tableau Superstore data to show the relationship between sub-categories on an Order; all without using a self table join. The visualization and analysis is driven by a user selection parameter.

Once the user selects a sub-category, the bar chart visualization updates to reflect the number of associated sub-category items on the same order.

Sample Superstore Data 2

Watch the video and as always get out there and do some great things with your data!

Feel free to also check out Part 1 here where we create a simpler correlation matrix version that shows all the sub-category relationships in one visual.

 

 

 

Market Basket Analysis in Tableau

 

A favored analysis technique employed by retailers to help them understand the purchase behavior of their customers is the market basket analysis. When you log on to Amazon, most likely you’ve noticed the “Frequently Bought Together” section where Jeff Bezos and company would like to cross-sell you additional products based upon the purchase history of other people who have purchased the same item.

Market Basket Analysis influences how retailers institute sales promotions, loyalty programs, cross-selling/up-selling and even store layouts.

If a retailer observes that most people who purchase Coca-Cola also purchase a package of Doritos (I know they’re competing companies), then it may not make sense to discount both items at once as the consumer might have purchased the associated item at full price anyhow. Understanding the correlation between products is powerful information.

In this video, we’ll use Tableau Superstore data to perform a simple market basket analysis.

Sample Superstore Data 2

Feel free to interact with this market basket analysis on Tableau Public and then download and dissect the workbook.

Watch the video and as always get out there and do some great things with your data.

Feel free to also check out Part 2 here where we’ll create an analysis driven by a user selection parameter.

When Corporate Layoffs Don’t Work

“When downsizing is a knee-jerk reaction, it has long-term costs. Employees and labor costs are rarely the true source of the problems facing an organization. Workers are more likely to be the source of innovation and renewal.” [1]

Case in Point: Circuit City Laid Off Employees for Over-performance

There were a combination of factors that lead to the demise of former electronics retailer Circuit City. A number of these reasons were self-inflicted wounds. The company located its stores in subprime locations, stopped selling appliances to cut warehouse storage and distribution costs and underinvested in its web presence at a time when consumer preferences were beginning to shift online.

However, the company’s biggest blunder was its decision to layoff its most experienced and knowledgeable sales persons while trying to compete in the competitive electronics retail marketplace. In March of 2007, Circuit City announced a scheme to layoff 3,400 hourly workers (roughly 8% of its workforce), while offering a severance package with the ability to reapply to former jobs at a reduced salary. Any reapplications had to occur after a mandatory 10 week cooling off period. Circuit City practiced genteelism by branding its cost cutting and de-skilling scheme a “wage management initiative”.

Management decided to staff its stores with fewer people, with fewer skills, making less money and expected this combination to yield long term positive results. As a result of the layoffs, Circuit City placed knowledgeable, experienced sales staff on a platter and served them to its main competitor, Best Buy. Additionally, where did Circuit City expect to find quality people who would work for a company that did not value loyalty, experience and wage increases?

“From a strategy perspective, customer-facing sales personnel would appear to be a core resource and potential differentiator for a consumer products retailer,” he [Kevin Clark, an assistant professor of management at Villanova School of Business] says. “Especially in an era of rapidly changing and more complex consumer electronics, knowledgeable sales personnel who are perceived by customers as ‘experts’ can be a source of competitive advantage.” [2]

Reportedly, “employees who were paid more than 51 cents above a set pay range for their departments were fired.” [3] However, solidifying the trope of senior executives reaping the gains without the pains, the CEO and Chairman of Circuit City received almost $10 million in various kinds of compensation for steering the company to its imperiled state. [4]

In under two years (i.e., November 2008), Circuit City announced it was going out of business. By laying off its highest paid hourly workers and replacing them with cheaper less skilled workers, in-store customer service levels plummeted which negatively impacted customer perception and sales.

Southwest Airlines Gets it Right

Waving flag of Southwest Airlines editorial 3D rendering

Treating employees as mere cogs and judging employees by costs and not by the overall value they create is self-defeating.

Some companies don’t understand that making workers happy leads to elevated productivity and higher retention levels. High employee morale should be table-stakes, instead it is a strategic key differentiator. Southwest Airlines has never had a layoff in its 47 plus years of existence. That’s laudable when you consider that airlines endured the fallout from 9/11 and the Great Recession (when oil prices spiked over $100 a barrel). As a well deserved consequence, Southwest Airlines routinely leads domestic airlines in customer satisfaction.

Consider this example of how Southwest Airlines treated its recruiting team during the global financial crisis:

“At one point, however, Southwest Airlines was staring at a tough time financially and it did ‘corporate redeployment’. It had 82 employees in the recruiting team. When the company put [in] a hiring freeze, it also wondered what to do with 82 of its employees in this particular team. The company utilised them for customer service. The result: Customer satisfaction went up as a result of this team’s enhanced skill set. When the economy recovered, the team went back to its original job; only this time, they had an additional skill set, which helped the company and the customers alike.” [1]

If you were in the airline industry would you rather work for Southwest Airlines or another domestic competitor (that I mercifully will not name) which embodies layoffs, labor strife and toxic mismanagement of employees?

The Negative Impact of Layoffs

There is a time and place for layoffs. However, more often than not, companies layoff employees during down times in the business cycle to simply lessen the impact on profits, not to avoid a collapse of the business. Against their own best interests, companies also announce layoffs during times of rising profits which causes their best people to head for greener pastures. Any expected cost savings are negated by lower productivity (when the best performers leave), lower innovation and a remaining demoralized workforce subjected to the negative effects of survivor syndrome (i.e., the feeling of guilt after seeing longtime co-workers discarded).

Additionally companies are impacted by “Brand equity costs—damage to the company’s brand as an employer of choice.” [1]. Sites like Glassdoor offer unfairly laid off employees the opportunity to share their sense of betrayal online which can significantly impact a company’s reputation.

Shortsighted management typically operates under the assumption that layoffs will positively impact shareholders. While financial analysts may cheer downsizing efforts, research indicates that layoffs have negative effects on share prices.

“A recent analysis of 41 studies covering 15,000 layoff announcements in more than a dozen countries over 31 years concluded that layoff announcements have an overall negative effect on stock-market prices. This remains true whatever the country, period of time or type of firm considered.”[1]

It should come as no surprise that Circuit City’s stock price fell 4% the day after the company pulled the plug on its most experienced employees. [5]

References:

[1] Employment Downsizing and its Alternatives. Retrieved from https://www.shrm.org/foundation/ourwork/initiatives/resources-from-past-initiatives/Documents/Employment%20Downsizing.pdf

[2] Circuit City plan: Bold strategy or black eye? NBC News. April 2, 2007. Retrieved from http://www.nbcnews.com/id/17857697/ns/business-careers/t/circuit-city-plan-bold-strategy-or-black-eye/

[3] Circuit City Cuts 3,400 ‘Overpaid’ Workers: Washington Post. March 29, 2007. Retrieved from http://www.washingtonpost.com/wp-dyn/content/article/2007/03/28/AR2007032802185.html

[4] Thousands Are Laid Off at Circuit City. What’s New?. New York Times. April 2, 2007 https://www.nytimes.com/2007/04/02/business/media/02carr.html

[5] It’s the Workforce, Stupid! The New Yorker. April 30, 2007. Retrieved from https://www.newyorker.com/magazine/2007/04/30/its-the-workforce-stupid

Circuit City Image Copyright : nazdravie