Address Matching in Excel Using Levenshtein Distance

For my Data Analysts, in this video I will demonstrate how to perform a column comparison between two address fields so you don’t have to manually review every row. We’ll use a VBA function from Stack Overflow to provide the comparison results.

I should point out that Excel is NOT the preferred method for address matching, but sometimes it is your only option due to lack of time or better tools. Ideally, you should use address correction software that “fixes spelling errors, corrects abbreviations, and standardizes capitalization so each address in your list complies with the USPS official format” – (per the USPS). Once your addresses are standardized, THEN you should perform a comparison, but this rarely happens.

What typically happens is that some poor analyst like you is conscripted into performing address matching manually using some combination of SQL Server and manual Excel processes. That’s why a Google search led you to this page!

If you ever have to perform address matching in Excel, this could be you!

Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

For example:

  • The string “HAT” as compared to “hat” would have a Levenshtein Distance of 3
    • Since the function is case sensitive all three characters are different
  • The string “HAT” as compared to “BAT” would have a Levenshtein Distance of 1
    • To turn the first string into the second string it would take 1 substitution of characters (H changed to B or vice-versa)
  • The lower the number, the more the strings are similar
  • The higher the number, the more the strings are dissimilar.

Activate the Developer Tab in Excel

The Developer tab is the place to go when you want to do or use the following:

  • Write macros
  • Run macros that you previously recorded
  • Create VBA Modules and User Defined Functions <– This is our sweet spot
  1. On the File tab, go to Options > Customize Ribbon
  2. Under Customize the Ribbon and under Main Tabs, select the Developer check box

Create a Module in Excel

  1. On the Developer tab select Visual Basic
  2. In the VBA interface select Insert > Module

Insert Levenshtein Distance Function VBA Code

  1. Go to this link at Stack Overflow to view the code as originally referenced
  2. Or, simply copy the code below as developed by user “smirkingman” which is the first answer.
    • Big shoutout to “smirkingman” for this great resource!
Option Explicit
Public Function Levenshtein(s1 As String, s2 As String)

Dim i As Integer
Dim j As Integer
Dim l1 As Integer
Dim l2 As Integer
Dim d() As Integer
Dim min1 As Integer
Dim min2 As Integer

l1 = Len(s1)
l2 = Len(s2)
ReDim d(l1, l2)
For i = 0 To l1
    d(i, 0) = i
Next
For j = 0 To l2
    d(0, j) = j
Next
For i = 1 To l1
    For j = 1 To l2
        If Mid(s1, i, 1) = Mid(s2, j, 1) Then
            d(i, j) = d(i - 1, j - 1)
        Else
            min1 = d(i - 1, j) + 1
            min2 = d(i, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            min2 = d(i - 1, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            d(i, j) = min1
        End If
    Next
Next
Levenshtein = d(l1, l2)
End Function
  1. Paste the code into your newly created Excel module
  2. Debug > Compile VBAProject

You should not experience any errors after compiling the code.

Watch the Video to Use the Function

Using this function in a judicious manner can help you cut down on the mental energy required to manually review the address columns on each row. It is much better to mentally focus on 25% of the rows than 100%. The fewer rows you have to manually review in Excel, the less the chance of you making an error.


Please like and subscribe on the Anthony B. Smoak YouTube channel.

All views and opinions are solely my own and do not necessarily reflect those my employer

Do Great Things with Your Data!



★☆★ Support this Channel: ★☆★

Merch ► shop.spreadshirt.com/AnthonySmoak

★☆★ FOLLOW ME BELOW: ★☆★

This image has an empty alt attribute; its file name is anthony-smoak-twitter.jpg

Twitter ► https://twitter.com/AnthonySmoak

Facebook ► https://www.facebook.com/AnthonyBSmoak/

Tableau Public ►Search for “Anthony B. Smoak”

Photo by Oladimeji Ajegbile from Pexels

Stacked Bar Chart with Dynamic Totals in Tableau

Are you looking for the next viz to showcase on your Tableau Public page? In this video I will teach you a technique that spices up the humble stacked bar chart with dynamic totals (using Tableau Set Actions). We will build out the viz step by step with Superstore data.

Stacked Bar Chart with Dynamic Totals

This chart is powered by Tableau Superstore data which is a data set that is readily available on the internet and is packaged as the default data set with Tableau. As you select the three legend categories at the top of the visual, the stacked bar chart sections will appear or disappear. The totals will also automatically update based upon your selection.

How cool is that!!??

I have to give a shoutout to Dorian Banutoiu for originating this technique. A few years ago, Dorian used this technique in a Makeover Monday exercise and it recently caught my attention when I was checking out his Tableau Public page. I immediately attempted to reverse engineer the technique (which admittedly took some effort).

Because I wanted to enable everyone with Tableau and/or Tableau Public to duplicate the chart, I used Tableau Superstore data as my foundation. Make sure to give Dorian a Twitter follow at @canonicalizedco.

What’s In it for You?

By following along with the video, you will utilize multiple Tableau elements such as:

Practice makes perfect so this will be a good opportunity for you to practice multiple Tableau elements with the creation of one visualization. You can click the links on the list above to see additional videos that cover respective areas.

Give Credit

If you do reproduce this visualization step by step or leverage the technique for your Tableau Public page or Linkedin Page, please link to this post or the YouTube video and place “Inspired by Dorian Banutoiu & Anthony Smoak” somewhere on the viz and post text.

Interact with the Finished Visualization

You can interact with the finished visualization on my Tableau Public page here:

https://tabsoft.co/3oNxq5Z


Please like and subscribe on the Anthony B. Smoak YouTube channel.

All views and opinions are solely my own and do not necessarily reflect those my employer

Do Great Things with Your Data!

★☆★ Support this Channel: ★☆★

Merch ► shop.spreadshirt.com/AnthonySmoak

★☆★ FOLLOW ME BELOW: ★☆★

Twitter ► https://twitter.com/AnthonySmoak

Facebook ► https://www.facebook.com/AnthonyBSmoak/

Tableau Public ►Search for “Anthony B. Smoak”

Amazing Tableau Dashboard for Inspiration

I worked hard to create a Tableau dashboard packed with multiple features that any beginner or intermediate user should know how to complete. Use this dashboard as an inspiration regarding techniques to learn for your next Tableau dashboard.

Here are a few of the features included in this dashboard:

  • Parameters
  • Dynamic Titles
  • KPIs
  • Filters
  • Context Filters
  • Top 5 by Dimension
  • Highlight Actions
  • Filter Actions
  • Ranking
  • Show/Hide Containers
  • Image Buttons
  • Parameter Driven Chart Swap
  • Maps
  • Shape Files
  • Reset All Filters
  • Combo Chart / Dual Axis Chart (Bar in Bar)
  • Quick Table Calculations
  • Bullet Chart
  • Animations
  • Containerized Dashboard Layout

Because I love to teach in my relatively spare time, I am considering offering 1 on 1 training to learn how to put together this sample dashboard. As I mention in the video, leave a comment with your thoughts on how much of an investment you think someone would make for 3 hours of 1 on 1 training to build this together. Someone would definitely impress their manager or future hiring manager if they had the knowledge to build this type of front end reporting.

Let’s learn together, contact me here for more information: https://anthonysmoak.com/contact/

Interact with dashboard here ► https://tabsoft.co/3nU4c4Y


Please like and subscribe on the Anthony B. Smoak YouTube channel.

Definitely pick up some merch if you’ve enjoyed this blog and YouTube channel over the years:

https://shop.spreadshirt.com/AnthonySmoak/

All views and opinions are solely my own and do not necessarily reflect those my employer

Do Great Things with Your Data!

Anthony B. Smoak, CBIP

How to Change the Text Delimiter in Excel

No more COMMA DRAMA! In this video I will show you how to change the default comma delimiter in Excel when making .csv (comma separated value) files. Sometimes text fields in your file have comments that you need to preserve in a flat file. This means you have to change the delimiter. Watch the video for a demonstration and keep reading for a quick reference.

I should note that as of the time of this post I am using Windows 10 Home.

Step 1

Search for “Set Regional Format”

Step 2

Select Additional date, time & regional settings

Step 3

Select Region

Step 4

Select Additional Settings

Then in the List Separator box, change the comma to a pipe.

When you save your .csv file in Excel, it will now be pipe delimited instead of comma delimited.

Your flat file will transition from this:

to this

Do Great Things with Your Data

Anthony B. Smoak

Please like and subscribe on the Anthony B. Smoak YouTube channel!
Definitely pick up some merch if you’ve enjoyed this blog and YouTube channel over the years.

All views and opinions are solely my own and do not necessarily reflect those my employer.

Tableau Dashboard Project #VIZNESSFIRST

In this video I kickstart the #VizznessFirst​ initiative where I describe a Tableau Dashboard I constructed using multiple resources from the Tableau #datafam​ community. Invest your valuable time in watching the videos as it will be worth the investment if you’re looking to pick up some new skills.

I describe the dashboard, show you the resources and then you attempt to build it. This is a perfect intermediate/advanced dashboard idea starter for students looking to improve their Tableau skills.

Background

Let’s start with a little background on how this initiative came together. I am a big fan of the Real World Fake Data (i.e., RWFD) project run by Mark Bradbourne at Tableau. I wanted to finally get involved and put together a dashboard and hopefully learn something new.

Because the particular data set used (Week 5: Help Desk) had very few measures to sum or aggregate; this left counting rows as the most informative means to squeeze some meaning from the data.

I stumbled upon an excellent blog post from Tableau Zen Master Lindsay Betzendahl where she explained a technique to highlight when a “filter” has been activated with a small indicator. I reversed engineered her dashboard to try and tease out how to pull this off with the RWFD data set.

In addition to this technique, I mashed up some other techniques from various members of the Tableau community. One of the great things that I enjoy about the Tableau community is that there are many creators who put out great content for others to learn from; whether it be videos, blog posts, PowerPoint slides, etc.

Once I had a dashboard I was proud of sharing. I figured I would release it as a series to try and teach others some of these techniques. The 3 videos in this series is the culmination of that effort.

VIDEO 1: Overview of the Project and Resources Required

Here are the references I used to put together the dashboard.

VIDEO 2: STEP BY STEP INSTRUCTIONS

Make sure to watch Video 2 above because this is where I explain the main technique required to complete the dashboard.

VIDEO 3: STEP BY STEP INSTRUCTIONS (TABLE TRICKS)

Video three rounds out the series by demonstrating a cool trick that enables you to build a filter button that opens a detailed table dashboard populated with only the records of interest from the first page. Yes, it uses a filter action but there is twist. Make sure to check it out.

SHARE WHAT YOU CREATE

  • POST YOUR FINISHED DASHBOARD TO TABLEAU PUBLIC:
    • DASHBOARD BUILT USING TABLEAU DESKTOP 2020.4.1
  • SHARE YOUR LINK IN THE COMMENTS ON THE YOUTUBE CHANNEL:
    • 10K+ SUBSCRIBERS CAN VIEW YOUR ACCOMPLISHMENT
  • YOU CAN ALSO POST TO:
    • TWITTER
    • LINKEDIN
  • MAKE SURE TO INCLUDE:


If you enjoyed this tip, please like and subscribe on the Anthony B. Smoak YouTube channel.

Definitely pick up some merch if you’ve enjoyed this blog and YouTube channel over the years:

https://shop.spreadshirt.com/AnthonySmoak/

All views and opinions are solely my own and do not necessarily reflect those my employer

Do Great Things with Your Data!

Anthony B. Smoak, CBIP

Show and Hide Containers in Tableau

In this video I’ll show you how to show and hide containers in Tableau at the push of a button. This makes for a convenient way to increase space for your dashboard while hiding your filters or switching to an additional hidden chart until needed.

If you’re not using at least Tableau 2019.2.0, then you need to run over to your I.T. department and have them set you up. In previous versions of Tableau you could achieve this effect, but you would have to implement a hacky methodology in order to pull it off. Although I love a good hack, we should all strive to work smarter not harder.

The key to pulling off the show/hide container is to add a floating horizontal or floating vertical container to your dashboard. Only once you’ve taken this step can you see the option to “Add Show/Hide Button”.

Once you’ve selected this option, any new sheets, filters or other objects you wish to place in your container are enabled to appear or disappear at the press of a button.

An “X” marks the spot as this default customizable icon will appear. You can replace this image with text or use your own customizable image in its place.

As a reminder, (from the Tableau Knowledge Base) these options “will not be available if the sheet is not on a horizontal or vertical container and that container is not floating.”

In lieu of the default show/hide icons, in the video we will use buttons from a template provided by Kevin Flerlage. Do yourself a favor and head on over to the Flerlage Twins blog and download this handy resource.

Make sure to give your filters and charts the “Personal Space” they need! Rick and Morty aficionados know exactly what I’m talking about.

For the Power BI curious, here is how a similar process is conducted, where the filters (ahem) slicers are hidden at the touch of a button.


If you enjoyed this tip, please like and subscribe on the Anthony B. Smoak YouTube channel.

All views and opinions are solely my own and do not necessarily reflect those my employer

Do Great Things with Your Data!

Anthony B. Smoak, CBIP

In all seriousness, the world lost an icon of baseball and civil rights as of the recording of this video; Mr. Hank Aaron. I live in Atlanta so I have to pay my respects with a shout out to Hammering Hank. Rest in Peace #44.

Drill from Region to State Using Parameter Actions in Tableau

When the data goes high, you can go low; to misquote a common saying. In this video I’ll show you how to start at a region level on your Tableau map and then drill into the State.

If you’re using the Tableau Superstore data set, make sure the Region and State fields are assigned to a geographic role. Most likely you will need to change the Region to a geographic role, which is created from the State field.

At a high level we’ll have a dual axis based upon the latitude, with the top latitude displaying the regions and the bottom latitude displaying the state. When we layer them on top of each other, we begin to create the illusion of the drill.

We’ll use a parameter creatively named [Region Parameter] which contains all of the regions. From there we’ll create a calculated field named [_States to show] as follows:

If [Region]=[Region Parameter]
Then [State]
END

In order to institute the drill, we’ll create a worksheet parameter action that will change the value of the region parameter on user selection. This causes the clause (If [Region]=[Region Parameter]) to evaluate to TRUE which then causes the display to show the states for the selected region.

It sounds more complicated than it is, so just make sure to watch the video for understanding and clarity.

As a bonus, I’ll show you how to achieve this effect where the selected region does not cause the other regions to gray out. Notice on the second map how all the non selected regions do not lose emphasis; this is not the default effect. It’s the little “show-off” details like this that can up your Tableau game. You’re welcome!

You can thank me by watching, liking and subscribing:

All views and opinions are solely my own and do NOT necessarily reflect those my employer.

Do Great Things with Your Data

Anthony B. Smoak

How to Filter your Tableau Viz in Tooltip (Top 10 Values)

Visualizations in tooltips, affectionately know as “Viz in tooltip” is a handy feature available in Tableau that enables “details on demand” functionality. As the user hovers over a specific mark or data point, additional details are revealed that are filtered specifically for that mark from another worksheet.

In the example above, as the user hovers over a bar, they obtain additional details about the three most profitable products associated with the respective bar.

As I learned in a very informative Tableau presentation for tooltip wonks (myself included), the underlying architecture is built upon action commands and shares many commonalities with action filters. For viz in tooltip performance considerations, use smaller and fewer visualizations. Also try to avoid maps and other complex visualizations that have significant mark density.

If your tooltip responsiveness is greater than 2 seconds or the height and or width is greater than 600 pixels, then consider rethinking your approach. According to Tableau best practice, users are not willing to wait more than 2 seconds hovering over a mark for a reveal.

Since the viz in tool tip passes filters between worksheets, this means we can make use of context filters (click this link for a fantastic overview) to limit the number of marks returned and help improve performance.

This is the Section You are Here for

Context filters also help solve the problem of returning the Top N records associated with a mark. When you assign a viz in tooltip on your source sheet, a set filter is applied on the target (i.e., viz in tooltip) worksheet. If you’re a frequent watcher of my videos you know that the Tableau order of operations prevents the default set filter from returning a proper Top N.

By adding the set filter to the context on the Order of Operations skyscraper, the data is pre-filtered by your dimension first (e.g., State) and then the Top N filter is applied. When the set filter turns gray, you know it’s working.

Notice that the Context Filter box is above the Sets entry; which means that the Context filter is evaluated BEFORE the set. Make sure to watch the video to learn how to limit to the Top 10 cities based upon a hovered state.

Check out the video for details and may all your viz in tooltips be context appropriate!

All views and opinions are solely my own and do NOT necessarily reflect those my employer.

Do Great Things With Your Data

-Anthony B. Smoak

Create Tableau KPIs Quick and Easy (Profit vs Budget)

In this video you will learn a quick and easy approach to putting together KPIs that illustrate how profits perform in relation to a budget (i.e., targets vs actuals). Of course the KPIs will be interactive thanks to help from parameters.

Now the dashboard that I built around the KPIs will definitely form the basis of additional videos. There are several techniques here (filtering viz in tooltips, show/hide container, bar in bar chart, reset all filters button, etc.) that form the basis of a good intermediate level dashboard.

Inspiration comes in many forms. I have to give a shoutout to Keith Dykstra for offering his original dashboard for reverse engineering. The idea for the KPIs and bar in bar chart are inspired by Keith. I added additional elements such as the reset all filters button, filtering by states via treemap, parameters in lieu of hover over images.

Shoutout to the Kevin Flerlage for his great PowerPoint button workbook. I modified the on/off toggle button based upon a template from Kevin’s workbook.

Finally, I was watching an Oregon vs UCLA football game one Saturday and was impressed by Oregon’s latest uniform combination. I had to throw that grey and green combination together for use on the dashboard. Inspiration can come from many places! Here’s a little Oregon football and my “Saturday Night Lab” tweet.

Make sure you watch this video to learn and hopefully get inspired yourself!

If you’re interested in KPIs you can check out these other videos:

All views and opinions are solely my own and do NOT necessarily reflect those my employer.

Understanding Tableau Context Filters

Context filters in Tableau are a big mystery right? In this video I will demonstrate two examples that will help shed some light on how context filters in Tableau work.

When I was starting to learn Tableau, I had no idea why I would ever need to add a filter to the context. It just didn’t make sense for me, because most of the time, the filter still operated in a manner I expected. Little did I know, I was moving a filter on up, to a deluxe apartment in the sky!

That is because I did not understand Tableau’s order of operations. This understanding is key. The higher the filter is on the official order of operations skyscraper, the more influence it has on all the other filters below it.

When two dimension filters are applied to a visual, they are working independently with access to all rows in the data source without regard to other filters. Eventually the filters decide what values they have in common (i.e., the intersection), and those values are shown on a visual.

But if you change your dimension filter to a context filter, you’ve ensured that any other filters that you set are defined as dependent filters because they process only the data that passes through the context filter.

This concept will be very handy when trying to compute the Top N values. Notice on the order of operations that Top N is located below context filters. That means that the Top N filter will only receive values that have been “pre-filtered” from the context filter.

What’s in it for You?

Per Tableau, use context filters to:

  • Improve performance – If you set a lot of filters or have a large data source, the queries can be slow. You can set one or more context filters to improve performance.
  • Create a dependent numerical or Top N filter – You can set a context filter to include only the data of interest, and then set a numerical or a top N filter.

To improve performance of context filters, especially on large data sources, follow these general rules.

  • Using a single context filter that significantly reduces the size of the data set is much better than applying many context filters.
  • In fact, if a filter does not reduce the size of the data set by one-tenth or more, it is actually worse to add it to the context because of the performance cost of computing the context.

Still scratching your head? It will all make sense after the video examples. Give it a watch!

Of course, check out the official documentation from Tableau where I sourced these tips.

All views and opinions are solely my own and do NOT necessarily reflect those my employer.

Do Great Things With Your Data

-Anthony B. Smoak

Photo by Andrea Piacquadio from Pexels