Address Matching in Excel Using Levenshtein Distance

For my Data Analysts, in this video I will demonstrate how to perform a column comparison between two address fields so you don’t have to manually review every row. We’ll use a VBA function from Stack Overflow to provide the comparison results.

I should point out that Excel is NOT the preferred method for address matching, but sometimes it is your only option due to lack of time or better tools. Ideally, you should use address correction software that “fixes spelling errors, corrects abbreviations, and standardizes capitalization so each address in your list complies with the USPS official format” – (per the USPS). Once your addresses are standardized, THEN you should perform a comparison, but this rarely happens.

What typically happens is that some poor analyst like you is conscripted into performing address matching manually using some combination of SQL Server and manual Excel processes. That’s why a Google search led you to this page!

If you ever have to perform address matching in Excel, this could be you!

Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

For example:

  • The string “HAT” as compared to “hat” would have a Levenshtein Distance of 3
    • Since the function is case sensitive all three characters are different
  • The string “HAT” as compared to “BAT” would have a Levenshtein Distance of 1
    • To turn the first string into the second string it would take 1 substitution of characters (H changed to B or vice-versa)
  • The lower the number, the more the strings are similar
  • The higher the number, the more the strings are dissimilar.

Activate the Developer Tab in Excel

The Developer tab is the place to go when you want to do or use the following:

  • Write macros
  • Run macros that you previously recorded
  • Create VBA Modules and User Defined Functions <– This is our sweet spot
  1. On the File tab, go to Options > Customize Ribbon
  2. Under Customize the Ribbon and under Main Tabs, select the Developer check box

Create a Module in Excel

  1. On the Developer tab select Visual Basic
  2. In the VBA interface select Insert > Module

Insert Levenshtein Distance Function VBA Code

  1. Go to this link at Stack Overflow to view the code as originally referenced
  2. Or, simply copy the code below as developed by user “smirkingman” which is the first answer.
    • Big shoutout to “smirkingman” for this great resource!
Option Explicit
Public Function Levenshtein(s1 As String, s2 As String)

Dim i As Integer
Dim j As Integer
Dim l1 As Integer
Dim l2 As Integer
Dim d() As Integer
Dim min1 As Integer
Dim min2 As Integer

l1 = Len(s1)
l2 = Len(s2)
ReDim d(l1, l2)
For i = 0 To l1
    d(i, 0) = i
For j = 0 To l2
    d(0, j) = j
For i = 1 To l1
    For j = 1 To l2
        If Mid(s1, i, 1) = Mid(s2, j, 1) Then
            d(i, j) = d(i - 1, j - 1)
            min1 = d(i - 1, j) + 1
            min2 = d(i, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            min2 = d(i - 1, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            d(i, j) = min1
        End If
Levenshtein = d(l1, l2)
End Function
  1. Paste the code into your newly created Excel module
  2. Debug > Compile VBAProject

You should not experience any errors after compiling the code.

Watch the Video to Use the Function

Using this function in a judicious manner can help you cut down on the mental energy required to manually review the address columns on each row. It is much better to mentally focus on 25% of the rows than 100%. The fewer rows you have to manually review in Excel, the less the chance of you making an error.

Please like and subscribe on the Anthony B. Smoak YouTube channel.

All views and opinions are solely my own and do not necessarily reflect those of my employer

Do Great Things with Your Data!

★☆★ Support this Channel: ★☆★

Merch ►


This image has an empty alt attribute; its file name is anthony-smoak-twitter.jpg

Twitter ►

Facebook ►

Tableau Public ►Search for “Anthony B. Smoak”

Photo by Oladimeji Ajegbile from Pexels


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s