5 Ways To Remove Duplicates With Vba

Intro

Master data cleaning with Excel VBA. Discover 5 efficient ways to remove duplicates using VBA scripts, including methods for exact duplicates, partial matches, and using arrays. Learn how to automate data processing, improve data quality, and boost productivity with these expert-approved VBA techniques for duplicate removal and data management.

Removing duplicates in a dataset is a crucial task for data analysts, and VBA (Visual Basic for Applications) provides a powerful toolset to accomplish this task efficiently. In this article, we will explore five different methods to remove duplicates using VBA, highlighting the strengths and weaknesses of each approach.

Why Remove Duplicates?

Duplicates can lead to inaccurate analysis, skewed results, and a host of other issues. By removing duplicates, you can ensure that your data is clean, consistent, and reliable. This is particularly important in applications such as data visualization, reporting, and data-driven decision-making.

Method 1: Using the Remove Duplicates Feature

Excel provides a built-in feature to remove duplicates, which can be accessed through VBA. This method is quick and easy to implement.

Sub RemoveDuplicatesMethod1()
    Range("A1:B100").RemoveDuplicates Columns:=Array(1, 2)
End Sub
Remove Duplicates Method 1

Method 2: Using a Loop to Delete Duplicates

This method uses a loop to iterate through the dataset and delete duplicate rows. While it may not be the most efficient approach, it provides a clear understanding of the process.

Sub RemoveDuplicatesMethod2()
    Dim lastRow As Long
    Dim i As Long
    Dim duplicate As Boolean
    
    lastRow = Cells(Rows.Count, "A").End(xlUp).Row
    
    For i = lastRow To 2 Step -1
        duplicate = False
        For j = i - 1 To 1 Step -1
            If Cells(i, "A") = Cells(j, "A") And Cells(i, "B") = Cells(j, "B") Then
                duplicate = True
                Exit For
            End If
        Next j
        
        If duplicate Then
            Rows(i).Delete
        End If
    Next i
End Sub
Remove Duplicates Method 2

Method 3: Using an Array to Store Unique Values

This method uses an array to store unique values and then writes the array back to the worksheet.

Sub RemoveDuplicatesMethod3()
    Dim uniqueValues() As Variant
    Dim i As Long
    Dim j As Long
    Dim lastRow As Long
    
    lastRow = Cells(Rows.Count, "A").End(xlUp).Row
    
    ReDim uniqueValues(lastRow)
    
    For i = 1 To lastRow
        uniqueValues(i) = Cells(i, "A").Value & Cells(i, "B").Value
    Next i
    
    For i = 1 To lastRow
        For j = i + 1 To lastRow
            If uniqueValues(i) = uniqueValues(j) Then
                uniqueValues(j) = ""
            End If
        Next j
    Next i
    
    Range("A1:B" & lastRow).ClearContents
    
    For i = 1 To lastRow
        If uniqueValues(i) <> "" Then
            Cells(i, "A").Value = Left(uniqueValues(i), Len(uniqueValues(i)) \ 2)
            Cells(i, "B").Value = Right(uniqueValues(i), Len(uniqueValues(i)) \ 2)
        End If
    Next i
End Sub
Remove Duplicates Method 3

Method 4: Using a Dictionary to Store Unique Values

This method uses a dictionary to store unique values and then writes the dictionary back to the worksheet.

Sub RemoveDuplicatesMethod4()
    Dim dict As Object
    Dim i As Long
    Dim lastRow As Long
    
    lastRow = Cells(Rows.Count, "A").End(xlUp).Row
    
    Set dict = CreateObject("Scripting.Dictionary")
    
    For i = 1 To lastRow
        dict(Cells(i, "A").Value & Cells(i, "B").Value) = ""
    Next i
    
    Range("A1:B" & lastRow).ClearContents
    
    i = 1
    For Each key In dict.Keys
        Cells(i, "A").Value = Left(key, Len(key) \ 2)
        Cells(i, "B").Value = Right(key, Len(key) \ 2)
        i = i + 1
    Next key
End Sub
Remove Duplicates Method 4

Method 5: Using Power Query

Power Query is a powerful tool in Excel that allows you to manipulate and transform data. This method uses Power Query to remove duplicates.

Sub RemoveDuplicatesMethod5()
    Dim qry As QueryTable
    
    Set qry = ActiveSheet.ListObjects.Add(xlSrcQuery, Range("A1:B100"), XlYesNoGuess.xlYes).QueryTable
    
    qry.CommandText = "LET Source = Excel.CurrentWorkbook(){[Name=""Table1""]}[Content], ""Filtered Rows"" = Table.SelectRows(Source, each ([Column1] <> null and [Column2] <> null)), ""Removed Duplicates"" = Table.Distinct(""Filtered Rows"", {""Column1"", ""Column2""}) IN ""Removed Duplicates"""
    
    qry.Refresh
End Sub
Remove Duplicates Method 5

Gallery of Remove Duplicates Methods

Conclusion

Removing duplicates is an essential task in data analysis, and VBA provides a range of methods to accomplish this task. Each method has its strengths and weaknesses, and the choice of method depends on the specific requirements of the project. By understanding the different methods available, you can choose the most efficient and effective approach to remove duplicates and ensure that your data is clean and reliable.

Jonny Richards

Love Minecraft, my world is there. At VALPO, you can save as a template and then reuse that template wherever you want.