Simple Guide to Eliminate Duplicates in Google Sheets
Learn how to use the Google Sheets Query formula to remove duplicates from the dataset.
Let’s say you are working on a dataset where there is a possibility of multiple duplicate entries. You want an output where all the duplicate entries are removed. You can perform this operation by simply using the Google Sheets Query function. Use queries to filter out selected data. The output will automatically remove duplicate entries.
As you can see in the image above, the duplicate entries are removed in the output. Duplicates can be removed also by using the UNIQUE command. Using the GROUP BY clause in the Google Sheets Queries function, you can remove duplicates. This article covers a detailed explanation for both these methods to remove duplicates.
Eliminate Duplicates in Google Sheets Using Queries
Setting up the data range to find and remove duplicate values
1. Selecting the data range
- We will use a dataset for querying. For ease of querying, we will name the range so we do not need to select the data range every time.
- Press Ctrl + A (Windows) or Cmd+A (Mac) to select the entire data range.
- Go to Data
- Select Named Ranges
2. Naming the data range
- Give a suitable name to your data range. Here I am naming it “Dataset”.
- Click on Done to complete the process of naming.
Eliminate Duplicates in Google Sheets using Query function
- You can remove duplicate entries of a dataset using the GROUP BY clause of the Query function, as shown below :
=QUERY(A1:D30,”Select A,B,SUM(C) Group By A,B label SUM(C) ‘total sales’ “)
Figure 3: Query to remove duplicates using GROUP BY clause
- The query will remove duplicate entries by grouping the dataset on columns A and B.
- The output will be columns A, B, and the sum of column C values, grouped by columns A and B, as shown below :
Using the UNIQUE command to Eliminate Duplicates in Google Sheets from Data Range
- You can remove duplicates using the UNIQUE command, without grouping the dataset. Use :
Figure 5: Command to remove duplicates from a dataset
- This will keep UNIQUE rows of the dataset, without grouping them.
- The duplicate column values will be added, similar to what we did in the GROUP BY clause by using the SUM function in the query.
- Also, this will NOT change the order of the rows, unlike GROUP BY, where the unique rows were sorted in ascending order, as shown below :
The query function can be used to eliminate duplicates in Google Sheets. You can also remove duplicates using the UNIQUE function. The query function automatically removes duplicate entries in the output. If you want to know more about the Google Sheets Query function, you can go through the documentation here.
You can check out other equally good articles on Google Sheets here.
Delete Empty rows in Google Sheets: In this article we would see how to delete empty rows in Google Sheets using two different ways.
How to group rows in Google Sheets: We will learn how to group rows in Google Sheets. We will also have a look at nested grouping.