How to Eliminate Duplicates Using Queries in Google Sheets

How to Eliminate Duplicates Using Queries in Google Sheets
Reading Time: 3 minutes

Simple Guide to Eliminate Duplicates in Google Sheets

Learn how to use the Google Sheets Query formula to remove duplicates from the dataset.

Background

Let’s say you are working on a dataset where there is a possibility of multiple duplicate entries. You want an output where all the duplicate entries are removed. You can perform this operation by simply using the Google Sheets Query function. Use queries to filter out selected data. The output will automatically remove duplicate entries.

Image Courtesy: Trendblog.net

As you can see in the image above, the duplicate entries are removed in the output. Duplicates can be removed also by using the UNIQUE command. Using the GROUP BY clause in the Google Sheets Queries function, you can remove duplicates. This article covers a detailed explanation for both these methods to remove duplicates.

Eliminate Duplicates in Google Sheets Using Queries

Setting up the data range to find and remove duplicate values

1. Selecting the data range

  • We will use a dataset for querying. For ease of querying, we will name the range so we do not need to select the data range every time.
  • Press Ctrl + A (Windows) or Cmd+A (Mac) to select the entire data range.
  • Go to Data
  • Select Named Ranges
Figure 1: Selecting dataset for creating a named range
Figure 1: Selecting dataset for creating a named range | Eliminate Duplicates in Google Sheets

2. Naming the data range

  • Give a suitable name to your data range. Here I am naming it “Dataset”.
  • Click on Done to complete the process of naming.
Figure 2: Going to Named Ranges to name our data range
Figure 2: Going to Named Ranges to name our data range

Eliminate Duplicates in Google Sheets using Query function

  • You can remove duplicate entries of a dataset using the GROUP BY clause of the Query function, as shown below :

=QUERY(A1:D30,”Select A,B,SUM(C) Group By A,B label SUM(C) ‘total sales’ “)

Figure 3: Query to remove duplicates using GROUP BY clause

  • The query will remove duplicate entries by grouping the dataset on columns A and B.
  • The output will be columns A, B, and the sum of column C values, grouped by columns A and B, as shown below :
 Duplicate entries from the dataset removed in the output, with only one row for all the duplicate rows
Figure 4: Duplicate entries from the dataset removed in the output, with only one row for all the duplicate rows

Using the UNIQUE command to Eliminate Duplicates in Google Sheets from Data Range

  • You can remove duplicates using the UNIQUE command, without grouping the dataset. Use :

=UNIQUE(Dataset)

Figure 5: Command to remove duplicates from a dataset

  • This will keep UNIQUE rows of the dataset, without grouping them.
  • The duplicate column values will be added, similar to what we did in the GROUP BY clause by using the SUM function in the query.
  • Also, this will NOT change the order of the rows, unlike GROUP BY, where the unique rows were sorted in ascending order, as shown below :
Example of how to eliminate duplicates in Google Sheets
Figure 6: UNIQUE command works similar to ORDER BY clause of QUERY function, without changing the order of occurrence

Conclusion

The query function can be used to eliminate duplicates in Google Sheets. You can also remove duplicates using the UNIQUE function. The query function automatically removes duplicate entries in the output. If you want to know more about the Google Sheets Query function, you can go through the documentation here.

See Also

You can check out other equally good articles on Google Sheets here.

Delete Empty rows in Google Sheets: In this article we would see how to delete empty rows in Google Sheets using two different ways.

How to group rows in Google Sheets: We will learn how to group rows in Google Sheets. We will also have a look at nested grouping.