IMPORTHTML in Google Sheets| Pull data from any Website

IMPORTHTML in Google Sheets| Pull data from any Website
Reading Time: 3 minutes
IMPORTHTML in Google Sheets
IMPORTHTML in Google Sheets

We will have a look at IMPORTHTML in Google Sheets using which, we can pull data for tables and lists from a given webpage.

What is IMPORTHTML in Google Sheets?

IMPORTHTML in Google Sheets is a function that allows you to fetch data from a table or a list in a web page. Let’s say you need to fetch data from a table on Wikipedia. This table contains the population of countries. You can try to copy and paste the data, but this will have two disadvantages. Firstly, the format for the data can be inconsistent and might not format correctly if you try to directly copy and paste it. Secondly, you’ll have to manually update the data for all the entries. This can be very repetitive and tiring.

That is where you can use IMPORTHTML, it allows you to fetch data from a webpage on a real-time basis. In the next section, we will discuss the syntax and parameters for IMPORTHTML in Google Sheets.

Syntax and Parameters for IMPORTHTML in Google Sheets

The syntax for IMPORTHTML is as follows:-

=IMPORTHTML(url, query, index)

  • url: The URL of the given website. Please note, you also have to include the protocol (eg, “http://”) inside the URL. The value must be enclosed inside quotation marks or reference to a cell
  • query: Can have two values, “table” or “list”, depending on the usage.
  • index: The index that decides which table or list should be returned from the page. The index value starts at 1. Also note that indexing is separate for tables and lists.

Basic Usage and Examples

Sample usage for the function might look like this.

=IMPORTHTML("https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)","table",3)

Let’s have a look at the targeted table. We have chosen a sample Wikipedia page. The table on the webpage looks something like this.

Sample data from Wikipedia
Figure 1: Sample Table from A Wikipedia Page

If we were to import this table into excel, it would look something like this.

Imported table in Google Sheets
Figure 2: Imported Table in Sheets

As you can see, it has been imported to your google sheets. You can also do the same thing with lists.

Get an index number for a given table or a list

If you are having trouble finding the index number for the table number you want, you can do the following steps.

  • Open the browser’s developer console. You can open it by clicking F12 on windows. If you are on Mac you can open it by Cmd+Opt+J(on Chrome) or Cmd+Opt+C (on Safari).
Console window in browser
Figure 3: Console in Browser
  • You can then use the following code in the console to get the list for all tables on a page.

var index = 1; [].forEach.call(document.getElementsByTagName(“table”), function(elements) { console.log(“Index: ” + index++, elements); });

  • It would then return a list of tables, you can hover over each one of them and see the corresponding table in the webpage getting highlighted. For example, in the given figure, the table we wanted got highlighted when we hovered on the 3rd index.
Finding index for a given table
Figure 4: Finding Index for a Given Table
  • If you want to search for indices for a list, you can enter this code instead.

var index = 1; [].forEach.call(document.querySelectorAll(“ul,ol”), function(elements) { console.log(“Index: ” + index++, elements); });

This way you can find out the index for the desired table or list.

Conclusion

We saw how to import a table or a list from a Wikipedia page using IMPORTHTML in Google Sheets. We also learned how to find out the index number for tables and lists. You can check the official documentation for IMPORTHTML here.

See also

Want to know more formulas and functions in Google Sheets? Look at our definitive guide on Google Sheets which covers hundreds of such topics here. Enjoy reading!

Google Sheets: Using Vlookup and Hlookup Together: Learn how to use Vlookup and Hlookup together

Google Sheets: Mean, Median and Mode: Calculating Mean, Median and Mode for a set of numbers