In this tutorial, we will learn how to create a box plot in Google Sheets. But before we get into the nuts and bolts of creating a box plot in Google Sheets, it’d be helpful to learn a few essential things about box plot.
What is a box plot?
A box plot is a method to graphically represent the distribution of data based on a five-number dataset: the minimum value, the lower quartile (first quartile), the median, the upper quartile (thir d quartile), and the maximum value. It gives information about outliers in the data and what their values are. It also gives information about symmetry in the dataset, their spread and skewness and how tightly they are grouped.
- The minimum and the maximum are the lowest and the highest data points, respectively, excluding the outliers.
- The median is the middle value in the data set.
- And the first and the third quartiles represent the median of the lower half and the upper half of the data set, respectively.
- Outliers are values that lie beyond the minimum and the maximum values.
Box plots are an excellent tool for conveying location and variation information in data sets, particularly for detecting and illustrating location and variation changes between different groups of data. The box plot can also be used to analyse stock market performance and visualise a summary of returns during a specific period.
How to create a box plot in Google Sheets
Creating a box plot in Google Sheets is not very different from how you’d create a box plot in other spreadsheets, except that in Google Sheets we use a Candlestick chart instead, since Google Sheets does not support box plot natively.
Here’s a step-by-step guide to create a box plot in Google Sheets:
In this tutorial, let us box plot the monthly mean temperature of Delhi for a typical year and analyse the data.
Make a copy of this Google spreadsheet if you want to follow along and practice.
Step 1: Compute the five-number summary for the box plot
In order to create a box plot, we need to compute the following values and place them in the order as given below:
- Minimum value
- Lower quartile
- Upper quartile
- Maximum value
For this, we make headers titled Minimum Value, Lower Quartile, Median, Upper Quartile, and Maximum Value.
Finding the Minimum value
To find the minimum value, enter the following formula under the Minimum Value column to find the minimum value:
Finding the Lower Quartile value
To find the value of the lower quartile, we enter into the cell underneath the Lower Quartile header the following formula:
Here, 1 indicates the quartile number.
Finding the median value
We use the MEDIAN function to find the median value. We enter the formula given below into the cell D18 to compute the median:
Finding the Upper Quartile value
In order to compute the upper quartile value, we enter the formula given below in the cell E18.
Where 3 signifies third quartile
Finding the Maximum value
Entering the formula below in the cell F18 returns the maximum value.
Now that we have found the values required for the box plot, we can proceed with creating a chart to represent the values.
Step 2: Creating a suitable chart for the box plot
As Google Sheets has no direct way to create a box plot, we use the nearest available option, which is the candlestick chart.
Before we go about creating a candlestick chart, we need to give a name to the summary values we have computed above to the left of the values, since Google Sheets requires that the first column for the chart data be text. Let us simply name it “Summary Data”.
Inserting the chart
Select the summary values including the cell with text to the left of the values (“Summary Data”, in this case). Then go to the “Insert” tab and select “Chart”.
Selecting the chart
The chart that is inserted by default may not be what we desire. So, we go to
Chart editor > Chart type > Candlestick chart
This will bring up the box plot on the screen.
Customising the box plot
We can now customise the box plot to give it more refinement. To do so, we click on the “Customise” option on the Chart editor menu. You can here give the box plot a name, change the background colour, font size and colour, insert gridlines, and so on.
Adding more than one box plot in Google Sheets
We can also add more than one box plot to a single chart. Inserting more box plots is helpful if we want to make a comparison between two or more data sets. Let us add the weather data for Mumbai and compare it with Delhi’s by inserting it in the same chart.
For this, we first add the weather data for Mumbai and find the five-number summary values for Mumbai as we did for Delhi. Then we select the summary data for both Delhi and Mumbai and insert the Candlestick chart. This will show the box plot for both Delhi and Mumbai in a single chart.
Box plots help us to easily visualise differences and outliers in a data set, and also compare data for different similar entities. Although box plots are less commonly used than other types of data representations, they offer few advantages such as occupying less space and easy interpretation of the data. Do not, therefore, hesitate to use them when appropriate.
We hope this article has been helpful to you. You can find more articles on Google Sheets here.
Here are some related articles you might find useful: