SQL Zoo Guide: SUM and COUNT

5 minute read

More SQL using SUM and COUNT!

Introduction

We continue where we left off from the previous post where we explored how to use SELECT within SELECT and correlated subqueries using the exercises found in SQL Zoo.

Below is a snippet of the table we’ll be working with for the following exercises.

world Table

name continent area population gdp
Afghanistan Asia 652230 25500100 20343000000
Albania Europe 28748 2831741 12960000000
Algeria Africa 2381741 37100000 188681000000
Andorra Europe 468 78115 3712000000
Angola Africa 1246700 20609294 100990000000

SUM, COUNT, MAX DISTINCT & ORDER BY functions

Today we will be learning how to use various aggregate functions in addition to DISTINCT and ORDER BY. Before we begin working on the following exercises, it is important to know when and how to use each of these functions.

The COUNT function counts the number of non-empty values within a column. We simply provide the name of the column within the function and the query should return the total number of non-empty rows. It’s important to note COUNT does not take into account the actual values stored within each row and only cares if the row has a non-empty value.

The SUM function on the other hand takes a column name and returns the sum of all values within the column. As opposed to the COUNT function, the SUM function does take into account the actual numerical value stored within each row.

The MAX function returns the max numeric value within the specified column.

The DISTINCT function removes duplicate observations and returns each unique observation within a column.

The ORDER BY function sorts the result-set in ascending or descending order. By default, ORDER BY sorts by ascending order. To sort by descending order, simply add the DESC at the end after the column(s) by which you are sorting by.

#1. Total world population

1) Show the total population of the world.

The first exercise begins with the basic application of the SUM function. Simply apply the SUM function to the population column which contains the population of all the countries in the world.

SELECT SUM(population)
FROM world

The query should return 7,118,632,738.

#2. List of continents

2) List all the continents - just once each.

The next exercise involves listing out all of the distinctive continents within the world table. Note, similar to the previous exercise, the DISTINCT function is applied at the beginning on the same line as the SELECT function.

SELECT DISTINCT(continent)
FROM world

The query should return all 8 distinct continents within the world table

#3. GDP of Africa

3) Give the total GDP of Africa.

Similar to our previous post on SQL Zoo, as the queries become more complicated, it helps to break the problem down into easy to understand fragments. Since we are taking into consideration the values stored within each observation (GDP), we will be using the SUM function.

We require the following:

Pull GDP from world table
Condition: Only countries from Africa
Require the total GDP
SELECT SUM(gdp)
FROM world
WHERE continent = 'Africa'

#4. Count the big countries

4) How many countries have an area of at least 1000000

This exercises requires us to count the total number of results that are returned after meeting the specified condition. Therefore, we require the COUNT function.

Pull countries from world table
Condition: Only countries with area > 1000000
Specifically, require the total number of observations returned
SELECT COUNT(name)
FROM world
WHERE area > 1000000

The returned result should be 28 countries.

#5. Baltic states population

5) What is the total population of (‘Estonia’, ‘Latvia’, ‘Lithuania’)?

Similar to exercise #3, we are interested in the total sum but with an added condition.

Pull population from world table
Condition: Only from the specified countries
Require the total population of the specified countries
SELECT SUM(population)
FROM world
WHERE name IN ('Estonia', 'Latvia', 'Lithuania')

Note: Remember to use IN instead of = when referencing multiple conditions.

Using GROUP BY & HAVING

The following examples explore combining the GROUP BY & HAVING functions with the aggregate functions we have practiced with.

The GROUP BY function is often used in conjuction with aggregate functions by applying the functions to groups of items that share the same values.

The HAVING function is used alongside the GROUP BY function in place of the WHERE function as the WHERE function cannot be used with aggregate functions.

#6. Counting the countries of each continent

6) For each continent, show the continent and number of countries.

For this exercise, we want to know how many total countries each continent has. Since the table doesn’t provide this data explicitly, we have to manipulate the table for the desired results.

Breaking this problem down, we have:

Pull continents and countries from world table
Require total # of countries
Require each continent
Seperate the total # of countries by continent
SELECT DISTINCT(continent),
    COUNT(name) AS 'Total # of countries'
FROM world
GROUP BY continent

#7. Counting big countries in each continent

7) For each continent, show the continent and number of countries with populations of at least 10 million.

Similar to the prior problem, we have to manipulate the data to obtain the desired information. The first clue we have is “For each continent”. This immediately lets us know that we will group the data by continent. Splitting the problem into simpler parts we have the following:

Pull continents and countries from world table
Require total # of countries
Condition: Only countries with population > 10,000,000
Seperate the total # of countries by continent
SELECT continent,
    COUNT(name) AS 'total # of countries'
FROM world
WHERE population > 10000000
GROUP BY continent

#8. Counting big continents

8) List the continents that have a total population of at least 100 million.

The final problem may be tricky to those that haven’t familiared themselves with the HAVING function. Normally, when we specify a condition, we use the WHERE function. However, when we are applying conditions to a group, it’s important to distinguish between the two similar functions.

In this example, we are interested in the total population of each continent. However, we do not have that specific information. We have the population of each country but with some data manipulation, we can easily figure the population of each continent by grouping the countries and adding their total population together. Once this is done, we can specify the condition each continent must meet. In simpler terms we have:

Pull continents from world table
Require the entire population of EACH continent
Condition: Only return continents with a TOTAL population > 100,000,000
SELECT continent
FROM world
GROUP BY continent
HAVING SUM(population) > 100000000

This concludes our guide on learning how to use aggregate functions alongside with the GROUP BY and HAVING functions. Stay tuned for a future post on learning to use JOIN functions!

Tags:

Updated: