The Ultimate Guide to the SQL COUNT Aggregate Function
The SQL COUNT function is an aggregate function that you can use in your queries to return the number of items found in a group. In this article, we will look at how you can use the COUNT function for more powerful, analytical SQL queries. First, we will look at how functions work in SQL, and then we will look at what aggregate functions are. Finally, we will examine some real SQL queries that use COUNT to look at the function in more detail.
SQL operators versus functions
In SQL, there are operators and functions. We will look at COUNT in this article, which is a function. So before we get into how to use the COUNT function, let’s look at how operators and functions differ.
Typically, operators perform or manipulate data on single pieces of data known as operands, expressions, or arguments. In this article, we will use the term argument. These operators manipulate the arguments within a table or view and return a result. In their simplest form, operators comprise basic arithmetical operations, such as add, subtract, divide and multiply. Syntactically an operator appears before or after an argument, or between arguments, for example, 1 + 2 + 1.
Functions, however, differ from operators in the format of their arguments, for example, function (argument 1, argument 2, argument 3 …). This equates the example to SUM (1,2,1). It is this difference in format that enables functions to perform operations on zero, one, two, or more arguments, or even sets of arguments.
What are SQL aggregate functions?
The COUNT function is an aggregate function in SQL. Aggregate functions in SQL are the built-in techniques, which enable SQL to perform mathematical operations on multiple rows or groups of data in a table or view. Aggregate functions, unlike single-row functions, do not just concern themselves with one row of data but work on groups of rows.
All aggregate functions except for COUNT ignore null values. Programmers commonly use aggregates in SELECT statements by using the GROUP BY clause to identify the target rows, or groups of rows, in a table. If you omit the GROUP BY clause, the aggregate function will apply to all rows and data in the table.
Aggregate functions in SQL are powerful features that allow programmers to write fast, clean, and efficient database queries. By learning how to identify, implement and correctly use the aggregate functions, a programmer has a powerful and flexible array of mathematical tools at hand. Now let’s look at the syntax of aggregate functions, which apply to the COUNT function.
SQL COUNT function syntax
The aggregate function syntax takes the form of placing the function before the argument(s) or set of arguments, also referred to as expressions. As an example, the syntax for the COUNT function would be:
COUNT(argument1, argument2 ….)
That is COUNT in its simplest and most common form.
However, there are other clauses.
In its fullest form, the syntax for the COUNT aggregate function is
COUNT ([DISTINCT | ALL] argument) [ OVER (analytic_clause)]
Many (but not all) aggregate functions that take a single argument also accept the DISTINCT and the ALL clauses. The DISTINCT clause only considers distinct or unique occurrences of items of data. With the ALL clause, every item of data is considered. An example of the difference would be the DISTINCT average of 1, 1, 1, and 3 is 2. However, the ALL average is 1.5. In SQL, ALL is the default mode.
Some aggregate functions can use analytic clauses, as is the case here with AVG. Aggregate functions may also be nested within other aggregates. We will look at multiple ways to use the COUNT functions when we look at some example queries in the next section.
SQL COUNT example queries
Knowing the syntax of the query may help us write them, but looking at example queries retrieved from a data set we can see will help us figure out what we can really do with them.
For the example queries in this article, we will use data from the following example table, called Products:
|1||100||Ice Cream||United States||5|
|2||null||Brussel Sprouts||United States||4|
|3||null||Iceberg Lettuce||United States||3|
It’s a simple table of products for a grocery store with the product names, ids, brand_ids, prices, and country of origin. Some values are null in the table, which means they have no value. This is on purpose. Some queries we write will account for these null values. Let’s get started with a simple SELECT query.
A standard SELECT query example
If you are new to the COUNT function but know a little SQL, then you know queries like this one:
SELECT name, country FROM Products;
Which will return these results:
|Ice Cream||United States|
|Brussel Sprouts||United States|
|Iceberg Lettuce||United States|
We can see there are ten results visually, but we can use the SQL COUNT function to get this value, as we will see in the next example.
An SQL COUNT query example
There is not much to a simple COUNT query. It will basically give you the count of records in the table. Here is an example of that type of query:
SELECT COUNT(*) AS product_count FROM Products;
This query will return this result set:
This tells us that our visual count was right, but now we can process the result with software that is connected to our database because we have the result in data. It just does a simple count of the records in the table or the row count of our first select query. Most database engines are configured to make a query like this really fast by doing just that, returning the row count.
This query works if all we want is a row count of the Products table. However, by adding a WHERE clause, we can create a more useful query.
Notice that we used COUNT(*) in our query. Adding the asterisk simply counts the rows in the results query. Another option is using COUNT(1). What using one instead of an asterisk does is append a 1 to every result in the query’s result set and then count that. There is even another option. Since the last option appended a value to count, we can replace that with counting an existing value, using COUNT(id) or COUNT(name). The result will be the same. It is just three different ways of doing the same thing.
You should use the COUNT(*) version. Most relational databases that see this version of COUNT will count the rows by accessing an index in the table, which will be optimized for speed. If COUNT(name) is used, the database engine may actually inspect each value in the column, resulting in a slower query. The same may happen if we use the COUNT(1) format, so in the remaining examples, we will be using the asterisk in the COUNT function where needed.
SQL COUNT with WHERE examples
The last query gave us a simple count of the Products table. We can also add a WHERE clause to filter the products and then count the resulting records. Here is a query where we count all the records where the brand_id is 101:
SELECT COUNT(*) AS product_count FROM Products WHERE brand_id = 101;
This query will return this result:
We can continue to run queries for each brand_id to get a count of all the products in each. We can also do the same to find the count of products from a specific country. Here is the query to count all the products that come from Canada:
SELECT COUNT(*) AS product_count FROM Products WHERE country = 'Canada';
This query will return this result set:
Dealing with null values using SQL COUNT
Remember when we said that some columns in the table had null values, and when we ran our first COUNT query, it just ignored all these null values and counted all the rows in the query? We can handle these with a query to filter them out. For example, if we only wanted a count of the products that had a brand_id, we could use the following query:
SELECT COUNT(*) AS product_count FROM Products WHERE brand_id IS NOT NULL;
The result of that query is the following:
Notice that when we are checking for nulls in a column, we use a different syntax. We don’t use the equals sign. Instead, check if the value IS NULL or IS NOT NULL.
Getting multiple counts by using SQL COUNT with GROUP BY
COUNT is a SQL aggregate function, but so far, none of the queries we have looked at have returned more than a single count. It’s time to use GROUP BY and show that COUNT can do much more. What GROUP BY does is group similar rows together into one row based on what columns are listed in the GROUP BY function. Once records are grouped like this, aggregate functions can be run separately on each group.
So if we wanted to count all the products we have in each brand_id, we could do that with one query. This is that query:
SELECT brand_id, COUNT(*) AS product_count FROM Products GROUP BY brand_id;
That query will return these results:
Notice that the null brand_id showed up again. To return just the counts of those products that actually have a brand_id, you could change the query to:
SELECT brand_id, COUNT(*) AS product_count FROM Products WHERE brand_id IS NOT NULL GROUP BY brand_id;
We can also run a similar query to get the count of products coming from each country. This time, we will filter out the null country values. Here is that query:
SELECT country, COUNT(*) AS product_count FROM Products WHERE country IS NOT NULL GROUP BY country;
The result of this query will look like this:
Be careful using the GROUP BY clause. You must make sure that all the columns you are querying can be grouped. In the count queries above, notice that we only returned the column we grouped the query by and the count of each group. Queries like the following won’t work or return weird results:
SELECT country, COUNT(*) AS product_count FROM Products WHERE country IS NOT NULL GROUP BY brand_id;
This query will return this result:
The query worked in an SQL sense. It returned results, but they are useless to us. The problem with the last query is that we grouped the results by the brand_id, but we are trying to return the country in the result set.
Using COUNT with OVER and PARTITION BY
We have already used GROUP BY to group the database records, so we could get a count of the products in each brand. We can get the same result by using OVER and PARTITION BY along with DISTINCT. Here is that query:
SELECT DISTINCT brand_id, COUNT(name) OVER(PARTITION BY brand_id) AS product_count FROM Products;
This query returns this set of results:
These are the same results we got from GROUP BY. You should remember to use DISTINCT, or else you will get these results instead:
Getting distinct counts with SQL COUNT
Another useful SQL keyword you can use with the COUNT function is DISTINCT. Look at the price column in our example table. There are 10 values there, just like there is in every column, but a lot of these repeat. If we wanted to count the distinct price values, we could do it with this query:
SELECT COUNT(DISTINCT price) AS distinct_price_count FROM Products;
The result of that query is this:
So we only have five distinct price values.
GROUPING and including a grand total with SQL COUNT
We can also write a query for the Products table that gives us not only a count of each product in a brand but also the last row that includes a total count of all the products. GROUPING indicates whether a specific column expression in a GROUP BY list is aggregated or not. It will return 1 if it is and 0 if it is not. Here is that query:
SELECT CASE WHEN GROUPING(brand_id) = 1 THEN 'Total:' ELSE brand_id END AS brand_id, COUNT(*) product_count FROM Products WHERE brand_id IS NOT NULL GROUP BY brand_id ORDER BY GROUPING(brand_id), brand_id;
We also use the CASE statement in the query to determine whether we should return a brand_id row or the row with the total. We also remove records when the brand_id is null from the results. We also use the ORDER BY clause to make sure the total row is the final record.
Top courses in Development
Using SQL COUNT with other aggregate functions
You can also use other aggregate functions along with COUNT in your SQL queries. If we wanted to return the count of products per country, along with the average price of the products those countries offer, we could write a query like the one below to get those results.
SELECT country, COUNT(*) AS product_count, AVG(price) AS price_average FROM Products WHERE country IS NOT NULL GROUP BY country;
The results of this query are below:
Both the AVG and COUNT functions will execute on the same set of grouped results.
You can do much more with SQL queries than just return the data stored in the database. Relational database engines also have built-in functions that you can execute on the results returned in the query. COUNT is one of those functions. COUNT is an aggregate function that returns the count of the items passed to it. These items can be columns from a database table or the results from an SQL expression. Using COUNT with the other functions available to you in SQL, you can write queries that add data analysis to the simple data retrieval process you are used to.
The count command is one of the most useful SQL commands, but it’s only a fraction of what you can do with the database language. Let’s continue our path to SQL by learning how to use the SQL AVG command.
Featured courses in Data Science
Empower your team. Lead the industry.
Get a subscription to a library of online courses and digital learning tools for your organization with Udemy Business.