Quicker, Faster SQL Queries – Using Aggregate Functions

sql aggregate functionsAggregate functions in SQL are powerful features that allow programmers to write fast, clean and efficient database queries. By learning how to identify, implement and correctly use the aggregate functions, a programmer has too hand a powerful and flexible array of mathematical tools. However, what are these Aggregate Functions and how are they used in a database query?

For those new to SQL, learn more at Udemy.com

In SQL, there are operators and functions. Typically, operators perform or manipulate data on single pieces of data known as operands, expressions or arguments. In this article, we will use the term argument. These operators manipulate the arguments within a table or view and return a result. In their simplest form, operators consist of basic arithmetical operations, such as add, subtract, divide and multiply. Syntactically an operator appears before or after an argument, or between arguments, for example 1 + 2 + 1.  Functions however differ from operators in the format of their arguments, for example, function (argument 1, argument 2, argument 3 …). This equates the example to SUM (1,2,1) It is this difference in format that enables functions to perform operations on zero, one, two or more arguments, or even sets of arguments.

Want to learn more about SQL’s fundamentals?

What are Aggregate Functions?

Aggregate functions in SQL are the inbuilt techniques, which enable SQL to perform mathematical operations on multiple rows or groups of data in a table or view. Aggregate functions unlike single row function, do not just concern themselves with one single row of data, but work on groups of rows. All aggregate functions with the exception of COUNT ignore null values. Programmers commonly use aggregates in SELECT statements by using the GROUP BY clause to identify the target rows, or groups of rows, in a table. If the GROUP BY clause is omitted then the aggregate function applies to all rows and data in the table.

Aggregate Functions – Syntax

The aggregate function syntax takes the form of placing the function before the argument(s) or set of arguments also referred to as expressions. As an example, the aggregate function (AVG) would be

AVG(argument1, argument2 ….) in its simplest and most common form.

However, there are other clauses.

In its fullest form, the syntax for the AVG aggregate function is

AVG ([DISTINCT | ALL] argument) [ OVER (analytic_clause)]

Many (but not all) aggregate functions that take a single argument also accept the DISTINCT and the ALL clauses. The DISTINCT clause only considers distinct or unique occurrences of items of data. With the ALL clause every item of data is considered, an example of the difference would be; the DISTINCT average of 1, 1, 1, and 3 is 2. However the ALL average is 1.5, in SQL, ALL is the default mode.

Some aggregate functions can use analytic_clauses as is the case here with AVG. Aggregate function may also be nested within other aggregates.

A practical example of the nested aggregate function using AVG is shown below

SELECT AVG(MAX(salary))

FROM employees

GROUP BY department_id;

 

AVG(MAX(SALARY))

—————-

15500

 

In the example above, the aggregate function MAX is nested within the AVG aggregate statement. The calculation first determines the value of maximum salary, the inner aggregate. It calculates this per group, defined by the GROUP BY clause, which in this example is department_id. Once the max salaries per department have been determined, the AVG then calculates the average maximum salary of all the maximum salaries for each department.

Commonly Used Aggregate Functions

There are many aggregate functions built into SQL the most common though include sum, count, min, max, avg and stdev.

In order to show practical examples of the aggregate functions, refer to data from the table below.

Table: STORE

store_idstore_nameregion_idemployee_count

107

New York

200

11

108

Boston

200

5

109

Chicago

300

4

110

San Diego

500

8

111

San Jose

500

2

 

SUM

Syntax - SUM([ DISTINCT | ALL ] arguments)

 

The SUM aggregate function produces a result by adding all the arguments or expressions. It may as is the case with all our examples, be used as either an aggregate or an analytical function.

Example of use

SELECT SUM(employee_count) "Total"
     FROM store;
     Total
----------
        30

 

Explanation – the sum aggregate took the arguments from employee_count and added them up returning the sum of all employees in each store.

 

COUNT

Syntax - COUNT({ * | [ DISTINCT | ALL ] expr })

 

Purpose – COUNT returns the number of rows returned as a result of the query.

Example of use

SELECT COUNT(DISTINCT region_id) "Regions"
  FROM stores;
  Regions
----------
         3

 

Explanation – In this example, the DISTINCT clause has been added so that the aggregate will only include unique instances of arguments. In this case, it ignores the duplicate occurrences of the regional_ids’, 200 & 500, only counting them once. Therefore, it returns a count of regions as being 3.

 

MIN

Syntax - MIN([ DISTINCT | ALL ] expr)

 

Purpose – returns the minimum value of the expression.

Example of use

SELECT MIN(employee_count) "Lowest No Employees"
  FROM store;
Lowest No Employees
---------
        2

 

Explanation – In this example, the aggregate looks for the lowest argument in the set employee_count.

 

MAX

Syntax - MAX([ DISTINCT | ALL ] expr)
Purpose – Returns the maximum value of an expression.
Example of use
SELECT MAX(employee_count) "Highest No Employees"
  FROM stores;
   Hishest No of Employees
                ----------
                        11
Explanation - In this example the aggregate function MAX returns the highest value of argument in the data set employee_count.

 

STDEV

Syntax - STDDEV([ DISTINCT | ALL ] expr)

 

Purpose – returns as the result the standard deviation of a set of expressions, a set of numbers.

Example of use

SELECT STDDEV(employee_count) "Deviation"
   FROM employees;
 Deviation
----------
      5.478

 

Explanation – The aggregate STDEV returns the standard deviation of the data set employee count

 

The SQL examples above are simply a subset of all the built in aggregate functions available in SQL. If you would like to learn more about other functions in SQL, there are also one row and analytic functions, which provide other in built resources and operations that make writing efficient queries faster and easier.