Using Aggregate Functions (SUM, AVG, MAX, MIN, COUNT, DISTINCT) in MySQL

Using Aggregate Functions (SUM, AVG, MAX, MIN, COUNT, DISTINCT) in MySQL

Learn how to use the aggregate functions including SUM, AVG, MAX, MIN, COUNT, DISTINCT in MySQL.

September 23, 2019

We often required to perform calculations on multiple rows to get the data for reporting, statistical and analytical purpose. In such cases, the aggregation functions available in MySQL are handy and we must be aware of using these functions to fetch the desired data. These functions include SUM, AVG, MAX, MIN, COUNT, and DISTINCT.

The rest of the section of this tutorial explains the usage of these aggregation functions.

Test Data

You can use the below-mentioned queries to prepare the test data to follow the subsequent sections. Create the school database and tables (user and score) to understand the aggregation functions.

# Create the database
CREATE SCHEMA `school` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;

# Create the user table
CREATE TABLE `school`.`user` (
`user_id` BIGINT NOT NULL AUTO_INCREMENT,
`first_name` VARCHAR(45) NOT NULL,
`last_name` VARCHAR(45) NULL,
`section` SMALLINT(6) NULL,
`active` TINYINT(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`user_id`));

# Create the score table
CREATE TABLE `school`.`score` (
`score_id` BIGINT NOT NULL AUTO_INCREMENT,
`user_id` BIGINT NOT NULL,
`section` SMALLINT(6) NOT NULL DEFAULT 1,
`subject` VARCHAR(45) NOT NULL,
`score` SMALLINT(6) NOT NULL DEFAULT 0,
PRIMARY KEY (`score_id`));

Now add the test data to the user table as shown below.

# User data
INSERT INTO `user` ( `user_id`, `first_name`, `last_name`, `section`, `active` ) VALUES
( 1, 'John', 'Smith', 1, 1 ),
( 2, 'Rick', 'Jones', 1, 1 ),
( 3, 'Catherine', 'Ponting', 2, 1 ),
( 4, 'Harsh', 'Upadhyay', 2, 1 ),
( 5, 'Tajwinder', 'Singh', 2, 0 );

The above query inserts 5 users for sections 1 and 2. Also, insert the score data using the query as shown below.

# Score data
INSERT INTO `school`.`score` (`score_id`, `user_id`, `section`, `subject`, `score`) VALUES
('1', '1', '1', 'Maths', '86'),
('2', '1', '1', 'Physics', '75'),
('3', '1', '1', 'Chemistry', '90'),
('4', '2', '1', 'Maths', '64'),
('5', '2', '1', 'Physics', '98'),
('6', '2', '1', 'Chemistry', '56'),
('7', '3', '1', 'Maths', '48'),
('8', '3', '1', 'Physics', '69'),
('9', '3', '1', 'Chemistry', '54'),
('10', '3', '2', 'Maths', '82'),
('11', '3', '2', 'Physics', '85'),
('12', '3', '2', 'Chemistry', '91'),
('13', '4', '2', 'Maths', '65'),
('14', '4', '2', 'Physics', '74'),
('15', '4', '2', 'Chemistry', '84');

We have inserted the score of the active students. Also, note that the score for the student with id 1 and 2 is added for section 1 for all the 3 subjects. The above query also inserts the score data of student with id 3 for sections 1 and 2. The student having id 4 got the score data for section 2. Now we have good test data to start learning the aggregate functions.

SUM

This section explains the usage of the aggregate function sum to get the section-wise score of the users to get the total score of all the subjects for all the sections as shown below.

# SUM - Total Score of the Users for each section
SELECT `user`.`first_name`, `user`.`last_name`, `score`.`section`, SUM(`score`) AS total_score FROM `user`, `score`
WHERE `user`.`user_id` = `score`.`user_id`
GROUP BY `score`.`user_id`, `score`.`section`;

# Result

first_name last_name section total_score
----------------------------------------------------
John Smith 1 251
Rick Jones 1 218
Catherine Ponting 1 171
Catherine Ponting 2 258
Harsh Upadhyay 2 223

The above query results can be used to present the report showing the total score of all the users for each section they have studied in the school.

AVG

The aggregate function AVG can be used to get the average value of the columns qualified for the aggregation based on the WHERE conditions and the grouping applied by us.

We can obtain the section-wise average marks in each subject as shown below.

# AVG - Section wise average score in each subject
SELECT `score`.`section`, `score`.`subject`, AVG(`score`) AS avg_score FROM `score`
GROUP BY `score`.`section`, `score`.`subject`;

# Result

section subject avg_score
--------------------------------------------
1 Maths 66.0000
1 Physics 80.6667
1 Chemistry 66.6667
2 Maths 73.5000
2 Physics 79.5000
2 Chemistry 87.5000

Now we can use the above data to generate the report showing the average score obtained by the users for each section.

MAX

The aggregate function MAX can be used to find out the maximum value of the columns qualified for the aggregation based on the WHERE conditions and the grouping applied by us.

We can obtain the section-wise maximum marks in each subject as shown below.

# MAX - Section wise maximum score in each subject
SELECT `score`.`section`, `score`.`subject`, MAX(`score`) AS max_score FROM `score`
GROUP BY `score`.`section`, `score`.`subject`;

# Result

section subject max_score
--------------------------------------------
1 Maths 86
1 Physics 98
1 Chemistry 90
2 Maths 82
2 Physics 85
2 Chemistry 91

MIN

The aggregate function MIN can be used to find out the minimum value of the columns qualified for the aggregation based on the WHERE conditions and the grouping applied by us.

We can obtain the section-wise minimum marks in each subject as shown below.

# MIN - Section wise minimum score in each subject
SELECT `score`.`section`, `score`.`subject`, MIN(`score`) AS min_score FROM `score`
GROUP BY `score`.`section`, `score`.`subject`;

# Result

section subject max_score
--------------------------------------------
1 Maths 48
1 Physics 69
1 Chemistry 54
2 Maths 65
2 Physics 74
2 Chemistry 84

COUNT

The aggregate function COUNT can be used to get the total number of values in the specified columns.

Using the test data, we can get the section-wise total active users as shown below.

# COUNT - Total active users in each section
SELECT `user`.`section`, COUNT(`user_id`) AS active_users FROM `user`
WHERE `user`.`active` = 1
GROUP BY `user`.`section`;

# Result
section active_user
--------------------
1 2
2 2

DISTINCT

We can use the keyword DISTINCT to omit duplicate records. In our scenario, we can get the user data who obtained score for at least one subject using the query as shown below.

# DISTINCT - Get list of users who got score
SELECT DISTINCT `user`.`user_id`, `user`.`first_name`, `user`.`last_name` FROM `user`, `score`
WHERE `user`.`user_id` = `score`.`user_id`;

# Result

user_id first_name last_name
--------------------------------------------
1 John Smith
2 Rick Jones
3 Catherine Ponting
4 Harsh Upadhyay

We can see that the inactive user who did not score even once is omitted from the query results.

This is how we can use the aggregate functions to obtain the data for reporting and analysis purpose. These functions are also important for managerial activities to take organization decisions using the data accumulated over time.

Write a Comment
Click the captcha image to get new code.
Discussion Forum by DISQUS