PostgreSQL SELECT DISTINCT
Summary: in this tutorial, you will learn how to use the PostgreSQL SELECT DISTINCT
clause to remove duplicate rows from a result set returned by a query.
Introduction to PostgreSQL SELECT DISTINCT
clause
The DISTINCT
a clause is used in the SELECT
statement to remove duplicate rows from a result set. The DISTINCT
clause keeps one row for each group of duplicates. The DISTINCTthe
clause can be applied to one or more columns in the select list of the SELECT
statement.
The following illustrates the syntax of the DISTINCT
clause:
SELECT
DISTINCT column1
FROM
table_name;
In this statement, the values in the column1
the column is used to evaluate the duplicate.
If you specify multiple columns, the DISTINCT
the clause will evaluate the duplicate based on the combination of values of these columns.
SELECT
DISTINCT column1, column2
FROM
table_name;
In this case, the combination of values in both column1
and column2
columns will be used for evaluating the duplicate.
PostgreSQL also provides the DISTINCT ON (expression)
to keep the “first” row of each group of duplicates using the following syntax:
SELECT
DISTINCT ON (column1) column_alias,
column2
FROM
table_name
ORDER BY
column1,
column2;
The order of rows returned from the SELECT
the statement is unspecified therefore the “first” row of each group of the duplicate is also unspecified.
It is a good practice to always use the ORDER BY
clause with the DISTINCT ON(expression)
to make the result set predictable.
Notice that the DISTINCT ON
the expression must match the leftmost expression in the ORDER BY
clause.
PostgreSQL SELECT DISTINCT
examples
Let’s create a new table called distinct_demo
and insert data into it for practicing the DISTINCT
clause.
Note that you will learn how to create a table and insert data into a table in the subsequent tutorial. In this tutorial, you just execute the statement in psql or pgAdmin to execute the statements.
First, use the following CREATE TABLE
statement to create the distinct_demo
the table consists of three columns: id
, bcolor
and fcolor
.
CREATE TABLE distinct_demo (
id serial NOT NULL PRIMARY KEY,
bcolor VARCHAR,
fcolor VARCHAR
);
Second, insert some rows into the distinct_demo
table using the following INSERT
statement:
INSERT INTO distinct_demo (bcolor, fcolor)
VALUES
('red', 'red'),
('red', 'red'),
('red', NULL),
(NULL, 'red'),
('red', 'green'),
('red', 'blue'),
('green', 'red'),
('green', 'blue'),
('green', 'green'),
('blue', 'red'),
('blue', 'green'),
('blue', 'blue');
Third, query the data from the distinct_demo
table using the SELECT
statement:
SELECT
id,
bcolor,
fcolor
FROM
distinct_demo ;
PostgreSQL DISTINCT
one column example
The following statement selects unique values in the bcolor
column from the t1
table and sorts the result set in alphabetical order by using the ORDER BY
clause.
SELECT
DISTINCT bcolor
FROM
distinct_demo
ORDER BY
bcolor;
PostgreSQL DISTINCT
multiple columns
The following statement demonstrates how to use the DISTINCT
the clause on multiple columns:
SELECT
DISTINCT bcolor,
fcolor
FROM
distinct_demo
ORDER BY
bcolor,
fcolor;
Because we specified both bcolor
and fcolor
columns in the SELECT DISTINCT
clause, PostgreSQL combined the values in both bcolor
and fcolor
columns to evaluate the uniqueness of the rows.
The query returns the unique combination of bcolor
and fcolor
from the distinct_demo
table. Notice that the distinct_demo
the table has two rows with red
value in both bcolor
and fcolor
columns. When we applied the DISTINCT
to both columns, one row was removed from the result set because it is a duplicate.
PostgreSQL DISTINCT ON
example
The following statement sorts the result set by the bcolor
and fcolor
, and then for each group of duplicates, it keeps the first row in the returned result set.
SELECT
DISTINCT ON (bcolor) bcolor,
fcolor
FROM
distinct_demo
ORDER BY
bcolor,
fcolor;
Here is the output:
In this tutorial, you have learned how to use PostgreSQL SELECT DISTINCT
statement to remove duplicate rows returned by a query.
0 Comments
CAN FEEDBACK
Emoji