SQL Select Distinct – Way to Get Rid of Duplicates

Welcome to the World of SQL

sql select distinctSQL is part of the vast growing database technologies in present business scenario. As data in businesses are growing at an unbelievably fast rate, understanding SQL, relational databases, and data management is vital for survival in today’s world. The benefit of SQL is that programmers and administrators need to learn one single language, and with minor adjustments, can apply it to a wide variety of database platforms, applications, and products. SQL is the standard language used to communicate with a relational database. It is important to learn and understand SQL and how data structures are used to store information in an organization. You use SQL to convey your needs to the database. You can request specific information from within a database by writing a query in SQLSELECT DISTINCT is one such statement or query used to suppress the display of duplicate records. Sometimes you need to display all the records in a database, but sometimes for better analysis you want to display only the distinct records and this when you use the SELECT DISTINCT option.

SELECT Statement and its Significance

SELECT statement represents Data Query Language (DQL) in SQL. A table is populated by using the INSERT statement. After this the SELECT statement is used in conjunction with the FROM clause to extract data from the database in an organized, readable manner. The SELECT keyword in a query is followed by a list of columns which you want to display as the result of a query output.

Apart from the keyword FROM, a SELECT statement can be followed by other keywords such as WHERE, ORDER BY to impose conditions with the SELECT statement.

Syntax for a SELECT Statement:

SELECT [column 1, column 2, …]

FROM source_table;

If you want to display all the columns of a table in the output, then use the symbol ‘*’ after SELECT.

SQL SELECT INTO Statement

The SELECT INTO statement selects data from one table and inserts it into a new table.

If we want to copy all columns of a table into a new table, then we use ‘*’ after the SELECT statement.

Syntax:

SELECT *

 INTO newtable [IN externaldb]

 FROM table1;

Example of SQL SELECT INTO Statement:

Here is an SQL SELECT INTO example to create a copy of the entire table of Vendors:

Syntax:

SELECT *

INTO Vendorcopy2014

FROM Vendors;

If we want to copy specific columns names into the new table then we enter:

Syntax:

SELECT [column 1, column 3, ….]

INTO newtable

FROM table1;

Here a new table will be created with the column-names and types as defined in the SELECT statement.

Example: To copy only the columns bearing Vendor Name and Country Name in the new table.

Syntax:

SELECT VendorName, CountryName

 INTO Vendorcopy2014

 FROM Vendors;

Example: Conditional copying of all vendor names only from the country Germany into the new table:

Syntax:

SELECT *

 INTO Vendorcopy2014

 FROM Vendors

 WHERE Country=’Germany’;

SELECT DISTINCT Statement

The DISTINCT clause is used to remove duplicates from the result set of a SELECT statement.

Syntax for the SQL DISTINCT clause is:

SELECT DISTINCT expressions

FROM tables

WHERE conditions;

 Explanation of Parameters or Arguments

  • Expressions are the columns or calculations that you wish to retrieve.
  • Tables are the tables that you wish to retrieve records from. At least one table must be listed with the FROM clause.
  • Conditions are conditions that must be met for the records to be selected.

If you do not want to remove duplicate data, or are sure that there will be no duplicates in the result set, then use ALL keyword instead of DISTINCT. But as ALL is the default keyword, so we usually do not include ALL in our queries.

Examples of SELECT DISTINCT Statement

Let us take this table called Vendors

Vdr_NameCityCountryOrd_Amt $
ABC SupplierBerlinGermany1000
XYZ DrillerMéxico D.F.Mexico2000
Antonio TraderMéxico D.F.Mexico1500
Henry Machining ExpertLondonUK4500
Prime Hardware SupplierNew YorkUSA3000

Example – With Single Column

When we use only one expression in the DISTINCT clause, the query will return the unique values for that expression.

Syntax:

SELECT DISTINCT City

FROM Vendors;

The above SQL DISTINCT statement will return this output of unique records from the vendors table.

City
Berlin
México D.F.
London
New York

Example – With Multiple Columns

When more than one expression is provided in the DISTINCT clause, the query will retrieve unique combinations for the expressions listed.

Syntax:

SELECT DISTINCT Vdr_Name, City, Ord_Amt $

FROM Vendors

WHERE City= México D.F.;

The above example will yield this output:

Vdr_NameCityOrd_Amt $
XYZ DrillerMéxico D.F.2000
Antonio TraderMéxico D.F.1500

COUNT Function with DISTINCT

The COUNT function may be used with the DISTINCT command to count the roes with distinct values in a database.

In the above example, write the syntax

SELECT COUNT (DISTINCT City)

From Vendors;

The above will yield the result ‘4’ as the column City has two duplicate entries.

If the Duplicate statement after the SELECT was replaced with ‘*’ then the result would have been ‘5’.

Conclusion

If you want to explore SELECT and SELECT DISTINCT statements further and want to decipher the effects of different clauses,  then then SQL Queries 101 will help you to understand how to write basic SQL queries and other code statements. As you hone your knowledge and skills to become a successful database administrator and developer, you need to start writing customized SQL code to ensure you meet the user requirements in the most efficient manner. If you want to explore Oracle SQL and Oracle PL/SQL Programming then taking a peek into the course such as Introduction to Oracle SQL, or Oracle PL/SQL Tutorial – A Comprehensive Training Course may be a good idea.