SQL SELF JOIN – A Smart Way to Handle Data

sql self joinSmith, a trainee DBA at ABC Database Services, has a challenge before him. He has to refer to data inside a database table he has generated with several columns and rows. He has to compare values in a column with other values in the same column of the same table. He doesn’t know how to handle this issue. Why don’t we help him to solve this problem?

SQL SELF JOIN query will be the most efficient solution to his problem. SQL SELF JOIN is a normal join and a query used to join a table to itself. You can visualize it this way – A SELF JOIN for a given table is a join that is performed between two identical copies of that table. The same table is being joined to another copy of itself rather than being joined with a different table. Oracle 11g Tutorial – A Comprehensive Training Course, Oracle PL/SQL from scratch (Part I Oracle SQL) are great courses which will can help Smith master SQL JOIN.

Smith can also consider browsing through the refresher courses like SQL Database for Beginners and Learn SQL in 6 days to recollect SQL coding and syntax, and how to perform different database operations on tables. As a trainee if he feels he needs further guidance on writing SQL queries and learning how to build applications or generate business reports, he can refer to Introduction to SQL Training.

 SQL SELF JOIN

 Why Do We Use SELF JOIN?

Some practical uses of SELF JOIN are to obtain running counts and totals, identify which data under a particular column satisfy a certain set of conditions, and generate another table by extracting data from the original table.  Some applications of the Self-join relationship are as follows, but of course, there can be many other possibilities:

  • Obtaining statistical information of a particular item. For example, a count or total of all records in the table having a certain field value.
  • Listing all records that have the same Match field value as the current record.
  • Identifying the first record having each match field value, or identifying duplicate records.
  • Displaying the Match field value only once for each group of records having that value.

Examples and Format of SELF JOIN

A self-join can be an inner join or an outer join. A table is joined to itself based upon a field or combination of fields that have duplicate data in different records. The data-type of the inter-related columns must be of the same type or needs to cast them in same type.

To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value would be equal to itself. You can use a self-join to simplify nested SQL queries where the inner and outer queries reference the same table. These joins allow you to retrieve related records from the same table.

Example 1:

Let us take an example where we use SELF JOIN to match an employee’s name against the Supervisor or Manager’s name. The employees table is shown below:

 emp_id                                 emp_name                               emp_supv

          4081                                      Nancy Brown                           NULL

          4083                                      Peter Parker                            4081

4055                                      Jacob Miller                               4083

4058                                      Mary Ray                                   4083

4060                                      Jane Smith                                 4081

4061                                      Bob Hunter                                4081

In this table, the manager attribute simply references the employee ID of another employee in the same table. For example, Nancy Brown is the CEO of the company and Peter Parker reports to Nancy Brown, while Jacob Miller and Mary Ray report to Peter Parker.

Here’s the SQL statement that will retrieve the desired results:

Syntax:

 SELECT a.emp_id AS “Emp_ID”,a.emp_name AS “Employee Name”,

b.emp_id AS “Supervisor ID”,b.emp_name AS “Supervisor Name”

FROM employees AS a LEFT OUTER JOIN employees AS b

ON a.emp_supv = b.emp_id

 The corresponding output table will look like:

Emp_ID               Employee Name               Supervisor ID                  Supervisor Name

4081                    Nancy Brown                     NULL                                   NULL

4083                    Peter Parker                       4081                                   Nancy Brown

4055                     Jacob Miller                       4083                                  Peter Parker

4058                     Mary Ray                            4083                                  Peter Parker

4060                   Jane Smith                           4081                                  Nancy Brown

4061                    Bob Hunter                         4081                                 Nancy Brown

It’s important to select the correct join type while writing a SELF JOIN statement. In this case, we used a LEFT OUTER JOIN to ensure we had output records corresponding to each employee. If we used an INNER JOIN, then Nancy Brown, the CEO would have been omitted as she does not have a supervisor.

Example 2:

The following example performs a self-join of the Sales.SalesPerson table to produce a list of the sales people with their matching territories.

Syntax:

SELECT st.Name AS TerritoryName, sp.BusinessEntityID,

sp.SalesQuota, sp.SalesYTD

FROM Sales.SalesPerson AS sp

JOIN Sales.SalesTerritory AS st

ON sp.TerritoryID = st.TerritoryID

ORDER BY st.Name, sp.BusinessEntityID

 Example 3:

Using SELF JOIN to find the products supplied by more than one vendor.

Because this query involves a join of the ProductVendor table with itself, two different aliases (pv1 and pv2) are used in the FROM clause. These aliases are used to qualify the column names in the rest of the query.

Syntax:

SELECT DISTINCT pv1.ProductID, pv1.VendorID

FROM Purchasing.ProductVendor pv1

INNER JOIN Purchasing.ProductVendor pv2

ON pv1.ProductID = pv2.ProductID

AND pv1.VendorID <> pv2.VendorID

ORDER BY pv1.ProductID

This discussion can be extrapolated to include other types of Joins in SQL such as:

  • INNER JOIN
  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN
  • CROSS JOIN

Summary

As Smith climbs his career ladder he can get more experienced and creative with JOIN statements which will provide better results with a less lag time. To become a pro in database programming Smith can check out courses like SQL Server Essentials: What you should know! and Practical SQL Skills from Beginner to Intermediate. Hope we helped Smith solve his problem. Wishing him all the best to achieve, accomplish, and prosper in this career as a DBA.