Types of Joins in SQL
Nowadays, the majority of organizations use Relational Database Management Systems (RDBMS) to organize and store their data. Oracle, MS SQL Server, Postgree or MySQL – it doesn’t really matter when it comes to the main principles and rules used to express what we want and how we want it from a relational point of view. The Structured Query Language (SQL) is the common standard domain specific language used to establish this communication, and the SQL joins are the primary means to combine the data.
Most of us know the basic types of joins – Inner and Outer joins – since they’re the usual suspects in our day-to-day tasks. But there are a few more joins that are worth mentioning, at least from a theoretical point of view. Do you know what's a Theta join ? The difference between Inner, Equi and Non-Equi join, what is a Cartesian product, what is a Natural join, what are the different types of Outer joins, what’s the connection between all of them ?
Keep reading to find out.
The Bigger Picture
One picture is worth a thousand words. So let’s not waste time.
In this diagram I’ve made you can see the whole hierarchy of joins, starting with the most generic
Cross join and finishing with the Natural join. The types of joins we usually use in our day to day work are the Equi joins. The Theta joins are more of a theoretical importance and are seldom used in practice. Let’s go through all of them.
The example schema
For my examples, I’ll use the following two simple tables with a few records inside for easy understanding.
Also note that the examples I provide are generic and can vary from implementation to implementation (I've used Oracle and SQL Developer). The gray rectangles are abstract types of joins (see the difference between Inner and Equi join below).
Cross Join (Cartesian product)
The Cross join is the most generic type of join, it generally means connect anything with anything. Its result is the number of records in the first table multiplied by the number of records in the second table, showing the columns altogether.
You can perform Cross join using one of the following queries:
SELECT * FROM Departments, Employees
SELECT * FROM Departments CROSS JOIN Employees
Which will result in
The use of the Cartesian join is strongly discouraged. If you feel you need it, consider rethinking your design.
The Inner Join refers to the intersection of two (or more) tables. For example, to get all the employees who have departments and all the departments who have employees, we’ll use the one of the following queries
SELECT * FROM Departments d, Employees e WHERE d.ID = e.DepartmentID
SELECT * FROM Departments d INNER JOIN Employees e ON d.ID = e.DepartmentID
If you want to get results only from the Development department, you’ll use something like this
SELECT * FROM Departments d, Employees e WHERE d.ID = e.DepartmentID AND d.ID = 1
The Outer Joins consist of the intersection and the complements of the tables. There are three general types of outer joins:
• Full Outer Join
• Left Outer Join
• Right Outer Join
Full Outer Join
The Full Outer join consists of the intersection of the tables along with the two relative complements. Simply said, this type of join will act like an inner join but will also return all records that have NULL for their foreign key relations:
SELECT * FROM Departments d FULL OUTER JOIN Employees e ON d.ID = e.DepartmentID
As you can see, the Employees that don’t have any departments associated along with the departments which don’t have any employees associated are also returned.
Left Outer Join
The Left Outer join will return the records that intersect, along with the records from the left table which don’t have any assigned records from the right table. In our case:
SELECT * FROM Departments d LEFT OUTER JOIN Employees e ON d.ID = e.DepartmentID;
Right Outer Join
The Right Outer join is essentially the same like the Left Outer Join, but with the right table’s complement.
SELECT * FROM Departments d RIGHT OUTER JOIN Employees e ON d.ID = e.DepartmentID;
As you can see, the employees without a department are also returned.
In fact, the examples I just gave were not just Inner Joins and Outer Joins. They ware also Equi Joins. And that’s the exact type of join you use when you write a SELECT statement with the equals binary operator. When someone talk about an Inner join, they usually mean an Inner Equi Join.
The Theta join (also known as the Non-equi join) is the kind of join you'll produce if you replace or combine the equals operator with something else. For example:
SELECT * FROM Departments d, Employees e WHERE d.ID < e.DepartmentID
SELECT * FROM Departments d, Employees e WHERE d.ID <= e.DepartmentID
The Natural join is more or less a convenience. It compares the columns of the tables and performs an Equi Join on them, which can either be Inner or Outer. In order to use this join, you’ll have to follow a specific naming convention. For example, in my case I should change the ID column of Departments to Department_ID and the foreign key column of Employees to Department_ID.
Having done that, the Natural join will be able to match these two and perform. For example:
SELECT * FROM Departments NATURAL JOIN Employees
This will perform an Inner Join by default. If we want to make it an Outer Natural Join, we can do it like this:
SELECT * FROM Departments NATURAL LEFT OUTER JOIN Employees
A Few Tips on Using SQL Joins
- By INNER JOIN, we usually mean INNER EQUI JOIN
- By LEFT JOIN, we usually mean LEFT OUTER EQUI JOIN (RIGHT JOIN respectively)
- The INTERSECT operation is not the same as the INNER JOIN
- The UNION operation is not the same as the FULL OUTER JOIN
- Don’t use the Cross Join. If you think you need it, better reconsider your design.
- Although you can use the Natural join, I would advise you against that for a few reasons
1. Stating the column names explicitly makes your intent clearer.
2. In order to utilize it, you’ll need to use a specific naming convention that may conflict with the one you currently follow.
3. If you use an Object Relation Mapper, you should think twice before choosing a proper table name (I would not like to access my department id like dept.Department_ID. I would prefer dept.ID).
For more information on the topic, check Kyle Hailey's article on SQL relations.
About the author: