Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: July 22, 2024
We use joins and subqueries to retrieve data from multiple tables in SQL. Nevertheless, they’re used for different purposes and in various scenarios to optimize and simplify queries. Understanding their differences and knowing when to use each can significantly enhance database performance and query efficiency.
In this article, we’ll explore the definitions and types of joins and subqueries and explore the differences between the two.
We’ll provide practical examples using the Baeldung University database scheme, specifically its Department and Faculty tables.
Note that we perform all operations in the PostgreSQL database management system (DBMS).
A join operation combines rows from two or more tables based on a related column between them. We use a join operation to retrieve data spread across multiple tables.
There are several types of joins, each serving a different purpose and use case:
Let’s use the Faculty and Department tables from the Baeldung University database schema to illustrate a full outer join:
SELECT f.id AS faculty_id, f.name AS faculty_name, d.name AS department_name
FROM Faculty f
FULL OUTER JOIN Department d ON f.department_id = d.id;
+------------+-----------------+--------------------------------+
| faculty_id | faculty_name | department_name |
|------------+-----------------+--------------------------------|
| 1 | Anubha Gupta | Electronics and Communications |
| 2 | Anubha Gupta | Electronics and Communications |
| 3 | Anubha Gupta | Electronics and Communications |
| 21 | Peter Pan | Electronics and Communications |
| 22 | Peter Pan | Electronics and Communications |
| 23 | Peter Pan | Electronics and Communications |
| 33 | Nando de Fretas | Electronics and Communications |
| 34 | Nando de Fretas | Electronics and Communications |
...
In this query, we performed a full outer join to display all faculty members and their associated departments. Additionally, faculty members without assigned departments and departments without faculty members are included.
A subquery, also known as an inner query or nested query, is a query within another SQL query. We can use subqueries in SELECT, INSERT, UPDATE, or DELETE statements, and we can also nest them inside other subqueries.
We use subqueries in the SELECT statement to calculate values for the result set. In particular, it’s helpful when we perform calculations or retrieve values that depend on other tables or calculations.
For example, let’s calculate the total number of faculty members in each department using a subquery in the SELECT statement:
SELECT
d.name AS department_name,
(SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) AS faculty_count
FROM
Department d;
+--------------------------------+---------------+
| department_name | faculty_count |
|--------------------------------+---------------|
| Computer Science | 20 |
| Electronics and Communications | 17 |
| Mechanical Engineering | 10 |
| Civil Engineering | 13 |
| Mathematics | 12 |
+--------------------------------+---------------+
SELECT 5
Time: 0.028s
In this example, the subquery (SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) calculates the total number of faculty members in each department.
We use subqueries in the WHERE clause to filter results based on values retrieved by another query.
For example, let’s find the names of faculty members who belong to departments that have more than 17 faculty members:
SELECT
f.name
FROM
Faculty f
WHERE
f.department_id IN (SELECT d.id FROM Department d WHERE (SELECT COUNT(*) FROM Faculty f WHERE f.department_id =
d.id) > 17);
+----------------+
| name |
|----------------|
| AV Subramanium |
| Risa Sodi |
| Risa Sodi |
| Wlliam Liu |
| Wlliam Liu |
| Wlliam Liu |
| Wlliam Liu |
| Cormen Qiu |
| Ajit Singh |
...
We can also use subqueries in the JOIN clause to create complex joins based on criteria that involve multiple tables.
For example, let’s retrieve faculty members along with the total number of faculty members in their department:
SELECT
f.id AS faculty_id,
f.name AS faculty_name,
d.name AS department_name,
dept_counts.faculty_count
FROM
Faculty f
JOIN
Department d ON f.department_id = d.id
JOIN
(SELECT department_id, COUNT(*) AS faculty_count FROM Faculty GROUP BY department_id) dept_counts
ON f.department_id = dept_counts.department_id;
+------------+-----------------+--------------------------------+---------------+
| faculty_id | faculty_name | department_name | faculty_count |
|------------+-----------------+--------------------------------+---------------|
| 1 | Anubha Gupta | Electronics and Communications | 17 |
| 2 | Anubha Gupta | Electronics and Communications | 17 |
| 3 | Anubha Gupta | Electronics and Communications | 17 |
| 21 | Peter Pan | Electronics and Communications | 17 |
| 22 | Peter Pan | Electronics and Communications | 17 |
| 23 | Peter Pan | Electronics and Communications | 17 |
| 33 | Nando de Fretas | Electronics and Communications | 17 |
| 34 | Nando de Fretas | Electronics and Communications | 17 |
| 41 | Robert Ludloo | Electronics and Communications | 17 |
| 42 | Robert Ludloo | Electronics and Communications | 17 |
...
In this example, the subquery (SELECT department_id, COUNT(*) AS faculty_count FROM Faculty GROUP BY department_id) calculates the number of faculty members in each department. Furthermore, the main query then joins this result to the Faculty and Department tables.
Nested subqueries are subqueries contained within other subqueries. Furthermore, they’re used for complex operations where multiple levels of queries are required.
For example, let’s find the names of departments that have more faculty members than the departments with the fewest faculty members:
SELECT
d.name
FROM
Department d
WHERE
(SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) >
(SELECT MIN(faculty_count)
FROM (SELECT COUNT(*) AS faculty_count
FROM Faculty
GROUP BY department_id) AS subquery);
+--------------------------------+
| name |
|--------------------------------|
| Computer Science |
| Electronics and Communications |
| Civil Engineering |
| Mathematics |
+--------------------------------+
SELECT 4
Time: 0.011s
In this query, the nested subquery calculates the number of faculty members in each department. Also, we find the minimum number of faculty members across the departments. The output of this query lists all departments that have more faculty members than the department with the fewest faculty members.
Let’s look at the key differences between a join and a subquery:
| Feature | Join | Subquery |
|---|---|---|
| Purpose | A join combines rows from two or more tables based on a related column | A subquery retrieves data based on the results of another query |
| Usage | It’s mostly used in the FROM clause | It’s used in SELECT, FROM, WHERE, or HAVING clauses |
| Complexity | It can be more complex for multiple table joins | It can simplify complex queries into manageable parts |
In this article, we’ve looked at joins and subqueries for retrieving data from multiple tables. Joins generally combine rows based on related columns and are efficient for large datasets. Subqueries, on the other hand, perform operations within another query and can be very useful for filtering and aggregation.
Understanding the differences and knowing when to use each can greatly enhance the efficiency and readability of queries.