Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Overview

We use joins and subqueries to retrieve data from multiple tables in SQL. Nevertheless, they’re used for different purposes and in various scenarios to optimize and simplify queries. Understanding their differences and knowing when to use each can significantly enhance database performance and query efficiency.

In this article, we’ll explore the definitions and types of joins and subqueries and explore the differences between the two.

We’ll provide practical examples using the Baeldung University database scheme, specifically its Department and Faculty tables.

Note that we perform all operations in the PostgreSQL database management system (DBMS).

2. Joins

A join operation combines rows from two or more tables based on a related column between them. We use a join operation to retrieve data spread across multiple tables.

There are several types of joins, each serving a different purpose and use case:

  • INNER JOIN: this join returns only the rows with matching values in both tables.
  • LEFT JOIN (or LEFT OUTER JOIN): it returns all rows from the left table and the matched rows from the right table. Furthermore, if no match is found, NULL values are returned for columns from the right table.
  • RIGHT JOIN (or RIGHT OUTER JOIN): this operation is the opposite of a left join. It returns all rows from the right table and the matched rows from the left table. Additionally, if no match is found, NULL values are returned for columns from the left table.
  • FULL JOIN  (or FULL OUTER JOIN): it returns all rows from both tables. When there’s a match between the tables, the joined row contains values from both tables. Additionally, if there’s no match, the result includes NULL values for the columns from the table without a match.

Let’s use the Faculty and Department tables from the Baeldung University database schema to illustrate a full outer join:

SELECT f.id AS faculty_id, f.name AS faculty_name, d.name AS department_name
FROM Faculty f
FULL OUTER JOIN Department d ON f.department_id = d.id;
+------------+-----------------+--------------------------------+
| faculty_id | faculty_name    | department_name                |
|------------+-----------------+--------------------------------|
| 1          | Anubha Gupta    | Electronics and Communications |
| 2          | Anubha Gupta    | Electronics and Communications |
| 3          | Anubha Gupta    | Electronics and Communications |
| 21         | Peter Pan       | Electronics and Communications |
| 22         | Peter Pan       | Electronics and Communications |
| 23         | Peter Pan       | Electronics and Communications |
| 33         | Nando de Fretas | Electronics and Communications |
| 34         | Nando de Fretas | Electronics and Communications |
...

In this query, we performed a full outer join to display all faculty members and their associated departments. Additionally, faculty members without assigned departments and departments without faculty members are included.

3. Subqueries

A subquery, also known as an inner query or nested query, is a query within another SQL query. We can use subqueries in SELECT, INSERT, UPDATE, or DELETE statements, and we can also nest them inside other subqueries.

3.1. Subqueries in SELECT Fields

We use subqueries in the SELECT statement to calculate values for the result set. In particular, it’s helpful when we perform calculations or retrieve values that depend on other tables or calculations.

For example, let’s calculate the total number of faculty members in each department using a subquery in the SELECT statement:

SELECT 
     d.name AS department_name,
     (SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) AS faculty_count
FROM 
     Department d;
+--------------------------------+---------------+
| department_name                | faculty_count |
|--------------------------------+---------------|
| Computer Science               | 20            |
| Electronics and Communications | 17            |
| Mechanical Engineering         | 10            |
| Civil Engineering              | 13            |
| Mathematics                    | 12            |
+--------------------------------+---------------+
SELECT 5
Time: 0.028s

In this example, the subquery (SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) calculates the total number of faculty members in each department.

3.2. Subqueries in the WHERE Statement

We use subqueries in the WHERE clause to filter results based on values retrieved by another query.

For example, let’s find the names of faculty members who belong to departments that have more than 17 faculty members:

SELECT 
     f.name 
FROM 
     Faculty f 
WHERE 
     f.department_id IN (SELECT d.id FROM Department d WHERE (SELECT COUNT(*) FROM Faculty f WHERE f.department_id =
  d.id) > 17);
+----------------+
| name           |
|----------------|
| AV Subramanium |
| Risa Sodi      |
| Risa Sodi      |
| Wlliam Liu     |
| Wlliam Liu     |
| Wlliam Liu     |
| Wlliam Liu     |
| Cormen Qiu     |
| Ajit Singh     |
...

3.3. Joining With Subqueries

We can also use subqueries in the JOIN clause to create complex joins based on criteria that involve multiple tables.

For example, let’s retrieve faculty members along with the total number of faculty members in their department:

SELECT 
     f.id AS faculty_id,
     f.name AS faculty_name,
     d.name AS department_name,
     dept_counts.faculty_count
FROM 
     Faculty f 
JOIN 
     Department d ON f.department_id = d.id
JOIN 
     (SELECT department_id, COUNT(*) AS faculty_count FROM Faculty GROUP BY department_id) dept_counts 
     ON f.department_id = dept_counts.department_id;
+------------+-----------------+--------------------------------+---------------+
| faculty_id | faculty_name    | department_name                | faculty_count |
|------------+-----------------+--------------------------------+---------------|
| 1          | Anubha Gupta    | Electronics and Communications | 17            |
| 2          | Anubha Gupta    | Electronics and Communications | 17            |
| 3          | Anubha Gupta    | Electronics and Communications | 17            |
| 21         | Peter Pan       | Electronics and Communications | 17            |
| 22         | Peter Pan       | Electronics and Communications | 17            |
| 23         | Peter Pan       | Electronics and Communications | 17            |
| 33         | Nando de Fretas | Electronics and Communications | 17            |
| 34         | Nando de Fretas | Electronics and Communications | 17            |
| 41         | Robert Ludloo   | Electronics and Communications | 17            |
| 42         | Robert Ludloo   | Electronics and Communications | 17            |
...

In this example, the subquery (SELECT department_id, COUNT(*) AS faculty_count FROM Faculty GROUP BY department_id) calculates the number of faculty members in each department. Furthermore, the main query then joins this result to the Faculty and Department tables.

3.4. Nested Subqueries

Nested subqueries are subqueries contained within other subqueries. Furthermore, they’re used for complex operations where multiple levels of queries are required.

For example, let’s find the names of departments that have more faculty members than the departments with the fewest faculty members:

SELECT 
     d.name 
FROM 
     Department d 
WHERE 
     (SELECT COUNT(*) FROM Faculty f WHERE f.department_id = d.id) > 
     (SELECT MIN(faculty_count) 
      FROM (SELECT COUNT(*) AS faculty_count 
            FROM Faculty 
            GROUP BY department_id) AS subquery);
+--------------------------------+
| name                           |
|--------------------------------|
| Computer Science               |
| Electronics and Communications |
| Civil Engineering              |
| Mathematics                    |
+--------------------------------+
SELECT 4
Time: 0.011s

In this query, the nested subquery calculates the number of faculty members in each department. Also, we find the minimum number of faculty members across the departments. The output of this query lists all departments that have more faculty members than the department with the fewest faculty members.

4. Differences

Let’s look at the key differences between a join and a subquery:

Feature Join Subquery
Purpose A join combines rows from two or more tables based on a related column A subquery retrieves data based on the results of another query
Usage It’s mostly used in the FROM clause It’s used in SELECT, FROM, WHERE, or HAVING clauses
Complexity It can be more complex for multiple table joins It can simplify complex queries into manageable parts

5. Conclusion

In this article, we’ve looked at joins and subqueries for retrieving data from multiple tables. Joins generally combine rows based on related columns and are efficient for large datasets. Subqueries, on the other hand, perform operations within another query and can be very useful for filtering and aggregation.

Understanding the differences and knowing when to use each can greatly enhance the efficiency and readability of queries.