Authors – All

If you have a few years of experience in the SQL ecosystem, and you're interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Introduction

When working with large databases, it’s often necessary to interact with multiple tables to extract relevant information. Furthermore, we may need to identify which tables contain a specific column name.

In this tutorial, we’ll explore how to find the tables that include a particular column name.

2. Problem Statement

In database management, especially with SQL, we often need to locate specific information within vast amounts of data. One important task is identifying all tables that contain a particular column name. This is crucial for several reasons, such as understanding data relationships, optimizing queries, ensuring data integrity, aiding in refactoring application code, and facilitating column name changes or data migrations.

In the following sections, we’ll learn how to achieve this using SQL queries. We tested our examples on MS SQL Server 2022, PostgreSQL 14, and MySQL 8 databases. However, most methods we discuss should be available in other versions of these SQL implementations and other SQL dialects.

3. Model

To discuss the queries in this tutorial, we’ll refer to the provided database schema, which includes the necessary scripts for the tables in each database.

In this tutorial, we’ll write SQL queries to identify the tables that contain the column department_id.

4. ANSI SQL Query Implementation

In this section, we’ll utilize ANSI SQL-based queries supported by most databases. ANSI SQL provides the information_schema, a standardized set of tables that store metadata about all tables and other database objects. As a result, using information_schema, we can efficiently identify tables that contain specific columns.

Let’s write a query to find tables that contain the column department_id:

SELECT table_name
FROM information_schema.columns
WHERE column_name = 'department_id';

Running this query produces the following output:

Tables containing department_id

This query is compatible with many databases, including PostgreSQL, MySQL, MS SQL, and others.

5. Database-Specific Query Implementation

In addition to ANSI SQL, many databases offer specific methods for storing and accessing metadata. Let’s explore some of these approaches for retrieving tables with a specific column name.

5.1. MS SQL

In MS SQL, we can use the catalog tables to retrieve information about tables, columns, and other database objects. Moreover, by using the views sys.columns and sys.tables, we can search for a specific column name:

SELECT t.name AS table_name
FROM sys.tables t
JOIN sys.columns c ON t.object_id = c.object_id
WHERE c.name = 'department_id';

This query joins the system views sys.columns and sys.tables to obtain the desired result.

We can further simplify the above query:

SELECT OBJECT_NAME(object_id) AS table_name
FROM sys.columns
WHERE name = 'department_id';

In this case, we used the OBJECT_NAME() function to retrieve the table name directly. The OBJECT_NAME() function looks up the table name from the sys.objects view using the object_id value. This query eliminates the need for explicit joins, unlike the previous query.

5.2. PostgreSQL

Similar to MS SQL, PostgreSQL also provides system catalog tables to retrieve the metadata apart from using the information schema:

SELECT c.relname AS table_name
FROM pg_catalog.pg_attribute a
JOIN pg_catalog.pg_class c ON a.attrelid = c.oid
WHERE a.attname = 'department_id';

In this query, we joined the tables pg_attribute and pg_class to find the tables that contain the column department_id. We can further refine the query by adding more conditions to the WHERE clause for additional filtering:

SELECT c.relname AS table_name
FROM pg_catalog.pg_attribute a
JOIN pg_catalog.pg_class c ON a.attrelid = c.oid
WHERE a.attname = 'department_id'
AND c.relkind = 'r';

Here, we added a condition for relkind, where r refers to regular tables, excluding views and system tables.

6. Conclusion

In this article, we looked into various methods for retrieving the list of tables containing a specific column name. Initially, we utilized the information_schema, a standard approach supported by many databases following the ANSI SQL standard, to obtain the table names. Additionally, we explored database-specific methods offered by PostgreSQL and MS SQL to achieve the same objective.

While the information_schema approach is widely applicable across databases, the database-specific methods provide additional insights into various database objects that may not be accessible via the information_schema.

As always, the sample queries we discussed here are available over on GitHub.

Authors – All

If you have a few years of experience in the SQL ecosystem, and you're interested in sharing that experience with the community, have a look at our Contribution Guidelines.