SQL Injection and How to Prevent It?

Azure Spring Apps is a fully managed service from Microsoft (built in collaboration with VMware), focused on building and deploying Spring Boot applications on Azure Cloud without worrying about Kubernetes.

The Enterprise plan comes with some interesting features, such as commercial Spring runtime support, a 99.95% SLA and some deep discounts (up to 47%) when you are ready for production.

>> Learn more and deploy your first Spring Boot app to Azure.

And, you can participate in a very quick (1 minute) paid user research from the Java on Azure product team.

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

Accelerate Your Jakarta EE Development with Payara Server!

With best-in-class guides and documentation, Payara essentially simplifies deployment to diverse infrastructures.

Beyond that, it provides intelligent insights and actions to optimize Jakarta EE applications.

The goal is to apply an opinionated approach to get to what's essential for mission-critical applications - really solid scalability, availability, security, and long-term support:

>> Download and Explore the Guide (to learn more)

The AI Assistant to boost Boost your productivity writing unit tests - Machinet AI.

AI is all the rage these days, but for very good reason. The highly practical coding companion, you'll get the power of AI-assisted coding and automated unit test generation.
Machinet's Unit Test AI Agent utilizes your own project context to create meaningful unit tests that intelligently aligns with the behavior of the code.
And, the AI Chat crafts code and fixes errors with ease, like a helpful sidekick.

Simplify Your Coding Journey with Machinet AI:

>> Install Machinet AI in your IntelliJ

Looking for the ideal Linux distro for running modern Spring apps in the cloud?

Meet Alpaquita Linux: lightweight, secure, and powerful enough to handle heavy workloads.

This distro is specifically designed for running Java apps. It builds upon Alpine and features significant enhancements to excel in high-density container environments while meeting enterprise-grade security standards.

Specifically, the container image size is ~30% smaller than standard options, and it consumes up to 30% less RAM:

>> Try Alpaquita Containers now.

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

>> Take a look at DBSchema

1. Introduction

Despite being one of the best-known vulnerabilities, SQL Injection continues to rank on the top spot of the infamous OWASP Top 10’s list – now part of the more general Injection class.

In this tutorial, we’ll explore common coding mistakes in Java that lead to a vulnerable application and how to avoid them using the APIs available in the JVM’s standard runtime library. We’ll also cover what protections we can get out of ORMs like JPA, Hibernate and others and which blind spots we’ll still have to worry about.

2. How Applications Become Vulnerable to SQL Injection?

Injection attacks work because, for many applications, the only way to execute a given computation is to dynamically generate code that is in turn run by another system or component. If in the process of generating this code we use untrusted data without proper sanitization, we leave an open door for hackers to exploit.

This statement may sound a bit abstract, so let’s take look at how this happens in practice with a textbook example:

public List<AccountDTO>
  unsafeFindAccountsByCustomerId(String customerId)
  throws SQLException {
    // UNSAFE !!! DON'T DO THIS !!!
    String sql = "select "
      + "customer_id,acc_number,branch_id,balance "
      + "from Accounts where customer_id = '"
      + customerId 
      + "'";
    Connection c = dataSource.getConnection();
    ResultSet rs = c.createStatement().executeQuery(sql);
    // ...
}

The problem with this code is obvious: we’ve put the customerId‘s value into the query with no validation at all. Nothing bad will happen if we’re sure that this value will only come from trusted sources, but can we?

Let’s imagine that this function is used in a REST API implementation for an account resource. Exploiting this code is trivial: all we have to do is to send a value that, when concatenated with the fixed part of the query, change its intended behavior:

curl -X GET \
  'http://localhost:8080/accounts?customerId=abc%27%20or%20%271%27=%271' \

Assuming the customerId parameter value goes unchecked until it reaches our function, here’s what we’d receive:

abc' or '1' = '1

When we join this value with the fixed part, we get the final SQL statement that will be executed:

select customer_id, acc_number,branch_id, balance
  from Accounts where customerId = 'abc' or '1' = '1'

Probably not what we’ve wanted…

A smart developer (aren’t we all?) would now be thinking: “That’s silly! I’d never use string concatenation to build a query like this”.

Not so fast… This canonical example is silly indeed but there are situations where we might still need to do it:

Complex queries with dynamic search criteria: adding UNION clauses depending on user-supplied criteria
Dynamic grouping or ordering: REST APIs used as a backend to a GUI data table

2.1. I’m Using JPA. I’m Safe, Right?

This is a common misconception. JPA and other ORMs relieves us from creating hand-coded SQL statements, but they won’t prevent us from writing vulnerable code.

Let’s see how the JPA version of the previous example looks:

public List<AccountDTO> unsafeJpaFindAccountsByCustomerId(String customerId) {    
    String jql = "from Account where customerId = '" + customerId + "'";        
    TypedQuery<Account> q = em.createQuery(jql, Account.class);        
    return q.getResultList()
      .stream()
      .map(this::toAccountDTO)
      .collect(Collectors.toList());        
}

The same issue we’ve pointed before is also present here: we’re using unvalidated input to create a JPA query, so we’re exposed to the same kind of exploit here.

3. Prevention Techniques

Now that we know what a SQL injection is, let’s see how we can protect our code from this kind of attack. Here we’re focusing on a couple of very effective techniques available in Java and other JVM languages, but similar concepts are available to other environments, such as PHP, .Net, Ruby and so forth.

For those looking for a complete list of available techniques, including database-specific ones, the OWASP Project maintains a SQL Injection Prevention Cheat Sheet, which is a good place to learn more about the subject.

3.1. Parameterized Queries

This technique consists of using prepared statements with the question mark placeholder (“?”) in our queries whenever we need to insert a user-supplied value. This is very effective and, unless there’s a bug in the JDBC driver’s implementation, immune to exploits.

Let’s rewrite our example function to use this technique:

public List<AccountDTO> safeFindAccountsByCustomerId(String customerId)
  throws Exception {
    
    String sql = "select "
      + "customer_id, acc_number, branch_id, balance from Accounts"
      + "where customer_id = ?";
    
    Connection c = dataSource.getConnection();
    PreparedStatement p = c.prepareStatement(sql);
    p.setString(1, customerId);
    ResultSet rs = p.executeQuery(sql)); 
    // omitted - process rows and return an account list
}

Here we’ve used the prepareStatement() method available in the Connection instance to get a PreparedStatement. This interface extends the regular Statement interface with several methods that allow us to safely insert user-supplied values in a query before executing it.

For JPA, we have a similar feature:

String jql = "from Account where customerId = :customerId";
TypedQuery<Account> q = em.createQuery(jql, Account.class)
  .setParameter("customerId", customerId);
// Execute query and return mapped results (omitted)

When running this code under Spring Boot, we can set the property logging.level.sql to DEBUG and see what query is actually built in order to execute this operation:

// Note: Output formatted to fit screen
[DEBUG][SQL] select
  account0_.id as id1_0_,
  account0_.acc_number as acc_numb2_0_,
  account0_.balance as balance3_0_,
  account0_.branch_id as branch_i4_0_,
  account0_.customer_id as customer5_0_ 
from accounts account0_ 
where account0_.customer_id=?

As expected, the ORM layer creates a prepared statement using a placeholder for the customerId parameter. This is the same we’ve done in the plain JDBC case – but with a few statements less, which is nice.

As a bonus, this approach usually results in a better performing query, since most databases can cache the query plan associated with a prepared statement.

Please note that this approach only works for placeholders used as values. For instance, we can’t use placeholders to dynamically change the name of a table:

// This WILL NOT WORK !!!
PreparedStatement p = c.prepareStatement("select count(*) from ?");
p.setString(1, tableName);

Here, JPA won’t help either:

// This WILL NOT WORK EITHER !!!
String jql = "select count(*) from :tableName";
TypedQuery q = em.createQuery(jql,Long.class)
  .setParameter("tableName", tableName);
return q.getSingleResult();

In both cases, we’ll get a runtime error.

The main reason behind this is the very nature of a prepared statement: database servers use them to cache the query plan required to pull the result set, which usually is the same for any possible value. This is not true for table names and other constructs available in the SQL language such as columns used in an order by clause.

3.2. JPA Criteria API

Since explicit JQL query building is the main source of SQL Injections, we should favor the use of the JPA’s Query API, when possible.

For a quick primer on this API, please refer to the article on Hibernate Criteria queries. Also worth reading is our article about JPA Metamodel, which shows how to generate metamodel classes that will help us to get rid of string constants used for column names – and the runtime bugs that arise when they change.

Let’s rewrite our JPA query method to use the Criteria API:

CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Account> cq = cb.createQuery(Account.class);
Root<Account> root = cq.from(Account.class);
cq.select(root).where(cb.equal(root.get(Account_.customerId), customerId));

TypedQuery<Account> q = em.createQuery(cq);
// Execute query and return mapped results (omitted)

Here, we’ve used more code lines to get the same result, but the upside is that now we don’t have to worry about JQL syntax.

Another important point: despite its verbosity, the Criteria API makes creating complex query services more straightforward and safer. For a complete example that shows how to do it in practice, please take a look at the approach used by JHipster-generated applications.

3.3. User Data Sanitization

Data Sanitization is a technique of applying a filter to user supplied-data so it can be safely used by other parts of our application. A filter’s implementation may vary a lot, but we can generally classify them in two types: whitelists and blacklists.

Blacklists, which consist of filters that try to identify an invalid pattern, are usually of little value in the context of SQL Injection prevention – but not for the detection! More on this later.

Whitelists, on the other hand, work particularly well when we can define exactly what is a valid input.

Let’s enhance our safeFindAccountsByCustomerId method so now the caller can also specify the column used to sort the result set. Since we know the set of possible columns, we can implement a whitelist using a simple set and use it to sanitize the received parameter:

private static final Set<String> VALID_COLUMNS_FOR_ORDER_BY
  = Collections.unmodifiableSet(Stream
      .of("acc_number","branch_id","balance")
      .collect(Collectors.toCollection(HashSet::new)));

public List<AccountDTO> safeFindAccountsByCustomerId(
  String customerId,
  String orderBy) throws Exception { 
    String sql = "select "
      + "customer_id,acc_number,branch_id,balance from Accounts"
      + "where customer_id = ? ";
    if (VALID_COLUMNS_FOR_ORDER_BY.contains(orderBy)) {
        sql = sql + " order by " + orderBy;
    } else {
        throw new IllegalArgumentException("Nice try!");
    }
    Connection c = dataSource.getConnection();
    PreparedStatement p = c.prepareStatement(sql);
    p.setString(1,customerId);
    // ... result set processing omitted
}

Here, we’re combining the prepared statement approach and a whitelist used to sanitize the orderBy argument. The final result is a safe string with the final SQL statement. In this simple example, we’re using a static set, but we could also have used database metadata functions to create it.

We can use the same approach for JPA, also taking advantage of the Criteria API and Metadata to avoid using String constants in our code:

// Map of valid JPA columns for sorting
final Map<String,SingularAttribute<Account,?>> VALID_JPA_COLUMNS_FOR_ORDER_BY = Stream.of(
  new AbstractMap.SimpleEntry<>(Account_.ACC_NUMBER, Account_.accNumber),
  new AbstractMap.SimpleEntry<>(Account_.BRANCH_ID, Account_.branchId),
  new AbstractMap.SimpleEntry<>(Account_.BALANCE, Account_.balance))
  .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

SingularAttribute<Account,?> orderByAttribute = VALID_JPA_COLUMNS_FOR_ORDER_BY.get(orderBy);
if (orderByAttribute == null) {
    throw new IllegalArgumentException("Nice try!");
}

CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Account> cq = cb.createQuery(Account.class);
Root<Account> root = cq.from(Account.class);
cq.select(root)
  .where(cb.equal(root.get(Account_.customerId), customerId))
  .orderBy(cb.asc(root.get(orderByAttribute)));

TypedQuery<Account> q = em.createQuery(cq);
// Execute query and return mapped results (omitted)

This code has the same basic structure as in the plain JDBC. First, we use a whitelist to sanitize the column name, then we proceed to create a CriteriaQuery to fetch the records from the database.

3.4. Are We Safe Now?

Let’s assume that we’ve used parameterized queries and/or whitelists everywhere. Can we now go to our manager and guarantee we’re safe?

Well… not so fast. Without even considering Turing’s halting problem, there are other aspects we must consider:

Stored Procedures: These are also prone to SQL Injection issues; whenever possible please apply sanitation even to values that will be sent to the database via prepared statements
Triggers: Same issue as with procedure calls, but even more insidious because sometimes we have no idea they’re there…
Insecure Direct Object References: Even if our application is SQL-Injection free, there’s still a risk that associated with this vulnerability category – the main point here is related to different ways an attacker can trick the application, so it returns records he or she was not supposed to have access to – there’s a good cheat sheet on this topic available at OWASP’s GitHub repository

In short, our best option here is caution. Many organizations nowadays use a “red team” exactly for this. Let them do their job, which is exactly to find any remaining vulnerabilities.

4. Damage Control Techniques

As a good security practice, we should always implement multiple defense layers – a concept known as defense in depth. The main idea is that even if we’re unable to find all possible vulnerabilities in our code – a common scenario when dealing with legacy systems – we should at least try to limit the damage an attack would inflict.

Of course, this would be a topic for a whole article or even a book but let’s name a few measures:

Apply the principle of least privilege: Restrict as much as possible the privileges of the account used to access the database
Use database-specific methods available in order to add an additional protection layer; for example, the H2 Database has a session-level option that disables all literal values on SQL Queries
Use short-lived credentials: Make the application rotate database credentials often; a good way to implement this is by using Spring Cloud Vault
Log everything: If the application stores customer data, this is a must; there are many solutions available that integrate directly to the database or work as a proxy, so in case of an attack we can at least assess the damage
Use WAFs or similar intrusion detection solutions: those are the typical blacklist examples – usually, they come with a sizeable database of known attack signatures and will trigger a programmable action upon detection. Some also include in-JVM agents that can detect intrusions by applying some instrumentation – the main advantage of this approach is that an eventual vulnerability becomes much easier to fix since we’ll have a full stack trace available.

5. Conclusion

In this article, we’ve covered SQL Injection vulnerabilities in Java applications – a very serious threat to any organization that depends on data for their business – and how to prevent them using simple techniques.

As usual, full code for this article is available on Github.

SQL Injection and How to Prevent It?

1. Introduction

2. How Applications Become Vulnerable to SQL Injection?

2.1. I’m Using JPA. I’m Safe, Right?

3. Prevention Techniques

3.1. Parameterized Queries

3.2. JPA Criteria API

3.3. User Data Sanitization

3.4. Are We Safe Now?

4. Damage Control Techniques

5. Conclusion

I just announced the new Learn Spring Security course, including the full material focused on the new OAuth2 stack in Spring Security:

Get started with Spring Data JPA through the reference Learn Spring Data JPA course:

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung

1. Introduction

2. How Applications Become Vulnerable to SQL Injection?

2.1. I’m Using JPA. I’m Safe, Right?

3. Prevention Techniques

3.1. Parameterized Queries

3.2. JPA Criteria API

3.3. User Data Sanitization

3.4. Are We Safe Now?

4. Damage Control Techniques

5. Conclusion

I just announced the new Learn Spring Security course, including the full material focused on the new OAuth2 stack in Spring Security:

Get started with Spring Data JPA through the reference Learn Spring Data JPA course: