What Is a TLAB or Thread-Local Allocation Buffer in Java?

Azure Spring Apps is a fully managed service from Microsoft (built in collaboration with VMware), focused on building and deploying Spring Boot applications on Azure Cloud without worrying about Kubernetes.

The Enterprise plan comes with some interesting features, such as commercial Spring runtime support, a 99.95% SLA and some deep discounts (up to 47%) when you are ready for production.

>> Learn more and deploy your first Spring Boot app to Azure.

And, you can participate in a very quick (1 minute) paid user research from the Java on Azure product team.

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

Accelerate Your Jakarta EE Development with Payara Server!

With best-in-class guides and documentation, Payara essentially simplifies deployment to diverse infrastructures.

Beyond that, it provides intelligent insights and actions to optimize Jakarta EE applications.

The goal is to apply an opinionated approach to get to what's essential for mission-critical applications - really solid scalability, availability, security, and long-term support:

>> Download and Explore the Guide (to learn more)

The AI Assistant to boost Boost your productivity writing unit tests - Machinet AI.

AI is all the rage these days, but for very good reason. The highly practical coding companion, you'll get the power of AI-assisted coding and automated unit test generation.
Machinet's Unit Test AI Agent utilizes your own project context to create meaningful unit tests that intelligently aligns with the behavior of the code.
And, the AI Chat crafts code and fixes errors with ease, like a helpful sidekick.

Simplify Your Coding Journey with Machinet AI:

>> Install Machinet AI in your IntelliJ

Looking for the ideal Linux distro for running modern Spring apps in the cloud?

Meet Alpaquita Linux: lightweight, secure, and powerful enough to handle heavy workloads.

This distro is specifically designed for running Java apps. It builds upon Alpine and features significant enhancements to excel in high-density container environments while meeting enterprise-grade security standards.

Specifically, the container image size is ~30% smaller than standard options, and it consumes up to 30% less RAM:

>> Try Alpaquita Containers now.

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

1. Introduction

In this tutorial, we’ll look at Thread-Local Allocation Buffers (TLABs). We’ll see what they are, how the JVM uses them, and how we can manage them.

2. Memory Allocation in Java

Certain commands in Java will allocate memory. The most obvious is the new keyword, but there are others – for example, using reflection.

Whenever we do this, the JVM must set aside some memory for the new objects on the heap. In particular, the JVM memory allocation does all allocations in this way in the Eden, or Young, space.

In a single-threaded application, this is easy. Since only a single memory allocation request can happen at any time, the thread can simply grab the next block of a suitable size, and we’re done:

Example of allocation of heap memory in a single thread.

However, in a multi-threaded application, we can’t do things quite so simply. If we do, then there’s the risk that two threads will request memory at the exact same instant and will both be given the exact same block:

Example of two threads attempting to allocate memory at the same instant

To avoid this, we synchronize memory allocations so that two threads cannot request the same memory block simultaneously. However, synchronizing all memory allocations will make them essentially single-threaded, which can be a huge bottleneck in our application.

3. Thread-Local Allocation Buffers

The JVM addresses this concern using Thread-Local Allocation Buffers, or TLABs. These are areas of heap memory that are reserved for a given thread and are used only by that thread to allocate memory:

Allocation of memory from multiple threads using TLAB.

By working in this way, no synchronization is necessary since only a single thread can pull from this buffer. The buffer itself is allocated in a synchronized manner, but this is a much less frequent operation.

Since allocating memory for objects is a relatively common occurrence, this can be a huge performance improvement. But how much, exactly? We can determine this easily enough with a simple test:

@Test
public void testAllocations() {
    long start = System.currentTimeMillis();

    List<Object> objects = new ArrayList<>();

    for (int i = 0; i < 1_000_000; ++i) {
        objects.add(new Object());
    }

    Assertions.assertEquals(1_000_000, objects.size());

    long end = System.currentTimeMillis();
    System.out.println((end - start) + "ms");
}

This is a relatively simple test, but it does the job. We’re going to allocate memory for 1,000,000 new Object instances and record how long it takes. We can then run this a number of times, both with and without TLAB, and see what the average time is (We’ll see in section 5 how we can turn TLAB off.):

Graph showing the test times with and without TLAB.

We can clearly see the difference. The average time with TLAB is 33 ms, and the average without goes up to 110 ms. That’s an increase of 230%, just by changing this one setting.

3.1. Running out of TLAB Space

Obviously, our TLAB space is finite. So, what happens when we run out?

If our application tries to allocate space for a new object and the TLAB doesn’t have enough available, the JVM has four possible options:

It can allocate a new amount of TLAB space for this thread, effectively increasing the amount available.
It can allocate the memory for this object from outside of TLAB space.
It can attempt to free up some memory using the garbage collector.
It can fail to allocate the memory and, instead, throw an error.

Option #4 is our catastrophic case, so we want to avoid it wherever possible, but it’s an option if the other cases can’t happen.

The JVM uses a number of complicated heuristics to determine which of the other options to use, and these heuristics may change between different JVMs and different versions. However, the most important details that feed into this decision include:

The number of allocations that are likely in a period of time. If we’re likely to be allocating a lot of objects, then increasing TLAB space will be the more efficient choice. If we’re likely to be allocating very few objects, then increasing TLAB space might actually be less efficient.
The amount of memory being requested. The more memory requested, the more expensive it’ll be to allocate this outside of the TLAB space.
The amount of available memory. If the JVM has a lot of memory available, then increasing TLAB space is much easier than if the memory usage is very high.
The amount of memory contention. If the JVM has a lot of threads that each need memory, then increasing TLAB space might be much more expensive than if there are very few threads.

3.2. TLAB Capacity

Using TLAB seems like a fantastic way to improve performance, but there are always costs. The synchronization needed to prevent multiple threads from allocating the same memory area makes the TLAB itself relatively expensive to allocate. We might also need to wait for sufficient memory to be available to allocate from in the first place if the JVM memory usage is especially high. As such, we ideally want to do this as infrequently as possible.

However, if a thread is allocated a larger amount of memory for its TLAB space than it needs, then this memory will just sit there unused and is essentially wasted. Worse, wasting this space makes it more difficult for other threads to obtain memory for TLAB space and can make the entire application slower overall.

As such, there’s contention about exactly how much space to allocate. Allocate too much, and we’re wasting space. But allocate too little, and we’ll spend more time than is desirable allocating TLAB space.

Thankfully the JVM will handle all of this for us, though we’ll soon see how we can tune it to our needs if necessary.

4. Seeing TLAB Usage

Now that we know what TLAB is and the impact it can have on our application, how can we see it in action?

Unfortunately, the jconsole tool doesn’t give any visibility into it as it does with the standard memory pools.

However, the JVM itself can output some diagnostic information. This uses the new unified GC logging mechanism, so we must launch the JVM with the -Xlog:gc+tlab=trace flag to see this information. This will then periodically print out information about the current TLAB usage by the JVM. For example, during a GC run, we might see something like:

[0.343s][trace][gc,tlab] GC(0) TLAB: gc thread: 0x000000014000a600 [id: 10499] desired_size: 450KB slow allocs: 4  refill waste: 7208B alloc: 0.99999    22528KB refills: 42 waste  1.4% gc: 161384B slow: 59152B

This tells us that, for this particular thread:

The current TLAB size is 450 KB (desired_size).
There have been four allocations outside of TLAB since the last GV (slow allocs).

Note that the exact logging will vary between JVMs and versions.

5. Tuning TLAB Settings

We’ve already seen what the impact can be from turning TLAB on and off, but what else can we do with it? There are a number of settings that we can adjust by providing JVM parameters when starting our application.

First, let’s actually see how to turn it off. This is done by passing the JVM parameter -XX-UseTLAB. Setting this will stop the JVM from using TLAB and force it to use synchronization on every memory allocation.

We can also leave TLAB enabled but stop it from being resized by setting the JVM parameter -XX:-ResizeTLAB. Doing this will mean that if the TLAB for a given thread fills up, all future allocations will be outside TLAB and require synchronization.

We also have the ability to configure the size of TLAB. We can provide the JVM parameter -XX:TLABSize with a value to use. This defines the suggested initial size that the JVM should use for each TLAB, so it’s the size per thread to allocate. If this is set to 0 – which is the default – then the JVM will dynamically determine how much to allocate per thread based on the current state of the JVM.

We can also specify -XX:MinTLABSize to give a lower limit on what the TLAB size for each thread should be for cases where we’re allowing the JVM to dynamically determine the size. We also have -XX:MaxTLABSize as the upper limit on what the TLAB can grow to for each thread.

All of these settings have sensible defaults already, and it’s usually best to just use these, but if we find there are problems, we do have a level of control.

6. Summary

In this article, we’ve seen what Thread-Local Allocation Buffers are, how they’re used, and how we can manage them. Next time you have any performance issues with your application, consider if this could be something to investigate.

What Is a TLAB or Thread-Local Allocation Buffer in Java?

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Introduction

2. Memory Allocation in Java

3. Thread-Local Allocation Buffers

3.1. Running out of TLAB Space

3.2. TLAB Capacity

4. Seeing TLAB Usage

5. Tuning TLAB Settings

6. Summary

Get started with Spring and Spring Boot, through the Learn Spring course:

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Introduction

2. Memory Allocation in Java

3. Thread-Local Allocation Buffers

3.1. Running out of TLAB Space

3.2. TLAB Capacity

4. Seeing TLAB Usage

5. Tuning TLAB Settings

6. Summary

Get started with Spring and Spring Boot, through the Learn Spring course: