eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Course – LJB – NPI EA (cat = Core Java)
announcement - icon

Code your way through and build up a solid, practical foundation of Java:

>> Learn Java Basics

1. Overview

With traditional databases, we typically rely on exact keyword or basic pattern matching to implement our search functionality. While sufficient for simple applications, this approach fails to fully understand the meaning and context behind natural language queries.

Vector stores address this limitation by storing data as numeric vectors that capture their meaning. Similar words end up close to each other, which allows for semantic search, where the relevant results are returned even if they don’t contain the exact keywords used in the query.

In this tutorial, we’ll explore how to integrate ChromaDB, an open-source vector store, with Spring AI.

To convert our text data into vectors that ChromaDB can store and search, we’ll need an embedding model. We’ll use Ollama to run an embedding model locally.

2. Dependencies

Let’s start by adding the necessary dependencies to our project’s pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

The ChromaDB starter dependency enables us to establish a connection with our ChromaDB vector store and interact with it.

Additionally, we import the Ollama starter dependency, which we’ll use to run our embedding model.

Since the current version, 1.0.0-M6, is a milestone release, we’ll also need to add the Spring Milestones repository to our pom.xml:

<repositories>
    <repository>
        <id>spring-milestones</id>
        <name>Spring Milestones</name>
        <url>https://repo.spring.io/milestone</url>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
</repositories>

This repository is where milestone versions are published, unlike the standard Maven Central repository.

Since we’re using multiple Spring AI starters in our project, let’s also include the Spring AI Bill of Materials (BOM) in our pom.xml:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>1.0.0-M6</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

With this addition, we can now remove the version tag from both of our starter dependencies.

The BOM eliminates the risk of version conflicts and ensures our Spring AI dependencies are compatible with each other.

3. Setting up Local Test Environment With Testcontainers

To facilitate local development and testing, we’ll use Testcontainers to set up our ChromaDB vector store and Ollama service.

The prerequisite for running the required services via Testcontainers is an active Docker instance.

3.1. Test Dependencies

First, let’s add the necessary test dependencies to our pom.xml:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-spring-boot-testcontainers</artifactId>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>chromadb</artifactId>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>ollama</artifactId>
    <scope>test</scope>
</dependency>

These dependencies provide us with the necessary classes to spin up ephemeral Docker instances for both of our external services.

3.2. Defining Testcontainers Beans

Next, let’s create a @TestConfiguration class that defines our Testcontainers beans:

@TestConfiguration(proxyBeanMethods = false)
class TestcontainersConfiguration {

    @Bean
    @ServiceConnection
    public ChromaDBContainer chromaDB() {
        return new ChromaDBContainer("chromadb/chroma:0.5.20");
    }

    @Bean
    @ServiceConnection
    public OllamaContainer ollama() {
        return new OllamaContainer("ollama/ollama:0.4.5");
    }
}

We specify the latest stable versions for our containers.

We also annotate our bean methods with @ServiceConnection. This dynamically registers all the properties required to set up a connection with both of our external services.

Even when not using the Testcontainers support, Spring AI automatically connects to ChromaDB and Ollama when running locally on their default ports of 8000 and 11434, respectively.

However, in production, we can override the connection details using the corresponding Spring AI properties:

spring:
  ai:
    vectorstore:
      chroma:
        client:
          host: ${CHROMADB_HOST}
          port: ${CHROMADB_PORT}
    ollama:
      base-url: ${OLLAMA_BASE_URL}

Once the connection details are configured correctly, Spring AI automatically creates beans of type VectorStore and EmbeddingModel for us, allowing us to interact with our vector store and embedding model, respectively. We’ll look at how to use these beans later in the tutorial.

Although @ServiceConnection automatically defines the necessary connection details, we’ll still need to configure a few additional properties in our application.yml file:

spring:
  ai:
    vectorstore:
      chroma:
        initialize-schema: true
    ollama:
      embedding:
        options:
          model: nomic-embed-text
      init:
        chat:
          include: false
        pull-model-strategy: WHEN_MISSING

Here, we enable schema initialization for ChromaDB. Then, we configure nomic-embed-text as our embedding model and instruct Ollama to pull the model if it’s not present in our system.

Alternatively, we can use a different embedding model from Ollama or a Hugging Face model as per requirement.

3.3. Using Testcontainers During Development

While Testcontainers is primarily used for integration testing, we can also use it during our local development.

To achieve this, we’ll create a separate main class in our src/test/java directory:

class TestApplication {

    public static void main(String[] args) {
        SpringApplication.from(Application::main)
          .with(TestcontainersConfiguration.class)
          .run(args);
    }
}

We create a TestApplication class and, inside its main method, start our main Application class with our TestcontainersConfiguration class.

This setup helps us to set up and manage our external services locally. We can run our Spring Boot application and have it connect to our external services, which are started via Testcontainers.

4. Populating ChromaDB at Application Startup

Now that we have our local environment set up, let’s populate our ChromaDB vector store with some sample data during application startup.

4.1. Fetching Poetry Records From PoetryDB

For our demonstration, we’ll use the PoetryDB API to fetch poems.

Let’s create a PoetryFetcher utility class for this:

class PoetryFetcher {

    private static final String BASE_URL = "https://poetrydb.org/author/";
    private static final String DEFAULT_AUTHOR_NAME = "Shakespeare";

    public static List<Poem> fetch() {
        return fetch(DEFAULT_AUTHOR_NAME);
    }

    public static List<Poem> fetch(String authorName) {
        return RestClient
          .create()
          .get()
          .uri(URI.create(BASE_URL + authorName))
          .retrieve()
          .body(new ParameterizedTypeReference<>() {});
    }

}

record Poem(String title, List<String> lines) {}

We use RestClient to invoke the PoetryDB API with the specified authorName. To deserialize the API response to a list of Poem records, we use ParameterizedTypeReference without explicitly specifying the generic response type, and Java will infer the type for us.

We also overload our fetch() method without any parameter to retrieve poems by the author Shakespeare. We’ll be using this method in our next section.

4.2. Storing Documents in ChromaDB Vector Store

Now, to populate our ChromaDB vector store with poems during application startup, we’ll create a VectorStoreInitializer class that implements the ApplicationRunner interface:

@Component
class VectorStoreInitializer implements ApplicationRunner {

    private final VectorStore vectorStore;

    // standard constructor

    @Override
    public void run(ApplicationArguments args) {
        List<Document> documents = PoetryFetcher
          .fetch()
          .stream()
          .map(poem -> {
              Map<String, Object> metadata = Map.of("title", poem.title());
              String content = String.join("\n", poem.lines());
              return new Document(content, metadata);
          })
          .toList();
        vectorStore.add(documents);
    }

}

In our VectorStoreInitializer, we autowire an instance of VectorStore.

Inside the run() method, we use our PoetryFetcher utility class to retrieve a list of Poem records. Then, we map each poem into a Document with the lines as content and the title as metadata.

Finally, we store all the documents in our vector store. When we invoke the add() method, Spring AI automatically converts our plaintext content into vector representation before storing it in our vector store. We don’t need to explicitly convert it using the EmbeddingModel bean.

By default, Spring AI uses SpringAiCollection as the collection name to store data in our vector store, but we can override it using the spring.ai.vectorstore.chroma.collection-name property.

With our ChromaDB vector store populated, let’s validate our semantic search functionality:

private static final int MAX_RESULTS = 3;

@ParameterizedTest
@ValueSource(strings = {"Love and Romance", "Time and Mortality", "Jealousy and Betrayal"})
void whenSearchingShakespeareTheme_thenRelevantPoemsReturned(String theme) {
    SearchRequest searchRequest = SearchRequest
      .builder()
      .query(theme)
      .topK(MAX_RESULTS)
      .build();
    List<Document> documents = vectorStore.similaritySearch(searchRequest);

    assertThat(documents)
      .hasSizeLessThanOrEqualTo(MAX_RESULTS)
      .allSatisfy(document -> {
          String title = String.valueOf(document.getMetadata().get("title"));
          assertThat(title)
            .isNotBlank();
        });
}

Here, we pass some common Shakespearean themes to our test method using @ValueSource. We then create a SearchRequest object with the theme as the query and MAX_RESULTS as the number of desired results

Next, we call the similaritySearch() method of our vectorStore bean, with our searchRequest. Similar to the add() method of the VectorStore, Spring AI converts our query to its vector representation before querying our vector store.

The returned documents will contain poems that are semantically related to the given theme, even if they don’t contain the exact keyword.

6. Conclusion

In this article, we explored how to integrate ChromaDB vector store with Spring AI.

Using Testcontainers, we started Docker containers for our ChromaDB and Ollama services, creating a local test environment.

We looked at how to populate our vector store with poems from the PoetryDB API during application startup. Then, we used common poetry themes to validate our semantic search functionality.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

eBook Jackson – NPI EA – 3 (cat = Jackson)
2 Comments
Oldest
Newest
Inline Feedbacks
View all comments