Course – LS (cat=JSON/Jackson)

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

Apache Kafka is an open-source, fault-tolerant, and highly scalable streaming platform. It follows a publish-subscribe architecture to stream data in real-time. We can process high-volume massive data with very low latency processing by putting the data in a queue. Sometimes, we need to send JSON data type to the Kafka topic for data processing and analysis.

In this tutorial, we’ll learn how to stream JSON data into Kafka topics. Additionally, we’ll also look at how to configure a Kafka producer and consumer for JSON data.

2. Importance of JSON Data in Kafka

Architecturally, Kafka supports message streams in its system. Therefore, we can also send JSON data to the Kafka server. Nowadays, in modern application systems, every application primarily deals in JSON only, so it becomes very important to communicate in JSON format. It is beneficial in real-time activity tracking of users and their behavior on websites and applications by sending data in JSON format.

Steaming JSON type of data into a Kafka server helps in real-time data analysis. It facilitates an event-driven architecture where each microservice subscribes to its relevant topics and provides changes in real-time. With Kafka topics and JSON formats, it is easy to deliver IOT data, communicate between microservices, and aggregate metrics.

3. Kafka Setup

To stream JSON into the Kafka server, we need to first set up the Kafka broker and Zookeeper. We can follow this tutorial to set up a full-fledged Kafka server. Now, let’s check the command to create a Kafka topic baeldung on which we’ll be producing and consuming the JSON data:

$ docker-compose exec kafka kafka-topics.sh --create --topic baeldung
  --partitions 1 --replication-factor 1 --bootstrap-server kafka:9092

The above command creates a Kafka topic baeldung with replication factor 1. Here, we have created a Kafka topic with only 1 replication factor, as it is only for demo purposes. We might need a multi-replication factor in real-case scenarios as it helps in system failover cases. Also, it provides high availability and reliability of data.

4. Produce Data

Kafka producer is the most basic component of the whole Kafka ecosystem, which provides the facility of producing data to the Kafka server. To demonstrate, let’s look at the command to start a producer using the docker-compose command:

$ docker-compose exec kafka kafka-console-producer.sh --topic baeldung
  --broker-list kafka:9092

In the above command, we created a Kafka producer to send messages to the Kafka broker. Furthermore, to send JSON data type, we would need to tweak the command. Before proceeding, let’s first create a sample JSON file sampledata.json:

{
    "name": "test",
    "age": 26,
    "email": "[email protected]",
    "city": "Bucharest",
    "occupation": "Software Engineer",
    "company": "Baeldung Inc.",
    "interests": ["programming", "hiking", "reading"]
}

The above sampledata.json file contains the basic information of a user in JSON format. To send JSON data into Kafka topics, we’ll need the jq library since it is very powerful to work with JSON data. To demonstrate, let’s install the jq library to pass this JSON data to the Kafka producer:

$ sudo apt-get install jq

The above command simply installs the jq library on the Linux machine. Furthermore, let’s look at the command to send JSON data:

$ jq -rc . sampledata.json | docker-compose exec -T kafka kafka-console-producer.sh --topic baeldung --broker-list kafka:9092

The above command is a single-line command to process and stream JSON data into the Kafka topic in a Docker environment. Firstly, the jq command processes the sampledata.json, and then using the -r option, it ensures that the JSON data is in row format and unquoted format. After that, the -c option makes sure that the data is presented in a single line so that the data can easily stream to the respective Kafka topic.

5. Consumer Data

So far, we have successfully sent the JSON data to the baeldung Kafka topic. Now, let’s look at the command to consume that data:

$ docker-compose exec kafka kafka-console-consumer.sh --topic baeldung  --from-beginning --bootstrap-server kafka:9092
{"name":"test","age":26,"email":"[email protected]","city":"Bucharest","occupation":"Software Engineer","company":"Baeldung Inc.","interests":["programming","hiking","reading"]}

The above command consumes all the data sent over to the baeldung topic from the beginning. In the previous section, we sent JSON data. Therefore, it also consumes that JSON data as well. In short, the above command allows users to actively monitor all the messages sent over to topic baeldung. It facilitates real-time data consumption using the Kafka-based messaging system.

6. Conclusion

In this article, we explored how to stream JSON data into a Kafka topic. First, we created a sample JSON, and then we streamed that JSON into the Kafka topic using a producer. After that, we consumed that data using the docker-compose command.

In short, we covered all the necessary steps to send JSON format data to the topic using a Kafka producer and consumer. Moreover, it provides schema evolution since JSON can handle graceful updates without affecting existing data.

Course – LS (cat=JSON/Jackson)

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.