Simplifying Data Pipelines with Apache Kafka


When you hear the terms, producer, consumer, topic category, broker, and cluster used together to describe a messaging system, something is brewing in the Kafka pipelines. Get connected and learn what that is, and what it means!

Many Big Data use cases have one thing in common – the use of Apache Kafka somewhere in the mix. Whether the distributed, partitioned, replicated commit log service is being used for messaging, website activity tracking, stream processing, or more, there’s no denying it is a hot technology. In this course, you will learn how Kafka is used in the real world and its architecture and components. You will quickly get up and running, producing and consuming messages using both the command line tools and the Java APIs. You also will get hands-on experience connecting Kafka to Spark, and working with Kafka Connect.

Course Syllabus

Lesson 1 – Introduction to Apache Kafka

  • What Kafka is and why it was created
  • The Kafka Architecture
  • The main components of Kafka
  • Some of the use cases for Kafka

Lesson 2 – Kafka Command Line

  • The contents of Kafka’s /bin directory
  • How to start and stop Kafka
  • How to create new topics
  • How to use Kafka command line tools to produce and consume messages

Lesson 3 – Kafka Producer Java API

  • The Kafka producer client
  • Some of the KafkaProducer configuration settings and what they do
  • How to create a Kafka producer using the Java API and send messages both synchronously and asynchronously

Lesson 4 – Kafka Consumer Java API

  • The Kafka consumer client
  • Some of the KafkaConsumer configuration settings and what they do
  • How to create a Kafka consumer using the Java API

Lesson 5 – Kafka Connect and Spark Streaming

  • Kafka Connect and how to use a pre-built connector
  • Some of the components of Kafka Connect
  • How to use Kafka and Spark Streaming together


  • Have taken the Hadoop 101 course.
  • Recommended skills prior to taking this course
  • Basic understanding of Apache Hadoop and Big Data.
  • Basic Linux Operating System knowledge.
  • Basic understanding of the Scala, Python, R, or Java programming languages.
Ibm Certification Courses