Apache Kafka is the standard for event streaming. Millions of messages per second, guaranteed delivery, and unlimited scalability.
Architektur und Konzepte¶
Kafka is a distributed commit log — it persistently stores messages and allows repeated reads.
Konzepte¶
- Topic — logical channel
- Partition — physical division for parallelism
- Consumer Group — automatic partition assignment
- Broker — server in the cluster
from confluent_kafka import Producer, Consumer
import json
producer = Producer({'bootstrap.servers': 'kafka:9092'})
producer.produce('orders', key=b'123', value=json.dumps(order).encode())
producer.flush()
consumer = Consumer({
'bootstrap.servers': 'kafka:9092',
'group.id': 'processor',
'auto.offset.reset': 'earliest',
})
consumer.subscribe(['orders'])
while True:
msg = consumer.poll(1.0)
if msg: process(json.loads(msg.value()))
Bewaehrte Praktiken¶
- Replication factor 3
- Idempotent producer
- Schema Registry — schema versioning
Zusammenfassung¶
Kafka is the foundation of event-driven architecture. Topics, partitions, and consumer groups for scalable real-time pipelines.
apache kafkastreamingevent-drivenmessaging