Skip to main content
Aiven for Apache Kafka is a fully managed Apache Kafka service that provides high-throughput, fault-tolerant event streaming. Deploy production-ready Kafka clusters in minutes with automated operations, monitoring, and scaling.

Overview

Apache Kafka is the industry-standard platform for building real-time data pipelines and streaming applications. Aiven for Apache Kafka takes care of all operational aspects so you can focus on building applications.

Cluster Types

Aiven offers two Kafka cluster types to match different workload requirements:
Inkless Kafka stores topic data in cloud object storage through diskless topics, enabling elastic scaling and long-term retention without managing disk capacity.Key Features:
  • Diskless topics that store data in object storage
  • Classic topics with managed remote storage
  • Elastic storage scaling
  • Cost-efficient for high-throughput workloads
  • Ideal for BYOC deployments
When to Use:
  • High-throughput workloads
  • Long-term data retention requirements
  • Storage elasticity is important
  • Cost optimization is a priority

Key Features

Tiered Storage

Store data indefinitely by moving older segments to cost-effective cloud object storage (S3, GCS, Azure Blob) while keeping recent data on fast local disks.

Kafka Connect

Managed connectors for integrating with databases, storage systems, and data platforms. Available on Professional tier.

MirrorMaker 2

Cross-cluster replication for disaster recovery, multi-region architectures, and migration between Kafka clusters.

Schema Registry

Centralized schema management with support for Avro, JSON Schema, and Protobuf. Ensure data compatibility across producers and consumers.

Kafka REST API

HTTP interface for producing and consuming messages without native Kafka clients.

Kafka Quotas

Control resource usage with quotas on throughput, request rates, and client connections.

Getting Started

1

Create a Kafka Service

Choose between Inkless or Classic Kafka based on your requirements:
  1. Select Inkless as the service type
  2. Choose Aiven Cloud or BYOC deployment
  3. Provide expected ingress, egress, and retention
  4. Deploy the service
2

Create a Topic

Create your first Kafka topic via the Aiven Console, CLI, or API:
avn service topic-create my-kafka-service my-topic \
  --partitions 3 \
  --replication 2 \
  --retention-ms 86400000
3

Generate Sample Data

Test your Kafka service with the built-in sample data generator from the Aiven Console to verify connectivity.
4

Connect Your Application

Get connection details from the service overview and configure your Kafka client.

Connection Examples

from kafka import KafkaProducer, KafkaConsumer
import json

# Producer
producer = KafkaProducer(
    bootstrap_servers='kafka-service.aivencloud.com:12345',
    security_protocol='SSL',
    ssl_cafile='ca.pem',
    ssl_certfile='service.cert',
    ssl_keyfile='service.key',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

# Send a message
producer.send('my-topic', {'event': 'user_login', 'user_id': 123})
producer.flush()

# Consumer
consumer = KafkaConsumer(
    'my-topic',
    bootstrap_servers='kafka-service.aivencloud.com:12345',
    security_protocol='SSL',
    ssl_cafile='ca.pem',
    ssl_certfile='service.cert',
    ssl_keyfile='service.key',
    value_deserializer=lambda m: json.loads(m.decode('utf-8')),
    auto_offset_reset='earliest',
    group_id='my-consumer-group'
)

for message in consumer:
    print(f"Received: {message.value}")

Advanced Features

Tiered Storage

Tiered storage decouples storage and compute, allowing indefinite data retention:
  • Recent data stays on fast local disks for low-latency access
  • Older segments automatically move to cloud object storage (S3, GCS, Azure)
  • Configure per-topic retention policies
  • Significant cost savings for long retention periods
  • Available on Classic Kafka (optional) and built into Inkless Kafka
Enable Tiered Storage:
avn service update my-kafka-service \
  -c kafka.tiered_storage.enabled=true

Kafka Connect

Kafka Connect provides managed source and sink connectors:Popular Connectors:
  • JDBC (PostgreSQL, MySQL, SQL Server)
  • S3, GCS, Azure Blob Storage
  • Elasticsearch, OpenSearch
  • MongoDB, Cassandra
  • Debezium CDC connectors
Create a Connector:
avn service connector create my-kafka-service @connector-config.json

MirrorMaker 2

Replicate data between Kafka clusters for:
  • Disaster recovery
  • Multi-region architectures
  • Migration between clusters
  • Active-active deployments
Setup Replication:
avn service integration-create \
  --integration-type kafka_mirrormaker \
  --source-service source-kafka \
  --dest-service target-kafka \
  -c kafka_mirrormaker.topics=".*"

Schema Registry

Schema Registry ensures data compatibility:
  • Support for Avro, JSON Schema, Protobuf
  • Schema evolution with compatibility checking
  • Schema versioning and history
  • Integration with producers and consumers
Register a Schema:
curl -X POST https://kafka-service.aivencloud.com/subjects/my-topic-value/versions \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d '{"schema": "{\"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"id\", \"type\": \"int\"}]}"}'

Performance and Scaling

Horizontal Scaling:
  • Add more brokers to increase throughput
  • Distribute partitions across brokers
  • Handle more concurrent connections
Vertical Scaling:
  • Upgrade to larger instance types
  • Increase CPU and memory per broker
  • Improve single-partition throughput
  • More partitions = higher parallelism
  • Balance partition count with consumer group size
  • Consider replication factor for durability
  • Typical: 3-10 partitions per broker
  • Each consumer in a group processes different partitions
  • Scale consumers up to number of partitions
  • Monitor consumer lag to identify bottlenecks

Monitoring and Operations

Key Metrics to Monitor

  • Throughput: Messages per second in/out
  • Latency: End-to-end message delivery time
  • Consumer Lag: How far behind consumers are
  • Disk Usage: Local and remote storage consumption
  • Replication: Under-replicated partitions

Integration with Observability Tools

avn service integration-create \
  --integration-type metrics \
  --source-service my-kafka-service \
  --dest-service my-grafana-service

Security

Authentication

  • SASL/SSL authentication
  • Certificate-based auth
  • ACL-based authorization
  • User and permission management

Encryption

  • TLS encryption in transit
  • Encryption at rest
  • Separate encryption keys per service

Network Security

  • VPC peering
  • AWS PrivateLink
  • IP allowlisting
  • Private connectivity options

Compliance

  • ISO 27001:2013
  • SOC 2 Type II
  • GDPR compliant
  • HIPAA available

Use Cases

Build microservices that communicate through events:
  • Decouple services with event streams
  • Enable real-time processing
  • Maintain event history
  • Support event sourcing patterns

Best Practices

  • Use meaningful topic names
  • Plan partition count for expected throughput
  • Set appropriate retention based on use case
  • Enable compression (snappy or lz4)
  • Consider topic naming conventions
  • Set acks=all for durability
  • Enable idempotence for exactly-once semantics
  • Batch messages for throughput
  • Implement proper error handling
  • Use async sends with callbacks
  • Choose appropriate consumer group IDs
  • Set proper auto.offset.reset
  • Monitor consumer lag
  • Implement graceful shutdown
  • Handle rebalancing properly
  • Monitor key metrics continuously
  • Set up alerts for critical issues
  • Plan capacity for peak loads
  • Test disaster recovery procedures
  • Document runbooks for common operations

Apache Flink

Stream processing on Kafka data

PostgreSQL

Sink Kafka data to PostgreSQL

OpenSearch

Search and analyze Kafka logs

ClickHouse

Real-time analytics on streaming data

Next Steps

Free Tier Available: Try Aiven for Apache Kafka with no payment method required. Perfect for development and testing.

Build docs developers (and LLMs) love