Skip to content

Deployment

Karafka is currently being used in production with the following deployment methods:

Since the only thing that is long-running is the Karafka server, it shouldn't be hard to make it work with other deployment and CD tools.

systemd (+ Capistrano)

You can easily manage Karafka applications with systemd. Here's an example .service file that you can use.

# Move to /lib/systemd/system/karafka.service
# Run: systemctl enable karafka

[Unit]
Description=karafka
After=syslog.target network.target

[Service]
Type=simple

WorkingDirectory=/opt/current
ExecStart=/bin/bash -lc 'bundle exec karafka server'
User=deploy
Group=deploy
UMask=0002

RestartSec=1
Restart=on-failure

# output goes to /var/log/syslog
StandardOutput=syslog
StandardError=syslog

# This will default to "bundler" if we don't specify it
SyslogIdentifier=karafka

[Install]
WantedBy=multi-user.target

If you want to use systemd based solution together with Capistrano, you don't need the capistrano-karafka gem. Instead, you can use this simple Capistrano .cap file:

# frozen_string_literal: true

after 'deploy:starting', 'karafka:stop'
after 'deploy:published', 'karafka:start'
after 'deploy:failed', 'karafka:restart'

namespace :karafka do
  task :start do
    on roles(:app) do
      execute :sudo, :systemctl, :start, 'karafka'
    end
  end

  task :stop do
    on roles(:app) do
      execute :sudo, :systemctl, :stop, 'karafka'
    end
  end

  task :restart do
    on roles(:app) do
      execute :sudo, :systemctl, :restart, 'karafka'
    end
  end

  task :status do
    on roles(:app) do
      execute :sudo, :systemctl, :status, 'karafka'
    end
  end
end

If you need to run several processes of a given type, please refer to template unit files.

Docker

Karafka can be dockerized as any other Ruby/Rails app. To execute karafka server command in your Docker container, just put this into your Dockerfile:

ENV KARAFKA_ENV production
CMD bundle exec karafka server

AWS + MSK (Fully Managed Apache Kafka)

First of all, it is worth pointing out that Karafka, similar to librdkafka does not support SASL mechanism for AWS MSK IAM that allows Kafka clients to handle authentication and authorization with MSK clusters through AWS IAM. This mechanism is a proprietary idea that is not part of Kafka.

Karafka does, however, support standard SASL + SSL mechanisms. Please follow the below instructions for both cluster initialization and Karafka configuration.

AWS MSK cluster setup

  1. Navigate to the AWS MSK page and press the Create cluster button.
  2. Select Custom create and Provisioned settings.

  1. Use custom config and set auto.create.topics.enable to true unless you want to create topics using Kafka API. You can change it later, and in general, it is recommended to disallow auto-topic creation (typos, etc.), but this can be useful for debugging.

  1. Setup your VPC and networking details.
  2. Make sure that you disable the Unauthenticated access option. With it enabled, there won't be any authentication beyond those imposed by your security groups and VPC.
  3. Disable IAM role-based authentication.
  4. Enable SASL/SCRAM authentication

  1. Provision your cluster.
  2. Make sure your cluster is accessible from your machines. You can test it by using the AWS VPC Reachability Analyzer.

  1. Visit your cluster Properties page and copy the Endpoints addresses.

  1. Log in to any of your machines and run a telnet session to any of the brokers:
telnet your-broker.kafka.us-east-1.amazonaws.com 9096

Trying 172.31.22.230...
Connected to your-broker.kafka.us-east-1.amazonaws.com.
Escape character is '^]'.
^Connection closed by foreign host.

If you can connect, your settings are correct, and your cluster is visible from your instance.

  1. Go to the AWS Secret Manager and create a key starting with AmazonMSK_ prefix. Select Other type of secret and Plaintext and provide the following value inside of the text field:

  1. In the Encryption key section, press the Add new key.

  1. Create a Symmetric key with Encrypt and decrypt as a usage pattern.

  1. Select your key in the Encryption key section and press Next.
  2. Provide a secret name and description and press Next until you reach the Store button.
  3. Store your secret.
  4. Go back to the AWS MSK and select your cluster.
  5. Navigate to the Associated secrets from AWS Secrets Manager section and press Associate secrets

  1. Press the Choose secrets and select the previously created secret.

  1. Press Associate secrets. It will take AWS a while to do it.
  2. Congratulations, you just configured everything needed to make it work with Karafka.

Karafka configuration for AWS MSK SASL + SSL

Provide the following details to the kafka section:

config.kafka = {
  'bootstrap.servers': 'yourcluster-broker1.amazonaws.com:9096,yourcluster-broker2.amazonaws.com:9096',
  'security.protocol': 'SASL_SSL',
  'sasl.username': 'username',
  'sasl.password': 'password',
  'sasl.mechanisms': 'SCRAM-SHA-512'
}

After that, you should be good to go.

Troubleshooting AWS MSK

Local: Authentication failure

ERROR -- : rdkafka: [thrd:sasl_ssl://broker1.kafka.us-east-1.amazonaws.]:
sasl_ssl://broker1.us-east-1.amazonaws.com:9096/bootstrap: SASL authentication error:
Authentication failed during authentication due to invalid credentials with SASL mechanism SCRAM-SHA-512
(after 312ms in state AUTH_REQ, 1 identical error(s) suppressed)

ERROR -- : librdkafka internal error occurred: Local: Authentication failure (authentication)

It may mean two things:

  • Your credentials are wrong
  • AWS MSK did not yet refresh its allowed keys, and you need to wait. Despite AWS reporting cluster as Active with no pending changes, it may take a few minutes for the credentials to start working.

Connection setup timed out in state CONNECT

rdkafka: [thrd:sasl_ssl://broker1.kafka.us-east-1.amazonaws.]:
sasl_ssl://broker1.us-east-1.amazonaws.com:9092/bootstrap:
Connection setup timed out in state CONNECT (after 30037ms in state CONNECT)

This means Kafka is unreachable. Check your brokers' addresses and ensure you use a proper port: 9096 with SSL or 9092 when plaintext. Also, make sure your instance can access AWS MSK at all.

Connection failures and timeouts

Please make sure that your instances can reach Kafka. Keep in mind that security group updates can have a certain lag in propagation.

Rdkafka::RdkafkaError (Broker: Invalid replication factor (invalid_replication_factor))

Please make sure your custom setting default.replication.factor value matches what you have declared as Number of zones in the Brokers section:

Rdkafka::RdkafkaError: Broker: Topic authorization failed (topic_authorization_failed)

This error occurs in case you enabled Kafka ACL but did not grant proper ACL permissions to your users. It often happens when you make your AWS MSK public.

Please note that allow.everyone.if.no.acl.found false superseeds auto.create.topics.enable. This means that despite auto.create.topics.enable being set to true, you will not be able to auto-create topics as the ACL will block this.

We recommend creating all the needed topics before making the cluster public and assigning proper permissions via Kafka ACL.

If you want to verify that this is indeed an ACL issue, try running ::Karafka::Admin.cluster_info. If you get cluster info and no errors, you can connect to the cluster, but ACL blocks any usage.

::Karafka::Admin.cluster_info =>
#<Rdkafka::Metadata:0x00007fea8e3a43c0                                           
 @brokers=[{:broker_id=>1001, :broker_name=>"your-kafka-host", :broker_port=>9092}],   
 @topics=[]
>

You can also use this ACL command to give all operations access for the brokers on all the topics to a given user:

./bin/kafka-acls.sh \
  --authorizer-properties zookeeper.connect=<ZOOKEEPER_CONNECTION_STRING> \
  --add \
  --allow-principal User:<USER_NAME> \
  --allow-host=* \
  --operation All \
  --topic=* \
  --group=*

Note: The above command must be run from a client machine with Java + Kafka installation, and the machine should also be able to communicate with the zookeeper nodes.

Heroku

Karafka works with the Heroku Kafka add-on, but it requires some extra configuration and understanding of how the Heroku Kafka add-on works.

Details about how Kafka for Heroku works can also be found here:

Heroku Kafka prefix convention

Note: This section only applies to the Multi-Tenant add-on mode.

All Kafka Basic topics and consumer groups begin with a unique prefix associated with your add-on. This prefix is accessible via the KAFKA_PREFIX environment variable.

That means that in the multi-tenant mode, you must remember always to prefix all the topic names and all the consumer group names with the KAFKA_PREFIX environment variable value.

To make it work you need to follow few steps:

  1. Use a consumer mapper that will inject the Heroku Kafka prefix into each consumer group id automatically.
class KarafkaApp < Karafka::App
  setup do |config|
    # other config options...

    # Inject the prefix automatically to every consumer group
    config.consumer_mapper = ->(raw_group_name) { "#{ENV['KAFKA_PREFIX']}#{raw_group_name}" }
  end
end
  1. Create all the consumer groups before using them via the Heroku CLI.
heroku kafka:consumer-groups:create CONSUMER_GROUP_NAME

Note: The value of KAFKA_PREFIX typically is like smoothboulder-1234. which would make the default consumer group in Karafka smoothboulder-1234.app when used with the mapper defined above. Kafka itself does not need to know the prefix when creating the consumer group.

This means that the Heroku CLI command needs to look as follows:

heroku kafka:consumer-groups:create app

This allows Heroku's multi-tenant setup to route smoothboulder-1234.app to your cluster correctly.

  1. When consuming, you always need to use the prefixed topic name:
class KarafkaApp < Karafka::App
  # ...
  routes.draw do
    topic "#{ENV['KAFKA_PREFIX']}users_events" do
      consumer UsersEventsConsumer
    end
  end
end
  1. When producing, you always need to use the prefixed topic name:
Karafka.producer.produce_async(
  topic: "#{ENV['KAFKA_PREFIX']}users_events",
  payload: {
    user_id: user.id,
    event: 'user.deleted'
  }.to_json
)

Note: You will need to configure your topics in Kafka before they can be used. This can be done in the Heroku UI or via the CLI provided by Heroku. Be sure to name your topics without the KAFKA_PREFIX, e.g. heroku kafka:topics:create users_events --partitions 3.

Configuring Karafka to work with Heroku SSL

When you turn on the add-on, Heroku exposes a few environment variables within which important details are stored. You need to use them to configure Karafka as follows:

class KarafkaApp < Karafka::App
  setup do |config|
    config.kafka = {
      # ...
      'security.protocol': 'ssl',
      # KAFKA_URL has the protocol that we do not need as we define the protocol separately
      'bootstrap.servers': ENV['KAFKA_URL'].gsub('kafka+ssl://', ''),
      'ssl.certificate.pem': ENV['KAFKA_CLIENT_CERT'],
      'ssl.key.pem': ENV['KAFKA_CLIENT_CERT_KEY'],
      'ssl.ca.pem': ENV['KAFKA_TRUSTED_CERT']
    }

    # ... other config options
  end
end

Troubleshooting

There are few problems you may encounter when configuring things for Heroku:

Unsupported protocol "KAFKA+SSL"

parse error: unsupported protocol "KAFKA+SSL"

Solution: Make sure you strip off the kafka+ssl:// component from the KAFKA_URL env variable content.

Disconnected while requesting ApiVersion

Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration
(connecting to a SSL listener?)

Solution: Make sure all the settings are configured exactly as presented in the configuration section.

Topic authorization failed

Broker: Topic authorization failed (topic_authorization_failed) (Rdkafka::RdkafkaError)

Solution: Make sure to namespace all the topics and consumer groups with the KAFKA_PREFIX environment value.

Messages are not being consumed

DEBUG -- : [3732873c8a74] Polled 0 messages in 1000ms
DEBUG -- : [3732873c8a74] Polling messages...
DEBUG -- : [3732873c8a74] Polled 0 messages in 1000ms
DEBUG -- : [3732873c8a74] Polling messages...
DEBUG -- : [3732873c8a74] Polled 0 messages in 1000ms

Solution 1: Basic multi-tenant Kafka plans require a prefix on topics and consumer groups. Make sure that both your topics and consumer groups are prefixed.

Solution 2: Make sure you've created appropriate consumer groups prior to them being used via the Heroku CLI.