Class: Karafka::Connection::Client

Inherits:

Object

Object
Karafka::Connection::Client

show all

Includes:: Karafka::Core::Helpers::Time

Defined in:: lib/karafka/connection/client.rb

Overview

An abstraction layer on top of the rdkafka consumer.

It is threadsafe and provides some security measures so we won’t end up operating on a closed consumer instance as it causes Ruby VM process to crash.

Instance Attribute Summary collapse

#id ⇒ String readonly

Id of the client.
#name ⇒ String readonly

Underlying consumer name.
#rebalance_manager ⇒ Object readonly

Returns the value of attribute rebalance_manager.
#subscription_group ⇒ Karafka::Routing::SubscriptionGroup readonly

Subscription group to which this client belongs to.

Instance Method Summary collapse

#assignment ⇒ Rdkafka::Consumer::TopicPartitionList

Current active assignment.
#assignment_lost? ⇒ Boolean

True if our current assignment has been lost involuntarily.
#batch_poll ⇒ Karafka::Connection::MessagesBuffer

Fetches messages within boundaries defined by the settings (time, size, topics, etc).
#commit_offsets(async: true) ⇒ Boolean

Commits the offset on a current consumer in a non-blocking or blocking way.
#commit_offsets! ⇒ Object

Commits offset in a synchronous way.
#committed(tpl = nil) ⇒ Rdkafka::Consumer::TopicPartitionList

Return the current committed offset per partition for this consumer group.
#consumer_group_metadata_pointer ⇒ FFI::Pointer

Returns pointer to the consumer group metadata.
#events_poll(timeout = 0) ⇒ Object

Triggers the rdkafka main queue events by consuming this queue.
#initialize(subscription_group, batch_poll_breaker) ⇒ Karafka::Connection::Client constructor

Creates a new consumer instance.
#mark_as_consumed(message, metadata = nil) ⇒ Boolean

Marks given message as consumed.
#mark_as_consumed!(message, metadata = nil) ⇒ Boolean

Marks a given message as consumed and commits the offsets in a blocking way.
#pause(topic, partition, offset = nil, timeout = 0) ⇒ Object

Pauses given partition and moves back to last successful offset processed.
#ping ⇒ Object

Runs a single poll on the main queue and consumer queue ignoring all the potential errors This is used as a keep-alive in the shutdown stage and any errors that happen here are irrelevant from the shutdown process perspective.
#query_watermark_offsets(topic, partition) ⇒ Array<Integer, Integer>

Reads watermark offsets for given topic.
#reset ⇒ Object

Closes and resets the client completely.
#resume(topic, partition) ⇒ Object

Resumes processing of a give topic partition after it was paused.
#seek(message) ⇒ Object

Seek to a particular message.
#stop ⇒ Object

Gracefully stops topic consumption.
#store_offset(message, offset_metadata = nil) ⇒ Object

Stores offset for a given partition of a given topic based on the provided message.

Constructor Details

#initialize(subscription_group, batch_poll_breaker) ⇒ `Karafka::Connection::Client`

Creates a new consumer instance.

Parameters:

subscription_group (Karafka::Routing::SubscriptionGroup) —

subscription group with all the configuration details needed for us to create a client
batch_poll_breaker (Proc) —

proc that when evaluated to false will cause the batch poll loop to finish early. This improves the shutdown and dynamic multiplication as it allows us to early break on long polls.

# File 'lib/karafka/connection/client.rb', line 66

def initialize(subscription_group, batch_poll_breaker)
  @id = SecureRandom.hex(6)
  # Name is set when we build consumer
  @name = ''
  @closed = false
  @subscription_group = subscription_group
  @buffer = RawMessagesBuffer.new
  @rebalance_manager = RebalanceManager.new(@subscription_group.id, @buffer)
  @rebalance_callback = Instrumentation::Callbacks::Rebalance.new(@subscription_group, id)

  @interval_runner = Helpers::IntervalRunner.new do
    events_poll
    # events poller returns nil when not running often enough, hence we don't use the
    # boolean to be explicit
    batch_poll_breaker.call ? :run : :stop
  end

  # There are few operations that can happen in parallel from the listener threads as well
  # as from the workers. They are not fully thread-safe because they may be composed out of
  # few calls to Kafka or out of few internal state changes. That is why we mutex them.
  # It mostly revolves around pausing and resuming.
  @mutex = Mutex.new
  # We need to keep track of what we have paused for resuming
  # In case we loose partition, we still need to resume it, otherwise it won't be fetched
  # again if we get reassigned to it later on. We need to keep them as after revocation we
  # no longer may be able to fetch them from Kafka. We could build them but it is easier
  # to just keep them here and use if needed when cannot be obtained
  @paused_tpls = Hash.new { |h, k| h[k] = {} }
end

Instance Attribute Details

#id ⇒ `String` (readonly)

Returns id of the client.

Returns:

(String) —

id of the client



29
30
31

# File 'lib/karafka/connection/client.rb', line 29

def id
  @id
end

#name ⇒ `String` (readonly)

Note:

Consumer name may change in case we regenerate it

Returns underlying consumer name.

Returns:

(String) —

underlying consumer name



26
27
28

# File 'lib/karafka/connection/client.rb', line 26

def name
  @name
end

#rebalance_manager ⇒ `Object` (readonly)

Returns the value of attribute rebalance_manager.



18
19
20

# File 'lib/karafka/connection/client.rb', line 18

def rebalance_manager
  @rebalance_manager
end

#subscription_group ⇒ `Karafka::Routing::SubscriptionGroup` (readonly)

Returns subscription group to which this client belongs to.

Returns:

(Karafka::Routing::SubscriptionGroup) —

subscription group to which this client belongs to



22
23
24

# File 'lib/karafka/connection/client.rb', line 22

def subscription_group
  @subscription_group
end

Instance Method Details

#assignment ⇒ `Rdkafka::Consumer::TopicPartitionList`

Returns current active assignment.

Returns:

(Rdkafka::Consumer::TopicPartitionList) —

current active assignment



183
184
185

# File 'lib/karafka/connection/client.rb', line 183

def assignment
  kafka.assignment
end

#assignment_lost? ⇒ `Boolean`

Returns true if our current assignment has been lost involuntarily.

Returns:

(Boolean) —

true if our current assignment has been lost involuntarily.



178
179
180

# File 'lib/karafka/connection/client.rb', line 178

def assignment_lost?
  kafka.assignment_lost?
end

#batch_poll ⇒ `Karafka::Connection::MessagesBuffer`

Note:

This method should not be executed from many threads at the same time

Fetches messages within boundaries defined by the settings (time, size, topics, etc).

Also periodically runs the events polling to trigger events callbacks.

Returns:

(Karafka::Connection::MessagesBuffer) —

messages buffer that holds messages per topic partition

# File 'lib/karafka/connection/client.rb', line 103

def batch_poll
  time_poll = TimeTrackers::Poll.new(@subscription_group.max_wait_time)

  @buffer.clear
  @rebalance_manager.clear

  events_poll

  loop do
    time_poll.start

    # Don't fetch more messages if we do not have any time left
    break if time_poll.exceeded?
    # Don't fetch more messages if we've fetched max that we've wanted
    break if @buffer.size >= @subscription_group.max_messages

    # Fetch message within our time boundaries
    response = poll(time_poll.remaining)

    # We track when last polling happened so we can provide means to detect upcoming
    # `max.poll.interval.ms` limit
    @buffer.polled

    case response
    when :tick_time
      nil
    # We get a hash only in case of eof error
    when Hash
      @buffer.eof(response[:topic], response[:partition])
    when nil
      nil
    else
      @buffer << response
    end

    # Upon polling rebalance manager might have been updated.
    # If partition revocation happens, we need to remove messages from revoked partitions
    # as well as ensure we do not have duplicated due to the offset reset for partitions
    # that we got assigned
    #
    # We also do early break, so the information about rebalance is used as soon as possible
    if @rebalance_manager.changed?
      # Since rebalances do not occur often, we can run events polling as well without
      # any throttling
      events_poll

      break
    end

    # If we were signaled from the outside to break the loop, we should
    break if @interval_runner.call == :stop

    # Track time spent on all of the processing and polling
    time_poll.checkpoint

    # Finally once we've (potentially) removed revoked, etc, if no messages were returned
    # and it was not an early poll exist, we can break. We also break if we got the eof
    # signaling to propagate it asap
    # Worth keeping in mind, that the rebalance manager might have been updated despite no
    # messages being returned during a poll
    break if response.nil? || response.is_a?(Hash)
  end

  @buffer
end

#commit_offsets(async: true) ⇒ `Boolean`

Note:

This will commit all the offsets for the whole consumer. In order to achieve granular control over where the offset should be for particular topic partitions, the store_offset should be used to only store new offset when we want them to be flushed

Note:

This method for async may return true despite involuntary partition revocation as it does not resolve to lost_assignment?. It returns only the commit state operation result.

Commits the offset on a current consumer in a non-blocking or blocking way.

Parameters:

async (Boolean) (defaults to: true) —

should the commit happen async or sync (async by default)

Returns:

(Boolean) —

did committing was successful. It may be not, when we no longer own given partition.



200
201
202

# File 'lib/karafka/connection/client.rb', line 200

def commit_offsets(async: true)
  internal_commit_offsets(async: async)
end

#commit_offsets! ⇒ `Object`

Commits offset in a synchronous way.

#committed(tpl = nil) ⇒ `Rdkafka::Consumer::TopicPartitionList`

Note:

It is recommended to use this only on rebalances to get positions with metadata when working with metadata as this is synchronous

Return the current committed offset per partition for this consumer group. The offset field of each requested partition will either be set to stored offset or to -1001 in case there was no stored offset for that partition.

Parameters:

tpl (Rdkafka::Consumer::TopicPartitionList) (defaults to: nil) —

for which we want to get committed

Returns:

(Rdkafka::Consumer::TopicPartitionList)

Raises:

(Rdkafka::RdkafkaError) —

When getting the committed positions fails.



426
427
428

# File 'lib/karafka/connection/client.rb', line 426

def committed(tpl = nil)
  @wrapped_kafka.committed(tpl)
end

#consumer_group_metadata_pointer ⇒ `FFI::Pointer`

Returns pointer to the consumer group metadata. It is used only in the context of exactly-once-semantics in transactions, this is why it is never remapped to Ruby

Returns:

(FFI::Pointer)



413
414
415

# File 'lib/karafka/connection/client.rb', line 413

def consumer_group_metadata_pointer
  kafka.consumer_group_metadata_pointer
end

#events_poll(timeout = 0) ⇒ `Object`

Note:

It is non-blocking when timeout 0 and will not wait if queue empty. It costs up to 2ms when no callbacks are triggered.

Triggers the rdkafka main queue events by consuming this queue. This is not the consumer consumption queue but the one with: - error callbacks - stats callbacks - OAUTHBEARER token refresh callbacks

Parameters:

timeout (Integer) (defaults to: 0) —

number of milliseconds to wait on events or 0 not to wait.



406
407
408

# File 'lib/karafka/connection/client.rb', line 406

def events_poll(timeout = 0)
  kafka.events_poll(timeout)
end

#mark_as_consumed(message, metadata = nil) ⇒ `Boolean`

Note:

This method won’t trigger automatic offsets commits, rather relying on the offset check-pointing trigger that happens with each batch processed. It will however check the librdkafka assignment ownership to increase accuracy for involuntary revocations.

Marks given message as consumed.

Parameters:

message (Karafka::Messages::Message) —

message that we want to mark as processed
metadata (String, nil) (defaults to: nil) —

offset storage metadata or nil if none

Returns:

(Boolean) —

true if successful. False if we no longer own given partition



354
355
356

# File 'lib/karafka/connection/client.rb', line 354

def mark_as_consumed(message, metadata = nil)
  store_offset(message, metadata) && !assignment_lost?
end

#mark_as_consumed!(message, metadata = nil) ⇒ `Boolean`

Marks a given message as consumed and commits the offsets in a blocking way.

Parameters:

message (Karafka::Messages::Message) —

message that we want to mark as processed
metadata (String, nil) (defaults to: nil) —

offset storage metadata or nil if none

Returns:

(Boolean) —

true if successful. False if we no longer own given partition

# File 'lib/karafka/connection/client.rb', line 363

def mark_as_consumed!(message, metadata = nil)
  return false unless mark_as_consumed(message, metadata)

  commit_offsets!
end

#pause(topic, partition, offset = nil, timeout = 0) ⇒ `Object`

Note:

This will pause indefinitely and requires manual #resume

Note:

When #internal_seek is not involved (when offset is nil) we will not purge the librdkafka buffers and continue from the last cursor offset

Note:

We accept the timeout value on this layer to have a cohesive pause/resume instrumentation, where all the details are available. It is especially needed, when

Pauses given partition and moves back to last successful offset processed.

Parameters:

topic (String) —

topic name
partition (Integer) —

partition
offset (Integer, nil) (defaults to: nil) —

offset of the message on which we want to pause (this message will be reprocessed after getting back to processing) or nil if we want to pause and resume from the consecutive offset (+1 from the last message passed to us by librdkafka)
timeout (Integer) (defaults to: 0) —

number of ms timeout of pause. It is used only for instrumentation and not in the pause itself as pausing on this level is infinite always.

# File 'lib/karafka/connection/client.rb', line 235

def pause(topic, partition, offset = nil, timeout = 0)
  @mutex.synchronize do
    # Do not pause if the client got closed, would not change anything
    return if @closed

    internal_commit_offsets(async: true)

    # Here we do not use our cached tpls because we should not try to pause something we do
    # not own anymore.
    tpl = topic_partition_list(topic, partition)

    return unless tpl

    Karafka.monitor.instrument(
      'client.pause',
      caller: self,
      subscription_group: @subscription_group,
      topic: topic,
      partition: partition,
      offset: offset,
      timeout: timeout
    )

    @paused_tpls[topic][partition] = tpl

    kafka.pause(tpl)

    # If offset is not provided, will pause where it finished.
    # This makes librdkafka not purge buffers and can provide significant network savings
    # when we just want to pause before further processing without changing the offsets
    return unless offset

    pause_msg = Messages::Seek.new(topic, partition, offset)

    internal_seek(pause_msg)
  end
end

#ping ⇒ `Object`

Runs a single poll on the main queue and consumer queue ignoring all the potential errors This is used as a keep-alive in the shutdown stage and any errors that happen here are irrelevant from the shutdown process perspective

This is used only to trigger rebalance callbacks and other callbacks

# File 'lib/karafka/connection/client.rb', line 389

def ping
  events_poll(100)
  poll(100)
rescue Rdkafka::RdkafkaError
  nil
end

#query_watermark_offsets(topic, partition) ⇒ `Array<Integer, Integer>`

Reads watermark offsets for given topic

Parameters:

topic (String) —

topic name
partition (Integer) —

partition number

Returns:

(Array<Integer, Integer>) —

watermark offsets (low, high)



435
436
437

# File 'lib/karafka/connection/client.rb', line 435

def query_watermark_offsets(topic, partition)
  @wrapped_kafka.query_watermark_offsets(topic, partition)
end

#reset ⇒ `Object`

Closes and resets the client completely.

# File 'lib/karafka/connection/client.rb', line 370

def reset
  Karafka.monitor.instrument(
    'client.reset',
    caller: self,
    subscription_group: @subscription_group
  ) do
    close

    @interval_runner.reset
    @closed = false
    @paused_tpls.clear
  end
end

#resume(topic, partition) ⇒ `Object`

Resumes processing of a give topic partition after it was paused.

Parameters:

topic (String) —

topic name
partition (Integer) —

partition

# File 'lib/karafka/connection/client.rb', line 277

def resume(topic, partition)
  @mutex.synchronize do
    return if @closed

    # We now commit offsets on rebalances, thus we can do it async just to make sure
    internal_commit_offsets(async: true)

    # If we were not able, let's try to reuse the one we have (if we have)
    tpl = topic_partition_list(topic, partition) || @paused_tpls[topic][partition]

    return unless tpl

    # If we did not have it, it means we never paused this partition, thus no resume should
    # happen in the first place
    return unless @paused_tpls[topic].delete(partition)

    Karafka.monitor.instrument(
      'client.resume',
      caller: self,
      subscription_group: @subscription_group,
      topic: topic,
      partition: partition
    )

    kafka.resume(tpl)
  end
end

#seek(message) ⇒ `Object`

Note:

Please note, that if you are seeking to a time offset, getting the offset is blocking

Seek to a particular message. The next poll on the topic/partition will return the message at the given offset.

Parameters:

message (Messages::Message, Messages::Seek) —

message to which we want to seek to. It can have the time based offset.



217
218
219

# File 'lib/karafka/connection/client.rb', line 217

def seek(message)
  @mutex.synchronize { internal_seek(message) }
end

#stop ⇒ `Object`

Gracefully stops topic consumption.

# File 'lib/karafka/connection/client.rb', line 306

def stop
  # librdkafka has several constant issues when shutting down during rebalance. This is
  # an issue that gets back every few versions of librdkafka in a limited scope, for example
  # for cooperative-sticky or in a general scope. This is why we unsubscribe and wait until
  # we no longer have any assignments. That way librdkafka consumer shutdown should never
  # happen with rebalance associated with the given consumer instance. Since we do not want
  # to wait forever, we also impose a limit on how long should we wait. This prioritizes
  # shutdown stability over endless wait.
  #
  # The `@unsubscribing` ensures that when there would be a direct close attempt, it
  # won't get into this loop again. This can happen when supervision decides it should close
  # things faster
  #
  # @see https://github.com/confluentinc/librdkafka/issues/4792
  # @see https://github.com/confluentinc/librdkafka/issues/4527
  if unsubscribe?
    @unsubscribing = true

    # Give 50% of time for the final close before we reach the forceful
    max_wait = shutdown_timeout * COOP_UNSUBSCRIBE_FACTOR
    used = 0
    stopped_at = monotonic_now

    unsubscribe

    until assignment.empty?
      used += monotonic_now - stopped_at
      stopped_at = monotonic_now

      break if used >= max_wait

      sleep(0.1)

      ping
    end
  end

  close
end

#store_offset(message, offset_metadata = nil) ⇒ `Object`

Stores offset for a given partition of a given topic based on the provided message.

Parameters:

message (Karafka::Messages::Message)
offset_metadata (String, nil) (defaults to: nil) —

offset storage metadata or nil if none



173
174
175

# File 'lib/karafka/connection/client.rb', line 173

def store_offset(message, offset_metadata = nil)
  internal_store_offset(message, offset_metadata)
end

Class: Karafka::Connection::Client

Overview

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(subscription_group, batch_poll_breaker) ⇒ Karafka::Connection::Client

Instance Attribute Details

#id ⇒ String (readonly)

#name ⇒ String (readonly)

#rebalance_manager ⇒ Object (readonly)

#subscription_group ⇒ Karafka::Routing::SubscriptionGroup (readonly)

Instance Method Details

#assignment ⇒ Rdkafka::Consumer::TopicPartitionList

#assignment_lost? ⇒ Boolean

#batch_poll ⇒ Karafka::Connection::MessagesBuffer

#commit_offsets(async: true) ⇒ Boolean

#commit_offsets! ⇒ Object

#committed(tpl = nil) ⇒ Rdkafka::Consumer::TopicPartitionList

#consumer_group_metadata_pointer ⇒ FFI::Pointer

#events_poll(timeout = 0) ⇒ Object

#mark_as_consumed(message, metadata = nil) ⇒ Boolean

#mark_as_consumed!(message, metadata = nil) ⇒ Boolean

#pause(topic, partition, offset = nil, timeout = 0) ⇒ Object

#ping ⇒ Object

#query_watermark_offsets(topic, partition) ⇒ Array<Integer, Integer>

#reset ⇒ Object

#resume(topic, partition) ⇒ Object

#seek(message) ⇒ Object

#stop ⇒ Object

#store_offset(message, offset_metadata = nil) ⇒ Object

#initialize(subscription_group, batch_poll_breaker) ⇒ `Karafka::Connection::Client`

#id ⇒ `String` (readonly)

#name ⇒ `String` (readonly)

#rebalance_manager ⇒ `Object` (readonly)

#subscription_group ⇒ `Karafka::Routing::SubscriptionGroup` (readonly)

#assignment ⇒ `Rdkafka::Consumer::TopicPartitionList`

#assignment_lost? ⇒ `Boolean`

#batch_poll ⇒ `Karafka::Connection::MessagesBuffer`

#commit_offsets(async: true) ⇒ `Boolean`

#commit_offsets! ⇒ `Object`

#committed(tpl = nil) ⇒ `Rdkafka::Consumer::TopicPartitionList`

#consumer_group_metadata_pointer ⇒ `FFI::Pointer`

#events_poll(timeout = 0) ⇒ `Object`

#mark_as_consumed(message, metadata = nil) ⇒ `Boolean`

#mark_as_consumed!(message, metadata = nil) ⇒ `Boolean`

#pause(topic, partition, offset = nil, timeout = 0) ⇒ `Object`

#ping ⇒ `Object`

#query_watermark_offsets(topic, partition) ⇒ `Array<Integer, Integer>`

#reset ⇒ `Object`

#resume(topic, partition) ⇒ `Object`

#seek(message) ⇒ `Object`

#stop ⇒ `Object`

#store_offset(message, offset_metadata = nil) ⇒ `Object`