-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-14499: [4/N] Implement OffsetFetch API #14120
Conversation
return FutureUtils.failedFuture(Errors.UNSUPPORTED_VERSION.exception( | ||
"This API is not implemented yet." | ||
)); | ||
// For backwards compatibility, we support fetch commits for the empty groupId. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we say we support fetch commits for the empty group ID, does this mean that we support it by always returning an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It means that you can actually commit offsets to a group names with an empty group id. I was not aware of this until I rewrote this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok -- I guess I just wasn't sure if this comment and the error need to be changed or not. Is the idea you can commit them but can't fetch them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked to David offline -- concluded that the comment was a little confusing -- null groupIds are not allowed but empty ones are.
@@ -256,6 +259,40 @@ public HeartbeatResponseData genericGroupHeartbeat( | |||
); | |||
} | |||
|
|||
/** | |||
* Fetch offsets for a given Group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should the comment mention this method fetches offsets for a given set of topics? I realize it is in the parameters, but could be useful to more easily distinguish this and fetchAllOffsets
} | ||
|
||
/** | ||
* @return The group corresponding to the group id or throw GroupIdNotFoundException. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we also include something about the committed offset here?
*/ | ||
@Override | ||
public void validateOffsetFetch() { | ||
// Nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I assume this is a todo for a future pr? Or do we not need any validation here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will have some validation when we add the member id/epoch in the request. This is coming in the next PR.
|
||
@ParameterizedTest | ||
@ValueSource(booleans = {true, false}) | ||
public void testFetchOffsetsWithoutWithoutRequireStable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Were these tests named before we parameterized them? This one and the one below say Without (Without) Require stable, but looks like we test both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. You're right.
return FutureUtils.failedFuture(Errors.INVALID_GROUP_ID.exception()); | ||
} | ||
|
||
// The require stable flag when set tells the broker to hold on returning unstable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my understanding -- this works because we only allow one write operation at a time and do not start the next operation until the previous write is completed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next write operation could actually be ran before the previous one in completed (or committed) but the write operations guarantees that the second one is completed only after the first one in order to respect the timeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. So the ordering is guaranteed by when each completes not when it starts.
} | ||
|
||
@Test | ||
public void testFetchOffsetsWithCommittedAndUncommittedData() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we call this test and the equivalent fetch all offsets "withCommittedAndUncommittedData"? Isn't the data all committed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we actually don't have this information here. The name is poorly chosen here. I meant that we test fetching offsets with different epoch to actually simulate uncommitted records. Let me rename it.
@jolshan Thanks for your review. I have addressed your comments. |
@dajac can you answer this real quick? Otherwise lg #14120 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
commit 62f876f2b81769598e25d8e9da6bb18f59f31008 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:27:34 2023 -0700 Revert "Squashed commit of the following:" This reverts commit 02fc35ea2c4564d9da5f42d5989deaf283d54122. commit 158f7932e7ba152639e5643f84e055065eee3269 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:26:40 2023 -0700 Change list to collection commit 02fc35ea2c4564d9da5f42d5989deaf283d54122 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:25:29 2023 -0700 Squashed commit of the following: commit eb39c95080b994398c40bcf5d54181e713ed6faa Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Wed Sep 6 14:49:48 2023 +0200 MINOR: StoreChangelogReaderTest fails with log-level DEBUG (#14300) A mocked method is executed unexpectedly when we enable DEBUG log level, leading to confusing test failures during debugging. Since the log message itself seems useful, we adapt the test to take the additional mocked method call into account). Reviewer: Bruno Cadonna <cadonna@apache.org> commit cc289d04c701a59f571683b908f778e0b236d72f Author: Ritika Reddy <98577846+rreddy-22@users.noreply.github.com> Date: Wed Sep 6 00:28:23 2023 -0700 MINOR: Fix trailing white spaces on reviewers.py (#14343) Fixing trailing white spaces on reviewers.py. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io> commit 7054625c45dc6edb3c07271fe4a6c24b4638424f Author: David Jacot <djacot@confluent.io> Date: Tue Sep 5 23:36:38 2023 -0700 KAFKA-14499: [6/N] Add MemberId and MemberEpoch to OffsetFetchRequest (#14321) This patch adds the MemberId and the MemberEpoch fields to the OffsetFetchRequest. Those fields will be populated when the new consumer group protocol is used to ensure that the member fetching the offset has the correct member id and epoch. If it does not, UNKNOWN_MEMBER_ID or STALE_MEMBER_EPOCH are returned to the client. Our initial idea was to implement the same for the old protocol. The field is called GenerationIdOrMemberEpoch in KIP-848 to materialize this. As a second though, I think that we should only do it for the new protocol. The effort to implement it in the old protocol is not worth it in my opinion. Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io> commit 80982c4ae3fe6be127b48ec09caff11ab5f87c69 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Wed Sep 6 05:50:12 2023 +0530 KAFKA-15410: Delete topic integration test with LocalTieredStorage and TBRLMM (3/4) (#14329) Added delete topic integration tests for tiered storage enabled topics with LocalTieredStorage and TBRLMM Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit b49013b73efa25466652d8d8122974e60c927ec4 Author: Andrew Schofield <aschofield@confluent.io> Date: Tue Sep 5 19:57:51 2023 +0100 KAFKA-9800: Exponential backoff for Kafka clients - KIP-580 (#14111) Implementation of KIP-580 to add exponential back-off to situations in which retry.backoff.ms is used to delay backoff attempts. This KIP adds exponential backoff behavior with a maximum controlled by a new config retry.backoff.max.ms, together with a +/- 20% of jitter to spread the retry attempts of the client fleet. Reviewers: Mayank Shekhar Narula <mayanks.narula@gmail.com>, Milind Luthra <i.milind.luthra@gmail.com>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com> commit 1f473ebb5ea9ad4ebdfdc99051864cce6d80db87 Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Sep 5 19:39:49 2023 +0100 KAFKA-14876: Add stopped state to Kafka Connect Administration docs section (#14336) Reviewers: Chris Egerton <chrise@aiven.io> commit 79598b49d6fff9bef686500f46a288b61a9013fd Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Sep 5 18:58:44 2023 +0100 MINOR: Update the documentation's table of contents to add missing headings for Kafka Connect (#14337) Reviewers: Chris Egerton <chrise@aiven.io> commit 37a51e286d5aaa890439e074e9f781ec26aaef2e Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Tue Sep 5 22:19:10 2023 +0530 KAFKA-15293 Added documentation for tiered storage metrics (#14331) Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org> commit be0de2124a1bb6363355aa792be618ddbb87e460 Author: Luke Chen <showuon@gmail.com> Date: Wed Sep 6 00:06:28 2023 +0800 MINOR: Update comment in consumeAction (#14335) Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com> commit 9f2ac375c282e1471a2d385704e1f7c128f34bb6 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Tue Sep 5 19:28:17 2023 +0530 KAFKA-15410: Reassign replica expand, move and shrink integration tests (2/4) (#14328) - Updated the log-start-offset to the correct value while building the replica state in ReplicaFetcherTierStateMachine#buildRemoteLogAuxState Integration tests added: 1. ReassignReplicaExpandTest 2. ReassignReplicaMoveTest and 3. ReassignReplicaShrinkTest Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit b892acae5e026e1affd51ef9756772807674b964 Author: Justine Olshan <jolshan@confluent.io> Date: Mon Sep 4 20:40:50 2023 -0700 KAFKA-15424: Make the transaction verification a dynamic configuration (#14324) This will allow enabling and disabling transaction verification (KIP-890 part 1) without having to roll the cluster. Tested that restarting the cluster persists the configuration. If a verification is disabled/enabled while we have an inflight request, depending on the step of the process, the change may or may not be seen in the inflight request (enabling will typically fail unverified requests, but we may still verify and reject when we first disable) Subsequent requests/retries will behave as expected for verification. Sequence checks will continue to take place after disabling until the first message is written to the partition (thus clearing the verification entry with the tentative sequence) or the broker restarts/partition is reassigned which will clear the memory. On enabling, we will only track sequences that for requests received after the verification is enabled. Reviewers: Jason Gustafson <jason@confluent.io>, Satish Duggana <satishd@apache.org> commit caaa4c55fee68c5893d54ffe84287f3b5205fff1 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Tue Sep 5 05:13:16 2023 +0530 KAFKA-15410: Expand partitions, segment deletion by retention and enable remote log on topic integration tests (1/4) (#14307) Added the below integration tests with tiered storage - PartitionsExpandTest - DeleteSegmentsByRetentionSizeTest - DeleteSegmentsByRetentionTimeTest and - EnableRemoteLogOnTopicTest - Enabled the test for both ZK and Kraft modes. These are enabled for both ZK and Kraft modes. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com> commit d34d84dbef20559e68c899315a0915e9dd740cb0 Author: Yash Mayya <yash.mayya@gmail.com> Date: Mon Sep 4 12:54:18 2023 +0100 KAFKA-7438: Migrate WindowStoreBuilderTest from EasyMock to Mockito (#14152) Reviewers: Divij Vaidya <diviv@amazon.com> commit 7a516b03863a00aa0cebfb376171a31a92d26f8c Author: Christo Lolov <lolovc@amazon.com> Date: Mon Sep 4 11:58:50 2023 +0100 KAFKA-14133: Move AbstractStreamTest and RocksDBMetricsRecordingTriggerTest to Mockito (#14223) Reviewers: Divij Vaidya <diviv@amazon.com> commit 78c59cd2b0b5f21c2028021d9dfb72d21065bb00 Author: Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com> Date: Mon Sep 4 11:02:32 2023 +0200 KAFKA-15052 Fix the flaky testBalancePartitionLeaders - part II (#13908) A follow-up to https://github.com/apache/kafka/pull/13804. This follow-up adds the alternative fix approach mentioned in the PR above - bumping the session timeout used in the test with 1 second. Reproducing the flake-out locally has been much harder than on the CI runs, as neither Gradle with Java 11 or Java 14 nor IntelliJ with Java 14 could show it, but IntelliJ with Java 11 could occasionally reproduce the failure the first time immediately after a rebuild. While I was unable to see the failure with the bumped session timeout, the testing procedure definitely didn't provide sufficient reassurance for the fix as even without it often I'd see hundreds of consecutive successful test runs when the first run didn't fail. Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com> commit a6409e8e61fec1be74a9b404d2750db15d1de40a Author: Proven Provenzano <93720617+pprovenzano@users.noreply.github.com> Date: Mon Sep 4 04:46:12 2023 -0400 KAFKA-15422: Update documenttion for delegation tokens when working with Kafka with KRaft (#14318) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> commit 5074c8038e44620b48d7700226810b983febd864 Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Mon Sep 4 09:13:04 2023 +0530 KAFKA-15260: RLM Task should handle uninitialized RLMM for the associated topic-parititon (#14113) This change is about RLM task handling retriable exception when it tries to copy segments to remote but the RLMM is not yet initialized. On encountering the exception, we log the error and throw the exception back to the caller. We also make sure that the failure metrics are updated since this is a temporary error because RLMM is not yet initialized. Added unit tests to verify RLM task does not attempt to copy segments to remote on encountering the retriable exception and that failure metrics remain unchanged. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com> commit da99879df7bd96675e4abbd50d4e504dc07df574 Author: Luke Chen <showuon@gmail.com> Date: Sun Sep 3 16:16:54 2023 +0800 KAFKA-15421: fix network thread leak in testThreadPoolResize (#14320) In SocketServerTest, we create SocketServer and enableRequestProcessing on each test class initialization. That's fine since we shutdown it in @AfterEach. The issue we have is we disabled 2 tests in this test suite. And when running these disabled tests, we will go through class initialization, but without @AfterEach. That causes 2 network thread leaked. Compared the error message in DynamicBrokerReconfigurationTest#testThreadPoolResize test here: org.opentest4j.AssertionFailedError: Invalid threads: expected 6, got 8: List(data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(PLAINTEXT)-PLAINTEXT-0, data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0, data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0, data-plane-kafka-socket-acceptor-ListenerName(PLAINTEXT)-PLAINTEXT-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0) ==> expected: <true> but was: <false> The 2 unexpected network threads are leaked from SocketServerTest. Reviewers: Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Chris Egerton <chrise@aiven.io> commit cc53889aaae8371964f9734a30da570afd0b7916 Author: Rohan <desai.p.rohan@gmail.com> Date: Sat Sep 2 18:14:14 2023 -0700 KAFKA-15429: reset transactionInFlight on StreamsProducer close (#14326) Resets the value of transactionInFlight to false when closing the StreamsProducer. This ensures we don't try to commit against a closed producer Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org> commit d293cd0735e050325eb7d2eafec435ec8aec92eb Author: Rohan <desai.p.rohan@gmail.com> Date: Sat Sep 2 18:13:16 2023 -0700 KAFKA-15429: catch+log errors from unsubscribe in streamthread shutdown (#14325) Preliminary fix for KAFKA-15429 which updates StreamThread.completeShutdown to catch-and-log errors from consumer.unsubscribe. Though this does not prevent the exception, it does preserve the original exception that caused the stream thread to exit. Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org> commit 1bb8c11f5aa07709ce1b1b6ef684a6750242d4b0 Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com> Date: Fri Sep 1 16:57:17 2023 -0400 KAFKA-14965 - OffsetsRequestsManager implementation & API integration (#14308) Implementation of the OffsetRequestsManager, responsible for building requests and processing responses for requests related to partition offsets. In this PR, the manager includes support for ListOffset requests, generated when the user makes any of the following consumer API calls: beginningOffsets endOffsets offsetsForTimes All previous consumer API calls interact with the OffsetsRequestsManager by generating a ListOffsetsApplicationEvent. Includes tests to cover the new functionality and to ensure no API level changes are introduced. This covers KAFKA-14965 and KAFKA-15081. Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com> commit 134f6c07a48219d2b54a6fed38ecb576af2f7cf3 Author: Christo Lolov <lolovc@amazon.com> Date: Fri Sep 1 18:42:57 2023 +0100 KAFKA-15427: Fix resource leak in integration tests for tiered storage (#14319) Co-authored-by: Nikhil Ramakrishnan <nikrmk@amazon.com> Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit 6391c6603582a4e4b5bb670233c345e53f82a77b Author: Jeff Kim <kimkb2011@gmail.com> Date: Fri Sep 1 09:36:33 2023 -0400 KAFKA-14500; [7/7] Refactor GroupMetadataManagerTest (#14122) This patch makes the styling consistent inside GroupMetadataManagerTest. Also, it adds JoinResult to simplify the JoinGroup API responses in the tests. Reviewers: David Arthur <mumrah@gmail.com>, David Jacot <djacot@confluent.io> commit dcff0878c48803e2d68f7e43c1e73735b643ace0 Author: David Jacot <djacot@confluent.io> Date: Fri Sep 1 03:45:24 2023 -0700 KAFKA-14499: [5/N] Refactor GroupCoordinator.fetchOffsets and GroupCoordinator.fetchAllOffsets (#14310) This patch refactors the GroupCoordinator.fetchOffsets and GroupCoordinator.fetchAllOffsets methods to take an OffsetFetchRequestGroup and to return an OffsetFetchResponseGroup. It prepares the ground for adding the member id and the member epoch to the OffsetFetchRequest. This change also makes those two methods more aligned with the others in the interface. Reviewers: Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io> commit d0f3cf1f9fa53b7d8663ca23bf42b5c6847e07ab Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Fri Sep 1 06:33:33 2023 +0530 KAFKA-15351: Ensure log-start-offset not updated to local-log-start-offset when remote storage enabled (#14301) When tiered storage is enabled on the topic, and the last-standing-replica is restarted, then the log-start-offset should not reset its offset to first-local-log-segment-base-offset. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com> commit 16dc983ad67767ee8debd125a3f8b150a91c7acf Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Thu Aug 31 22:21:01 2023 +0200 Kafka Streams Threading: Timeout behavior (#14171) Implement setting and clearing task timeouts, as well as changing the output on exceptions to make it similar to the existing code path. Reviewer: Walker Carlson <wcarlson@apache.org> commit 43fe13350f7a4c74cd101cbb69a01d062f5c9329 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Fri Sep 1 00:09:26 2023 +0530 KAFKA-15404: Disable the flaky integration tests. (#14296) Disabled the below tests to fix the thread leak: 1. kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize() and 2. org.apache.kafka.tiered.storage.integration.OffloadAndConsumeFromLeaderTest Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Justine Olshan <jolshan@confluent.io> commit c2bb8eb875d94568d1ad19bf207ec69c182405d4 Author: Luke Chen <showuon@gmail.com> Date: Thu Aug 31 16:44:32 2023 +0800 MINOR: Close topic based RLMM correctly in integration tests (#14315) Reviewers: Divij Vaidya <diviv@amazon.com> commit 95e1cdc4efbc720687cefad5bacd053565d03614 Author: A. Sophie Blee-Goldman <ableegoldman@gmail.com> Date: Wed Aug 30 13:29:38 2023 -0700 HOTFIX: avoid placement of unnecessary transient standby tasks & improve assignor logging (#14149) Minor fix to avoid creating unnecessary standby tasks, especially when these may be surprising or unexpected as in the case of an application with num.standby.replicas = 0 and warmup replicas disabled. The "bug" here was introduced during the fix for an issue with cooperative rebalancing and in-memory stores. The fundamental problem is that in-memory stores cannot be unassigned from a consumer for any period, however temporary, without being closed and losing all the accumulated state. This caused some grief when the new HA task assignor would assign an active task to a node based on the readiness of the standby version of that task, but would have to remove the active task from the initial assignment so it could first be revoked from its previous owner, as per the cooperative rebalancing protocol. This temporary gap in any version of that task among the consumer's assignment for that one intermediate rebalance would end up causing the consumer to lose all state for it, in the case of in-memory stores. To fix this, we simply began to place standby tasks on the intended recipient of an active task awaiting revocation by another consumer. However, the fix was a bit of an overreach, as we assigned these temporary standby tasks in all cases, regardless of whether there had previously been a standby version of that task. We can narrow this down without sacrificing any of the intended functionality by only assigning this kind of standby task where the consumer had previously owned some version of it that would otherwise potentially be lost. Also breaks up some of the long log lines in the StreamsPartitionAssignor and expands the summary info while moving it all to the front of the line (following reports of missing info due to truncation of long log lines in larger applications) commit 703e1d9faafbf07795261b3233ab985583f17fcb Author: Vincent Jiang <84371940+vincent81jiang@users.noreply.github.com> Date: Wed Aug 30 09:19:24 2023 -0700 KAFKA-15375: fix broken clean shutdown detection logic in LogManager When running in kraft mode, LogManager.startup is called in a different thread than the main broker (#14239) startup thread (by BrokerMetadataPublisher when the first metadata update is received.) If a fatal error happens during broker startup, before LogManager.startup is completed, LogManager.shutdown may mark log dirs as clean shutdown improperly. This PR includes following change: 1. During LogManager startup time: - track hadCleanShutdwon info for each log dir - track loadLogsCompleted status for each log dir 2. During LogManager shutdown time: - do not write clean shutdown marker file for log dirs which have hadCleanShutdown==false and loadLogsCompleted==false Reviewers: Colin P. McCabe <cmccabe@apache.org> commit dbda60c60da8f5a7eabe113615196b729b40a0e8 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 30 10:19:22 2023 +0100 KAFKA-14133: Move RocksDBRangeIteratorTest, TimestampedKeyValueStoreBuilderTest and TimestampedSegmentTest to Mockito (#14222) Reviewers: Divij Vaidya <diviv@amazon.com> commit 2e3ff21c2e3674ece50c2a8a4053b93024e12b4a Author: mannoopj <139923522+mannoopj@users.noreply.github.com> Date: Wed Aug 30 03:03:41 2023 -0400 KAFKA-15412: Reading an unknown version of quorum-state-file should trigger an error (#14302) Reading an unknown version of quorum-state-file should trigger an error. Currently the only known version is 0. Reading any other version should cause an error. Reviewers: Justine Olshan <jolshan@confluent.io>, Luke Chen <showuon@gmail.com> commit efec0f5756510bb02ee578b1a01dd8388237c14b Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 30 01:04:20 2023 +0100 KAFKA-15267: Do not allow Tiered Storage to be disabled while topics have remote.storage.enable property (#14161) The purpose of this change is to not allow a broker to start up with Tiered Storage disabled (remote.log.storage.system.enable=false) while there are still topics that have 'remote.storage.enable' set. Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit 1c5020e1429d1dadcb59955395afa87ada99f670 Author: Chris Egerton <chrise@aiven.io> Date: Tue Aug 29 15:44:22 2023 -0400 KAFKA-13327: Gracefully report connector validation errors instead of returning 500 responses (#14303) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Greg Harris <greg.harris@aiven.io> commit 8611d28b2e2050b913837ade4d01231413991dc3 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 29 12:25:30 2023 -0700 KAFKA-15392: Prevent shadowing RestServer shutdown exceptions (#14277) Reviewers: Chris Egerton <chrise@aiven.io> commit 945d21953e93409290a68dced0d366380866fb26 Author: Philip Nee <pnee@confluent.io> Date: Tue Aug 29 12:03:15 2023 -0700 KAFKA-14875: Implement wakeup (#14118) Summary Implemented wakeup() mechanism using a WakeupTrigger class to store the pending wakeup item, and when wakeup() is invoked, it checks whether there's an active task or a wakeup task. If there's an active task: the task will be completed exceptionally and the atomic reference will be freed up. If there an wakedup task, which means wakeup() was invoked before a blocking call was issued. Therefore, the current task will be completed exceptionally immediately. This PR also addressed minor issues such as: Throwing WakeupException at the right place: As wakeups are thrown by completing an active future exceptionally. The WakeupException is wrapped inside of the ExecutionException. mockConstruction is a thread-lock mock; therefore, we need to free up the reference before completing the test. Otherwise, other tests will continue using the thread-lock mock. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com> commit 3b02e97b65e7636fea84da834be235f93df41aea Author: Taher Ghaleb <taher.a.ghaleb@gmail.com> Date: Tue Aug 29 12:27:20 2023 -0400 KAFKA-15403: Refactor @Test(expected) annotation with assertThrows (#14264) assertThrows makes the verification of exceptions clearer and more intuitive, thus improving code readability compared to the annotation approach. It is considered a test smell in the research literature. One possible research is due to developers not keeping up to date with recent versions of testing frameworks. All such patterns in streams have been refactored. Reviewers: vamossagar12 <sagarmeansocean@gmail.com>, Justine Olshan <jolshan@confluent.io> commit 0912ca27e2a229d2ebe02f4d1dabc40ed5fab0bb Author: Alyssa Huang <ahuang@confluent.io> Date: Mon Aug 28 23:47:22 2023 -0700 KRaft support for DescribeClusterRequestTest and DeleteConsumerGroupsTest (#14294) Reviewers: dengziming <dengziming1993@gmail.com>, mannoopj <mannoopj@users.noreply.github.com> commit 68b140cb56a208b951be84191d1e3f1eb0169882 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Mon Aug 28 20:23:17 2023 +0530 MINOR: Fix the TBRLMMRestart test. (#14297) Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org> commit 68b7031dc443b6f6b5dfac81316ab22fe250ec54 Author: David Jacot <djacot@confluent.io> Date: Mon Aug 28 07:02:56 2023 -0700 KAFKA-14499: [4/N] Implement OffsetFetch API (#14120) This patch implements the OffsetFetch API in the new group coordinator. I found out that implementing the `RequireStable` flag is hard (to not say impossible) in the current model. For the context, the flag is here to ensure that an OffsetRequest request does not return stale offsets if there are pending offsets to be committed. In the scala code, we basically check the pending offsets data structure and if they are any pending offsets, we return the `UNSTABLE_OFFSET_COMMIT` error. This tells the consumer to retry. In our new model, we don't have the pending offsets data structure. Instead, we use a timeline data structure to handle all the pending/uncommitted changes. Because of this we don't know whether offsets are pending for a particular group. Instead of doing this, I propose to not return the `UNSTABLE_OFFSET_COMMIT` error anymore. Instead, when `RequireStable` is set, we use a write operation to ensure that we read the latest offsets. If they are uncommitted offsets, the write operation ensures that the response is only return when they are committed. This gives a similar behaviour in the end. Reviewers: Justine Olshan <jolshan@confluent.io> commit 4590d565ef08e48411123aa1ecbb46bd8130a3de Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Mon Aug 28 15:59:50 2023 +0530 KAFKA-15399: Enable OffloadAndConsumeFromLeader test (#14285) Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit 664f71b20712f7e4d1768d993c2b1ab109b6061c Author: Christo Lolov <lolovc@amazon.com> Date: Mon Aug 28 11:06:37 2023 +0100 KAFKA-14133: Move RecordCollectorTest, StateRestoreCallbackAdapterTest and StoreToProcessorContextAdapterTest to Mockito (#14210) Reviewers: Divij Vaidya <diviv@amazon.com> commit b41b2dfcf2f0f9e458374fb9b0842bcc8739f130 Author: Calvin Liu <83986057+CalvinConfluent@users.noreply.github.com> Date: Mon Aug 28 02:59:48 2023 -0700 KAFKA-15353: make sure AlterPartitionRequest.build() is idempotent (#14236) As described in https://issues.apache.org/jira/browse/KAFKA-15353 When the AlterPartitionRequest version is < 3 and its builder.build is called multiple times, both newIsrWithEpochs and newIsr will all be empty. This can happen if the sender retires on errors. Reviewers: Luke Chen <showuon@gmail.com> commit d869bf5b221a50b16e1ca42f1043adc79c85eae8 Author: iit2009060 <59436466+iit2009060@users.noreply.github.com> Date: Mon Aug 28 14:55:38 2023 +0530 KAFKA-15256: Adding reviewer as part of release announcement email template (#14288) Reviewers: Divij Vaidya <diviv@amazon.com>, Philip Nee <pnee@confluent.io> commit 180dcd396932a4033df2104aaf0038b852c4fc1b Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com> Date: Mon Aug 28 09:35:11 2023 +0100 KAFKA-15294: Publish remote storage configs (#14266) This change does the following: 1. Make RemoteLogManagerConfigs that are implemented public 2. Add tasks to generate html docs for the configs 3. Include config docs in the main site Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit 3643039a9a43cbd7c2dfc4927004be7b32264fb4 Author: Tang Yunzi <34680857+shinyruoqaq@users.noreply.github.com> Date: Mon Aug 28 11:13:47 2023 +0800 MINOR: Fix incorrect comment in TopicDeletionManager.scala. (#14292) Fix incorrect comment in TopicDeletionManager.scala Reviewers: Luke Chen <showuon@gmail.com> commit ff3e6842ff99a600fc02e69ebefb09eef93decb3 Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Sat Aug 26 05:52:26 2023 +0530 KAFKA-15181: Wait for RemoteLogMetadataCache to initialize after assigning partitions (#14127) This PR adds the following changes to the `TopicBasedRemoteLogMetadataManager` 1. Added a guard in RemoteLogMetadataCache so that the incoming request can be served from the cache iff the corresponding user-topic-partition is initalized 2. Improve error handling in ConsumerTask thread so that is not killed when there are errors in reading the internal topic 3. ConsumerTask initialization should handle the case when there are no records to read and some other minor changes Added Unit Tests for the changes Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> Reviewers: Luke Chen <showuon@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit f2d499e25a1ab8674dda7c6e5a05a12fb8687dbd Author: David Arthur <mumrah@gmail.com> Date: Fri Aug 25 13:41:43 2023 -0400 KAFKA-15389: Don't publish until we have replayed at least one record (#14282) When starting up a controller for the first time (i.e., with an empty log), it is possible for MetadataLoader to publish an empty MetadataImage before the activation records of the controller have been written. While this is not a bug, it could be confusing. This patch closes that gap by waiting for at least one controller record to be committed before the MetadataLoader starts publishing images. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 5785796f985aa294c12e670da221d086a7fa887c Author: Maros Orsak <maros.orsak159@gmail.com> Date: Fri Aug 25 11:31:21 2023 +0200 MINOR: Add a few test cases to clients (#14211) Reviewers: Mickael Maison <mickael.maison@gmail.com> commit 30de2bb5efe7d163115dd358067c98b17c3a6730 Author: Mickael Maison <mimaison@users.noreply.github.com> Date: Fri Aug 25 10:34:17 2023 +0200 MINOR: Missing space in ProducerStateManager LogContext (#14275) Reviewers: Divij Vaidya <diviv@amazon.com> commit c9715a3485f2dbad1accbbf88c66af67febfe489 Author: Luke Chen <showuon@gmail.com> Date: Fri Aug 25 11:52:37 2023 +0800 MINOR: Use "add-exports" only when jdk >= 16 in minikdc (#14232) Use "add-exports" only when jdk >= 16 in minikdc Reviewers: Greg Harris <greg.harris@aiven.io> commit d4ab3ae85a35483c15f0ceb0002eb2d81ad79ad2 Author: Satish Duggana <satishd@apache.org> Date: Fri Aug 25 05:27:59 2023 +0530 KAFKA-14888: Added remote log segments retention mechanism based on time and size. (#13561) This change introduces a remote log segment segment retention cleanup mechanism. RemoteLogManager runs retention cleanup activity tasks on each leader replica. It assesses factors such as overall size and retention duration, subsequently removing qualified segments from remote storage. This process also involves adjusting the log-start-offset within the UnifiedLog accordingly. It also cleans up the segments which have epochs earlier than the earliest leader epoch in the current leader. Co-authored-by: Satish Duggana <satishd@apache.org> Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> Reviewers: Jun Rao <junrao@gmail.com>, Divij Vaidya <diviv@amazon.com, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Alexandre Dupriez <alexandre.dupriez@gmail.com>, Nikhil Ramakrishnan <ramakrishnan.nikhil@gmail.com> commit 9e3b1f9b9bf48603acde7f71c704af812a6dab4b Author: Satish Duggana <satishd@apache.org> Date: Fri Aug 25 05:03:38 2023 +0530 MINOR Bump trunk to 3.7.0-SNAPSHOT (#14286) Reviewers: Divij Vaidya <diviv@amazon.com> commit 880288879e8200cd22951e67a50df166edcc1b33 Author: Yash Mayya <yash.mayya@gmail.com> Date: Thu Aug 24 23:44:53 2023 +0100 KAFKA-15377: Don't expose externalized secret values in tasks-config API endpoint (#14244) Reviewers: Greg Harris <greg.harris@aiven.io> commit a3303b6112f658fd4fcd3b8ae1cdfd7c1e43db5b Author: Greg Harris <greg.harris@aiven.io> Date: Thu Aug 24 12:24:34 2023 -0700 KAFKA-15393: Improve shutdown behavior in MM2 integration tests (#14278) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Chris Egerton <chrise@aiven.io> commit 8d12c1175cca5af27f79aa746c24bd998ff62345 Author: Phuc-Hong-Tran <44060007+Phuc-Hong-Tran@users.noreply.github.com> Date: Fri Aug 25 03:38:45 2023 +1000 KAFKA-15152: Fix incorrect format specifiers when formatting string (#14026) Reviewers: Divij Vaidya <diviv@amazon.com> Co-authored-by: phuchong.tran <phuchong.tran@servicenow.com> commit 45aae641a5a6632f7d1261b1880dbb8e73be1ff6 Author: Said Boudjelda <bmscomp@gmail.com> Date: Thu Aug 24 19:29:33 2023 +0200 MINOR: Upgrade version of Gradle entreprise to 3.14.1 & custom user data Gradle plugin to 1.11.1 (#14131) Reviewers: Divij Vaidya <diviv@amazon.com> commit 89aaecafae30fc2adc51d2fade89af4e9745598e Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Thu Aug 24 21:05:48 2023 +0530 KAFKA-15290: Handle topic-level dynamic remote storage enable configuration (#14238) * KAFKA-15290: Handle topic-level dynamic remote log storage enable configuration. To onboard existing topics to tiered storage, bootstrap the remote-log-components when updating the dynamic `remote.storage.enable` config on the topic. Reviewers: Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org> commit 88d2c4460a1c8c8cf5dbcc9edb43f42fe898ca00 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Thu Aug 24 17:12:13 2023 +0530 KAFKA-15400: Use readLock when removing an item from the RemoteIndexCache (#14283) - Caffeine cache is thread safe, we want to hold the writeLock only during the close. - Fix the flaky tests Reviewers: Divij Vaidya <diviv@amazon.com> commit 25b128de81f826d0e0fe415acecf8b6d4cf837f4 Author: Mehari Beyene <132488287+mehbey@users.noreply.github.com> Date: Thu Aug 24 03:04:55 2023 -0700 KAFKA-14991: KIP-937-Improve message timestamp validation (#14135) This implementation introduces two new configurations `log.message.timestamp.before.max.ms` and `log.message.timestamp.after.max.ms` and deprecates `log.message.timestamp.difference.max.ms`. The default value for all these three configs is maintained to be Long.MAX_VALUE for backward compatibility but with the newly added configurations we can have a finer control when validating message timestamps that are in the past and the future compared to the broker's timestamp. To maintain backward compatibility if the default value of `log.message.timestamp.before.max.ms` is not changed, we are assuming users are still using the deprecated config `log.message.timestamp.difference.max.ms` and validation is done using its value. This ensures that existing customers who have customized the value of `log.message.timestamp.difference.max.ms` will continue to see no change in behavior. Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com> commit 87a30b73b561b205eeabae66aeba529a9307dfa0 Author: Okada Haruki <ocadaruma@gmail.com> Date: Thu Aug 24 17:24:01 2023 +0900 KAFKA-15391: Handle concurrent dir rename which makes log-dir to be offline unexpectedly (#14280) A race condition between async flush and segment rename (for deletion purpose) might cause the entire log directory to be marked offline when we delete a topic. This PR fixes the bug by ignoring NoSuchFileException when we flush a directory. Reviewers: Divij Vaidya <diviv@amazon.com> commit 9972297e510d74bd5dedbffe5dfb7a9f1c0a123f Author: olalamichelle <32348847+olalamichelle@users.noreply.github.com> Date: Wed Aug 23 21:59:16 2023 -0500 KAFKA-14780: Fix flaky test 'testSecondaryRefreshAfterElapsedDelay' (#14078) "The test RefreshingHttpsJwksTest#testSecondaryRefreshAfterElapsedDelay relies on the actual system clock, which makes it frequently fail. The fix adds a second constructor that allows for passing a ScheduledExecutorService to manually execute the scheduled tasks before refreshing. The fixed task is much more robust and stable. Co-authored-by: Fei Xie <feixie@MacBook-Pro.attlocal.net> Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit 0017bb3db5476479a9cfde2ec667949b6b001451 Author: Yash Mayya <yash.mayya@gmail.com> Date: Wed Aug 23 08:54:20 2023 +0100 MINOR: Fix typo in stale PR message (#14274) commit 368bec85e874090b0032cb95f2f465309fbc60d5 Author: Divij Vaidya <diviv@amazon.com> Date: Wed Aug 23 09:31:58 2023 +0200 KAFKA-9926: Fix flaky PlaintextAdminIntegrationTest.testCreatePartitions (#14273) Reviewers: Luke Chen <showuon@gmail.com> commit e9f358eef6a5f48530ecfe0ca6fca689410b499c Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com> Date: Wed Aug 23 00:43:06 2023 -0400 KAFKA-14937; [2/N]: Refactoring for client code to reduce boilerplate (#14218) This PR main refactoring relates to : 1. serializers/deserializers used in clients - unified in a Deserializers class 2. logic for configuring ClusterResourceListeners moved to ClientUtils 3. misc refactoring of the new async consumer in preparation for upcoming Request Managers Reviewers: Jun Rao <junrao@gmail.com> commit 8394ddc0d26399dc20ddff802886fa0b1f41f420 Author: Ron Dagostino <rndgstn@gmail.com> Date: Tue Aug 22 19:04:53 2023 -0400 MINOR: Move delegation token support to Metadata Version 3.6-IV2 (#14270) #14083 added support for delegation tokens in KRaft and attached that support to the existing MetadataVersion 3.6-IV1. This patch moves that support into a separate MetadataVersion 3.6-IV2. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 064a14795ccaa4d50798252e7d9ef081311b5e59 Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Aug 22 17:26:29 2023 +0100 MINOR: Update OpenAPI summary documentation for Connect's /tasks endpoint (#14267) Reviewers: Mickael Maison <mickael.maison@gmail.com> commit 86afa416d203637f148c4cfe2c5ec38ebc31247e Author: Christo Lolov <lolovc@amazon.com> Date: Tue Aug 22 09:55:54 2023 +0100 KAFKA-14133: Move mocks from KStreamTransformValuesTest, KTableImplTest and KTableTransformValuesTest to Mockito (#14204) Reviewers: Divij Vaidya <diviv@amazon.com> commit 418b8a6e5939903c4b9561a2be7dd2fa8e29c271 Author: David Arthur <mumrah@gmail.com> Date: Mon Aug 21 19:02:14 2023 -0400 KAFKA-14538 Metadata transactions in MetadataLoader (#14208) This PR contains three main changes: - Support for transactions in MetadataLoader - Abort in-progress transaction during controller failover - Utilize transactions for ZK to KRaft migration A new MetadataBatchLoader class is added to decouple the loading of record batches from the publishing of metadata in MetadataLoader. Since a transaction can span across multiple batches (or multiple transactions could exist within one batch), some buffering of metadata updates was needed before publishing out to the MetadataPublishers. MetadataBatchLoader accumulates changes into a MetadataDelta, and uses a callback to publish to the publishers when needed. One small oddity with this approach is that since we can "splitting" batches in some cases, the number of bytes returned in the LogDeltaManifest has new semantics. The number of bytes included in a batch is now only included in the last metadata update that is published as a result of a batch. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit ad76497b12b0d81a98dc6e230e5516adffbfa0ee Author: Walker Carlson <18128741+wcarlson5@users.noreply.github.com> Date: Mon Aug 21 16:08:38 2023 -0500 KAFKA-14936: fix grace period partition issue (#14269) Move the store creation to builder pattern and recover mintimestamp Reviewers: John Roesler<vvcephei@apache.org>, Bill Bejeck <bbejeck@gmail.com> commit 6008af7468108869f74c63e591c45d27c6dfd17e Author: Ron Dagostino <rndgstn@gmail.com> Date: Mon Aug 21 13:02:32 2023 -0400 MINOR: Enable delegation token system test for KRaft (#14268) Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 4b383378a0fd19d6d3c9ae7c2175fa3459661a04 Author: Justine Olshan <jolshan@confluent.io> Date: Mon Aug 21 08:44:00 2023 -0700 KAFKA-15380: Execute action queue after callback request (#14197) KIP-890 part 1 introduced the callback request type. It is used to execute a callback after KafkaApis.handle has returned. We did not account for tryCompleteActions at the end of handle when making this change. In tests, we saw produce p99 increase dramatically (likely because we have to wait for another request before we can complete DelayedProduce). As a result, we should add the tryCompleteActions after the callback as well. In testing, this improved the produce performance. Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io> commit eefa812453d5d5074506f5fc7642f585ac0c3545 Author: David Mao <47232755+splett2@users.noreply.github.com> Date: Sun Aug 20 15:16:27 2023 -0700 MINOR: Delete unused class, LogOffsetMetadata toString formatting (#14246) Noticed that there was a dangling unused class (LongRef, replaced by PrimitiveRef.LongRef), and the LogOffsetMetadata toString was a little oddly formatted. Reviewers: Justine Olshan <jolshan@confluent.io> commit 6492164d9c099ae3091cd508df98453c954b7e13 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Sun Aug 20 17:15:52 2023 +0530 KAFKA-15167: Tiered Storage Test Harness Framework (#14116) `TieredStorageTestHarness` is a base class for integration tests exercising the tiered storage functionality. This uses `LocalTieredStorage` instance as the second-tier storage system and `TopicBasedRemoteLogMetadataManager` as the remote log metadata manager. Co-authored-by: Alexandre Dupriez <alexandre.dupriez@gmail.com> Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> commit c2759df0676cef252596239baf8f1f361e76c49f Author: Proven Provenzano <93720617+pprovenzano@users.noreply.github.com> Date: Sat Aug 19 14:01:08 2023 -0400 KAFKA-15219: KRaft support for DelegationTokens (#14083) Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com> commit 05c329e61fbfddc1f972403fbec0a24c45d4e173 Author: Bruno Cadonna <cadonna@apache.org> Date: Sat Aug 19 19:00:23 2023 +0200 KAFKA-10199: Change to RUNNING if no pending task to init exist (#14249) A stream thread should only change to RUNNING if there are no active tasks in restoration in the state updater and if there are no pending tasks to recycle and to init. Usually all pending tasks to init are added to the state updater in the same poll iteration that handles the assignment. However, if during an initialization of a task a LockException the task is re-added to the tasks to init and initialization is retried in the next poll iteration. A LockException might occur when a state directory is still locked by another thread, when the rebalance just happened. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Walker Carlson <wcarlson@confluent.io> commit 4c7e0a9fa66fd1becc590d6060228b0305dd400b Author: Bruno Cadonna <cadonna@apache.org> Date: Sat Aug 19 12:13:30 2023 +0200 MINOR: Decouple purging committed records from committing (#14227) Currently, Kafka Streams only tries to purge records whose offset are committed from a repartition topic when at least one offset was committed in the current commit. The coupling between committing some offsets and purging records is not needed and might delay purging of records. For example, if a in-flight call for purging records has not completed yet when a commit happens, a new call is not issued. If then the earlier in-flight call for purging records finally completes but the next commit does not commit any offsets, Streams does not issue the call for purging records whose offset were committed in the previous commit because the purging call was still in-flight. This change issues calls for purging records during any commit if the purge interval passed, even if no offsets were committed in the current commit. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Walker Carlson <wcarlson@confluent.io> commit d0b7677c2c9a8265643a90175f86987521615849 Author: Walker Carlson <18128741+wcarlson5@users.noreply.github.com> Date: Fri Aug 18 22:00:04 2023 -0500 KAFKA-14936: Add restore logic (3/N) (#14027) Added restore logic for the buffer in grace period joins. Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com> commit 82ae77f94566881d957929acbdc3197e60f7f5f8 Author: Greg Harris <greg.harris@aiven.io> Date: Fri Aug 18 15:28:43 2023 -0700 KAFKA-15226: Add connect-plugin-path and plugin.discovery system test (#14230) Reviewers: Chris Egerton <chrise@aiven.io> commit b36cf4ef977fb14bc57683630a9f3f3680705550 Author: Matthias J. Sax <matthias@confluent.io> Date: Fri Aug 18 11:06:08 2023 -0700 HOTIFX: fix Kafka Streams upgrade path from 3.4 to 3.5 (#14103) KIP-904 introduced a backward incompatible change that requires a 2-bounce rolling upgrade. The new "3.4" upgrade config value is not recognized by `AssignorConfiguration` though and thus crashed Kafka Streams if use. Reviewers: Farooq Qaiser <fqaiser94@gmail.com>, Bruno Cadonna <bruno@confluent.io> commit 3ad5f42f595e4eef3a5787ce34e2676ed58bee3d Author: David Arthur <mumrah@gmail.com> Date: Fri Aug 18 12:44:01 2023 -0400 Handle case of default broker in config migration (#14237) When collecting the set of broker IDs during the migration, don't try to parse the default broker resource `""` as a broker ID. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit ee036ed9effe6d38ee737e9360d3dd256a74fa9f Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Fri Aug 18 18:31:27 2023 +0200 KAFKA-15319: Upgrade rocksdb to fix CVE-2022-37434 (#14216) Rocksdbjni<7.9.2 is vulnerable to CVE-2022-37434 due to zlib 1.2.12 Reviewers: Divij Vaidya <diviv@amazon.com>, Bruno Cadonna <cadonna@apache.org> commit 4f88fb28f38f3e461377bf688520b28a3f209b5d Author: DL1231 <53332773+DL1231@users.noreply.github.com> Date: Fri Aug 18 20:51:09 2023 +0800 KAFKA-15130: Delete remote segments when deleting a topic (#13947) * Delete remote segments when deleting a topic Co-authored-by: Kamal Chandraprakash <kchandraprakash@uber.com> Co-authored-by: d00791190 <dinglan6@huawei.com> commit 3f4816dd3eafaf1a0636d3ee689069f897c99e28 Author: José Armando GarcÃa Sancio <jsancio@users.noreply.github.com> Date: Thu Aug 17 18:40:17 2023 -0700 KAFKA-15345; KRaft leader notifies leadership when listener reaches epoch start (#14213) In a non-empty log the KRaft leader only notifies the listener of leadership when it has read to the leader's epoch start offset. This guarantees that the leader epoch has been committed and that the listener has read all committed offsets/records. Unfortunately, the KRaft leader doesn't do this when the log is empty. When the log is empty the listener is notified immediately when it has become leader. This makes the API inconsistent and harder to program against. This change fixes that by having the KRaft leader wait for the listener's nextOffset to be greater than the leader's epochStartOffset before calling handleLeaderChange. The RecordsBatchReader implementation is also changed to include control records. This makes it possible for the state machine learn about committed control records. This additional information can be used to compute the committed offset or for counting those bytes when determining when to snapshot the partition. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io> commit 7802c264c96ae27167cf38c263b86398aa0ea3fe Author: Yash Mayya <yash.mayya@gmail.com> Date: Thu Aug 17 19:13:53 2023 +0100 MINOR: Allow writing tombstone offsets for arbitrary partitions in the FileStreamSourceConnector (#14234) Reviewers: Chris Egerton <chrise@aiven.io> commit a253dc6643bb18b123a3832f88b0d97965f38068 Author: Chris Egerton <chrise@aiven.io> Date: Thu Aug 17 14:11:01 2023 -0400 KAFKA-15102: Add release notes about the replication.policy.internal.topic.separator.enabled property for MirrorMaker 2 (#14220) Reviewers: Greg Harris <greg.harris@aiven.io> commit d85a70081333a2ab9dd6593e99abf213a469ba2d Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Thu Aug 17 19:53:58 2023 +0200 MINOR: Do not reuse admin client across tests (#14225) Reusing an admin client across tests can cause false positives in leak checkers, so don't do it. Reviewers: Divij Vaidya <diviv@amazon.com>, Matthias J. Sax <matthias@confluent.io> commit de409b389d26f7681fba8583db2b96584258aa48 Author: Chris Egerton <chrise@aiven.io> Date: Thu Aug 17 09:33:59 2023 -0400 KAFKA-15177: Implement KIP-875 SourceConnector::alterOffset API in MirrorMaker 2 connectors (#14005) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Greg Harris <greg.harris@aiven.io> commit 6bd17419b76f8cf8d7e4a11c071494dfaa72cd50 Author: Greg Harris <greg.harris@aiven.io> Date: Wed Aug 16 11:37:33 2023 -0700 KAFKA-15228: Add sync-manifests command to connect-plugin-path (KIP-898) (#14195) Reviewers: Chris Egerton <chrise@aiven.io> commit a9efca0bf63110d68f84fc9841d8a31f245e10e0 Author: Greg Harris <greg.harris@aiven.io> Date: Wed Aug 16 10:30:24 2023 -0700 KAFKA-14759: Move Mock, Schema, and Verifiable connectors to new test-plugins module (#13302) Reviewers: Hector Geraldino <hgeraldino@gmail.com>, Chris Egerton <chrise@aiven.io> commit f970ddff10114ae7f9ea1b9966cd25ac8bb5f581 Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com> Date: Wed Aug 16 12:00:39 2023 +0100 KAFKA-15210: Mention vote should be open for at atleast 72 hours in the release script (#14183) Reviewers: Divij Vaidya <diviv@amazon.com> commit d0e9e94629c1347b19a2992b67d8590e61a59d04 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 16 09:19:35 2023 +0100 KAFKA-14133: Migrate ActiveTaskCreatorTest, ChangelogTopicsTest and GlobalProcessorContextImplTest to Mockito (#14209) Reviewers: Divij Vaidya <diviv@amazon.com> commit cfe49d1b7782e4adde409638c4a4152b22bd1ce5 Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Wed Aug 16 17:01:49 2023 +0900 KAFKA-7438: Replace EasyMock with Mockito in SessionStoreBuilderTest (#14142) Reviewers: Divij Vaidya <diviv@amazon.com>, Yash Mayya <yash.mayya@gmail.com> commit 1a15cd708ab252e7c75dea06f36db36599f18322 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 16 08:46:40 2023 +0100 KAFKA-14133: Migrato SessionCacheFlushListenerTest, TimestampedCacheFlushListenerTest and TimestampedTupleForwarderTest to Mockito (#14205) Reviewers: Divij Vaidya <diviv@amazon.com> commit ee27773549646602b955343dba680efaa5227286 Author: vamossagar12 <sagarmeansocean@gmail.com> Date: Wed Aug 16 07:16:17 2023 +0530 KAFKA-15329: Make default remote.log.metadata.manager.class.name as topic based RLMM (#14202) As described in the KIP here the default value of remote.log.metadata.manager.class.name should be TopicBasedRemoteLogMetadataManager Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Divij Vaidya <diviv@amazon.com> commit 35e925f3535e7774520317310505fcde946228d5 Author: Omnia G.H Ibrahim <o.g.h.ibrahim@gmail.com> Date: Wed Aug 16 00:58:52 2023 +0100 KAFKA-15102: Add replication.policy.internal.topic.separator.enabled property to MirrorMaker 2 (KIP-949) (#14082) Reviewers: Chris Egerton <chrise@aiven.io> commit b97e8203eb1fb016cd7cccd3dbf5fecc716969be Author: Philip Nee <pnee@confluent.io> Date: Tue Aug 15 15:01:28 2023 -0700 MINOR: CommitRequestManager should only poll when the coordinator node is known (#14179) As title, we discovered a flaky bug during testing that the commit request manager would seldomly throw a NOT_COORDINATOR exception, which means the request was routed to a non-coordinator node. We discovered that if we don't check the coordinator node in the commitRequestManager, the request manager will pass on an empty node to the NetworkClientDelegate, which implies the request can be sent to any node in the cluster. This behavior is incorrect as the commit requests need to be routed to a coordinator node. Because the timing coordinator's discovery during integration testing isn't entirely deterministic; therefore, the test became extremely flaky. After fixing this: The coordinator node is mandatory before attempt to enqueue these commit request to the NetworkClient. Reviewers: Jun Rao <junrao@gmail.com> commit 28858f3a3e2e6a8499944344b270bf33ca116503 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 15 14:24:48 2023 -0700 MINOR: Fix SynchronizationTest classloaders sometimes not being parallel capable (#14177) Reviewers: Chris Egerton <chrise@aiven.io> commit 1a001c1e88e3e32b34a5c52f257de722da95a736 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 15 13:21:45 2023 -0700 KAFKA-15336: Add ServiceLoader Javadocs for Connect plugins (#14194) Reviewers: Chris Egerton <chrise@aiven.io> commit 0a531b7e7cc614b273092a6d28b1b8c72bb93ea9 Author: David Arthur <mumrah@gmail.com> Date: Tue Aug 15 16:01:31 2023 -0400 MINOR: Install ControllerServer metadata publishers sooner (#14215) This patch is a follow up of #14169 that installs the metadata publishers before blocking on the authorizer future. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit c199840f0add1c191c4333dc58baae49594734a8 Author: David Arthur <mumrah@gmail.com> Date: Sat Aug 12 09:14:17 2023 -0400 MINOR: Fix the ZkMigrationState metric in KafkaController This patch fixes an issue for ZK controllers where we were emitting the ZkMigrationState enum rather than a value. This can lead to downstream failures with JMX metrics since the RMI protocol will marshal the ZkMigrationState object returned by the gauge. Any downstream consumer of this metric (like jconsole or a metrics exporter) will not be able to unmarshal the value since the ZkMigrationState class will not be present. The fix is simply to emit the byte value of this enum. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Alok Thatikunta <athatikunta@confluent.io> commit 696a56dd2ba54beed41748ef8cce958dadd6b375 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Wed Aug 16 00:13:11 2023 +0530 KAFKA-15295: Add config validation when remote storage is enabled on a topic (#14176) Add config validation which verifies that system level remote storage is enabled when enabling remote storage for a topic. In case verification fails, it throws InvalidConfigurationException. Reviewers: Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit fd6c9f16bacf26261fee2755c1d3679b9703a8a0 Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Tue Aug 15 18:48:13 2023 +0900 KAFKA-7438: Replace Easymock & Powermock with Mockito in RocksDBMetricsRecorderGaugesTest (#14190) Reviewers: Christo Lolov <christololov@gmail.com>, Divij Vaidya <diviv@amazon.com> commit adc16d0f310dc1350bca66d5da013d599b990cfa Author: Colin Patrick McCabe <cmccabe@apache.org> Date: Mon Aug 14 16:58:56 2023 -0700 KAFKA-14538: Implement KRaft metadata transactions in QuorumController Implement the QuorumController side of KRaft metadata transactions. As specified in KIP-868, this PR creates a new metadata version, IBP_3_6_IV1, which contains the three new records: AbortTransactionRecord, BeginTransactionRecord, EndTransactionRecord. In order to make offset management unit-testable, this PR moves it out of QuorumController.java and into OffsetControlManager.java. The general approach here is to track the "last stable offset," which is calculated by looking at the latest committed offset and the in-progress transaction (if any). When a transaction is aborted, we revert back to this last stable offset. We also revert back to it when the controller is transitioning from active to inactive. In a follow-up PR, we will add support for the transaction records in MetadataLoader. We will also add support for automatically aborting pending transactions after a controller failover. Reviewers: David Arthur <mumrah@gmail.com> commit 5a67b080c7d7814c38a83a25ab951e65743fae81 Author: Calvin Liu <83986057+CalvinConfluent@users.noreply.github.com> Date: Mon Aug 14 16:14:29 2023 -0700 MINOR: Fix a race when query isUnderMinIsr (#14138) When the leader becomes the follower, we first remove the ISR and then reset the leader. If we call isUnderMinIsr in between, we will get an answer with true which is a race bug. Reviewers: Justine Olshan <jolshan@confluent.io> commit 67b527460e155f1f5e850bb7fd65c7c373367b48 Author: Kirk True <kirk@kirktrue.pro> Date: Mon Aug 14 10:08:20 2023 -0700 KAFKA-14937: Refactoring for client code to reduce boilerplate (#13990) Move common code from the client implementations to the ClientUtils class or (consumer) Utils class, where passible. There are a number of places in the client code where the same basic calls are made by more than one client implementation. Minor refactoring will reduce the amount of boilerplate code necessary for the client to construct its internal state. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com> commit 5234ddff5025501be2b4fca3ecba4e4eb584bbc5 Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Mon Aug 14 17:17:28 2023 +0200 KAFKA-15326: [5/N] Processing thread punctuation (#14001) Implements punctuation inside processing threads. The scheduler algorithm checks if a task that is not assigned currently can be punctuated, and returns it when a worker thread asks for the next task to be processed. Then, the processing thread runs all punctuations in the punctionation queue. Piggy-backed: take TaskExecutionMetadata into account when processing records. Reviewer: Bruno Cadonna <cadonna@apache.org> commit 43751d8d0521b1440a823a9430fdb0659ce7c436 Author: vveicc <vveicc@163.com> Date: Mon Aug 14 17:04:15 2023 +0800 KAFKA-15289: Support KRaft mode in RequestQuotaTest (#14201) Enable kraft mode for RequestQuotaTest, there are 2 works left to be done. Reviewers: dengziming <dengziming1993@gmail.com> commit d91c9bd2b594719cee629b6c057204fb0de0d1a2 Author: Chris Egerton <chrise@aiven.io> Date: Sat Aug 12 16:52:49 2023 -0400 KAFKA-14682: Report Mockito unused stubbings during Jenkins build (#14186) * KAFKA-14682: Report Mockito unused stubbings during Jenkins build * DO NOT MERGE: Add test case that should fail during Jenkins build * Revert "DO NOT MERGE: Add test case that should fail during Jenkins build" This reverts commit 8418b835ecb49fa10da04c7a997c7e982a8c4a47. commit ae46c0a34c90e3954b47ead2dc057e0bbf186a1f Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Sat Aug 12 17:41:07 2023 +0900 KAFKA-7438: Replace Easymock & Powermock with Mockito in TableSourceNodeTest (#14189) Reviewers: Divij Vaidya <diviv@amazon.com> commit 3a94670a013caa57b991caa47037d5c9bfdbf20a Author: Hao Li <1127478+lihaosky@users.noreply.github.com> Date: Sat Aug 12 01:38:07 2023 -0700 MINOR: Fix streams task assignor tests (#14196) Reviewers: Divij Vaidya <diviv@amazon.com> commit cfb45b000140729aabc34953a80bc8dc4c9a08d3 Author: Rittika Adhikari <rittika.adhikari@gmail.com> Date: Fri Aug 11 14:15:17 2023 -0700 MINOR: Refactor TierStateMachine related tests into a separate test file (#13503) This PR builds off of KAFKA-14685 and refactors any tests explicitly related to ReplicaFetcherTierStateMachine into a separate testing file ReplicaFetcherTierStateMachineTest. Reviewers: Jun Rao <junrao@gmail.com> commit f6b8b39747112bd8f3418893af760d42a1635648 Author: Philip Nee <pnee@confluent.io> Date: Fri Aug 11 13:15:30 2023 -0700 MINOR: Fix committed API in the PrototypeAsyncConsumer timeout (#14123) Discovered the committed() API timeout during the integration test. After investigation, this is because the future was not completed in the ApplicationEventProcessor. Also added toString methods to the event class for debug purposes. Reviewers: Jun Rao <junrao@gmail.com> commit f5655d31d3d527dae057240570162827c6a79fb2 Author: Greg Harris <greg.harris@aiven.io> Date: Fri Aug 11 12:05:51 2023 -0700 KAFKA-15030: Add connect-plugin-path command-line tool (#14064) Reviewers: Chris Egerton <chrise@aiven.io> commit 1e747a24a361eb4ca258eab3acd07fa0260e19d4 Author: Florin Akermann <florin.akermann@gmail.com> Date: Fri Aug 11 20:45:18 2023 +0200 KAFKA-13197: fix GlobalKTable join/left-join semantics documentation. (#14…
commit cfa850957061b919bcb078d50a7e2e36ceab43c6 Author: Ritika Reddy <rreddy@confluent.io> Date: Thu Sep 7 15:49:01 2023 -0700 Removed compute unassigned partitions and minors, updated javadocs to make them more readable. commit 9cfe4fa67dea5754e3bc19e46a1a4df22e1a0801 Author: Ritika Reddy <rreddy@confluent.io> Date: Thu Sep 7 00:07:36 2023 -0700 Removed compute unassigned partitions and minors commit 62f876f2b81769598e25d8e9da6bb18f59f31008 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:27:34 2023 -0700 Revert "Squashed commit of the following:" This reverts commit 02fc35ea2c4564d9da5f42d5989deaf283d54122. commit 158f7932e7ba152639e5643f84e055065eee3269 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:26:40 2023 -0700 Change list to collection commit 02fc35ea2c4564d9da5f42d5989deaf283d54122 Author: Ritika Reddy <rreddy@confluent.io> Date: Wed Sep 6 11:25:29 2023 -0700 Squashed commit of the following: commit eb39c95080b994398c40bcf5d54181e713ed6faa Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Wed Sep 6 14:49:48 2023 +0200 MINOR: StoreChangelogReaderTest fails with log-level DEBUG (#14300) A mocked method is executed unexpectedly when we enable DEBUG log level, leading to confusing test failures during debugging. Since the log message itself seems useful, we adapt the test to take the additional mocked method call into account). Reviewer: Bruno Cadonna <cadonna@apache.org> commit cc289d04c701a59f571683b908f778e0b236d72f Author: Ritika Reddy <98577846+rreddy-22@users.noreply.github.com> Date: Wed Sep 6 00:28:23 2023 -0700 MINOR: Fix trailing white spaces on reviewers.py (#14343) Fixing trailing white spaces on reviewers.py. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io> commit 7054625c45dc6edb3c07271fe4a6c24b4638424f Author: David Jacot <djacot@confluent.io> Date: Tue Sep 5 23:36:38 2023 -0700 KAFKA-14499: [6/N] Add MemberId and MemberEpoch to OffsetFetchRequest (#14321) This patch adds the MemberId and the MemberEpoch fields to the OffsetFetchRequest. Those fields will be populated when the new consumer group protocol is used to ensure that the member fetching the offset has the correct member id and epoch. If it does not, UNKNOWN_MEMBER_ID or STALE_MEMBER_EPOCH are returned to the client. Our initial idea was to implement the same for the old protocol. The field is called GenerationIdOrMemberEpoch in KIP-848 to materialize this. As a second though, I think that we should only do it for the new protocol. The effort to implement it in the old protocol is not worth it in my opinion. Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io> commit 80982c4ae3fe6be127b48ec09caff11ab5f87c69 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Wed Sep 6 05:50:12 2023 +0530 KAFKA-15410: Delete topic integration test with LocalTieredStorage and TBRLMM (3/4) (#14329) Added delete topic integration tests for tiered storage enabled topics with LocalTieredStorage and TBRLMM Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit b49013b73efa25466652d8d8122974e60c927ec4 Author: Andrew Schofield <aschofield@confluent.io> Date: Tue Sep 5 19:57:51 2023 +0100 KAFKA-9800: Exponential backoff for Kafka clients - KIP-580 (#14111) Implementation of KIP-580 to add exponential back-off to situations in which retry.backoff.ms is used to delay backoff attempts. This KIP adds exponential backoff behavior with a maximum controlled by a new config retry.backoff.max.ms, together with a +/- 20% of jitter to spread the retry attempts of the client fleet. Reviewers: Mayank Shekhar Narula <mayanks.narula@gmail.com>, Milind Luthra <i.milind.luthra@gmail.com>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com> commit 1f473ebb5ea9ad4ebdfdc99051864cce6d80db87 Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Sep 5 19:39:49 2023 +0100 KAFKA-14876: Add stopped state to Kafka Connect Administration docs section (#14336) Reviewers: Chris Egerton <chrise@aiven.io> commit 79598b49d6fff9bef686500f46a288b61a9013fd Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Sep 5 18:58:44 2023 +0100 MINOR: Update the documentation's table of contents to add missing headings for Kafka Connect (#14337) Reviewers: Chris Egerton <chrise@aiven.io> commit 37a51e286d5aaa890439e074e9f781ec26aaef2e Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Tue Sep 5 22:19:10 2023 +0530 KAFKA-15293 Added documentation for tiered storage metrics (#14331) Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org> commit be0de2124a1bb6363355aa792be618ddbb87e460 Author: Luke Chen <showuon@gmail.com> Date: Wed Sep 6 00:06:28 2023 +0800 MINOR: Update comment in consumeAction (#14335) Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com> commit 9f2ac375c282e1471a2d385704e1f7c128f34bb6 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Tue Sep 5 19:28:17 2023 +0530 KAFKA-15410: Reassign replica expand, move and shrink integration tests (2/4) (#14328) - Updated the log-start-offset to the correct value while building the replica state in ReplicaFetcherTierStateMachine#buildRemoteLogAuxState Integration tests added: 1. ReassignReplicaExpandTest 2. ReassignReplicaMoveTest and 3. ReassignReplicaShrinkTest Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit b892acae5e026e1affd51ef9756772807674b964 Author: Justine Olshan <jolshan@confluent.io> Date: Mon Sep 4 20:40:50 2023 -0700 KAFKA-15424: Make the transaction verification a dynamic configuration (#14324) This will allow enabling and disabling transaction verification (KIP-890 part 1) without having to roll the cluster. Tested that restarting the cluster persists the configuration. If a verification is disabled/enabled while we have an inflight request, depending on the step of the process, the change may or may not be seen in the inflight request (enabling will typically fail unverified requests, but we may still verify and reject when we first disable) Subsequent requests/retries will behave as expected for verification. Sequence checks will continue to take place after disabling until the first message is written to the partition (thus clearing the verification entry with the tentative sequence) or the broker restarts/partition is reassigned which will clear the memory. On enabling, we will only track sequences that for requests received after the verification is enabled. Reviewers: Jason Gustafson <jason@confluent.io>, Satish Duggana <satishd@apache.org> commit caaa4c55fee68c5893d54ffe84287f3b5205fff1 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Tue Sep 5 05:13:16 2023 +0530 KAFKA-15410: Expand partitions, segment deletion by retention and enable remote log on topic integration tests (1/4) (#14307) Added the below integration tests with tiered storage - PartitionsExpandTest - DeleteSegmentsByRetentionSizeTest - DeleteSegmentsByRetentionTimeTest and - EnableRemoteLogOnTopicTest - Enabled the test for both ZK and Kraft modes. These are enabled for both ZK and Kraft modes. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com> commit d34d84dbef20559e68c899315a0915e9dd740cb0 Author: Yash Mayya <yash.mayya@gmail.com> Date: Mon Sep 4 12:54:18 2023 +0100 KAFKA-7438: Migrate WindowStoreBuilderTest from EasyMock to Mockito (#14152) Reviewers: Divij Vaidya <diviv@amazon.com> commit 7a516b03863a00aa0cebfb376171a31a92d26f8c Author: Christo Lolov <lolovc@amazon.com> Date: Mon Sep 4 11:58:50 2023 +0100 KAFKA-14133: Move AbstractStreamTest and RocksDBMetricsRecordingTriggerTest to Mockito (#14223) Reviewers: Divij Vaidya <diviv@amazon.com> commit 78c59cd2b0b5f21c2028021d9dfb72d21065bb00 Author: Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com> Date: Mon Sep 4 11:02:32 2023 +0200 KAFKA-15052 Fix the flaky testBalancePartitionLeaders - part II (#13908) A follow-up to https://github.com/apache/kafka/pull/13804. This follow-up adds the alternative fix approach mentioned in the PR above - bumping the session timeout used in the test with 1 second. Reproducing the flake-out locally has been much harder than on the CI runs, as neither Gradle with Java 11 or Java 14 nor IntelliJ with Java 14 could show it, but IntelliJ with Java 11 could occasionally reproduce the failure the first time immediately after a rebuild. While I was unable to see the failure with the bumped session timeout, the testing procedure definitely didn't provide sufficient reassurance for the fix as even without it often I'd see hundreds of consecutive successful test runs when the first run didn't fail. Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com> commit a6409e8e61fec1be74a9b404d2750db15d1de40a Author: Proven Provenzano <93720617+pprovenzano@users.noreply.github.com> Date: Mon Sep 4 04:46:12 2023 -0400 KAFKA-15422: Update documenttion for delegation tokens when working with Kafka with KRaft (#14318) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> commit 5074c8038e44620b48d7700226810b983febd864 Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Mon Sep 4 09:13:04 2023 +0530 KAFKA-15260: RLM Task should handle uninitialized RLMM for the associated topic-parititon (#14113) This change is about RLM task handling retriable exception when it tries to copy segments to remote but the RLMM is not yet initialized. On encountering the exception, we log the error and throw the exception back to the caller. We also make sure that the failure metrics are updated since this is a temporary error because RLMM is not yet initialized. Added unit tests to verify RLM task does not attempt to copy segments to remote on encountering the retriable exception and that failure metrics remain unchanged. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com> commit da99879df7bd96675e4abbd50d4e504dc07df574 Author: Luke Chen <showuon@gmail.com> Date: Sun Sep 3 16:16:54 2023 +0800 KAFKA-15421: fix network thread leak in testThreadPoolResize (#14320) In SocketServerTest, we create SocketServer and enableRequestProcessing on each test class initialization. That's fine since we shutdown it in @AfterEach. The issue we have is we disabled 2 tests in this test suite. And when running these disabled tests, we will go through class initialization, but without @AfterEach. That causes 2 network thread leaked. Compared the error message in DynamicBrokerReconfigurationTest#testThreadPoolResize test here: org.opentest4j.AssertionFailedError: Invalid threads: expected 6, got 8: List(data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(PLAINTEXT)-PLAINTEXT-0, data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0, data-plane-kafka-socket-acceptor-ListenerName(INTERNAL)-SSL-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0, data-plane-kafka-socket-acceptor-ListenerName(PLAINTEXT)-PLAINTEXT-0, data-plane-kafka-socket-acceptor-ListenerName(EXTERNAL)-SASL_SSL-0) ==> expected: <true> but was: <false> The 2 unexpected network threads are leaked from SocketServerTest. Reviewers: Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Chris Egerton <chrise@aiven.io> commit cc53889aaae8371964f9734a30da570afd0b7916 Author: Rohan <desai.p.rohan@gmail.com> Date: Sat Sep 2 18:14:14 2023 -0700 KAFKA-15429: reset transactionInFlight on StreamsProducer close (#14326) Resets the value of transactionInFlight to false when closing the StreamsProducer. This ensures we don't try to commit against a closed producer Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org> commit d293cd0735e050325eb7d2eafec435ec8aec92eb Author: Rohan <desai.p.rohan@gmail.com> Date: Sat Sep 2 18:13:16 2023 -0700 KAFKA-15429: catch+log errors from unsubscribe in streamthread shutdown (#14325) Preliminary fix for KAFKA-15429 which updates StreamThread.completeShutdown to catch-and-log errors from consumer.unsubscribe. Though this does not prevent the exception, it does preserve the original exception that caused the stream thread to exit. Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org> commit 1bb8c11f5aa07709ce1b1b6ef684a6750242d4b0 Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com> Date: Fri Sep 1 16:57:17 2023 -0400 KAFKA-14965 - OffsetsRequestsManager implementation & API integration (#14308) Implementation of the OffsetRequestsManager, responsible for building requests and processing responses for requests related to partition offsets. In this PR, the manager includes support for ListOffset requests, generated when the user makes any of the following consumer API calls: beginningOffsets endOffsets offsetsForTimes All previous consumer API calls interact with the OffsetsRequestsManager by generating a ListOffsetsApplicationEvent. Includes tests to cover the new functionality and to ensure no API level changes are introduced. This covers KAFKA-14965 and KAFKA-15081. Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com> commit 134f6c07a48219d2b54a6fed38ecb576af2f7cf3 Author: Christo Lolov <lolovc@amazon.com> Date: Fri Sep 1 18:42:57 2023 +0100 KAFKA-15427: Fix resource leak in integration tests for tiered storage (#14319) Co-authored-by: Nikhil Ramakrishnan <nikrmk@amazon.com> Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit 6391c6603582a4e4b5bb670233c345e53f82a77b Author: Jeff Kim <kimkb2011@gmail.com> Date: Fri Sep 1 09:36:33 2023 -0400 KAFKA-14500; [7/7] Refactor GroupMetadataManagerTest (#14122) This patch makes the styling consistent inside GroupMetadataManagerTest. Also, it adds JoinResult to simplify the JoinGroup API responses in the tests. Reviewers: David Arthur <mumrah@gmail.com>, David Jacot <djacot@confluent.io> commit dcff0878c48803e2d68f7e43c1e73735b643ace0 Author: David Jacot <djacot@confluent.io> Date: Fri Sep 1 03:45:24 2023 -0700 KAFKA-14499: [5/N] Refactor GroupCoordinator.fetchOffsets and GroupCoordinator.fetchAllOffsets (#14310) This patch refactors the GroupCoordinator.fetchOffsets and GroupCoordinator.fetchAllOffsets methods to take an OffsetFetchRequestGroup and to return an OffsetFetchResponseGroup. It prepares the ground for adding the member id and the member epoch to the OffsetFetchRequest. This change also makes those two methods more aligned with the others in the interface. Reviewers: Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io> commit d0f3cf1f9fa53b7d8663ca23bf42b5c6847e07ab Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Fri Sep 1 06:33:33 2023 +0530 KAFKA-15351: Ensure log-start-offset not updated to local-log-start-offset when remote storage enabled (#14301) When tiered storage is enabled on the topic, and the last-standing-replica is restarted, then the log-start-offset should not reset its offset to first-local-log-segment-base-offset. Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com> commit 16dc983ad67767ee8debd125a3f8b150a91c7acf Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Thu Aug 31 22:21:01 2023 +0200 Kafka Streams Threading: Timeout behavior (#14171) Implement setting and clearing task timeouts, as well as changing the output on exceptions to make it similar to the existing code path. Reviewer: Walker Carlson <wcarlson@apache.org> commit 43fe13350f7a4c74cd101cbb69a01d062f5c9329 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Fri Sep 1 00:09:26 2023 +0530 KAFKA-15404: Disable the flaky integration tests. (#14296) Disabled the below tests to fix the thread leak: 1. kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize() and 2. org.apache.kafka.tiered.storage.integration.OffloadAndConsumeFromLeaderTest Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Justine Olshan <jolshan@confluent.io> commit c2bb8eb875d94568d1ad19bf207ec69c182405d4 Author: Luke Chen <showuon@gmail.com> Date: Thu Aug 31 16:44:32 2023 +0800 MINOR: Close topic based RLMM correctly in integration tests (#14315) Reviewers: Divij Vaidya <diviv@amazon.com> commit 95e1cdc4efbc720687cefad5bacd053565d03614 Author: A. Sophie Blee-Goldman <ableegoldman@gmail.com> Date: Wed Aug 30 13:29:38 2023 -0700 HOTFIX: avoid placement of unnecessary transient standby tasks & improve assignor logging (#14149) Minor fix to avoid creating unnecessary standby tasks, especially when these may be surprising or unexpected as in the case of an application with num.standby.replicas = 0 and warmup replicas disabled. The "bug" here was introduced during the fix for an issue with cooperative rebalancing and in-memory stores. The fundamental problem is that in-memory stores cannot be unassigned from a consumer for any period, however temporary, without being closed and losing all the accumulated state. This caused some grief when the new HA task assignor would assign an active task to a node based on the readiness of the standby version of that task, but would have to remove the active task from the initial assignment so it could first be revoked from its previous owner, as per the cooperative rebalancing protocol. This temporary gap in any version of that task among the consumer's assignment for that one intermediate rebalance would end up causing the consumer to lose all state for it, in the case of in-memory stores. To fix this, we simply began to place standby tasks on the intended recipient of an active task awaiting revocation by another consumer. However, the fix was a bit of an overreach, as we assigned these temporary standby tasks in all cases, regardless of whether there had previously been a standby version of that task. We can narrow this down without sacrificing any of the intended functionality by only assigning this kind of standby task where the consumer had previously owned some version of it that would otherwise potentially be lost. Also breaks up some of the long log lines in the StreamsPartitionAssignor and expands the summary info while moving it all to the front of the line (following reports of missing info due to truncation of long log lines in larger applications) commit 703e1d9faafbf07795261b3233ab985583f17fcb Author: Vincent Jiang <84371940+vincent81jiang@users.noreply.github.com> Date: Wed Aug 30 09:19:24 2023 -0700 KAFKA-15375: fix broken clean shutdown detection logic in LogManager When running in kraft mode, LogManager.startup is called in a different thread than the main broker (#14239) startup thread (by BrokerMetadataPublisher when the first metadata update is received.) If a fatal error happens during broker startup, before LogManager.startup is completed, LogManager.shutdown may mark log dirs as clean shutdown improperly. This PR includes following change: 1. During LogManager startup time: - track hadCleanShutdwon info for each log dir - track loadLogsCompleted status for each log dir 2. During LogManager shutdown time: - do not write clean shutdown marker file for log dirs which have hadCleanShutdown==false and loadLogsCompleted==false Reviewers: Colin P. McCabe <cmccabe@apache.org> commit dbda60c60da8f5a7eabe113615196b729b40a0e8 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 30 10:19:22 2023 +0100 KAFKA-14133: Move RocksDBRangeIteratorTest, TimestampedKeyValueStoreBuilderTest and TimestampedSegmentTest to Mockito (#14222) Reviewers: Divij Vaidya <diviv@amazon.com> commit 2e3ff21c2e3674ece50c2a8a4053b93024e12b4a Author: mannoopj <139923522+mannoopj@users.noreply.github.com> Date: Wed Aug 30 03:03:41 2023 -0400 KAFKA-15412: Reading an unknown version of quorum-state-file should trigger an error (#14302) Reading an unknown version of quorum-state-file should trigger an error. Currently the only known version is 0. Reading any other version should cause an error. Reviewers: Justine Olshan <jolshan@confluent.io>, Luke Chen <showuon@gmail.com> commit efec0f5756510bb02ee578b1a01dd8388237c14b Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 30 01:04:20 2023 +0100 KAFKA-15267: Do not allow Tiered Storage to be disabled while topics have remote.storage.enable property (#14161) The purpose of this change is to not allow a broker to start up with Tiered Storage disabled (remote.log.storage.system.enable=false) while there are still topics that have 'remote.storage.enable' set. Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com> commit 1c5020e1429d1dadcb59955395afa87ada99f670 Author: Chris Egerton <chrise@aiven.io> Date: Tue Aug 29 15:44:22 2023 -0400 KAFKA-13327: Gracefully report connector validation errors instead of returning 500 responses (#14303) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Greg Harris <greg.harris@aiven.io> commit 8611d28b2e2050b913837ade4d01231413991dc3 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 29 12:25:30 2023 -0700 KAFKA-15392: Prevent shadowing RestServer shutdown exceptions (#14277) Reviewers: Chris Egerton <chrise@aiven.io> commit 945d21953e93409290a68dced0d366380866fb26 Author: Philip Nee <pnee@confluent.io> Date: Tue Aug 29 12:03:15 2023 -0700 KAFKA-14875: Implement wakeup (#14118) Summary Implemented wakeup() mechanism using a WakeupTrigger class to store the pending wakeup item, and when wakeup() is invoked, it checks whether there's an active task or a wakeup task. If there's an active task: the task will be completed exceptionally and the atomic reference will be freed up. If there an wakedup task, which means wakeup() was invoked before a blocking call was issued. Therefore, the current task will be completed exceptionally immediately. This PR also addressed minor issues such as: Throwing WakeupException at the right place: As wakeups are thrown by completing an active future exceptionally. The WakeupException is wrapped inside of the ExecutionException. mockConstruction is a thread-lock mock; therefore, we need to free up the reference before completing the test. Otherwise, other tests will continue using the thread-lock mock. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com> commit 3b02e97b65e7636fea84da834be235f93df41aea Author: Taher Ghaleb <taher.a.ghaleb@gmail.com> Date: Tue Aug 29 12:27:20 2023 -0400 KAFKA-15403: Refactor @Test(expected) annotation with assertThrows (#14264) assertThrows makes the verification of exceptions clearer and more intuitive, thus improving code readability compared to the annotation approach. It is considered a test smell in the research literature. One possible research is due to developers not keeping up to date with recent versions of testing frameworks. All such patterns in streams have been refactored. Reviewers: vamossagar12 <sagarmeansocean@gmail.com>, Justine Olshan <jolshan@confluent.io> commit 0912ca27e2a229d2ebe02f4d1dabc40ed5fab0bb Author: Alyssa Huang <ahuang@confluent.io> Date: Mon Aug 28 23:47:22 2023 -0700 KRaft support for DescribeClusterRequestTest and DeleteConsumerGroupsTest (#14294) Reviewers: dengziming <dengziming1993@gmail.com>, mannoopj <mannoopj@users.noreply.github.com> commit 68b140cb56a208b951be84191d1e3f1eb0169882 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Mon Aug 28 20:23:17 2023 +0530 MINOR: Fix the TBRLMMRestart test. (#14297) Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org> commit 68b7031dc443b6f6b5dfac81316ab22fe250ec54 Author: David Jacot <djacot@confluent.io> Date: Mon Aug 28 07:02:56 2023 -0700 KAFKA-14499: [4/N] Implement OffsetFetch API (#14120) This patch implements the OffsetFetch API in the new group coordinator. I found out that implementing the `RequireStable` flag is hard (to not say impossible) in the current model. For the context, the flag is here to ensure that an OffsetRequest request does not return stale offsets if there are pending offsets to be committed. In the scala code, we basically check the pending offsets data structure and if they are any pending offsets, we return the `UNSTABLE_OFFSET_COMMIT` error. This tells the consumer to retry. In our new model, we don't have the pending offsets data structure. Instead, we use a timeline data structure to handle all the pending/uncommitted changes. Because of this we don't know whether offsets are pending for a particular group. Instead of doing this, I propose to not return the `UNSTABLE_OFFSET_COMMIT` error anymore. Instead, when `RequireStable` is set, we use a write operation to ensure that we read the latest offsets. If they are uncommitted offsets, the write operation ensures that the response is only return when they are committed. This gives a similar behaviour in the end. Reviewers: Justine Olshan <jolshan@confluent.io> commit 4590d565ef08e48411123aa1ecbb46bd8130a3de Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Mon Aug 28 15:59:50 2023 +0530 KAFKA-15399: Enable OffloadAndConsumeFromLeader test (#14285) Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit 664f71b20712f7e4d1768d993c2b1ab109b6061c Author: Christo Lolov <lolovc@amazon.com> Date: Mon Aug 28 11:06:37 2023 +0100 KAFKA-14133: Move RecordCollectorTest, StateRestoreCallbackAdapterTest and StoreToProcessorContextAdapterTest to Mockito (#14210) Reviewers: Divij Vaidya <diviv@amazon.com> commit b41b2dfcf2f0f9e458374fb9b0842bcc8739f130 Author: Calvin Liu <83986057+CalvinConfluent@users.noreply.github.com> Date: Mon Aug 28 02:59:48 2023 -0700 KAFKA-15353: make sure AlterPartitionRequest.build() is idempotent (#14236) As described in https://issues.apache.org/jira/browse/KAFKA-15353 When the AlterPartitionRequest version is < 3 and its builder.build is called multiple times, both newIsrWithEpochs and newIsr will all be empty. This can happen if the sender retires on errors. Reviewers: Luke Chen <showuon@gmail.com> commit d869bf5b221a50b16e1ca42f1043adc79c85eae8 Author: iit2009060 <59436466+iit2009060@users.noreply.github.com> Date: Mon Aug 28 14:55:38 2023 +0530 KAFKA-15256: Adding reviewer as part of release announcement email template (#14288) Reviewers: Divij Vaidya <diviv@amazon.com>, Philip Nee <pnee@confluent.io> commit 180dcd396932a4033df2104aaf0038b852c4fc1b Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com> Date: Mon Aug 28 09:35:11 2023 +0100 KAFKA-15294: Publish remote storage configs (#14266) This change does the following: 1. Make RemoteLogManagerConfigs that are implemented public 2. Add tasks to generate html docs for the configs 3. Include config docs in the main site Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit 3643039a9a43cbd7c2dfc4927004be7b32264fb4 Author: Tang Yunzi <34680857+shinyruoqaq@users.noreply.github.com> Date: Mon Aug 28 11:13:47 2023 +0800 MINOR: Fix incorrect comment in TopicDeletionManager.scala. (#14292) Fix incorrect comment in TopicDeletionManager.scala Reviewers: Luke Chen <showuon@gmail.com> commit ff3e6842ff99a600fc02e69ebefb09eef93decb3 Author: Abhijeet Kumar <abhijeet.cse.kgp@gmail.com> Date: Sat Aug 26 05:52:26 2023 +0530 KAFKA-15181: Wait for RemoteLogMetadataCache to initialize after assigning partitions (#14127) This PR adds the following changes to the `TopicBasedRemoteLogMetadataManager` 1. Added a guard in RemoteLogMetadataCache so that the incoming request can be served from the cache iff the corresponding user-topic-partition is initalized 2. Improve error handling in ConsumerTask thread so that is not killed when there are errors in reading the internal topic 3. ConsumerTask initialization should handle the case when there are no records to read and some other minor changes Added Unit Tests for the changes Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> Reviewers: Luke Chen <showuon@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org> commit f2d499e25a1ab8674dda7c6e5a05a12fb8687dbd Author: David Arthur <mumrah@gmail.com> Date: Fri Aug 25 13:41:43 2023 -0400 KAFKA-15389: Don't publish until we have replayed at least one record (#14282) When starting up a controller for the first time (i.e., with an empty log), it is possible for MetadataLoader to publish an empty MetadataImage before the activation records of the controller have been written. While this is not a bug, it could be confusing. This patch closes that gap by waiting for at least one controller record to be committed before the MetadataLoader starts publishing images. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 5785796f985aa294c12e670da221d086a7fa887c Author: Maros Orsak <maros.orsak159@gmail.com> Date: Fri Aug 25 11:31:21 2023 +0200 MINOR: Add a few test cases to clients (#14211) Reviewers: Mickael Maison <mickael.maison@gmail.com> commit 30de2bb5efe7d163115dd358067c98b17c3a6730 Author: Mickael Maison <mimaison@users.noreply.github.com> Date: Fri Aug 25 10:34:17 2023 +0200 MINOR: Missing space in ProducerStateManager LogContext (#14275) Reviewers: Divij Vaidya <diviv@amazon.com> commit c9715a3485f2dbad1accbbf88c66af67febfe489 Author: Luke Chen <showuon@gmail.com> Date: Fri Aug 25 11:52:37 2023 +0800 MINOR: Use "add-exports" only when jdk >= 16 in minikdc (#14232) Use "add-exports" only when jdk >= 16 in minikdc Reviewers: Greg Harris <greg.harris@aiven.io> commit d4ab3ae85a35483c15f0ceb0002eb2d81ad79ad2 Author: Satish Duggana <satishd@apache.org> Date: Fri Aug 25 05:27:59 2023 +0530 KAFKA-14888: Added remote log segments retention mechanism based on time and size. (#13561) This change introduces a remote log segment segment retention cleanup mechanism. RemoteLogManager runs retention cleanup activity tasks on each leader replica. It assesses factors such as overall size and retention duration, subsequently removing qualified segments from remote storage. This process also involves adjusting the log-start-offset within the UnifiedLog accordingly. It also cleans up the segments which have epochs earlier than the earliest leader epoch in the current leader. Co-authored-by: Satish Duggana <satishd@apache.org> Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> Reviewers: Jun Rao <junrao@gmail.com>, Divij Vaidya <diviv@amazon.com, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Alexandre Dupriez <alexandre.dupriez@gmail.com>, Nikhil Ramakrishnan <ramakrishnan.nikhil@gmail.com> commit 9e3b1f9b9bf48603acde7f71c704af812a6dab4b Author: Satish Duggana <satishd@apache.org> Date: Fri Aug 25 05:03:38 2023 +0530 MINOR Bump trunk to 3.7.0-SNAPSHOT (#14286) Reviewers: Divij Vaidya <diviv@amazon.com> commit 880288879e8200cd22951e67a50df166edcc1b33 Author: Yash Mayya <yash.mayya@gmail.com> Date: Thu Aug 24 23:44:53 2023 +0100 KAFKA-15377: Don't expose externalized secret values in tasks-config API endpoint (#14244) Reviewers: Greg Harris <greg.harris@aiven.io> commit a3303b6112f658fd4fcd3b8ae1cdfd7c1e43db5b Author: Greg Harris <greg.harris@aiven.io> Date: Thu Aug 24 12:24:34 2023 -0700 KAFKA-15393: Improve shutdown behavior in MM2 integration tests (#14278) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Chris Egerton <chrise@aiven.io> commit 8d12c1175cca5af27f79aa746c24bd998ff62345 Author: Phuc-Hong-Tran <44060007+Phuc-Hong-Tran@users.noreply.github.com> Date: Fri Aug 25 03:38:45 2023 +1000 KAFKA-15152: Fix incorrect format specifiers when formatting string (#14026) Reviewers: Divij Vaidya <diviv@amazon.com> Co-authored-by: phuchong.tran <phuchong.tran@servicenow.com> commit 45aae641a5a6632f7d1261b1880dbb8e73be1ff6 Author: Said Boudjelda <bmscomp@gmail.com> Date: Thu Aug 24 19:29:33 2023 +0200 MINOR: Upgrade version of Gradle entreprise to 3.14.1 & custom user data Gradle plugin to 1.11.1 (#14131) Reviewers: Divij Vaidya <diviv@amazon.com> commit 89aaecafae30fc2adc51d2fade89af4e9745598e Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Thu Aug 24 21:05:48 2023 +0530 KAFKA-15290: Handle topic-level dynamic remote storage enable configuration (#14238) * KAFKA-15290: Handle topic-level dynamic remote log storage enable configuration. To onboard existing topics to tiered storage, bootstrap the remote-log-components when updating the dynamic `remote.storage.enable` config on the topic. Reviewers: Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org> commit 88d2c4460a1c8c8cf5dbcc9edb43f42fe898ca00 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Thu Aug 24 17:12:13 2023 +0530 KAFKA-15400: Use readLock when removing an item from the RemoteIndexCache (#14283) - Caffeine cache is thread safe, we want to hold the writeLock only during the close. - Fix the flaky tests Reviewers: Divij Vaidya <diviv@amazon.com> commit 25b128de81f826d0e0fe415acecf8b6d4cf837f4 Author: Mehari Beyene <132488287+mehbey@users.noreply.github.com> Date: Thu Aug 24 03:04:55 2023 -0700 KAFKA-14991: KIP-937-Improve message timestamp validation (#14135) This implementation introduces two new configurations `log.message.timestamp.before.max.ms` and `log.message.timestamp.after.max.ms` and deprecates `log.message.timestamp.difference.max.ms`. The default value for all these three configs is maintained to be Long.MAX_VALUE for backward compatibility but with the newly added configurations we can have a finer control when validating message timestamps that are in the past and the future compared to the broker's timestamp. To maintain backward compatibility if the default value of `log.message.timestamp.before.max.ms` is not changed, we are assuming users are still using the deprecated config `log.message.timestamp.difference.max.ms` and validation is done using its value. This ensures that existing customers who have customized the value of `log.message.timestamp.difference.max.ms` will continue to see no change in behavior. Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com> commit 87a30b73b561b205eeabae66aeba529a9307dfa0 Author: Okada Haruki <ocadaruma@gmail.com> Date: Thu Aug 24 17:24:01 2023 +0900 KAFKA-15391: Handle concurrent dir rename which makes log-dir to be offline unexpectedly (#14280) A race condition between async flush and segment rename (for deletion purpose) might cause the entire log directory to be marked offline when we delete a topic. This PR fixes the bug by ignoring NoSuchFileException when we flush a directory. Reviewers: Divij Vaidya <diviv@amazon.com> commit 9972297e510d74bd5dedbffe5dfb7a9f1c0a123f Author: olalamichelle <32348847+olalamichelle@users.noreply.github.com> Date: Wed Aug 23 21:59:16 2023 -0500 KAFKA-14780: Fix flaky test 'testSecondaryRefreshAfterElapsedDelay' (#14078) "The test RefreshingHttpsJwksTest#testSecondaryRefreshAfterElapsedDelay relies on the actual system clock, which makes it frequently fail. The fix adds a second constructor that allows for passing a ScheduledExecutorService to manually execute the scheduled tasks before refreshing. The fixed task is much more robust and stable. Co-authored-by: Fei Xie <feixie@MacBook-Pro.attlocal.net> Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit 0017bb3db5476479a9cfde2ec667949b6b001451 Author: Yash Mayya <yash.mayya@gmail.com> Date: Wed Aug 23 08:54:20 2023 +0100 MINOR: Fix typo in stale PR message (#14274) commit 368bec85e874090b0032cb95f2f465309fbc60d5 Author: Divij Vaidya <diviv@amazon.com> Date: Wed Aug 23 09:31:58 2023 +0200 KAFKA-9926: Fix flaky PlaintextAdminIntegrationTest.testCreatePartitions (#14273) Reviewers: Luke Chen <showuon@gmail.com> commit e9f358eef6a5f48530ecfe0ca6fca689410b499c Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com> Date: Wed Aug 23 00:43:06 2023 -0400 KAFKA-14937; [2/N]: Refactoring for client code to reduce boilerplate (#14218) This PR main refactoring relates to : 1. serializers/deserializers used in clients - unified in a Deserializers class 2. logic for configuring ClusterResourceListeners moved to ClientUtils 3. misc refactoring of the new async consumer in preparation for upcoming Request Managers Reviewers: Jun Rao <junrao@gmail.com> commit 8394ddc0d26399dc20ddff802886fa0b1f41f420 Author: Ron Dagostino <rndgstn@gmail.com> Date: Tue Aug 22 19:04:53 2023 -0400 MINOR: Move delegation token support to Metadata Version 3.6-IV2 (#14270) #14083 added support for delegation tokens in KRaft and attached that support to the existing MetadataVersion 3.6-IV1. This patch moves that support into a separate MetadataVersion 3.6-IV2. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 064a14795ccaa4d50798252e7d9ef081311b5e59 Author: Yash Mayya <yash.mayya@gmail.com> Date: Tue Aug 22 17:26:29 2023 +0100 MINOR: Update OpenAPI summary documentation for Connect's /tasks endpoint (#14267) Reviewers: Mickael Maison <mickael.maison@gmail.com> commit 86afa416d203637f148c4cfe2c5ec38ebc31247e Author: Christo Lolov <lolovc@amazon.com> Date: Tue Aug 22 09:55:54 2023 +0100 KAFKA-14133: Move mocks from KStreamTransformValuesTest, KTableImplTest and KTableTransformValuesTest to Mockito (#14204) Reviewers: Divij Vaidya <diviv@amazon.com> commit 418b8a6e5939903c4b9561a2be7dd2fa8e29c271 Author: David Arthur <mumrah@gmail.com> Date: Mon Aug 21 19:02:14 2023 -0400 KAFKA-14538 Metadata transactions in MetadataLoader (#14208) This PR contains three main changes: - Support for transactions in MetadataLoader - Abort in-progress transaction during controller failover - Utilize transactions for ZK to KRaft migration A new MetadataBatchLoader class is added to decouple the loading of record batches from the publishing of metadata in MetadataLoader. Since a transaction can span across multiple batches (or multiple transactions could exist within one batch), some buffering of metadata updates was needed before publishing out to the MetadataPublishers. MetadataBatchLoader accumulates changes into a MetadataDelta, and uses a callback to publish to the publishers when needed. One small oddity with this approach is that since we can "splitting" batches in some cases, the number of bytes returned in the LogDeltaManifest has new semantics. The number of bytes included in a batch is now only included in the last metadata update that is published as a result of a batch. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit ad76497b12b0d81a98dc6e230e5516adffbfa0ee Author: Walker Carlson <18128741+wcarlson5@users.noreply.github.com> Date: Mon Aug 21 16:08:38 2023 -0500 KAFKA-14936: fix grace period partition issue (#14269) Move the store creation to builder pattern and recover mintimestamp Reviewers: John Roesler<vvcephei@apache.org>, Bill Bejeck <bbejeck@gmail.com> commit 6008af7468108869f74c63e591c45d27c6dfd17e Author: Ron Dagostino <rndgstn@gmail.com> Date: Mon Aug 21 13:02:32 2023 -0400 MINOR: Enable delegation token system test for KRaft (#14268) Reviewers: Colin P. McCabe <cmccabe@apache.org> commit 4b383378a0fd19d6d3c9ae7c2175fa3459661a04 Author: Justine Olshan <jolshan@confluent.io> Date: Mon Aug 21 08:44:00 2023 -0700 KAFKA-15380: Execute action queue after callback request (#14197) KIP-890 part 1 introduced the callback request type. It is used to execute a callback after KafkaApis.handle has returned. We did not account for tryCompleteActions at the end of handle when making this change. In tests, we saw produce p99 increase dramatically (likely because we have to wait for another request before we can complete DelayedProduce). As a result, we should add the tryCompleteActions after the callback as well. In testing, this improved the produce performance. Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io> commit eefa812453d5d5074506f5fc7642f585ac0c3545 Author: David Mao <47232755+splett2@users.noreply.github.com> Date: Sun Aug 20 15:16:27 2023 -0700 MINOR: Delete unused class, LogOffsetMetadata toString formatting (#14246) Noticed that there was a dangling unused class (LongRef, replaced by PrimitiveRef.LongRef), and the LogOffsetMetadata toString was a little oddly formatted. Reviewers: Justine Olshan <jolshan@confluent.io> commit 6492164d9c099ae3091cd508df98453c954b7e13 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Sun Aug 20 17:15:52 2023 +0530 KAFKA-15167: Tiered Storage Test Harness Framework (#14116) `TieredStorageTestHarness` is a base class for integration tests exercising the tiered storage functionality. This uses `LocalTieredStorage` instance as the second-tier storage system and `TopicBasedRemoteLogMetadataManager` as the remote log metadata manager. Co-authored-by: Alexandre Dupriez <alexandre.dupriez@gmail.com> Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> commit c2759df0676cef252596239baf8f1f361e76c49f Author: Proven Provenzano <93720617+pprovenzano@users.noreply.github.com> Date: Sat Aug 19 14:01:08 2023 -0400 KAFKA-15219: KRaft support for DelegationTokens (#14083) Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com> commit 05c329e61fbfddc1f972403fbec0a24c45d4e173 Author: Bruno Cadonna <cadonna@apache.org> Date: Sat Aug 19 19:00:23 2023 +0200 KAFKA-10199: Change to RUNNING if no pending task to init exist (#14249) A stream thread should only change to RUNNING if there are no active tasks in restoration in the state updater and if there are no pending tasks to recycle and to init. Usually all pending tasks to init are added to the state updater in the same poll iteration that handles the assignment. However, if during an initialization of a task a LockException the task is re-added to the tasks to init and initialization is retried in the next poll iteration. A LockException might occur when a state directory is still locked by another thread, when the rebalance just happened. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Walker Carlson <wcarlson@confluent.io> commit 4c7e0a9fa66fd1becc590d6060228b0305dd400b Author: Bruno Cadonna <cadonna@apache.org> Date: Sat Aug 19 12:13:30 2023 +0200 MINOR: Decouple purging committed records from committing (#14227) Currently, Kafka Streams only tries to purge records whose offset are committed from a repartition topic when at least one offset was committed in the current commit. The coupling between committing some offsets and purging records is not needed and might delay purging of records. For example, if a in-flight call for purging records has not completed yet when a commit happens, a new call is not issued. If then the earlier in-flight call for purging records finally completes but the next commit does not commit any offsets, Streams does not issue the call for purging records whose offset were committed in the previous commit because the purging call was still in-flight. This change issues calls for purging records during any commit if the purge interval passed, even if no offsets were committed in the current commit. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Walker Carlson <wcarlson@confluent.io> commit d0b7677c2c9a8265643a90175f86987521615849 Author: Walker Carlson <18128741+wcarlson5@users.noreply.github.com> Date: Fri Aug 18 22:00:04 2023 -0500 KAFKA-14936: Add restore logic (3/N) (#14027) Added restore logic for the buffer in grace period joins. Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com> commit 82ae77f94566881d957929acbdc3197e60f7f5f8 Author: Greg Harris <greg.harris@aiven.io> Date: Fri Aug 18 15:28:43 2023 -0700 KAFKA-15226: Add connect-plugin-path and plugin.discovery system test (#14230) Reviewers: Chris Egerton <chrise@aiven.io> commit b36cf4ef977fb14bc57683630a9f3f3680705550 Author: Matthias J. Sax <matthias@confluent.io> Date: Fri Aug 18 11:06:08 2023 -0700 HOTIFX: fix Kafka Streams upgrade path from 3.4 to 3.5 (#14103) KIP-904 introduced a backward incompatible change that requires a 2-bounce rolling upgrade. The new "3.4" upgrade config value is not recognized by `AssignorConfiguration` though and thus crashed Kafka Streams if use. Reviewers: Farooq Qaiser <fqaiser94@gmail.com>, Bruno Cadonna <bruno@confluent.io> commit 3ad5f42f595e4eef3a5787ce34e2676ed58bee3d Author: David Arthur <mumrah@gmail.com> Date: Fri Aug 18 12:44:01 2023 -0400 Handle case of default broker in config migration (#14237) When collecting the set of broker IDs during the migration, don't try to parse the default broker resource `""` as a broker ID. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit ee036ed9effe6d38ee737e9360d3dd256a74fa9f Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Fri Aug 18 18:31:27 2023 +0200 KAFKA-15319: Upgrade rocksdb to fix CVE-2022-37434 (#14216) Rocksdbjni<7.9.2 is vulnerable to CVE-2022-37434 due to zlib 1.2.12 Reviewers: Divij Vaidya <diviv@amazon.com>, Bruno Cadonna <cadonna@apache.org> commit 4f88fb28f38f3e461377bf688520b28a3f209b5d Author: DL1231 <53332773+DL1231@users.noreply.github.com> Date: Fri Aug 18 20:51:09 2023 +0800 KAFKA-15130: Delete remote segments when deleting a topic (#13947) * Delete remote segments when deleting a topic Co-authored-by: Kamal Chandraprakash <kchandraprakash@uber.com> Co-authored-by: d00791190 <dinglan6@huawei.com> commit 3f4816dd3eafaf1a0636d3ee689069f897c99e28 Author: José Armando GarcÃa Sancio <jsancio@users.noreply.github.com> Date: Thu Aug 17 18:40:17 2023 -0700 KAFKA-15345; KRaft leader notifies leadership when listener reaches epoch start (#14213) In a non-empty log the KRaft leader only notifies the listener of leadership when it has read to the leader's epoch start offset. This guarantees that the leader epoch has been committed and that the listener has read all committed offsets/records. Unfortunately, the KRaft leader doesn't do this when the log is empty. When the log is empty the listener is notified immediately when it has become leader. This makes the API inconsistent and harder to program against. This change fixes that by having the KRaft leader wait for the listener's nextOffset to be greater than the leader's epochStartOffset before calling handleLeaderChange. The RecordsBatchReader implementation is also changed to include control records. This makes it possible for the state machine learn about committed control records. This additional information can be used to compute the committed offset or for counting those bytes when determining when to snapshot the partition. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io> commit 7802c264c96ae27167cf38c263b86398aa0ea3fe Author: Yash Mayya <yash.mayya@gmail.com> Date: Thu Aug 17 19:13:53 2023 +0100 MINOR: Allow writing tombstone offsets for arbitrary partitions in the FileStreamSourceConnector (#14234) Reviewers: Chris Egerton <chrise@aiven.io> commit a253dc6643bb18b123a3832f88b0d97965f38068 Author: Chris Egerton <chrise@aiven.io> Date: Thu Aug 17 14:11:01 2023 -0400 KAFKA-15102: Add release notes about the replication.policy.internal.topic.separator.enabled property for MirrorMaker 2 (#14220) Reviewers: Greg Harris <greg.harris@aiven.io> commit d85a70081333a2ab9dd6593e99abf213a469ba2d Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Thu Aug 17 19:53:58 2023 +0200 MINOR: Do not reuse admin client across tests (#14225) Reusing an admin client across tests can cause false positives in leak checkers, so don't do it. Reviewers: Divij Vaidya <diviv@amazon.com>, Matthias J. Sax <matthias@confluent.io> commit de409b389d26f7681fba8583db2b96584258aa48 Author: Chris Egerton <chrise@aiven.io> Date: Thu Aug 17 09:33:59 2023 -0400 KAFKA-15177: Implement KIP-875 SourceConnector::alterOffset API in MirrorMaker 2 connectors (#14005) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Greg Harris <greg.harris@aiven.io> commit 6bd17419b76f8cf8d7e4a11c071494dfaa72cd50 Author: Greg Harris <greg.harris@aiven.io> Date: Wed Aug 16 11:37:33 2023 -0700 KAFKA-15228: Add sync-manifests command to connect-plugin-path (KIP-898) (#14195) Reviewers: Chris Egerton <chrise@aiven.io> commit a9efca0bf63110d68f84fc9841d8a31f245e10e0 Author: Greg Harris <greg.harris@aiven.io> Date: Wed Aug 16 10:30:24 2023 -0700 KAFKA-14759: Move Mock, Schema, and Verifiable connectors to new test-plugins module (#13302) Reviewers: Hector Geraldino <hgeraldino@gmail.com>, Chris Egerton <chrise@aiven.io> commit f970ddff10114ae7f9ea1b9966cd25ac8bb5f581 Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com> Date: Wed Aug 16 12:00:39 2023 +0100 KAFKA-15210: Mention vote should be open for at atleast 72 hours in the release script (#14183) Reviewers: Divij Vaidya <diviv@amazon.com> commit d0e9e94629c1347b19a2992b67d8590e61a59d04 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 16 09:19:35 2023 +0100 KAFKA-14133: Migrate ActiveTaskCreatorTest, ChangelogTopicsTest and GlobalProcessorContextImplTest to Mockito (#14209) Reviewers: Divij Vaidya <diviv@amazon.com> commit cfe49d1b7782e4adde409638c4a4152b22bd1ce5 Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Wed Aug 16 17:01:49 2023 +0900 KAFKA-7438: Replace EasyMock with Mockito in SessionStoreBuilderTest (#14142) Reviewers: Divij Vaidya <diviv@amazon.com>, Yash Mayya <yash.mayya@gmail.com> commit 1a15cd708ab252e7c75dea06f36db36599f18322 Author: Christo Lolov <lolovc@amazon.com> Date: Wed Aug 16 08:46:40 2023 +0100 KAFKA-14133: Migrato SessionCacheFlushListenerTest, TimestampedCacheFlushListenerTest and TimestampedTupleForwarderTest to Mockito (#14205) Reviewers: Divij Vaidya <diviv@amazon.com> commit ee27773549646602b955343dba680efaa5227286 Author: vamossagar12 <sagarmeansocean@gmail.com> Date: Wed Aug 16 07:16:17 2023 +0530 KAFKA-15329: Make default remote.log.metadata.manager.class.name as topic based RLMM (#14202) As described in the KIP here the default value of remote.log.metadata.manager.class.name should be TopicBasedRemoteLogMetadataManager Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Divij Vaidya <diviv@amazon.com> commit 35e925f3535e7774520317310505fcde946228d5 Author: Omnia G.H Ibrahim <o.g.h.ibrahim@gmail.com> Date: Wed Aug 16 00:58:52 2023 +0100 KAFKA-15102: Add replication.policy.internal.topic.separator.enabled property to MirrorMaker 2 (KIP-949) (#14082) Reviewers: Chris Egerton <chrise@aiven.io> commit b97e8203eb1fb016cd7cccd3dbf5fecc716969be Author: Philip Nee <pnee@confluent.io> Date: Tue Aug 15 15:01:28 2023 -0700 MINOR: CommitRequestManager should only poll when the coordinator node is known (#14179) As title, we discovered a flaky bug during testing that the commit request manager would seldomly throw a NOT_COORDINATOR exception, which means the request was routed to a non-coordinator node. We discovered that if we don't check the coordinator node in the commitRequestManager, the request manager will pass on an empty node to the NetworkClientDelegate, which implies the request can be sent to any node in the cluster. This behavior is incorrect as the commit requests need to be routed to a coordinator node. Because the timing coordinator's discovery during integration testing isn't entirely deterministic; therefore, the test became extremely flaky. After fixing this: The coordinator node is mandatory before attempt to enqueue these commit request to the NetworkClient. Reviewers: Jun Rao <junrao@gmail.com> commit 28858f3a3e2e6a8499944344b270bf33ca116503 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 15 14:24:48 2023 -0700 MINOR: Fix SynchronizationTest classloaders sometimes not being parallel capable (#14177) Reviewers: Chris Egerton <chrise@aiven.io> commit 1a001c1e88e3e32b34a5c52f257de722da95a736 Author: Greg Harris <greg.harris@aiven.io> Date: Tue Aug 15 13:21:45 2023 -0700 KAFKA-15336: Add ServiceLoader Javadocs for Connect plugins (#14194) Reviewers: Chris Egerton <chrise@aiven.io> commit 0a531b7e7cc614b273092a6d28b1b8c72bb93ea9 Author: David Arthur <mumrah@gmail.com> Date: Tue Aug 15 16:01:31 2023 -0400 MINOR: Install ControllerServer metadata publishers sooner (#14215) This patch is a follow up of #14169 that installs the metadata publishers before blocking on the authorizer future. Reviewers: Colin P. McCabe <cmccabe@apache.org> commit c199840f0add1c191c4333dc58baae49594734a8 Author: David Arthur <mumrah@gmail.com> Date: Sat Aug 12 09:14:17 2023 -0400 MINOR: Fix the ZkMigrationState metric in KafkaController This patch fixes an issue for ZK controllers where we were emitting the ZkMigrationState enum rather than a value. This can lead to downstream failures with JMX metrics since the RMI protocol will marshal the ZkMigrationState object returned by the gauge. Any downstream consumer of this metric (like jconsole or a metrics exporter) will not be able to unmarshal the value since the ZkMigrationState class will not be present. The fix is simply to emit the byte value of this enum. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Alok Thatikunta <athatikunta@confluent.io> commit 696a56dd2ba54beed41748ef8cce958dadd6b375 Author: Kamal Chandraprakash <kchandraprakash@uber.com> Date: Wed Aug 16 00:13:11 2023 +0530 KAFKA-15295: Add config validation when remote storage is enabled on a topic (#14176) Add config validation which verifies that system level remote storage is enabled when enabling remote storage for a topic. In case verification fails, it throws InvalidConfigurationException. Reviewers: Christo Lolov <lolovc@amazon.com>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com> commit fd6c9f16bacf26261fee2755c1d3679b9703a8a0 Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Tue Aug 15 18:48:13 2023 +0900 KAFKA-7438: Replace Easymock & Powermock with Mockito in RocksDBMetricsRecorderGaugesTest (#14190) Reviewers: Christo Lolov <christololov@gmail.com>, Divij Vaidya <diviv@amazon.com> commit adc16d0f310dc1350bca66d5da013d599b990cfa Author: Colin Patrick McCabe <cmccabe@apache.org> Date: Mon Aug 14 16:58:56 2023 -0700 KAFKA-14538: Implement KRaft metadata transactions in QuorumController Implement the QuorumController side of KRaft metadata transactions. As specified in KIP-868, this PR creates a new metadata version, IBP_3_6_IV1, which contains the three new records: AbortTransactionRecord, BeginTransactionRecord, EndTransactionRecord. In order to make offset management unit-testable, this PR moves it out of QuorumController.java and into OffsetControlManager.java. The general approach here is to track the "last stable offset," which is calculated by looking at the latest committed offset and the in-progress transaction (if any). When a transaction is aborted, we revert back to this last stable offset. We also revert back to it when the controller is transitioning from active to inactive. In a follow-up PR, we will add support for the transaction records in MetadataLoader. We will also add support for automatically aborting pending transactions after a controller failover. Reviewers: David Arthur <mumrah@gmail.com> commit 5a67b080c7d7814c38a83a25ab951e65743fae81 Author: Calvin Liu <83986057+CalvinConfluent@users.noreply.github.com> Date: Mon Aug 14 16:14:29 2023 -0700 MINOR: Fix a race when query isUnderMinIsr (#14138) When the leader becomes the follower, we first remove the ISR and then reset the leader. If we call isUnderMinIsr in between, we will get an answer with true which is a race bug. Reviewers: Justine Olshan <jolshan@confluent.io> commit 67b527460e155f1f5e850bb7fd65c7c373367b48 Author: Kirk True <kirk@kirktrue.pro> Date: Mon Aug 14 10:08:20 2023 -0700 KAFKA-14937: Refactoring for client code to reduce boilerplate (#13990) Move common code from the client implementations to the ClientUtils class or (consumer) Utils class, where passible. There are a number of places in the client code where the same basic calls are made by more than one client implementation. Minor refactoring will reduce the amount of boilerplate code necessary for the client to construct its internal state. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com> commit 5234ddff5025501be2b4fca3ecba4e4eb584bbc5 Author: Lucas Brutschy <lucasbru@users.noreply.github.com> Date: Mon Aug 14 17:17:28 2023 +0200 KAFKA-15326: [5/N] Processing thread punctuation (#14001) Implements punctuation inside processing threads. The scheduler algorithm checks if a task that is not assigned currently can be punctuated, and returns it when a worker thread asks for the next task to be processed. Then, the processing thread runs all punctuations in the punctionation queue. Piggy-backed: take TaskExecutionMetadata into account when processing records. Reviewer: Bruno Cadonna <cadonna@apache.org> commit 43751d8d0521b1440a823a9430fdb0659ce7c436 Author: vveicc <vveicc@163.com> Date: Mon Aug 14 17:04:15 2023 +0800 KAFKA-15289: Support KRaft mode in RequestQuotaTest (#14201) Enable kraft mode for RequestQuotaTest, there are 2 works left to be done. Reviewers: dengziming <dengziming1993@gmail.com> commit d91c9bd2b594719cee629b6c057204fb0de0d1a2 Author: Chris Egerton <chrise@aiven.io> Date: Sat Aug 12 16:52:49 2023 -0400 KAFKA-14682: Report Mockito unused stubbings during Jenkins build (#14186) * KAFKA-14682: Report Mockito unused stubbings during Jenkins build * DO NOT MERGE: Add test case that should fail during Jenkins build * Revert "DO NOT MERGE: Add test case that should fail during Jenkins build" This reverts commit 8418b835ecb49fa10da04c7a997c7e982a8c4a47. commit ae46c0a34c90e3954b47ead2dc057e0bbf186a1f Author: bachmanity1 <81428651+bachmanity1@users.noreply.github.com> Date: Sat Aug 12 17:41:07 2023 +0900 KAFKA-7438: Replace Easymock & Powermock with Mockito in TableSourceNodeTest (#14189) Reviewers: Divij Vaidya <diviv@amazon.com> commit 3a94670a013caa57b991caa47037d5c9bfdbf20a Author: Hao Li <1127478+lihaosky@users.noreply.github.com> Date: Sat Aug 12 01:38:07 2023 -0700 MINOR: Fix streams task assignor tests (#14196) Reviewers: Divij Vaidya <diviv@amazon.com> commit cfb45b000140729aabc34953a80bc8dc4c9a08d3 Author: Rittika Adhikari <rittika.adhikari@gmail.com> Date: Fri Aug 11 14:15:17 2023 -0700 MINOR: Refactor TierStateMachine related tests into a separate test file (#13503) This PR builds off of KAFKA-14685 and refactors any tests explicitly related to ReplicaFetcherTierStateMachine into a separate testing file ReplicaFetcherTierStateMachineTest. Reviewers: Jun Rao <junrao@gmail.com> commit f6b8b39747112bd8f3418893af760d42a1635648 Author: Philip Nee <pnee@confluent.io> Date: Fri Aug 11 13:15:30 2023 -0700 MINOR: Fix committed API in the PrototypeAsyncConsumer timeout (#14123) Discovered the committed() API timeout during the integration test. After investigation, this is because the future was not completed in the ApplicationEventProcessor. Also added toString methods to the event class for debug purposes. Reviewers: Jun Rao <junrao@gmail.com> commit f5655d31d3d527dae057240570162827c6a79fb2 Author: Greg Harris <greg.harris…
This patch implements the OffsetFetch API in the new group coordinator. I found out that implementing the `RequireStable` flag is hard (to not say impossible) in the current model. For the context, the flag is here to ensure that an OffsetRequest request does not return stale offsets if there are pending offsets to be committed. In the scala code, we basically check the pending offsets data structure and if they are any pending offsets, we return the `UNSTABLE_OFFSET_COMMIT` error. This tells the consumer to retry. In our new model, we don't have the pending offsets data structure. Instead, we use a timeline data structure to handle all the pending/uncommitted changes. Because of this we don't know whether offsets are pending for a particular group. Instead of doing this, I propose to not return the `UNSTABLE_OFFSET_COMMIT` error anymore. Instead, when `RequireStable` is set, we use a write operation to ensure that we read the latest offsets. If they are uncommitted offsets, the write operation ensures that the response is only return when they are committed. This gives a similar behaviour in the end. Reviewers: Justine Olshan <jolshan@confluent.io>
This patch implements the OffsetFetch API in the new group coordinator.
I found out that implementing the
RequireStable
flag is hard (to not say impossible) in the current model. For the context, the flag is here to ensure that an OffsetRequest request does not return stale offsets if there are pending offsets to be committed. In the scala code, we basically check the pending offsets data structure and if they are any pending offsets, we return theUNSTABLE_OFFSET_COMMIT
error. This tells the consumer to retry.In our new model, we don't have the pending offsets data structure. Instead, we use a timeline data structure to handle all the pending/uncommitted changes. Because of this we don't know whether offsets are pending for a particular group. Instead of doing this, I propose to not return the
UNSTABLE_OFFSET_COMMIT
error anymore. Instead, whenRequireStable
is set, we use a write operation to ensure that we read the latest offsets. If they are uncommitted offsets, the write operation ensures that the response is only return when they are committed. This gives a similar behaviour in the end.Note that this PR does not adds the MemberId and MemberEpoch fields to the request. This will be done in the next PR.
Committer Checklist (excluded from commit message)