Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15800: Prevent DataExceptions from corrupting KafkaOffsetBackingStore #14718

Merged

Conversation

gharris1727
Copy link
Contributor

https://issues.apache.org/jira/browse/KAFKA-15800

The JsonConverter may throw DataException when encountering malformed data. This causes part or all of the batch of ConsumerRecords from the KafkaBasedLog consumer to be dropped, corrupting the state of the KafkaOffsetBackingStore.

This is a narrow-scope fix for just the KafkaOffsetBackingStore, and i'd like to follow up with a fix to the KafkaBasedLog to make these sorts of mistakes less impactful in the future.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

…gStore

Signed-off-by: Greg Harris <greg.harris@aiven.io>
Copy link
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, Greg! The change itself makes sense to me, but I was just wondering about the scenarios that would lead to something like this occurring? The offsets that the source connector generates would be serialized in the runtime by the JSON converter itself before being written to the offsets topic. Is this fix for cases where connectors or operators directly produce bad data to the offsets topic for a Connect cluster?

Signed-off-by: Greg Harris <greg.harris@aiven.io>
@gharris1727
Copy link
Contributor Author

Is this fix for cases where connectors or operators directly produce bad data to the offsets topic for a Connect cluster?

I do not know how we got malformed data in the topic exactly, but the invalid records did appear to be almost valid source offsets, they were just missing a double quote in the middle of the key. It may have been from a bad converter version, a bit flip, or something else, I can't be sure.

It may have been an operator mis-typing during a manual remediation, I just don't have direct evidence for that. If that was the case, KIP-875 (which introduced this bug) should also make it less likely in the future.

@gharris1727 gharris1727 requested a review from yashmayya November 9, 2023 19:18
Copy link
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Greg!

@gharris1727
Copy link
Contributor Author

Test failures appear unrelated, and the connect and mirror tests pass locally.

@gharris1727 gharris1727 merged commit 989618a into apache:trunk Nov 10, 2023
gharris1727 added a commit that referenced this pull request Nov 10, 2023
…gStore (#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
gharris1727 added a commit that referenced this pull request Nov 10, 2023
…gStore (#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
mjsax pushed a commit to confluentinc/kafka that referenced this pull request Nov 22, 2023
…gStore (apache#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
rreddy-22 pushed a commit to rreddy-22/kafka-rreddy that referenced this pull request Jan 2, 2024
…gStore (apache#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
…gStore (apache#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
…gStore (apache#14718)

Signed-off-by: Greg Harris <greg.harris@aiven.io>

Reviewers: Yash Mayya <yash.mayya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants