[FLINK-39035][format][avro] Support Avro fast-read and column pruning…#27536
Open
low-bee wants to merge 3 commits intoapache:masterfrom
Open
[FLINK-39035][format][avro] Support Avro fast-read and column pruning…#27536low-bee wants to merge 3 commits intoapache:masterfrom
low-bee wants to merge 3 commits intoapache:masterfrom
Conversation
… in AvroDeserializationSchema via configuration
davidradl
reviewed
Feb 5, 2026
| + "Avro Fastread improves Avro read speeds by constructing a resolution chain. " | ||
| + "get more information about this feature, please visit https://issues.apache.org/jira/browse/AVRO-3230"); | ||
|
|
||
| public static final ConfigOption<String> AVRO_WRITER_SCHEMA_STRING = |
Contributor
There was a problem hiding this comment.
I can't see any test for this option.
In the Confluent Avro format, which will inherit the Avro options, you can specify a schema to use there, also the Confluent schema registry can supply the real schema. I think we should understand and document which options take precedent.
Also I suggest we say that this writer schema needs to be compatible with the table definition and what that means. I am thinking about
- compatibility between nullable and non nullable fields
- what it means for pruning nested columns
- changing to castable types.
Author
There was a problem hiding this comment.
I'm glad to see your reply, and I will add more unit tests and documentation as suggested.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
To optimize avro format reading performance, about https://issues.apache.org/jira/browse/FLINK-39035
Brief change log
add option for avro
Verifying this change
Unit tests have been added.
Does this pull request potentially affect one of the following parts:
Dependencies (does it add or upgrade a dependency): (no)
The public API, i.e., is any changed class annotated with @public(Evolving): (no)
The serializers: (yes)
The runtime per-record code paths (performance sensitive): (no)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
The S3 file system connector: (no)
Documentation
Does this pull request introduce a new feature? (no)
If yes, how is the feature documented? (not applicable)