feat: support read to arrow by luoyuxia · Pull Request #116 · apache/paimon-rust

luoyuxia · 2026-03-10T11:37:08Z

Purpose

Linked issue: close #115

Brief change log

Tests

API and Format

Documentation

XiaoHongbo-Hope · 2026-03-10T14:20:38Z

crates/paimon/src/arrow/reader.rs

+            .iter()
+            .flat_map(|ds| ds.data_file_entries().map(|(p, _)| p))
+            .map(|p| {
+                if !p.to_ascii_lowercase().ends_with(".parquet") {


only parquet?

Yes, let's only support parquet now. We can support more in next following pr.

XiaoHongbo-Hope · 2026-03-10T14:25:34Z

crates/paimon/src/arrow/reader.rs

+            .iter()
+            .flat_map(|ds| ds.data_file_entries().map(|(p, _)| p))
+            .map(|p| {
+                if !p.to_ascii_lowercase().ends_with(".parquet") {


I can get your point, only parquet is supported. But I am afraid that this check is not enough. See DataFilePathFactory.formatIdentifier. Or can we have a better way to get the format.

I understand the concern. However, for parquet files we normally do not expect extra compression suffixes like .gz, since compression is handled within the parquet format itself.

So for the current parquet-only read path, I think checking for .parquet should be sufficient. If we later need to support other file naming conventions, we can revisit this and align with DataFilePathFactory.formatIdentifier.

Btw, From the Python side, it seems we only inspect the last extension via os.path.splitext(). That means a path like *.json.gz would be identified as gz, not json.

So the current behavior there does not appear to handle compressed suffixes in a generalized way either.

For parquet, this is less of a concern since we normally do not expect an extra compression suffix such as .parquet.gz.

Thanks for your very kind explanation. And thanks again for pointing out the issue in python side, I wiil check it.

XiaoHongbo-Hope · 2026-03-10T15:20:14Z

+1

luoyuxia force-pushed the support-read-to-arrow-final branch from 54e9acf to ecee807 Compare March 10, 2026 11:47

feat: support read to arrow

3e92261

luoyuxia force-pushed the support-read-to-arrow-final branch 3 times, most recently from b969a6c to 16f7ab1 Compare March 10, 2026 13:17

ci: add it case for reading log table

42b2732

luoyuxia force-pushed the support-read-to-arrow-final branch from 16f7ab1 to 42b2732 Compare March 10, 2026 13:30

XiaoHongbo-Hope reviewed Mar 10, 2026

View reviewed changes

XiaoHongbo-Hope approved these changes Mar 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support read to arrow#116

feat: support read to arrow#116
luoyuxia wants to merge 2 commits intoapache:mainfrom
luoyuxia:support-read-to-arrow-final

luoyuxia commented Mar 10, 2026

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Uh oh!

luoyuxia Mar 10, 2026

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Uh oh!

XiaoHongbo-Hope Mar 10, 2026 •

edited

Loading

Uh oh!

luoyuxia Mar 10, 2026 •

edited

Loading

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Uh oh!

XiaoHongbo-Hope commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luoyuxia commented Mar 10, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

luoyuxia Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

XiaoHongbo-Hope Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luoyuxia Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XiaoHongbo-Hope Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

XiaoHongbo-Hope commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

XiaoHongbo-Hope Mar 10, 2026 •

edited

Loading

luoyuxia Mar 10, 2026 •

edited

Loading