Skip to content

[VL] Spark 4.1: Velox split function returns incorrect results with limit parameter (SPARK-49968) #11913

@baibaichen

Description

@baibaichen

Backend

VL (Velox)

Bug description

split('hello', '', 1) returns ["h"] in Velox but should return ["hello"]. This is a Velox C++ re2-based split implementation issue when the limit parameter is specified. Spark 4.1 added the limit parameter test via SPARK-49968.

Spark 4.1 only.

Parent issue: #11550

Impact

Suite / Item Status spark40 spark41
GlutenCollationRegexpExpressionsSuite Entire suite TODO (1 failure) 🟢 🔴
GlutenRegexpExpressionsSuite .exclude("SPLIT") 🟢 🔴
VeloxStringFunctionsSuite testWithSpecifiedSparkVersion skips 4.x 🟢 🔴

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions