Please leave a comment and mark on the TODO tests in this issue if you want to contribute
Backend
VL(Velox)
Bug Description
These suites were commented out in #11512 with TODO markers(and were updated based on #11800) indicating they need to be fixed and re-enabled.
Context
The test suites below have been disabled due to various failures. Each table shows the status for both Spark 4.0 and 4.1:
- 🔴 = Suite is commented out (disabled)
- 🟢 = Suite is enabled
The failure count (when available) is shown in the status column.
org.apache.spark.sql
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenDataFrameSubquerySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11727 |
GlutenExplainSuite |
🔴 (4 failures) |
🔴 (4 failures) |
|
Gluten doesn't use codegen (2 codegen + 2 DPP/V2 scan) |
GlutenJoinHintSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenLogQuerySuite |
🟢 |
🟢 |
@kapilks |
Fixed |
GlutenRandomDataGeneratorSuite |
🟢 |
🔴 (232 failures) |
#11919 |
TimeType |
GlutenSetCommandSuite |
🔴 (1 failures) |
🔴 (1 failures) |
@Surbhi-Vijay |
|
GlutenSingleLevelAggregateHashMapSuite |
🟢 |
🟢 |
@kapilks |
Fixed |
GlutenSparkSessionJobTaggingAndCancellationSuite |
🔴 |
🔴 |
@baibaichen |
Flaky, disabled in #11908 |
GlutenTwoLevelAggregateHashMapSuite |
🟢 |
🟢 |
@kapilks |
Fixed |
GlutenTwoLevelAggregateHashMapWithVectorizedMapSuite |
🟢 |
🟢 |
@kapilks |
Fixed |
GlutenVariantEndToEndSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11726 |
GlutenVariantShreddingSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11726 |
GlutenXmlFunctionsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11725 |
GlutenSparkSessionExtensionSuite |
🔴 (1 failure) |
🔴 (1 failure) |
|
Discovered in #11800 |
GlutenTPCDSV1_4_PlanStabilitySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCDSV1_4_PlanStabilityWithStatsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCDSV2_7_PlanStabilitySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCDSV2_7_PlanStabilityWithStatsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCDSModifiedPlanStabilitySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCDSModifiedPlanStabilityWithStatsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
GlutenTPCHPlanStabilitySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11799 |
catalyst.expressions
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenCastWithAnsiOnSuite |
🔴 (4 failures) |
🔴 (10 failures) |
#10134 |
ANSI + TimeType |
GlutenCollationRegexpExpressionsSuite |
🟢 |
🔴 (1 failure) |
#11913 |
Velox split limit unsupported |
GlutenCsvExpressionsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenExpressionEvalHelperSuite |
🔴 (2 failures) |
🔴 (2 failures) |
|
Discovered in #11800 |
GlutenObjectExpressionsSuite |
🔴 (7 failures) |
🔴 (7 failures) |
|
Discovered in #11800 |
GlutenOrderingSuite |
🟢 |
🔴 (2 failures) |
#11919 |
TimeType |
GlutenScalaUDFSuite |
🔴 (1 failure) |
🔴 (1 failure) |
|
Discovered in #11800 |
GlutenToPrettyStringSuite |
🔴 (1 failure) |
🔴 (1 failure) |
#11918 |
Velox timestamp timezone unsupported |
GlutenXmlExpressionsSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11580 |
connector
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenGroupBasedUpdateTableSuite |
🔴 (1 failures) |
🔴 (1 failures) |
#11912 |
JNI exception mapping |
GlutenMergeIntoDataFrameSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11812 |
execution
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenColumnarRulesSuite |
🟢 |
🔴 (1 failure) |
#11920 |
dual-mode ColumnarToRow |
GlutenDataSourceScanExecRedactionSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenDataSourceV2ScanExecRedactionSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenExternalAppendOnlyUnsafeRowArraySuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11847 |
GlutenHiveResultSuite |
🟢 |
🔴 (1 failure) |
#11919 |
TimeType |
GlutenInsertSortForLimitAndOffsetSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenLogicalPlanTagInSparkPlanSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11833. Core fix: propagate LOGICAL_PLAN_TAG during offload |
GlutenMultiStatefulOperatorsSuite |
🔴 (2 failures) |
🔴 (2 failures) |
#11911 |
Streaming |
GlutenPlannerSuite |
🔴 (15 failures) |
🔴 (15 failures) |
|
Optimization behavior difference (9 testGluten + 6 exclude) |
GlutenProjectedOrderingAndPartitioningSuite |
🔴 (6 failures) |
🔴 (6 failures) |
|
SinglePartition vs HashPartitioning difference |
GlutenRemoveRedundantProjectsSuite |
🔴 (14 failures) |
🔴 (14 failures) |
|
Optimization behavior difference |
GlutenRemoveRedundantSortsSuite |
🔴 (5 failures) |
🔴 (5 failures) |
|
Optimization behavior difference |
GlutenSQLExecutionSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11847 |
GlutenSQLJsonProtocolSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11847 |
GlutenShufflePartitionsUtilSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11847 |
GlutenSimpleSQLViewSuite |
🔴 (1 failure) |
🔴 (2 failures) |
#11912 |
JNI exception mapping + precision loss |
GlutenSparkPlanSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11816 |
GlutenUnsafeRowSerializerSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11847 |
GlutenWholeStageCodegenSparkSubmitSuite |
🔴 (1 failures) |
🔴 (1 failures) |
|
|
GlutenWholeStageCodegenSuite |
🔴 (24 failures) |
🔴 (24 failures) |
@Surbhi-Vijay |
|
execution.joins
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenSingleJoinSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11577 |
execution.datasources.parquet
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenParquetTypeWideningSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11683 |
GlutenParquetVariantShreddingSuite |
🟢 |
🟢 |
@baibaichen |
Fixed in #11726 |
execution.datasources.text
execution.python
sources
| Suite |
Spark 4.0 |
Spark 4.1 |
Owner |
Comments |
GlutenBucketedReadWithHiveSupportSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenBucketedWriteWithHiveSupportSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenCommitFailureTestRelationSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenDisableUnnecessaryBucketedScanWithHiveSupportSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenJsonHadoopFsRelationSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenParquetHadoopFsRelationSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
GlutenSimpleTextHadoopFsRelationSuite |
🟢 |
🟢 |
@baibaichen |
Deleted in #11847 (TestHiveSingleton conflict) |
streaming
Structured Streaming suites (20 disabled) are tracked in #11911.
Summary Statistics
- Total suites disabled in Spark 4.0: 69
- Total suites disabled in Spark 4.1: 79
- Unique suites across both versions: 79
PRs
| PR |
Status |
Suites fixed |
| #11577 |
✅ Merged |
1: SingleJoinSuite |
| #11580 |
✅ Merged |
1: XmlExpressionsSuite |
| #11683 |
✅ Merged |
1: ParquetTypeWideningSuite |
| #11725 |
✅ Merged |
1: XmlFunctionsSuite |
| #11726 |
✅ Merged |
2: VariantEndToEnd, VariantShreddingSuite |
| #11727 |
✅ Merged |
1: DataFrameSubquerySuite |
| #11799 |
✅ Merged |
7: TPCDS PlanStability suites |
| #11812 |
✅ Merged |
1: MergeIntoDataFrame (JobTagging reverted in #11908) |
| #11833 |
✅ Merged |
1: LogicalPlanTagInSparkPlan (150 tests) |
| #11847 |
✅ Merged |
6 enabled + 7 deleted: SystemPropertyTrait suites |
| #11816 |
✅ Merged |
8 suites + 2 trait fix + 1 disable |
| #11908 |
🔵 Draft |
4: PythonDataSource, PythonUDF, PythonUDTF, RowQueue |
Please leave a comment and mark on the TODO tests in this issue if you want to contribute
Backend
VL(Velox)
Bug Description
These suites were commented out in #11512 with TODO markers(and were updated based on #11800) indicating they need to be fixed and re-enabled.
Context
The test suites below have been disabled due to various failures. Each table shows the status for both Spark 4.0 and 4.1:
The failure count (when available) is shown in the status column.
org.apache.spark.sql
GlutenDataFrameSubquerySuiteGlutenExplainSuiteGlutenJoinHintSuiteGlutenLogQuerySuiteGlutenRandomDataGeneratorSuiteGlutenSetCommandSuiteGlutenSingleLevelAggregateHashMapSuiteGlutenSparkSessionJobTaggingAndCancellationSuiteGlutenTwoLevelAggregateHashMapSuiteGlutenTwoLevelAggregateHashMapWithVectorizedMapSuiteGlutenVariantEndToEndSuiteGlutenVariantShreddingSuiteGlutenXmlFunctionsSuiteGlutenSparkSessionExtensionSuiteGlutenTPCDSV1_4_PlanStabilitySuiteGlutenTPCDSV1_4_PlanStabilityWithStatsSuiteGlutenTPCDSV2_7_PlanStabilitySuiteGlutenTPCDSV2_7_PlanStabilityWithStatsSuiteGlutenTPCDSModifiedPlanStabilitySuiteGlutenTPCDSModifiedPlanStabilityWithStatsSuiteGlutenTPCHPlanStabilitySuitecatalyst.expressions
GlutenCastWithAnsiOnSuiteGlutenCollationRegexpExpressionsSuiteGlutenCsvExpressionsSuiteGlutenExpressionEvalHelperSuiteGlutenObjectExpressionsSuiteGlutenOrderingSuiteGlutenScalaUDFSuiteGlutenToPrettyStringSuiteGlutenXmlExpressionsSuiteconnector
GlutenGroupBasedUpdateTableSuiteGlutenMergeIntoDataFrameSuiteexecution
GlutenColumnarRulesSuiteGlutenDataSourceScanExecRedactionSuiteGlutenDataSourceV2ScanExecRedactionSuiteGlutenExternalAppendOnlyUnsafeRowArraySuiteGlutenHiveResultSuiteGlutenInsertSortForLimitAndOffsetSuiteGlutenLogicalPlanTagInSparkPlanSuiteGlutenMultiStatefulOperatorsSuiteGlutenPlannerSuiteGlutenProjectedOrderingAndPartitioningSuiteGlutenRemoveRedundantProjectsSuiteGlutenRemoveRedundantSortsSuiteGlutenSQLExecutionSuiteGlutenSQLJsonProtocolSuiteGlutenShufflePartitionsUtilSuiteGlutenSimpleSQLViewSuiteGlutenSparkPlanSuiteGlutenUnsafeRowSerializerSuiteGlutenWholeStageCodegenSparkSubmitSuiteGlutenWholeStageCodegenSuiteexecution.joins
GlutenSingleJoinSuiteexecution.datasources.parquet
GlutenParquetTypeWideningSuiteGlutenParquetVariantShreddingSuiteexecution.datasources.text
GlutenWholeTextFileV1SuiteGlutenWholeTextFileV2Suiteexecution.python
GlutenPythonDataSourceSuiteGlutenPythonUDFSuiteGlutenPythonUDTFSuiteGlutenRowQueueSuitesources
GlutenBucketedReadWithHiveSupportSuiteGlutenBucketedWriteWithHiveSupportSuiteGlutenCommitFailureTestRelationSuiteGlutenDisableUnnecessaryBucketedScanWithHiveSupportSuiteGlutenJsonHadoopFsRelationSuiteGlutenParquetHadoopFsRelationSuiteGlutenSimpleTextHadoopFsRelationSuitestreaming
Structured Streaming suites (20 disabled) are tracked in #11911.
Summary Statistics
PRs