Releases: StarRocks/starrocks
Releases · StarRocks/starrocks
3.4.8
Release Date: September 30, 2025
Behavior Change
- Lake internal tablet parallel scan (enable_lake_tablet_internal_parallel) is now enabled by default, increasing per‑query internal parallelism (may raise peak resource usage) #62360
Bug Fixes
The following issues have been fixed:
Data Lake Analytics
- Delta Lake partition column names were forcibly converted to lowercase, causing mismatch with actual column names #62970
- Iceberg manifest cache eviction race could trigger a NullPointerException #63052
- Uncaught generic exceptions during Iceberg scan phase interrupted scan range submission and produced no metrics #63019
Materialized Views (MV)
- Complex multi-layer projected views used in MV rewrite produced invalid plans or missing column statistics #63014 #62230
- Case mismatch of Hive external table MV partition columns was incorrectly rejected #62623
- MV refresh used only the creator’s default role, causing insufficient privilege in “no default role” or LDAP setups (role activation strategy & config introduced) #62461
- Case-insensitive conflicts in list-partitioned MV partition names led to duplicate name errors #62443
- Residual version mapping after failed MV restore caused subsequent incremental refresh to be skipped, returning empty results #62643
- Abnormal partitions after MV recovery caused FE restart NullPointerException #62563
- Non-global aggregation queries incorrectly applied aggregation pushdown rewrite producing invalid plans #63105
Storage / Metadata
- Tablet deletion state was only updated in memory (shutdown) and not persisted, so GC still treated it as running and skipped reclamation #63623 #63620
- Concurrent query plus drop tablet led to early delvec cleanup and “no delete vector found” errors #63307
- Base and cumulative sstable sharing the same max_rss_rowid in PK index compaction were misordered, risking lost delete semantics #63362
- Possible BE crash when LakePersistentIndex destructor ran after a failed initialization #62297
- Graceful shutdown of publish thread pool silently discarded queued tasks without marking failures, creating version holes and a false “all succeeded” impression #62683
- Newly cloned replica on a newly added BE during rebalance was immediately judged redundant and removed, preventing data migration to the new node #62894
- Missing lock when reading tablet max version caused inconsistent replication transaction decisions #62280
Query & Optimization
- Combination of date_trunc equality and raw column range predicate was reduced to a point interval, returning empty result sets (e.g. date_trunc('month', dt)='2025-09-01' AND dt>'2025-09-23') #63570
- Pushdown of non-deterministic predicates (random/time functions) produced inconsistent results #63533
- Missing consumer node after CTE reuse decision produced incomplete execution plans #63188
- Type mismatch crashes when table functions and low-cardinality (dictionary) encoding coexisted #62500 #62384
Ingestion & Export
- Oversized CSV split into parallel fragments caused every fragment to skip header rows, leading to data loss (only the first fragment should skip) #62789
- SHOW CREATE ROUTINE LOAD without explicit DB returned job from another database with the same name #62792
- NullPointerException when sameLabelJobs became null during concurrent load job cleanup #63181
Cluster Operations & Management
- CN normal restart or crash path incorrectly executed scale-in deregistration, harming topology consistency #63002 #63010
- Backend decommission blocked even when all tablets were already in recycle bin (no force completion) #63267
- OPTIMIZE TABLE task stuck in PENDING after thread pool rejection #62556
- Dirty tablet metadata cleanup used GTID arguments in the wrong order #62285
3.5.6
Release date: September 22, 2025
Improvements
- A decommissioned BE will be forcibly dropped when all its tablets are in the recycle bin, to avoid the decommission being blocked by those tablets. #62781
- Vacuum metrics will be updated when Vacuum succeeds. #62540
- Added thread pool metrics to the fragment instance execution state report, including active threads, queue count, and running threads. #63067
- Supports S3 path-style access in shared-data clusters to improve compatibility with MinIO and other S3-compatible storage systems. You can enable this feature by setting
aws.s3.enable_path_style_accesstotruewhen creating a storage volume. #62591 - Supports resetting the starting point of the AUTO_INCREMENT value via
ALTER TABLE`` <table_name>`` AUTO_INCREMENT`` = 10000;. #62767 - Supports using Distinguished Name (DN) in Group Provider for group matching, improving the user group solution for LDAP/Microsoft Active Directory environments. #62711
- Supports Azure Workload Identity authentication for Azure Data Lake Storage Gen2. #62754
- Added transaction error messages to the
information_schema.``loadsview to aid failure diagnosis. #61364 - Supports reusing common expressions for complex CASE WHEN expressions in Scan predicates to reduce repetitive computation. #62779
- Uses the REFRESH (instead of ALTER) privilege on the materialized view to execute REFRESH statements. #62636
- Disabled low-cardinality optimization on Lake tables by default to avoid potential issues. #62586
- Enabled tablet balancing between workers by default in shared-data clusters. #62661
- Supports reusing expressions in outer-join WHERE predicates to reduce repetitive computation. #62139
- Added Clone metrics in FE. #62421
- Added Clone metrics in BE. #62479
- Added an FE configuration item
enable_statistic_cache_refresh_after_writeto disable statistics-cache lazy refresh by default. #62518 - Masked credential information in SUBMIT TASK for better security. #62311
json_extractin the Trino dialect returns a JSON type. #59718- Supports ARRAY type in
null_or_empty. #62207 - Adjusted the size limit for the Iceberg manifest cache. #61966
- Added a remote file-cache limit for Hive. #62288
Bug Fixes
The following issues have been fixed:
- Secondary replicas hang indefinitely due to negative timeout values, which cause incorrect timestamp comparisons. #62805
- PublishTask may be blocked when TransactionState is REPLICATION. #61664
- Incorrect repair mechanism for Hive tables that have been dropped and recreated during materialized view refresh. #63072
- Incorrect execution plans were generated after the materialized view aggregation push‑down rewrite. #63060
- ANALYZE PROFILE failures caused by PlanTuningGuide producing unrecognized strings (null explainString) in the query profiles. #63024
- Inappropriate return type of
hour_from_unixtimeand incorrect rewrite rule ofCAST. #63006 - NPE in Iceberg manifest cache under data races. #63043
- Shared-data clusters lack support for colocation in materialized views. #62941
- Iceberg table Scan Exception during Scan Range deployment. #62994
- Incorrect execution plans were generated for view-based rewrite. #62918
- Errors and disrupted tasks due to Compute Nodes are not gracefully shut down on exit. #62916
- NPE when Stream Load execution status updates. #62921
- An issue with statistics when the column name and the name in the PARTITION BY clause differ in case. #62953
- Wrong results are returned when the
LEASTfunction is used as a predicate. #62826 - Invalid ProjectOperator above the table-pruning frontier CTEConsumer. #62914
- Redundant replica handling after Clone. #62542
- Failed to collect Stream Load profiles. #62802
- Ineffective disk rebalancing caused by improper BE selection. #62776
- A potential NPE crash in LocalTabletsChannel when a missing
tablet_idleads to a null delta writer. #62861 - KILL ANALYZE does not take effect. #62842
- SQL syntax errors in histogram stats when MCV values contain single quotes. #62853
- Incorrect output format of metrics for Prometheus. #62742
- NPE when querying
information_schema.analyze_statusafter the database is dropped. #62796 - CVE-2025-58056. #62801
- When SHOW CREATE ROUTINE LOAD is executed, wrong results are returned because the database is considered null if not specified. #62745
- Data loss caused by incorrectly skipping CSV headers in
files(). #62719 - NPE when replaying batch-transaction upserts. #62715
- Publish being incorrectly reported as successful during graceful shutdown in shared-nothing clusters. #62417
- Crash in asynchronous delta writer due to a null pointer. #62626
- Materialized view refresh is skipped because the materialized view version map is not cleared after a failed restore job. #62634
- Issues caused by case-sensitive partition column validation in the materialized view analyzer. #62598
- Duplicate IDs for statements with syntax errors. #62258
- StatisticsExecutor status is overridden due to redundant state assignment in CancelableAnalyzeTask. #62538
- Incorrect error messages produced by statistics collection. #62533
- Premature throttling caused by insufficient default maximum connections for external users. #62523
- A potential NPE in materialized view backup and restore operations. #62514
- Incorrect
http_workers_nummetric. #62457 - The runtime filter fails to locate the corresponding execution group during construction. #62465
- Tedious results on Scan Node caused by simplifying CASE WHEN with complex functions. #62505
gmtimeis not thread-safe. #60483- An issue with getting Hive partitions with escaped strings. #59032
4.0.0-RC
4.0.0-RC
Release date: September 9, 2025
Data Lake Analytics
- Unified Page Cache and Data Cache for BE metadata, and adopted an adaptive strategy for scaling. #61640
- Optimized metadata file parsing for Iceberg statistics to avoid repetitive parsing. #59955
- Optimized COUNT/MIN/MAX queries against Iceberg metadata by efficiently skipping over data file scans, significantly improving aggregation query performance on large partitioned tables and reducing resource consumption. #60385
- Supports compaction for Iceberg tables via procedure
rewrite_data_files. - Supports Iceberg tables with hidden partitions, including creating, writing, and reading the tables. #58914
- Supports the TIME data type in the Paimon catalog. #58292
Security and Authentication
- In scenarios where JWT authentication and the Iceberg REST Catalog are used, StarRocks supports the passthrough of user login information to Iceberg via the REST Session Catalog for subsequent data access authentication. #59611 #58850
- Supports vended credentials for the Iceberg catalog.
Storage Optimization and Cluster Management
- Introduced the File Bundling optimization for the cloud-native table in shared-data clusters to automatically bundle the data files generated by loading, Compaction, or Publish operations, thereby reducing the API cost caused by high-frequency access to the external storage system. #58316
- Supports Kafka 4.0 for Routine Load.
- Supports full-text inverted indexes on Primary Key tables in shared-nothing clusters.
- Supports enabling case-insensitive processing on names of catalogs, databases, tables, views, and materialized views. #61136
- Supports blacklisting Compute Nodes in shared-data clusters. #60830
- Supports global connection ID. #57256
Query and Performance Improvement
- Supports DECIMAL256 data type, expanding the upper limit of precision from 38 to 76 bits. Its 256-bit storage provides better adaptability to high-precision financial and scientific computing scenarios, effectively mitigating DECIMAL128's precision overflow problem in very large aggregations and high-order operations. #59645
- Optimized the performance of the JOIN and AGG operators. #61691
- [Preview] Introduced SQL Plan Manager to allow users to bind a query plan to a query, thereby preventing the query plan from changing due to system state changes (mainly data updates and statistics updates), thus stabilizing query performance. #56310
- Introduced Partition-wise Spillable Aggregate/Distinct operators to replace the original Spill implementation based on sorted aggregation, significantly improving aggregation performance and reducing read/write overhead in complex and high-cardinality GROUP BY scenarios. #60216
- Flat JSON V2:
- Supports configuring Flat JSON on the table level. #57379
- Enhance JSON columnar storage by retaining the V1 mechanism while adding page- and segment-level indexes (ZoneMaps, Bloom filters), predicate pushdown with late materialization, dictionary encoding, and integration of a low-cardinality global dictionary to significantly boost execution efficiency. #60953
- Supports an adaptive ZoneMap index creation strategy for the STRING data type. #61960
Functions and SQL Syntax
- Added the following functions:
- Provides the following syntactic extensions:
3.5.5
Release date: September 5, 2025
Improvements
- Added a new system variable
enable_drop_table_check_mv_dependency(default:false). When set totrue, if the object to be dropped is referenced by a downstream materialized view, the system prevents the execution ofDROP TABLE/DROP VIEW/DROP MATERIALIZED VIEW. The error message lists the dependent materialized views and suggests checking thesys.object_dependenciesview for details. #61584 - Logs now include the Linux distribution and CPU architecture of the build, to facilitate issue reproduction and troubleshooting. Log format:
... build <hash> distro <id> arch <arch>. #62017 - Persisted per-Tablet index and incremental column group file sizes are now cached, replacing on-demand directory scans. This accelerates Tablet status reporting in BE and reduces latency under high I/O scenarios. #61901
- Downgraded several high-frequency INFO logs in FE and BE to VLOG, and aggregated task submission logs, significantly reducing redundant storage-related logs and log volume under heavy load. #62121
- Improved query performance for External Catalog metadata through
information_schemaby pushing table filters before callinggetTable, avoiding per-table RPCs. #62404
Bug Fixes
The following issues have been fixed:
- NullPointerException when fetching partition-level column statistics during the Plan stage due to missing data. #61935
- Fixed Parquet write issues with non-empty NULL arrays, and corrected
SPLIT(NULL, …)behavior to consistently return NULL, preventing data corruption and runtime errors. #61999 - Failure when creating materialized views using
CASE WHENexpressions due to incompatible VARCHAR type returns (fixed by ensuring consistency before and after refresh, and introducing a new FE configurationtransform_type_prefer_string_for_varcharto prefer STRING and avoid length mismatch). #61996 - Statistics for nested CTEs could not be computed outside of memo when
enable_rbo_table_prunewasfalse. #62070 - In Audit Logs, inaccurate Scan Rows results for INSERT INTO SELECT statements. #61381
- ExceptionInInitializerError/NullPointerException during initialization caused FE startup failure when Query Queue v2 was enabled. #62161
- BE crash when
LakePersistentIndexinitialization failed and_memtablecleanup was triggered. #62279 - Permission issues during materialized view refresh due to creator roles not being activated (fixed by adding FE configuration
mv_use_creator_based_authorization. When set tofalse, materialized views are refreshed as root, for compatibility with LDAP-authenticated clusters). #62396 - Materialized view refresh failures caused by case-sensitive List partition table names (fixed by enforcing case-insensitive uniqueness checks on partition names, aligning with OLAP table semantics). #62389
3.3.18
3.3.18
Release Date: August 28, 2025
Bug Fixes
The following issues have been fixed:
- BE crashes when
LakePersistentIndexinitialization failed due to cleanup of_memtable. #62279 - A concurrency issue caused by missing locks when retrieving the maximum Tablet version in the replication transaction manager. #62238
- A hang issue in the phased scheduler, which waited indefinitely during synchronous Profile collection (after the fix, the system correctly terminates Profile collection when scheduling errors occur). #62140
- Exception handling issues in low-cardinality optimization under the
ALLOW_THROW_EXCEPTIONmode (after the fix, exceptions in expression evaluation are properly caught and returned). #62098 - FThe system failed to compute nested CTE statistics outside of the memo during table pruning when
enable_rbo_table_prunewas set tofalse. #62070 - CVE-2025-55163 issue. #62041
- An issue where
split_morsel_queuenested insidepartition_morsel_queuefailed to correctly receive the Tablet Schema. #62034 - Incorrect handling of
NULLarrays during Parquet writes, which could cause data inconsistency or crashes (after the fix, the system ensures thesplitfunction can correctly handleNULLinput strings). #61999 - Failure when creating materialized views using
CASE WHENexpressions due to incompatible return types of VARCHAR (after the fix, the system ensures consistency before and after refresh). #61996 - A concurrency safety issue caused by long operations holding shard-level locks while calculating compression scores. #61899
- An incomplete table pruning issue in CBO caused by pruning logic not considering all relevant predicates. #61881
3.4.7
Release Date: September 1, 2025
Bug Fixes
The following issues have been fixed:
max_filter_ratiois not persisted for Routine Load jobs. #61755- In Stream Load, the
now(precision)function lost the precision parameter. #61721 - In Audit Log, the Scan Rows result for
INSERT INTO SELECTstatements was inaccurate. #61381 - After upgrading the cluster to v3.4.5, the
fslib read iopsmetric increased compared to before the upgrade. #61724 - Queries against SQLServer using JDBC Catalog often got stuck. #61719
3.5.4
Release Date: August 22, 2025
Improvements
- Added logs to clarify the reason that tablets cannot be repaired. #61959
- Optimized DROP PARTITION information in logs. #61787
- Assigned a large but configurable row count to tables with unknown stats for statistical estimation. #61332
- Added balance statistic according to label location. #61905
- Added colocate group balance statistics to improve cluster monitoring. #61736
- Skipped the Publish waiting phase when the number of healthy replicas exceeds the default replica count. #61820
- Included the tablet information collection time in the tablet report. #61643
- Supports writing Starlet files with tags. #61605
- Supports viewing cluster balance statistics via SHOW PROC. #61578
- Bumped librdkafka to 2.11.0 to support Kafka 4.0 and removed deprecated configurations. #61698
- Added
prepared_timeoutconfiguration to Stream Load Transaction Interface. #61539 - Upgraded StarOS to v3.5‑rc3. #61685
Bug Fixes
The following issues have been fixed:
- Incorrect Dict version of random distribution tables. #61933
- Incorrect query context in context conditions. #61929
- Publish failures caused by synchronous Publish for shadow tablets during ALTER operations. #61887
- CVE‑2025‑55163 issue. #62041
- Memory leak in real-time data ingestion from Apache Kafka. #61698
- Incorrect count of rebuild files in the lake persistent index. #61859
- Statistics collection on generated expression columns causes cross-database query errors. #61829
- Query Cache misaligns in shared-nothing clusters, causing inconsistent results. #61783
- High memory usage in CatalogRecycleBin due to retaining deleted partition information.#61582
- SQL Server JDBC connections fail when the timeout exceeds 65,535 milliseconds. #61719
- Security Integration fails to encrypt passwords, exposing sensitive information. #60666
MIN()andMAX()functions on Iceberg partition columns return NULL unexpectedly. #61858- Other predicates of Join containing non‑push‑down subfields were incorrectly rewritten. #61868
- QueryContext cancellation can lead to a use‑after‑free situation. #61897
- CBO’s table pruning overlooks other predicates. #61881
- Partial Updates in
COLUMN_UPSERT_MODEmay overwrite auto-increment columns with zero. #61341 - JDBC TIME type conversion uses an incorrect timezone offset that leads to wrong time values. #61783
max_filter_ratiowas not being serialized in Routine Load jobs. #61755- Precision loss in the
now(precision)function in Stream Load. #61721 - Cancelling a query may result in a “query id not found” error. #61667
- LDAP authentication may miss PartialResultException, causing incomplete query results. #60667
- Paimon Timestamp timezone conversion issue when the query condition contains DATETIME. #60473
3.5.3
Release Date: August 11, 2025
Feature Enhancements
- Lake Compaction adds Segment write time statistics. #60891
- Disable inline mode for Data Cache writes to avoid performance degradation. #60530
- Iceberg metadata scan supports shared file I/O. #61012
- Support termination of all PENDING ANALYZE tasks. #61118
- Force reuse when there are too many CTE nodes to avoid excessive optimization time. #60983
- Added
BALANCEtype to cluster balance results. #61081 - Optimized materialized view rewrite for external tables. #61037
- Default value of system variable
enable_materialized_view_agg_pushdown_rewriteis changed totrue, enabling aggregation pushdown for materialized view queries by default. #60976 - Optimized partition statistics lock competition. #61041
Bug Fixes
The following issues have been fixed:
- Inconsistent Chunk column size after column pruning. #61271
- Synchronous execution of partition statistics loading may cause deadlocks. #61300
- Crash when
array_mapprocesses constant array columns. #61309 - Setting an auto-increment column to NULL results in the system mistakenly rejecting valid data within the same Chunk. #61255
- The actual number of JDBC connections may exceed the
jdbc_connection_pool_sizelimit. #61038 - FQDN mode did not use IP addresses as cache map keys. #61203
- Array column cloning error during array comparison. #61036
- Deploying serialized thread pool blockage led to query performance degradation. #61150
- OK hbResponse not synchronized after heartbeat retry counter reset. #61249
- Incorrect result for the
hour_from_unixtimefunction. #61206 - Conflicts between ALTER TABLE jobs and partition creation. #60890
- Cache does not take effect after upgrading from v3.3 to v3.4 or later. #60973
- Vector index metric
hit_countis not set. #61102 - Stream Load transactions fail to find the coordinator node. #60154
- BE crashes when loading OOM partitions. #60778
- INSERT OVERWRITE failed on manually created partitions. #60750
- Partition creation failed when partition names matched case-insensitively but had different values. #60909
- The system does not support PostgreSQL UUID type. #61021
- Case sensitivity issue with column names when loading Parquet data via
FILES(). #61059
3.3.17
v3.3.17
Release Date: July 30, 2025
Bug Fixes
The following issues have been fixed:
- Upgraded HttpClient5 to 5.4.3. #61298
- Incorrect
cpu_core_used_permillelimit in resource groups. #61177 - Conflict between ALTER jobs and partition creation tasks. #61167
- NPE caused by missing
globalStateMgrinConnectContext. #60880 - Partition creation failed when partition names matched case-insensitively but had different values. #60909
- Lock competition caused by synchronous access to partition statistics. #61041
- ANALYZE tasks stuck in
pendingstate after FE restart. #61113 - Issue with JIT (Just-In-Time) compilation in BE. #61060
- Leader address issue in Starmgr. #61016
- CVE vulnerabilities in Broker. #60908
- Actual number of JDBC connections exceeded
jdbc_connection_pool_sizelimit. #61004 - CVE-2022-41404 vulnerability. #59689
- CVEs related to Parquet and HttpClient5. #58750
- Partition not removed from
_partition_mapwhen physical partition ID was empty. #60842 - Missing version check in shared-data clusters. #59422
- Transaction log missing when publishing logs in batches in shared-data clusters. #60949
- Concurrent publishing of the same transaction when Batch Publish is enabled in shared-data clusters. #57574
- Statistics overwrite issue caused by lack of semi-synchronous mode. #60897
- Inaccurate
maxInstantTimeused for filtering Hudi files when retrieving latest merged file slices. #60927 - TaskRun state incompatible with earlier versions. #60438
- CVE-2025-52999 vulnerability. #60795
- Vulnerability caused by
log4j-1.2.17-cloudera6in Broker. #59579 - BE crash when loading OOM partitions. #60778
- Base Compaction tasks blocking other compaction tasks. #60711
- Inefficient handling of error string truncation. #60878
- Materialized view rewrite failed in multi-FE environments. #60841
- INSERT OVERWRITE failed on manually created partitions. #60750
- Issue caused by using random distribution in aggregate keys. #60702
- Crash caused by low cardinality rewrite in
multi_distinct_count. #60664 - Issue with Pivot resolving fields. #60748
- Upgraded
hudi-commonto 1.0.2. #59501 - BE crash when CLONE and DROP TABLE run concurrently. #61359
3.4.6
Release Date: August 7, 2025
Improvements
- When exporting data to Parquet files using
INSERT INTO FILES, you can now specify the Parquet version via theparquet.versionproperty to improve compatibility with other tools when reading the exported files. #60843
Bug Fixes
The following issues have been fixed:
- Loading jobs failed due to overly coarse lock granularity in
TableMetricsManager. #58911 - Case sensitivity issue in column names when loading Parquet data via
FILES(). #61059 - Cache did not take effect after upgrading a shared-data cluster from v3.3 to v3.4 or later. #60973
- A division-by-zero error occurred when the partition ID was null, causing a BE crash. #60842
- Broker Load jobs failed during BE scaling. #60224
Behavior Changes
- The
keywordcolumn in theinformation_schema.keywordsview has been renamed towordto align with the MySQL definition. #60863