Skip to content

Compute deterministic CRC for segment data to avoid Lucene segment CRC mismatch issue #17262

@anuragrai16

Description

@anuragrai16

Problem:

We have seen issue with a recently introduced configuration of skip segment CRC check when load a segment during server start or pre-downloading. The issues with this flag and its edge cases are covered in depth here.

While the fix has the last premise, we need a long-term reliable way to generate a deterministic CRC from the segment, so we dont need to selectively enable flags at the table level to skip CRC checks.

As mentioned in the last ticket, we started with the problem statement as :
As Pinot segment today include many variety of indexes and data, it becomes increasingly hard or infeasible to use CRC to verify segment consistency across replicas. Examples include Lucene text index which has by design non-deterministic behavior during indexing. There should be a revamp on how to use and create CRC when loading segments. This PR just creates option for users to opt out. It has been tested in our production env for a while for Pinot table with text index without issues.

To remove the indeterminism of CRC mismatch for future index types added in Pinot without needing to skip the CRC check to work around this.

Proposed Solution:

  • We will compute a data-only CRC and persist to ZK for every new segment. The data-only CRC is computed using the forward index files, dictionary files inverted index files and the metadata.properties files. While forward indexes are enough to capture the column level data deterministically, Pinot also supports disabling forward indexes for some columns. So, to work around this, we include both the dictionary files and inverted indexes in the data-only CRC calculation.

  • This data-only CRC will be persisted in ZK in-addition to existing full segment CRC, allowing us to not undertake a migration from old CRC (that uses all segment files) to new CRC (data-only files)

  • During the segment CRC check performed at various places (like pre-download, deep-store download), we will use the data-crc to compare segments when data-crc is available. If the data-crc is not available, we will fallback to the full CRC.

  • For v3 segment types where all segment index files are combined into one, the data-only CRC needs to be computed before the formats are converted from v1/v2 segment to v3 segment.

Edge Cases:

  • Even with data only CRC computed, data can be different in cases when we use Ingestion transformation functions like NOW(), which causes the individual replica of segments to evaluate them locally and cause different data. This case is a valid case of re-downloading the segment on replicas, since the underlying data should be same for all replicas of the segment.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions