Skip to content
This repository was archived by the owner on Feb 6, 2024. It is now read-only.
This repository was archived by the owner on Feb 6, 2024. It is now read-only.

Refactor shard version related logic #263

@ZuLiangWang

Description

@ZuLiangWang

Description
The current shard version verification implementation is not perfect enough and has the following problems:

  • The shard versions of CeresMeta and CeresDB are independent of each other. When inconsistencies occur, they must be restored by restarting the CeresDB node.
  • shard version synchronization is chaotic and prone to unexpected Version inconsistencies.
  • The verification logic of shard version limits concurrent DDL. Only one DDL can succeed on a shard at the same time.

Proposal
Redesign and implement shard version related logic.

Additional context
Some current thoughts:

  1. How to synchronize meta version with ceresdb?
    1. Return the latest version in the response of creating and deleting tables (I prefer this solution)
    2. Synchronize the latest version through heartbeat
    3. meta pulls the latest version through the interface provided by ceresdb
  2. Who will persist the shard version information?
    1. Keep it as is, persisted by meta, and ceresdb synchronizes version from meta when opening shard (I prefer this solution)
    2. Version persistence is maintained by ceresdb. When opening shard, ceresdb synchronizes it to meta through response.
  3. How to handle version when operating shards concurrently?
    1. Leave it as is, only one operation will succeed and the others will fail.
    2. When making a batch batch, create a table, delete a table and make a batch, you must consider how to increment the version.
      1. Batch operation, version +1
      2. For each operation in the batch, version +1
  4. Are version inconsistencies allowed within a certain range?
    1. Not allowed, must be completely consistent (current method)
    2. Record the operations on the shard, and ignore the version when operating the shard that allows changes or there will be a certain range of inconsistencies in the operation.
  5. How to recover when versions are inconsistent?
    1. Manually restart the node (current method, not acceptable)
    2. Automatic error correction and recovery
      1. Meta regularly inspects all shard versions. For inconsistent versions, meta initiates repair operations to ceresdb.
      2. ceresdb is responsible for error correction. When receiving a request with an inconsistent version, ceresdb initiates a repair operation to ceresmeta.
      3. How to correct the error specifically and what needs to be done before synchronizing to a consistent version?
        1. Try to rebuild the table or delete the table so that the failed procedure can be executed successfully.
        2. Ignore it directly and force version synchronization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions