Skip to content

cli: --data-dir#247

Closed
turuslan wants to merge 1 commit intolambdaclass:mainfrom
turuslan:cli/data-dir
Closed

cli: --data-dir#247
turuslan wants to merge 1 commit intolambdaclass:mainfrom
turuslan:cli/data-dir

Conversation

@turuslan
Copy link

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR implements issue #79 by replacing the hardcoded "./data" RocksDB path with a new --data-dir CLI argument, giving operators explicit control over where the node stores its data.

Changes:

  • Adds data_dir: PathBuf field to CliOptions (clap #[arg(long)], no default)
  • Passes &options.data_dir to RocksDBBackend::open instead of the hardcoded "./data" string

Potential concerns:

  • The argument has no default value, making it a required argument. Existing start scripts that do not already pass --data-dir will fail at startup with a clap missing-argument error. Consider providing a default (e.g., "./data") for backward compatibility, or documenting this as an intentional breaking change.
  • The resolved data_dir path is not logged at startup, unlike node_key. Adding a log line would aid debugging.

Confidence Score: 4/5

  • PR is safe to merge; the logic change is correct, though the absence of a default value is a breaking change worth deciding on before merging.
  • The change is minimal, focused, and functionally correct. The only concern is whether the required (no-default) --data-dir argument is intentional — existing deployments will break until they add the flag. Once that design decision is confirmed (or a default is added), the PR is clean.
  • bin/ethlambda/src/main.rs — verify default-value intent for --data-dir

Important Files Changed

Filename Overview
bin/ethlambda/src/main.rs Adds --data-dir CLI argument (PathBuf) and threads it through to RocksDBBackend::open, replacing the hardcoded "./data" path. The change is functionally correct; the argument lacks a default value which is a breaking change for existing deployments.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CLI startup] --> B[CliOptions::parse]
    B --> C{--data-dir provided?}
    C -- No --> D[clap error: required argument missing]
    C -- Yes --> E[options.data_dir: PathBuf]
    E --> F[RocksDBBackend::open and options.data_dir]
    F --> G[fetch_initial_state]
    G --> H[Node running with user-specified data dir]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: bin/ethlambda/src/main.rs
Line: 57-58

Comment:
**No default value for `--data-dir` is a breaking change**

The previous behaviour hardcoded `"./data"` as the RocksDB path, so existing start scripts and deployments will now fail at startup with a missing-required-argument error from clap because `--data-dir` is required but has no default.

If backward compatibility is desired, a default of `"./data"` can be added; alternatively, a more conventional XDG-style default (e.g., `~/.ethlambda/data`) could be documented or provided:

```suggestion
    #[arg(long, default_value = "./data")]
    data_dir: PathBuf,
```

If the intent is to _force_ explicit configuration (which is also reasonable), consider documenting this as a breaking change in the PR / release notes.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: bin/ethlambda/src/main.rs
Line: 135

Comment:
**`data_dir` not logged at startup**

`node_key` is logged at startup (line 107), but `data_dir` is not. Logging the resolved data directory path is useful for debugging and confirming the correct directory is in use, especially since it is now user-supplied rather than hardcoded. Consider adding a similar `info!` log line near the existing `node_key` log.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "data dir cli option" | Re-trigger Greptile

Comment on lines 57 to +58
#[arg(long)]
data_dir: PathBuf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No default value for --data-dir is a breaking change

The previous behaviour hardcoded "./data" as the RocksDB path, so existing start scripts and deployments will now fail at startup with a missing-required-argument error from clap because --data-dir is required but has no default.

If backward compatibility is desired, a default of "./data" can be added; alternatively, a more conventional XDG-style default (e.g., ~/.ethlambda/data) could be documented or provided:

Suggested change
#[arg(long)]
data_dir: PathBuf,
#[arg(long, default_value = "./data")]
data_dir: PathBuf,

If the intent is to force explicit configuration (which is also reasonable), consider documenting this as a breaking change in the PR / release notes.

Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/ethlambda/src/main.rs
Line: 57-58

Comment:
**No default value for `--data-dir` is a breaking change**

The previous behaviour hardcoded `"./data"` as the RocksDB path, so existing start scripts and deployments will now fail at startup with a missing-required-argument error from clap because `--data-dir` is required but has no default.

If backward compatibility is desired, a default of `"./data"` can be added; alternatively, a more conventional XDG-style default (e.g., `~/.ethlambda/data`) could be documented or provided:

```suggestion
    #[arg(long, default_value = "./data")]
    data_dir: PathBuf,
```

If the intent is to _force_ explicit configuration (which is also reasonable), consider documenting this as a breaking change in the PR / release notes.

How can I resolve this? If you propose a fix, please make it concise.

read_validator_keys(&validators_path, &validator_keys_dir, &options.node_id);

let backend = Arc::new(RocksDBBackend::open("./data").expect("Failed to open RocksDB"));
let backend = Arc::new(RocksDBBackend::open(&options.data_dir).expect("Failed to open RocksDB"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 data_dir not logged at startup

node_key is logged at startup (line 107), but data_dir is not. Logging the resolved data directory path is useful for debugging and confirming the correct directory is in use, especially since it is now user-supplied rather than hardcoded. Consider adding a similar info! log line near the existing node_key log.

Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/ethlambda/src/main.rs
Line: 135

Comment:
**`data_dir` not logged at startup**

`node_key` is logged at startup (line 107), but `data_dir` is not. Logging the resolved data directory path is useful for debugging and confirming the correct directory is in use, especially since it is now user-supplied rather than hardcoded. Consider adding a similar `info!` log line near the existing `node_key` log.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@pablodeymo
Copy link
Collaborator

Thanks @turuslan
Closed in favor of #248

@pablodeymo pablodeymo closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants