Skip to content

Add --data-dir CLI parameter for configuring storage path#248

Merged
pablodeymo merged 4 commits intomainfrom
add-data-dir-cli
Mar 25, 2026
Merged

Add --data-dir CLI parameter for configuring storage path#248
pablodeymo merged 4 commits intomainfrom
add-data-dir-cli

Conversation

@pablodeymo
Copy link
Collaborator

@pablodeymo pablodeymo commented Mar 25, 2026

Motivation

The RocksDB storage path was hardcoded to ./data, making it impossible to configure where the node stores its data on disk. Other lean clients (e.g., Zeam) already expose a --data-dir parameter for this purpose.

This is especially important for:

  • Server deployments where operators want to store data on a specific disk/partition
  • Running multiple nodes on the same machine with separate data directories
  • Docker setups where the data directory needs to match volume mounts

Description

  • Add --data-dir CLI argument to CliOptions (type: PathBuf, default: ./data)
  • Replace the hardcoded "./data" string in RocksDBBackend::open() with the CLI-provided value
  • Fully backward compatible — existing deployments that don't pass --data-dir get the same ./data default

Usage

# Default (same as before)
ethlambda --custom-network-config-dir /config --node-key /config/node.key --node-id ethlambda_0 --gossipsub-port 9001

# Custom data directory
ethlambda --data-dir /mnt/ssd/ethlambda-data --custom-network-config-dir /config --node-key /config/node.key --node-id ethlambda_0 --gossipsub-port 9001

CLI help output

Flag Default Description
--data-dir ./data Directory for RocksDB storage

Files changed

  • bin/ethlambda/src/main.rs — new CLI arg + use it when opening RocksDB

How to Test

  1. Run without --data-dir → data is stored in ./data (no behavior change)
  2. Run with --data-dir /tmp/test-data → data is stored in /tmp/test-data
  3. Verify --help shows the new flag

Related Issues

Closes #79

Defaults to ./data for backward compatibility. Matches the same
parameter available in other lean clients like Zeam.
@github-actions
Copy link

🤖 Kimi Code Review

The changes look correct and follow Rust best practices. This is a straightforward configuration improvement with no security or correctness issues.

Summary:

  • File: bin/ethlambda/src/main.rs
  • Change: Makes the RocksDB data directory configurable via CLI (--data-dir) instead of hardcoded "./data"

Review:

  1. Line 72-74: Good use of PathBuf for cross-platform path handling. The doc comment will automatically serve as CLI help text via clap's derive macro.

  2. Line 136-138: Correct passing of &options.data_dir to RocksDBBackend::open. The reference is valid for the duration of the call, and PathBuf implements AsRef<Path> which is likely the expected argument type.

  3. Backward compatibility: The default value "./data" matches the previous hardcoded path, ensuring existing deployments continue to work without configuration changes.

Minor suggestions (non-blocking):

  • Consider validating directory permissions or existence early, though RocksDBBackend::open likely handles this internally.
  • If this path is used elsewhere for auxiliary files (logs, temp data), ensure all components respect this same --data-dir argument to avoid data fragmentation.

Security note: As with all database path configurations, operators should ensure proper filesystem permissions on the target directory to prevent unauthorized access to validator state or chain data.

LGTM.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link

🤖 Claude Code Review

The change is minimal and clean. Here's the review:


PR #248: Add --data-dir CLI parameter

Overall: This is a small, well-scoped change. The implementation is correct and backward compatible. A few minor observations:

Correctness

The RocksDBBackend::open signature accepts impl AsRef<Path>, so passing &options.data_dir (a &PathBuf) works correctly — PathBuf implements AsRef<Path>. No issue here.

The PathBuf import is already present in main.rs (line 16: path::{Path, PathBuf}), so no new import is needed. Good.

Minor Observations

  • No directory creation: RocksDB with create_if_missing(true) will create the database file but may fail if intermediate parent directories don't exist. For example, --data-dir /mnt/ssd/ethlambda/data will fail if /mnt/ssd/ethlambda/ doesn't exist yet. A std::fs::create_dir_all(&options.data_dir) before opening would make the UX more robust, though this is a minor operational concern rather than a correctness bug.

  • Startup log: The chosen data directory isn't logged at startup. Given that misconfigured paths are a common operational issue, an info!(data_dir = %options.data_dir.display(), "Opening RocksDB") line would help operators quickly verify the correct path is in use. Other startup configuration (e.g., ports, node ID) is presumably already logged.

  • Default value string: The default_value = "./data" is a relative path resolved against the working directory at runtime. This matches the prior hardcoded behavior and is explicitly documented in the PR — no issue, just worth noting in operator docs that the path is relative to CWD when not absolute.

What's Good

  • Clean, minimal diff — exactly the right scope for this change.
  • Backward compatible by design (same default).
  • PathBuf is the right type for filesystem paths in CLI args (vs. String), and clap handles the conversion correctly.
  • Formatting of the multi-line Arc::new(...) call follows the project's style.

Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

No findings.

This PR is a straightforward CLI plumbing change: it adds --data-dir with the existing ./data default at bin/ethlambda/src/main.rs and threads that path into RocksDBBackend::open at bin/ethlambda/src/main.rs. I don’t see any correctness, consensus, security, or performance regression from the patch as written.

I could not run cargo check in this environment because the sandbox blocks rustup from writing under /home/runner/.rustup/tmp, so there is still an unverified compile step.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR adds a --data-dir CLI parameter to ethlambda, replacing the hardcoded "./data" string passed to RocksDBBackend::open(). The default value preserves backward compatibility. Both concerns raised in the previous review round — pre-creating parent directories with create_dir_all and logging the resolved path at startup — are fully addressed in this revision.

Key changes:

  • New data_dir: PathBuf field on CliOptions with default_value = "./data", consistent with the rest of the struct
  • std::fs::create_dir_all called before RocksDBBackend::open to handle missing intermediate directories gracefully
  • info! log emitted with the resolved data_dir value, matching the startup logging patterns already present in main
  • RocksDBBackend::open correctly accepts impl AsRef<Path>, so passing &options.data_dir compiles cleanly with no adapter needed

Confidence Score: 5/5

  • This PR is safe to merge — it is a small, backward-compatible feature addition with no logic regressions.
  • The change is minimal and self-contained to a single file. The default value preserves the existing ./data behaviour exactly. Both previously raised concerns (parent directory creation and startup logging) are addressed. The RocksDBBackend::open signature accepts impl AsRef<Path>, so the type change from &str to &PathBuf is fully compatible. No new failure modes are introduced beyond what was already present.
  • No files require special attention.

Important Files Changed

Filename Overview
bin/ethlambda/src/main.rs Adds --data-dir CLI argument (defaulting to ./data) to CliOptions, calls create_dir_all before opening RocksDB to handle missing parent directories, and logs the resolved path at startup. All previous review concerns are fully addressed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([ethlambda start]) --> B[Parse CLI options\n--data-dir or default ./data]
    B --> C{data_dir provided?}
    C -- yes --> D[Use provided path]
    C -- no --> E[Use default ./data]
    D --> F[std::fs::create_dir_all\ndata_dir]
    E --> F
    F --> G{Directory created\nor already exists?}
    G -- failure --> H([panic: Failed to create\ndata directory])
    G -- success --> I[RocksDBBackend::open\ndata_dir]
    I --> J{Open succeeded?}
    J -- failure --> K([panic: Failed to open RocksDB])
    J -- success --> L[info! Initialized DB\ndata_dir = ...]
    L --> M[Continue node startup...]
Loading

Reviews (2): Last reviewed commit: "Rename storage log message to "Initializ..." | Re-trigger Greptile

@pablodeymo pablodeymo mentioned this pull request Mar 25, 2026
Copy link
Collaborator

@MegaRedHand MegaRedHand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. greptile comments are on-point

Addresses review feedback: create_dir_all prevents cryptic RocksDB
errors when parent directories don't exist, and logging the resolved
path helps diagnose misconfiguration in server deployments.
@pablodeymo
Copy link
Collaborator Author

@greptile-apps review it again

@pablodeymo pablodeymo merged commit cd52a08 into main Mar 25, 2026
3 checks passed
@pablodeymo pablodeymo deleted the add-data-dir-cli branch March 25, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add --data-dir flag to specify database directory

2 participants