Skip to content

feat: use per-devnet GPG-encrypted SSH keys in CI workflows#739

Open
vivekgsharma wants to merge 5 commits intov1.0-devfrom
vivek/devent-gpg
Open

feat: use per-devnet GPG-encrypted SSH keys in CI workflows#739
vivekgsharma wants to merge 5 commits intov1.0-devfrom
vivek/devent-gpg

Conversation

@vivekgsharma
Copy link
Copy Markdown
Collaborator

@vivekgsharma vivekgsharma commented Apr 2, 2026

Summary

This PR introduces per-devnet SSH access in CI, replacing the previous shared server key approach with devnet-specific credentials.

Key Changes

Unique SSH keypair per devnet

  • Create Devnet now generates a dedicated SSH keypair for each devnet.
  • The private key is encrypted using GPG and stored alongside the devnet config files in:

Secure key reuse across workflows

  • Subsequent workflows (Create Resume, Manual Deploy, Platform Deploy, Destroy) decrypt the stored key in CI using the configured GPG key.
  • This ensures consistent, devnet-specific SSH access instead of relying on a shared key.

Manual access for developers

  • Developers can download devnet-<name>.ssh.asc from dash-network-configs.
  • Using their GPG private key, they can decrypt it locally to access and inspect devnet servers.

Fixes

Dashmate (fresh-node) issues in Ansible

  • Ensures the requested Dashmate version is used for config rendering.
  • Always adds the dashmate user to the docker group.

Impact

  • Improves security by isolating SSH access per devnet
  • Enables reproducible and consistent CI workflows
  • Provides controlled manual access for debugging and inspection

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Workflows now generate, encrypt, store, and decrypt devnet RSA SSH keys as networks/$NETWORK.ssh.asc using GPG; resume validation requires .yml, .tfvars, .inventory, and .ssh.asc. Runners pinned to Ubuntu 22.04, gnupg added, and Ansible installs pinned to ansible-core==2.16.3.

Changes

Cohort / File(s) Summary
Create / Destroy devnet workflows
.github/workflows/create-devnet.yml, .github/workflows/destroy-devnet.yml
Add GPG-based RSA key generation/encryption into networks/$NETWORK.ssh.asc; decrypts on resume to ~/.ssh/id_rsa/.pub; include gnupg; config validation now requires .yml, .tfvars, .inventory, and .ssh.asc; persistence/push/remove updated to include *.ssh.asc.
Deploy / Platform workflows
.github/workflows/deploy.yml, .github/workflows/platform-deploy.yml
Pin runner to ubuntu-22.04; add npm ci and system deps (python3-pip, python3-netaddr, sshpass, jq, gnupg); install ansible-core==2.16.3 + jmespath; split GitHub deploy key and server key setup; devnet flows decrypt networks/$NETWORK.ssh.asc into ~/.ssh/id_rsa; regenerate pubkey and rewrite ~/.ssh/config; create networks/.env with PRIVATE_KEY_PATH/PUBLIC_KEY_PATH; verification passes --private-key="$HOME/.ssh/id_rsa".
Config validation & push logic
.github/workflows/create-devnet.yml, .github/workflows/destroy-devnet.yml, .github/workflows/deploy.yml
Validation updated to require ssh.asc for devnet flows; push/commit steps refuse to publish partial artifact sets and exit unless all four artifacts are present; cloning/persistence includes *.ssh.asc.
Ansible role and playbook changes
ansible/roles/dashmate/tasks/main.yml, ansible/deploy.yml
dashmate_config_version selection now prefers provided dashmate_version when defined; removed one when on docker-group addition task. ansible/deploy.yml had large sections commented out—core masternode setup/playbook runs are reduced to a smaller active subset.
Misc — file refs & cleanup
.github/workflows/platform-deploy.yml, .github/workflows/deploy.yml
Use networks/$NETWORK.* refs, escape VERSION into SAFE_VERSION for sed, conditional image/tag substitutions, small regex/listing cleanup, removed extraneous debug listings.

Sequence Diagram(s)

sequenceDiagram
    participant GH as GitHub Actions runner
    participant Repo as Configs repo (networks/)
    participant GPG as GnuPG
    participant SSH as SSH (~/.ssh)
    participant Ansible as Ansible playbooks

    GH->>Repo: clone/list `networks/$NETWORK.*`
    GH->>GPG: import GPG private key from secrets (if devnet)
    alt devnet and `networks/$NETWORK.ssh.asc` present
        GPG->>Repo: decrypt `networks/$NETWORK.ssh.asc`
        Repo-->>SSH: write `~/.ssh/id_rsa` and `~/.ssh/id_rsa.pub`
    else non-devnet or direct key
        GH->>SSH: write `DEPLOY_SERVER_KEY` to `~/.ssh/id_rsa`
    end
    GH->>SSH: write GitHub deploy key `id_ed25519` and update `~/.ssh/config`
    GH->>Ansible: invoke playbooks with `--private-key="$HOME/.ssh/id_rsa"`
    Ansible-->>Repo: read `networks/.env` (PRIVATE_KEY_PATH, PUBLIC_KEY_PATH)
    GH->>Repo: push/update `networks/*.ssh.asc` and other artifacts (fail if partial)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Poem

🐇 GPG seeds tucked in .asc so neat,
Keys hop home to ~/.ssh with tiny feet,
Ubuntu twenty-two hums a steady beat,
Ansible-core hops in, deployment fleet,
CI carrots glint — devnet's complete! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing per-devnet GPG-encrypted SSH keys in CI workflows across multiple workflow files and related Ansible tasks.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch vivek/devent-gpg

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
.github/workflows/create-devnet.yml (2)

231-240: GPG recipient extraction is clever but fragile.

The awk parsing relies on specific GPG output format with pub followed by fpr records. This works but is tightly coupled to GPG's --with-colons output format.

Consider adding a comment explaining the expected GPG output structure for future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-devnet.yml around lines 231 - 240, Add a short
clarifying comment above the mapfile/gpg/awk block explaining the assumed
--with-colons output structure (that a "pub" record is followed by an "fpr"
record whose 10th colon field is the fingerprint) and why the awk rule ('$1 ==
"pub" {want=1; next} want && $1 == "fpr" {print $10; want=0}') extracts
recipients; reference the variables/commands involved (mapfile, recipients
array, gpg --batch --with-colons, and the awk expression) so future maintainers
understand the fragile coupling to GPG's output format and can update it safely
if the format changes.

94-101: Redundant validation: the final regex check is always satisfied if the earlier checks pass.

If $NAME matches ^[a-z0-9][a-z0-9-]*$ (line 90), then devnet-$NAME will always match ^devnet-[a-z0-9][a-z0-9-]*$. The check at lines 98-101 is unreachable as a failure path.

Consider removing redundant check
          if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == mainnet-* ]]; then
            echo "Error: reserved network names are not allowed in this workflow"
            exit 1
          fi
-          if [[ ! "devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]]; then
-            echo "Error: resulting network name is not a valid devnet name"
-            exit 1
-          fi
          echo "Will create: devnet-$NAME"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-devnet.yml around lines 94 - 101, The final regex
check ([[ ! "devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]]) is redundant
because the prior validation of NAME (^[a-z0-9][a-z0-9-]*$) plus the
reserved-name guard already guarantees devnet-$NAME will match; remove the
unreachable block (the entire if that echoes "Error: resulting network name is
not a valid devnet name" and exits) to avoid duplicated validation, leaving the
earlier NAME regex and reserved-name checks intact.
.github/workflows/destroy-devnet.yml (1)

54-65: Same redundant validation pattern as in create-devnet.yml.

The check at lines 62-65 is unreachable since lines 54-57 already ensure $NAME matches ^[a-z0-9][a-z0-9-]*$.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/destroy-devnet.yml around lines 54 - 65, The final regex
check that validates "devnet-$NAME" is redundant and unreachable because the
earlier validation on NAME already enforces ^[a-z0-9][a-z0-9-]*$; remove the
unreachable if block that tests [[ ! "devnet-$NAME" =~
^devnet-[a-z0-9][a-z0-9-]*$ ]] (the block that echoes "Error: resulting network
name is not a valid devnet name" and exits) to avoid duplicate validation; keep
the initial NAME validation and the reserved-name check (the if checks
referencing NAME and the reserved string comparisons) intact.
.github/workflows/platform-deploy.yml (1)

58-72: SSH config created here is overwritten later, same pattern as deploy.yml.

The config at lines 66-70 is replaced in the "Prepare server SSH key" step. Consider consolidating.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/platform-deploy.yml around lines 58 - 72, The SSH config
created in the "Set up GitHub SSH key" step is later overwritten by the "Prepare
server SSH key" step; consolidate by creating a single SSH config that contains
both Host entries (github.com and the server host) or append the server block
instead of clobbering the file. Update the "Set up GitHub SSH key" and "Prepare
server SSH key" steps so they either write distinct Host blocks to ~/.ssh/config
using >> (append) or a shared helper step that ensures both IdentityFile entries
and StrictHostKeyChecking settings coexist, and ensure correct permissions with
chmod 600 on the final config.
.github/workflows/deploy.yml (1)

55-69: SSH config created here is immediately overwritten in the next SSH step.

The ~/.ssh/config written here (lines 63-67) is replaced in the "Prepare server SSH key" step (lines 117-127). This duplication is harmless but redundant.

Consider deferring config creation to the later step
      - name: Set up GitHub SSH key
        env:
          EVO_APP_DEPLOY_KEY: ${{ secrets.EVO_APP_DEPLOY_KEY }}
        run: |
          mkdir -p ~/.ssh
          printf '%s\n' "$EVO_APP_DEPLOY_KEY" > ~/.ssh/id_ed25519
          chmod 600 ~/.ssh/id_ed25519
-
-          cat > ~/.ssh/config << 'EOL'
-          Host github.com
-            IdentityFile ~/.ssh/id_ed25519
-            StrictHostKeyChecking no
-          EOL
-
-          chmod 600 ~/.ssh/config
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy.yml around lines 55 - 69, The SSH config written in
the "Set up GitHub SSH key" step is redundant because the later "Prepare server
SSH key" step overwrites ~/.ssh/config; remove or defer creation of
~/.ssh/config from the "Set up GitHub SSH key" step (keep creating
~/.ssh/id_ed25519 and setting permissions) and ensure the actual SSH config is
only written in the "Prepare server SSH key" step, or consolidate the two steps
so only one writes ~/.ssh/config to avoid duplication; refer to the step names
"Set up GitHub SSH key" and "Prepare server SSH key" and the file ~/.ssh/config
when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/create-devnet.yml:
- Around line 373-374: The workflow step named "Push configs to
dash-network-configs" uses if: always(), causing it to run even after failed
deployments; change its condition to only run on successful deploys (e.g.,
replace if: always() with if: success() or use the deploy job outcome like if:
needs.deploy.outcome == 'success') so configs aren't pushed after a
partial/failed deploy; update the step condition in the "Push configs to
dash-network-configs" step accordingly.

---

Nitpick comments:
In @.github/workflows/create-devnet.yml:
- Around line 231-240: Add a short clarifying comment above the mapfile/gpg/awk
block explaining the assumed --with-colons output structure (that a "pub" record
is followed by an "fpr" record whose 10th colon field is the fingerprint) and
why the awk rule ('$1 == "pub" {want=1; next} want && $1 == "fpr" {print $10;
want=0}') extracts recipients; reference the variables/commands involved
(mapfile, recipients array, gpg --batch --with-colons, and the awk expression)
so future maintainers understand the fragile coupling to GPG's output format and
can update it safely if the format changes.
- Around line 94-101: The final regex check ([[ ! "devnet-$NAME" =~
^devnet-[a-z0-9][a-z0-9-]*$ ]]) is redundant because the prior validation of
NAME (^[a-z0-9][a-z0-9-]*$) plus the reserved-name guard already guarantees
devnet-$NAME will match; remove the unreachable block (the entire if that echoes
"Error: resulting network name is not a valid devnet name" and exits) to avoid
duplicated validation, leaving the earlier NAME regex and reserved-name checks
intact.

In @.github/workflows/deploy.yml:
- Around line 55-69: The SSH config written in the "Set up GitHub SSH key" step
is redundant because the later "Prepare server SSH key" step overwrites
~/.ssh/config; remove or defer creation of ~/.ssh/config from the "Set up GitHub
SSH key" step (keep creating ~/.ssh/id_ed25519 and setting permissions) and
ensure the actual SSH config is only written in the "Prepare server SSH key"
step, or consolidate the two steps so only one writes ~/.ssh/config to avoid
duplication; refer to the step names "Set up GitHub SSH key" and "Prepare server
SSH key" and the file ~/.ssh/config when making the change.

In @.github/workflows/destroy-devnet.yml:
- Around line 54-65: The final regex check that validates "devnet-$NAME" is
redundant and unreachable because the earlier validation on NAME already
enforces ^[a-z0-9][a-z0-9-]*$; remove the unreachable if block that tests [[ !
"devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]] (the block that echoes "Error:
resulting network name is not a valid devnet name" and exits) to avoid duplicate
validation; keep the initial NAME validation and the reserved-name check (the if
checks referencing NAME and the reserved string comparisons) intact.

In @.github/workflows/platform-deploy.yml:
- Around line 58-72: The SSH config created in the "Set up GitHub SSH key" step
is later overwritten by the "Prepare server SSH key" step; consolidate by
creating a single SSH config that contains both Host entries (github.com and the
server host) or append the server block instead of clobbering the file. Update
the "Set up GitHub SSH key" and "Prepare server SSH key" steps so they either
write distinct Host blocks to ~/.ssh/config using >> (append) or a shared helper
step that ensures both IdentityFile entries and StrictHostKeyChecking settings
coexist, and ensure correct permissions with chmod 600 on the final config.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5e404721-f3c2-4c17-8d2f-19751a94944b

📥 Commits

Reviewing files that changed from the base of the PR and between 555e290 and c6d503d.

📒 Files selected for processing (4)
  • .github/workflows/create-devnet.yml
  • .github/workflows/deploy.yml
  • .github/workflows/destroy-devnet.yml
  • .github/workflows/platform-deploy.yml

Comment on lines 373 to +374
- name: Push configs to dash-network-configs
if: always()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential issue: if: always() may push configs after failed deployments.

This step runs even when the deploy step fails. If Terraform succeeded but Ansible failed, you might push config files for a partially deployed devnet.

Consider changing to if: success() or adding logic to skip the push if the deploy step failed.

Suggested fix
      - name: Push configs to dash-network-configs
-        if: always()
+        if: success()
        env:
          EVO_APP_DEPLOY_WRITE_KEY: ${{ secrets.EVO_APP_DEPLOY_WRITE_KEY }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Push configs to dash-network-configs
if: always()
- name: Push configs to dash-network-configs
if: success()
env:
EVO_APP_DEPLOY_WRITE_KEY: ${{ secrets.EVO_APP_DEPLOY_WRITE_KEY }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-devnet.yml around lines 373 - 374, The workflow
step named "Push configs to dash-network-configs" uses if: always(), causing it
to run even after failed deployments; change its condition to only run on
successful deploys (e.g., replace if: always() with if: success() or use the
deploy job outcome like if: needs.deploy.outcome == 'success') so configs aren't
pushed after a partial/failed deploy; update the step condition in the "Push
configs to dash-network-configs" step accordingly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/create-devnet.yml (1)

372-416: ⚠️ Potential issue | 🟠 Major

Make config persistence transactional.

deploy.yml and destroy-devnet.yml now treat yml/tfvars/inventory/ssh.asc as the control-plane state. Because this step still runs with if: always() and downgrades git push failures to warnings, a failed deploy can publish a false resume set, and a successful create can finish without persisting the only recoverable SSH key. Please gate persistence on a known-good infra state and fail the run when a fresh create cannot publish all four artifacts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-devnet.yml around lines 372 - 416, The "Push
configs to dash-network-configs" step currently runs unconditionally (if:
always()) and downgrades missing artifacts or push failures to warnings (COPIED
!= 4 exits 0 and push failures echo warnings), which allows inconsistent
control-plane state to be published; change this step to only run on success
(replace if: always() with if: success()), require all four artifacts by making
the COPIED != 4 check fail the job (exit 1) instead of warning, and make git
push failures return non-zero (remove the warning+exit 0 branches and instead
exit 1 on push failure). Also ensure the write-key branch (checking
EVO_APP_DEPLOY_WRITE_KEY and ~/.ssh/id_ed25519_write) treats missing write key
as a hard failure when COPIED == 4 so a fresh create cannot finish without
persisting the SSH key.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.github/workflows/create-devnet.yml:
- Around line 372-416: The "Push configs to dash-network-configs" step currently
runs unconditionally (if: always()) and downgrades missing artifacts or push
failures to warnings (COPIED != 4 exits 0 and push failures echo warnings),
which allows inconsistent control-plane state to be published; change this step
to only run on success (replace if: always() with if: success()), require all
four artifacts by making the COPIED != 4 check fail the job (exit 1) instead of
warning, and make git push failures return non-zero (remove the warning+exit 0
branches and instead exit 1 on push failure). Also ensure the write-key branch
(checking EVO_APP_DEPLOY_WRITE_KEY and ~/.ssh/id_ed25519_write) treats missing
write key as a hard failure when COPIED == 4 so a fresh create cannot finish
without persisting the SSH key.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1c2d104b-6dc0-47ea-a58f-29e943717c6a

📥 Commits

Reviewing files that changed from the base of the PR and between c6d503d and 1cb683a.

📒 Files selected for processing (4)
  • .github/workflows/create-devnet.yml
  • .github/workflows/deploy.yml
  • .github/workflows/destroy-devnet.yml
  • .github/workflows/platform-deploy.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/platform-deploy.yml

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ansible/deploy.yml (1)

404-419: Consider adding a filter to preserve recently-used images during cleanup.

The Docker prune removes all unused images (dangling: false), which could delete images that may be needed for quick rollbacks. If this is intentional for CI environments where disk space is constrained, this is fine. Otherwise, consider adding an images_filters with a time-based threshold.

💡 Optional: Add time-based filter to preserve recent images
     - name: Prune unused Docker images
       community.docker.docker_prune:
         containers: true
         images: true
         images_filters:
           dangling: false
+          until: "24h"
         networks: true
         volumes: false
         builder_cache: true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/deploy.yml` around lines 404 - 419, The Docker prune task "Prune
unused Docker images" currently sets images_filters: dangling: false and will
remove unused but potentially recently-used images; update that task
(community.docker.docker_prune) to add a time-based filter (e.g.,
images_filters: { dangling: false, until: "{{ docker_prune_image_age |
default('72h') }}" }) so images newer than the threshold are preserved, and make
docker_prune_image_age a configurable variable (defaults to 72h) so CI vs prod
behavior can be tuned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ansible/deploy.yml`:
- Around line 404-419: The Docker prune task "Prune unused Docker images"
currently sets images_filters: dangling: false and will remove unused but
potentially recently-used images; update that task
(community.docker.docker_prune) to add a time-based filter (e.g.,
images_filters: { dangling: false, until: "{{ docker_prune_image_age |
default('72h') }}" }) so images newer than the threshold are preserved, and make
docker_prune_image_age a configurable variable (defaults to 72h) so CI vs prod
behavior can be tuned.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 814b8c42-dcc1-4c94-87e2-f0961305ad78

📥 Commits

Reviewing files that changed from the base of the PR and between 318ceef and 7e18fde.

📒 Files selected for processing (1)
  • ansible/deploy.yml

@vivekgsharma vivekgsharma requested a review from ktechmidas April 2, 2026 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant