ci(phoenix): workflow_dispatch reinstall for phoenix-deploy-api on CT 5700 #16

Open
nsatoshi wants to merge 0 commits from devin/1777403009-phoenix-bootstrap into master
Owner

Summary

Closes the third item of the post-merge optional follow-ups: ship the missing bootstrap path for phoenix-deploy-api on CT 5700.

The current state on master:

  • phoenix-deploy-api/server.js is the real implementation (executeDeploy via execFile, prepareDeployWorkspace, healthcheck, Gitea status, outbound webhook). Verified by grep — the string Deploy request queued (stub) does NOT exist on master.
  • The running service on CT 5700 is still the older stub that returns {"status":"accepted",...,"message":"Deploy request queued (stub)..."} for every target. So deploy-to-phoenix.yml's deploy / deploy-atomic-swap-dapp jobs return 202 but never execute scripts/deployment/deploy-atomic-swap-dapp-5801.sh.
  • scripts/deployment/deploy-phoenix-deploy-api-to-dev-vm.sh already implements the full ship path (tar → scp to PVE → pct push to CT → run install-systemd.sh → curl /health) for a LAN operator. The gap was: no Gitea-Actions surface for it, so a normal "push to master" cannot heal a stale CT.

This PR adds .gitea/workflows/bootstrap-phoenix-deploy-api.yml, a workflow_dispatch-only job that mirrors the LAN script as a CI runner step:

  1. Validate repo layout — refuses to bootstrap if phoenix-deploy-api/server.js still contains the stub string (defence-in-depth against a regression).
  2. Install SSH key for the PVE node from PHOENIX_PVE_SSH_KEY (use a dedicated deploy key, not your personal key); pin host with PHOENIX_PVE_KNOWN_HOSTS if set, else accept-new.
  3. Build deploy bundletar czf phoenix-deploy-api/ + config/public-sector-program-manifest.json (manifest is optional; warned not failed).
  4. scp + ssh + pct push + install-systemd.sh — the same chain the LAN script uses.
  5. Health checkGET /health with retry.
  6. Non-stub probePOST /api/deploy with target: "__bootstrap_probe__" and the deploy bearer token. Fails the workflow if the response body still contains Deploy request queued (stub) or any auth-rejection signal. This is the unambiguous post-bootstrap health signal in CI logs.

workflow_dispatch-only by design — reinstalling the deploy service is sensitive enough that we want it gated behind a deliberate manual click, not fired on every master push.

Files

  • .gitea/workflows/bootstrap-phoenix-deploy-api.yml — new workflow.
  • docs/04-configuration/DEVIN_GITEA_PROXMOX_CICD.md — new §3a "Bootstrap workflow secrets (one-time per CT)" listing the new secrets and explaining the non-stub probe.

Validation

python3 -c "import yaml; yaml.safe_load(open('.gitea/workflows/bootstrap-phoenix-deploy-api.yml'))"   # YAML valid
grep -c "Deploy request queued (stub)" phoenix-deploy-api/server.js                                  # 0 (master is post-stub)
git log --oneline -5                                                                                  # unchanged on master

End-to-end CI verification of the new workflow itself was not run from this VM — the workflow needs the new secrets (PHOENIX_PVE_HOST, PHOENIX_PVE_SSH_KEY, PHOENIX_PVE_KNOWN_HOSTS, PHOENIX_DEV_VM_VMID, PHOENIX_DEPLOY_DEV_VM_IP) populated in the Gitea repo before it can ssh into the PVE node. Setting those is intentionally a maintainer step; the docs section guides through it.

Review & Testing Checklist for Human

  • Add the new repo secrets in Gitea Settings → Secrets and Variables → Actions:
    • PHOENIX_PVE_HOST (PVE node IP that hosts CT 5700, e.g. 192.168.11.12)
    • PHOENIX_PVE_USER (default root, set if different)
    • PHOENIX_PVE_SSH_KEY (PEM-format private key, dedicated deploy key recommended)
    • PHOENIX_PVE_KNOWN_HOSTS (one-line known_hosts entry; optional but recommended)
    • PHOENIX_DEV_VM_VMID (default 5700)
    • PHOENIX_DEPLOY_DEV_VM_IP (default 192.168.11.59)
  • Run the workflow once from the Gitea Actions UI (Actions tab → "Bootstrap Phoenix Deploy API" → "Run workflow"). Watch the bundle build, scp, pct push, install-systemd.sh, /health, and non-stub probe.
  • After the workflow succeeds, re-run an existing deploy-to-phoenix.yml (push or workflow_dispatch). Check the deploy / deploy-atomic-swap-dapp jobs' Phoenix response body — should NOT contain Deploy request queued (stub) anymore.
  • Verify the actual deploy ran: SSH into CT 5801 / atomic-swap host and confirm the bundle SHA matches the latest atomic-swap-dapp commit (or use the existing post-deploy verification: curl -sS https://atomic-swap.defi-oracle.io/ | grep -oE 'assets/index-[^"]+\.js' | head -1).

Notes

  • The workflow does not modify any host state until the manual trigger fires.
  • If the SSH key in PHOENIX_PVE_SSH_KEY is incorrect or not authorised on the PVE node, the scp step will fail with a clean SSH error message.
  • The non-stub probe in step 6 is the canary I want surfaced in CI logs. If a future change accidentally regresses the running service to a stub (e.g. someone manually copies an old build), the probe will fail and you'll see it in the Bootstrap workflow logs.
  • This is a separate PR from the EIP-5792 PR (atomic-swap-dapp #3) — they touch different repos and different concerns.
## Summary Closes the third item of the post-merge optional follow-ups: ship the missing **bootstrap path** for `phoenix-deploy-api` on CT 5700. The current state on `master`: - `phoenix-deploy-api/server.js` is the **real** implementation (`executeDeploy` via `execFile`, `prepareDeployWorkspace`, healthcheck, Gitea status, outbound webhook). Verified by grep — the string `Deploy request queued (stub)` does NOT exist on `master`. - The **running service** on CT 5700 is still the **older stub** that returns `{"status":"accepted",...,"message":"Deploy request queued (stub)..."}` for every target. So `deploy-to-phoenix.yml`'s deploy / deploy-atomic-swap-dapp jobs return 202 but never execute `scripts/deployment/deploy-atomic-swap-dapp-5801.sh`. - `scripts/deployment/deploy-phoenix-deploy-api-to-dev-vm.sh` already implements the full ship path (tar → scp to PVE → pct push to CT → run `install-systemd.sh` → curl /health) for a LAN operator. The gap was: no Gitea-Actions surface for it, so a normal "push to master" cannot heal a stale CT. This PR adds `.gitea/workflows/bootstrap-phoenix-deploy-api.yml`, a `workflow_dispatch`-only job that mirrors the LAN script as a CI runner step: 1. **Validate repo layout** — refuses to bootstrap if `phoenix-deploy-api/server.js` still contains the stub string (defence-in-depth against a regression). 2. **Install SSH key** for the PVE node from `PHOENIX_PVE_SSH_KEY` (use a dedicated deploy key, not your personal key); pin host with `PHOENIX_PVE_KNOWN_HOSTS` if set, else `accept-new`. 3. **Build deploy bundle** — `tar czf` `phoenix-deploy-api/` + `config/public-sector-program-manifest.json` (manifest is optional; warned not failed). 4. **scp + ssh + pct push + install-systemd.sh** — the same chain the LAN script uses. 5. **Health check** — `GET /health` with retry. 6. **Non-stub probe** — `POST /api/deploy` with `target: "__bootstrap_probe__"` and the deploy bearer token. Fails the workflow if the response body still contains `Deploy request queued (stub)` or any auth-rejection signal. This is the unambiguous post-bootstrap health signal in CI logs. `workflow_dispatch`-only by design — reinstalling the deploy service is sensitive enough that we want it gated behind a deliberate manual click, not fired on every master push. **Files** - `.gitea/workflows/bootstrap-phoenix-deploy-api.yml` — new workflow. - `docs/04-configuration/DEVIN_GITEA_PROXMOX_CICD.md` — new §3a "Bootstrap workflow secrets (one-time per CT)" listing the new secrets and explaining the non-stub probe. **Validation** ``` python3 -c "import yaml; yaml.safe_load(open('.gitea/workflows/bootstrap-phoenix-deploy-api.yml'))" # YAML valid grep -c "Deploy request queued (stub)" phoenix-deploy-api/server.js # 0 (master is post-stub) git log --oneline -5 # unchanged on master ``` End-to-end CI verification of the new workflow itself was not run from this VM — the workflow needs the new secrets (`PHOENIX_PVE_HOST`, `PHOENIX_PVE_SSH_KEY`, `PHOENIX_PVE_KNOWN_HOSTS`, `PHOENIX_DEV_VM_VMID`, `PHOENIX_DEPLOY_DEV_VM_IP`) populated in the Gitea repo before it can ssh into the PVE node. Setting those is intentionally a maintainer step; the docs section guides through it. ## Review & Testing Checklist for Human - [ ] Add the new repo secrets in Gitea Settings → Secrets and Variables → Actions: - `PHOENIX_PVE_HOST` (PVE node IP that hosts CT 5700, e.g. `192.168.11.12`) - `PHOENIX_PVE_USER` (default `root`, set if different) - `PHOENIX_PVE_SSH_KEY` (PEM-format private key, dedicated deploy key recommended) - `PHOENIX_PVE_KNOWN_HOSTS` (one-line known_hosts entry; optional but recommended) - `PHOENIX_DEV_VM_VMID` (default `5700`) - `PHOENIX_DEPLOY_DEV_VM_IP` (default `192.168.11.59`) - [ ] Run the workflow once from the Gitea Actions UI (Actions tab → "Bootstrap Phoenix Deploy API" → "Run workflow"). Watch the bundle build, scp, pct push, install-systemd.sh, /health, and non-stub probe. - [ ] After the workflow succeeds, re-run an existing `deploy-to-phoenix.yml` (push or workflow_dispatch). Check the deploy / deploy-atomic-swap-dapp jobs' Phoenix response body — should NOT contain `Deploy request queued (stub)` anymore. - [ ] Verify the actual deploy ran: SSH into CT 5801 / atomic-swap host and confirm the bundle SHA matches the latest atomic-swap-dapp commit (or use the existing post-deploy verification: `curl -sS https://atomic-swap.defi-oracle.io/ | grep -oE 'assets/index-[^"]+\.js' | head -1`). ### Notes - The workflow does not modify any host state until the manual trigger fires. - If the SSH key in `PHOENIX_PVE_SSH_KEY` is incorrect or not authorised on the PVE node, the scp step will fail with a clean SSH error message. - The non-stub probe in step 6 is the canary I want surfaced in CI logs. If a future change accidentally regresses the running service to a stub (e.g. someone manually copies an old build), the probe will fail and you'll see it in the Bootstrap workflow logs. - This is a **separate PR from the EIP-5792 PR** (atomic-swap-dapp #3) — they touch different repos and different concerns.
nsatoshi added 1 commit 2026-04-28 19:06:26 +00:00
ci(phoenix): workflow_dispatch reinstall for phoenix-deploy-api on CT 5700
Some checks failed
AI Code Review / claude-review (pull_request) Failing after 1m11s
Validate (PR) / run-all-validation (pull_request) Successful in 24s
9e0795dbc4
Closes the gap where phoenix-deploy-api/server.js on master is the real
implementation, but the running service on CT 5700 is the older stub
that returns 'Deploy request queued (stub)' for every target.

The new workflow .gitea/workflows/bootstrap-phoenix-deploy-api.yml is
manual-only (workflow_dispatch). When triggered it:

  1. Validates the repo layout (phoenix-deploy-api/server.js MUST NOT
     contain the stub string).
  2. Tars phoenix-deploy-api/ + config/public-sector-program-manifest.json
     into a deploy bundle.
  3. scp's the bundle to the PVE node that hosts CT 5700 using a
     dedicated deploy SSH key (PHOENIX_PVE_SSH_KEY repo secret).
  4. pct push / pct exec the bundle into the CT and runs the existing
     phoenix-deploy-api/scripts/install-systemd.sh which already drops
     /opt/phoenix-deploy-api/, writes the systemd unit, and restarts
     the service.
  5. Health-checks GET http://<dev-vm>:4001/health (with retry).
  6. Posts a non-stub probe: POST /api/deploy with target __bootstrap_probe__
     + the deploy bearer token. Fails the workflow if the response body
     still contains 'Deploy request queued (stub)' or any auth-rejection
     signal. That gives an unambiguous post-bootstrap health signal in
     CI logs without depending on a successful real deploy.

Required new secrets (documented in docs/04-configuration/DEVIN_GITEA_PROXMOX_CICD.md
section 3a):
  PHOENIX_PVE_HOST, PHOENIX_PVE_USER (default root), PHOENIX_PVE_SSH_KEY,
  PHOENIX_PVE_KNOWN_HOSTS (optional), PHOENIX_DEV_VM_VMID (default 5700),
  PHOENIX_DEPLOY_DEV_VM_IP (default 192.168.11.59).

Triggered manually only — bootstrap is sensitive enough that we do NOT
fire on every master push. Once the running service on CT 5700 is
post-stub, the existing deploy job in deploy-to-phoenix.yml will
actually execute scripts/deployment/deploy-atomic-swap-dapp-5801.sh on
each push instead of returning a 202 stub.

Co-Authored-By: Nakamoto, S <defi@defi-oracle.io>

Claude encountered an error —— View job


I'll analyze this and get back to you.

**Claude encountered an error** —— [View job](http://127.0.0.1:3000/d-bis/proxmox/actions/runs/209) --- I'll analyze this and get back to you.
Some checks failed
AI Code Review / claude-review (pull_request) Failing after 1m11s
Validate (PR) / run-all-validation (pull_request) Successful in 24s
This branch is already included in the target branch. There is nothing to merge.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin devin/1777403009-phoenix-bootstrap:devin/1777403009-phoenix-bootstrap
git checkout devin/1777403009-phoenix-bootstrap
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: d-bis/proxmox#16