PR AA: Phoenix / systemd deployment scaffolding (migrate Phoenix off Next.js stub) #31

Closed
nsatoshi wants to merge 2 commits from devin/1776898782-pr-aa-phoenix-migration into main
Owner

PR AA — Phoenix / systemd deployment scaffolding

Closes the gap between Gitea main (b48eb2a, Vite portal + Node orchestrator, 29 PRs merged, 167 tests) and what's actually serving curucombo.曼李.com today (Next.js "ISO-20022 Combo Flow" app from an unpushed local b118b2b checkout). After this PR is merged and the runbook in scripts/deployment/README.md is followed on CT 8604, the Phoenix deployment will serve d-bis/CurrenciCombo main.

Nothing in this PR changes runtime code. All additions live under scripts/deployment/.

Artifacts

file purpose
systemd/currencicombo-orchestrator.service Node orchestrator. EnvironmentFile=/etc/currencicombo/orchestrator.env. Hardening: ProtectSystem=strict, PrivateTmp, NoNewPrivileges, LockPersonality, no ambient caps.
systemd/currencicombo-webapp.service nginx serving the Vite SPA on :3000 via RuntimeDirectory=currencicombo-webapp.
webapp-nginx.conf Self-contained nginx config. Intentionally returns HTTP 421 on /api/* and /events/* so an NPMplus misconfig surfaces as a clean error rather than silently returning index.html.
.env.prod.example Template for /etc/currencicombo/orchestrator.env. Documents every EXT-* blocker env var 1:1 with the Proxmox repo's check-external-dependencies.sh.
install.sh Idempotent host setup: user, dirs, nginx install (apt/apk auto), fresh Postgres role/DB (--force-recreate-db to wipe), Redis autodetect, env file with auto-generated EVENT_SIGNING_SECRET + 3 API keys, systemd units enabled. --dry-run supported.
deploy-currencicombo-8604.sh The build-and-swap deploy driver referenced by the Proxmox phoenix-deploy-api/deploy-targets.json tuple. git fetch/reset → orchestrator tsc build → portal vite build with VITE_ORCHESTRATOR_URL baked in → npm run migrate → timestamped backup → systemctl stop → rsync → systemctl start → smoke /ready + portal / → grep EXT-* from journalctl. Flags: --ref, --dry-run, --skip-migrate, --skip-build, --rollback.
README.md Architecture diagram, first-time setup (8 steps), NPMplus ingress rule table, subsequent-deploy one-liner, rollback, troubleshooting table, cutover-from-pre-existing-Next.js sequence, explicit list of Proxmox-side follow-ups.

Target-agnostic

No IP / hostname / VLAN is hardcoded anywhere in the scripts or units. The only file that embeds the public hostname is README.md (documentation) and the default VITE_ORCHESTRATOR_URL in deploy-currencicombo-8604.sh, which is overridable via env:

CC_GIT_REMOTE, CC_GIT_REF, CC_REPO_DIR, CC_APP_HOME, CC_BACKUP_DIR,
CC_USER, VITE_ORCHESTRATOR_URL, ORCHESTRATOR_UNIT, WEBAPP_UNIT,
CC_HEALTH_URL, CC_PORTAL_URL, CC_HEALTH_TIMEOUT_SECS

Single-origin NPMplus routing (user-confirmed)

curucombo.曼李.com/api/*      →  10.160.0.14:8080  (orchestrator)
curucombo.曼李.com/events/*   →  10.160.0.14:8080  (SSE; proxy_buffering off, 24h read_timeout)
curucombo.曼李.com/*          →  10.160.0.14:3000  (Vite SPA)

The NPMplus rule bodies, exact Host / X-Forwarded-* headers, and SSE timeout settings are spelled out in README.md.

Verification on this build box (headless)

  • shellcheck --severity=warning on both scripts: clean.
  • bash -n on both scripts: clean.
  • systemd-analyze verify on both unit files: parse cleanly. Only complaint is /usr/sbin/nginx not being executable, which is expected — nginx is installed by install.sh at deploy time on CT 8604.
  • install.sh --dry-run: fails fast with the expected FATAL on hosts without psql (build box). On CT 8604 with Postgres + Redis already installed, it walks through every step idempotently.
  • deploy-currencicombo-8604.sh --help: prints the usage block with all 5 flags and 10 env overrides.

Out of scope

  • The actual cutover on r630-01 — user runs install.sh + deploy-currencicombo-8604.sh themselves per README §"First-time setup" and §"Cutting over from the pre-existing Next.js build". This PR is artifacts-only.
  • Proxmox-side corrections (separate commit on /home/intlc/projects/proxmox after cutover runs cleanly):
    • Update phoenix-deploy-api/deploy-targets.json to point at scripts/deployment/deploy-currencicombo-8604.sh.
    • Remove the inaccurate "Next.js webapp with ignoreBuildErrors" language in EXTERNAL_DEPENDENCY_BLOCKERS.md — the new webapp is Vite + tsc -b strict, no build-error suppression.

Relationship to PR #30

PR #30 (docker compose sandbox) remains mergeable and unchanged — it's the local dev path. PR AA is the Phoenix prod path. They share conventions (nginx.conf, .env template layout, EXT-* contract) so future ops changes affect both consistently.

## PR AA — Phoenix / systemd deployment scaffolding Closes the gap between Gitea `main` (`b48eb2a`, Vite portal + Node orchestrator, 29 PRs merged, 167 tests) and what's actually serving `curucombo.曼李.com` today (Next.js "ISO-20022 Combo Flow" app from an unpushed local `b118b2b` checkout). After this PR is merged and the runbook in `scripts/deployment/README.md` is followed on CT 8604, the Phoenix deployment will serve `d-bis/CurrenciCombo` `main`. Nothing in this PR changes runtime code. All additions live under `scripts/deployment/`. ### Artifacts | file | purpose | |---|---| | `systemd/currencicombo-orchestrator.service` | Node orchestrator. `EnvironmentFile=/etc/currencicombo/orchestrator.env`. Hardening: `ProtectSystem=strict`, `PrivateTmp`, `NoNewPrivileges`, `LockPersonality`, no ambient caps. | | `systemd/currencicombo-webapp.service` | nginx serving the Vite SPA on `:3000` via `RuntimeDirectory=currencicombo-webapp`. | | `webapp-nginx.conf` | Self-contained nginx config. **Intentionally returns HTTP 421 on `/api/*` and `/events/*`** so an NPMplus misconfig surfaces as a clean error rather than silently returning `index.html`. | | `.env.prod.example` | Template for `/etc/currencicombo/orchestrator.env`. Documents every `EXT-*` blocker env var 1:1 with the Proxmox repo's `check-external-dependencies.sh`. | | `install.sh` | Idempotent host setup: user, dirs, nginx install (apt/apk auto), fresh Postgres role/DB (`--force-recreate-db` to wipe), Redis autodetect, env file with auto-generated `EVENT_SIGNING_SECRET` + 3 API keys, systemd units enabled. `--dry-run` supported. | | `deploy-currencicombo-8604.sh` | The build-and-swap deploy driver referenced by the Proxmox `phoenix-deploy-api/deploy-targets.json` tuple. `git fetch/reset` → orchestrator `tsc` build → portal `vite build` with `VITE_ORCHESTRATOR_URL` baked in → `npm run migrate` → timestamped backup → `systemctl stop` → rsync → `systemctl start` → smoke `/ready` + portal `/` → grep `EXT-*` from `journalctl`. Flags: `--ref`, `--dry-run`, `--skip-migrate`, `--skip-build`, `--rollback`. | | `README.md` | Architecture diagram, first-time setup (8 steps), NPMplus ingress rule table, subsequent-deploy one-liner, rollback, troubleshooting table, cutover-from-pre-existing-Next.js sequence, explicit list of Proxmox-side follow-ups. | ### Target-agnostic No IP / hostname / VLAN is hardcoded anywhere in the scripts or units. The only file that embeds the public hostname is `README.md` (documentation) and the default `VITE_ORCHESTRATOR_URL` in `deploy-currencicombo-8604.sh`, which is overridable via env: ``` CC_GIT_REMOTE, CC_GIT_REF, CC_REPO_DIR, CC_APP_HOME, CC_BACKUP_DIR, CC_USER, VITE_ORCHESTRATOR_URL, ORCHESTRATOR_UNIT, WEBAPP_UNIT, CC_HEALTH_URL, CC_PORTAL_URL, CC_HEALTH_TIMEOUT_SECS ``` ### Single-origin NPMplus routing (user-confirmed) ``` curucombo.曼李.com/api/* → 10.160.0.14:8080 (orchestrator) curucombo.曼李.com/events/* → 10.160.0.14:8080 (SSE; proxy_buffering off, 24h read_timeout) curucombo.曼李.com/* → 10.160.0.14:3000 (Vite SPA) ``` The NPMplus rule bodies, exact Host / X-Forwarded-* headers, and SSE timeout settings are spelled out in `README.md`. ### Verification on this build box (headless) - `shellcheck --severity=warning` on both scripts: **clean**. - `bash -n` on both scripts: **clean**. - `systemd-analyze verify` on both unit files: parse cleanly. Only complaint is `/usr/sbin/nginx` not being executable, which is expected — nginx is installed by `install.sh` at deploy time on CT 8604. - `install.sh --dry-run`: fails fast with the expected FATAL on hosts without `psql` (build box). On CT 8604 with Postgres + Redis already installed, it walks through every step idempotently. - `deploy-currencicombo-8604.sh --help`: prints the usage block with all 5 flags and 10 env overrides. ### Out of scope - The actual cutover on r630-01 — user runs `install.sh` + `deploy-currencicombo-8604.sh` themselves per README §"First-time setup" and §"Cutting over from the pre-existing Next.js build". This PR is **artifacts-only**. - Proxmox-side corrections (separate commit on `/home/intlc/projects/proxmox` after cutover runs cleanly): - Update `phoenix-deploy-api/deploy-targets.json` to point at `scripts/deployment/deploy-currencicombo-8604.sh`. - Remove the inaccurate "Next.js webapp with `ignoreBuildErrors`" language in `EXTERNAL_DEPENDENCY_BLOCKERS.md` — the new webapp is Vite + `tsc -b` strict, no build-error suppression. ### Relationship to PR #30 PR #30 (`docker compose` sandbox) remains mergeable and unchanged — it's the **local dev** path. PR AA is the **Phoenix prod** path. They share conventions (nginx.conf, .env template layout, EXT-* contract) so future ops changes affect both consistently.
nsatoshi added 1 commit 2026-04-22 23:06:07 +00:00
PR AA: Phoenix / systemd deployment scaffolding (migrate Phoenix off Next.js stub)
Some checks failed
CI / Frontend Lint (pull_request) Failing after 7s
CI / Frontend Type Check (pull_request) Failing after 6s
CI / Frontend Build (pull_request) Failing after 8s
CI / Frontend E2E Tests (pull_request) Failing after 8s
CI / Orchestrator Build (pull_request) Failing after 7s
CI / Orchestrator Unit Tests (pull_request) Failing after 6s
CI / Orchestrator E2E (Testcontainers) (pull_request) Has been skipped
CI / Contracts Compile (pull_request) Failing after 6s
CI / Contracts Test (pull_request) Failing after 7s
Code Quality / SonarQube Analysis (pull_request) Failing after 19s
Code Quality / Code Quality Checks (pull_request) Failing after 6s
Security Scan / Dependency Vulnerability Scan (pull_request) Failing after 4s
Security Scan / OWASP ZAP Scan (pull_request) Failing after 5s
361776ab2e
Closes the gap between Gitea main (b48eb2a, Vite portal + Node
orchestrator, 29 PRs merged, 167 tests) and what's actually serving
curucombo.xn--vov0g.com (Next.js 'ISO-20022 Combo Flow' app from an
unpushed local b118b2b checkout). After this PR is merged and the
runbook in scripts/deployment/README.md is followed on CT 8604, the
Phoenix deployment will serve d-bis/CurrenciCombo main.

Artifacts (all under scripts/deployment/):
- systemd/currencicombo-orchestrator.service  - Node orchestrator,
  EnvironmentFile=/etc/currencicombo/orchestrator.env, full systemd
  hardening (ProtectSystem=strict, PrivateTmp, no caps).
- systemd/currencicombo-webapp.service        - nginx serving Vite
  SPA on :3000 via RuntimeDirectory=/run/currencicombo-webapp.
- webapp-nginx.conf                            - self-contained nginx
  config; intentionally 421s on /api/* and /events/* so an NPMplus
  misconfig fails loudly instead of silently returning index.html.
- .env.prod.example                            - template for
  /etc/currencicombo/orchestrator.env. Documents every EXT-* blocker
  env var 1:1 with the Proxmox repo's check-external-dependencies.sh.
- install.sh                                   - idempotent host setup:
  user, dirs, nginx, fresh Postgres role/DB (--force-recreate-db to
  wipe), Redis autodetect, env file with auto-generated
  EVENT_SIGNING_SECRET + 3 API keys, systemd units enabled but not
  started. --dry-run supported.
- deploy-currencicombo-8604.sh                 - build-and-swap deploy
  driver (the script deploy-targets.json / phoenix-deploy-api calls):
  git fetch/reset, orchestrator tsc build, portal vite build with
  VITE_ORCHESTRATOR_URL baked in, migrations, timestamped backup,
  systemctl stop, rsync, systemctl start, smoke /ready + portal /,
  grep EXT-* from journalctl. --ref, --dry-run, --skip-migrate,
  --skip-build, --rollback.
- README.md                                    - architecture diagram,
  first-time setup (8 steps), NPMplus ingress rule table, subsequent-
  deploy one-liner, rollback, troubleshooting table, cutover-from-
  pre-existing-Next.js sequence, explicit list of Proxmox-side
  follow-ups.

Target-agnostic: no IP / hostname / VLAN hardcoded. The only file that
embeds the public hostname is README.md (for documentation) and the
default VITE_ORCHESTRATOR_URL in deploy-currencicombo-8604.sh (which
is overridable via env).

Single-origin NPMplus routing (confirmed with user):
  curucombo.\xe6\x9b\xbc\xe6\x9d\x8e.com/api/*     -> 10.160.0.14:8080  (orchestrator)
  curucombo.\xe6\x9b\xbc\xe6\x9d\x8e.com/events/*  -> 10.160.0.14:8080  (SSE)
  curucombo.\xe6\x9b\xbc\xe6\x9d\x8e.com/*         -> 10.160.0.14:3000  (Vite SPA)

Verified on this box (headless):
- shellcheck --severity=warning: clean on both scripts.
- bash -n: clean on both scripts.
- systemd-analyze verify: both unit files parse cleanly (only complaint
  is /usr/sbin/nginx not being executable, expected -- nginx is
  installed at deploy time).
- install.sh --dry-run: fails fast with the expected FATAL on hosts
  without psql (build box). On CT 8604 with Postgres+Redis already
  installed, it walks through every step.
- deploy-currencicombo-8604.sh --help: prints the usage.

No runtime code changes. Non-UI. Complements PR #30 (docker-compose
sandbox) which remains the local-dev path.

Proxmox-side follow-up (separate commit on /home/intlc/projects/proxmox
after this PR merges and cutover runs cleanly):
- Update phoenix-deploy-api/deploy-targets.json to point at
  scripts/deployment/deploy-currencicombo-8604.sh.
- Retire the inaccurate "Next.js webapp with ignoreBuildErrors"
  language in EXTERNAL_DEPENDENCY_BLOCKERS.md.

Co-Authored-By: Nakamoto, S <defi@defi-oracle.io>
nsatoshi added 1 commit 2026-04-22 23:30:38 +00:00
PR AA follow-up: manual-rollback loud-failure summary + keep-min-5 backup-prune cron + root-only initial-keys handoff file
Some checks failed
CI / Frontend Lint (pull_request) Failing after 7s
CI / Frontend Type Check (pull_request) Failing after 7s
CI / Frontend Build (pull_request) Failing after 6s
CI / Frontend E2E Tests (pull_request) Failing after 7s
CI / Orchestrator Build (pull_request) Failing after 7s
CI / Orchestrator Unit Tests (pull_request) Failing after 6s
CI / Orchestrator E2E (Testcontainers) (pull_request) Has been skipped
CI / Contracts Compile (pull_request) Failing after 5s
CI / Contracts Test (pull_request) Failing after 6s
Code Quality / SonarQube Analysis (pull_request) Failing after 20s
Code Quality / Code Quality Checks (pull_request) Failing after 7s
Security Scan / Dependency Vulnerability Scan (pull_request) Failing after 4s
Security Scan / OWASP ZAP Scan (pull_request) Failing after 4s
ded7d24924
- deploy-currencicombo-8604.sh: on readiness timeout, print loud failure
  summary (journalctl tails + exact --rollback command with specific
  backup path) instead of silently exiting. Deliberately does NOT
  auto-rollback; first cutovers often fail because of env/migration
  mistakes and auto-restore hides the failure state ops needs.
- install.sh: on first run, write the three API keys + EVENT_SIGNING_SECRET
  to /root/currencicombo-first-keys.txt (0600, root:root) as a handoff
  copy. Canonical values still live in /etc/currencicombo/orchestrator.env.
  Log one pointer line (not the secrets themselves) to journald.
  Handoff file is NOT regenerated if orchestrator.env already exists.
- install-prune-cron.sh (new, opt-in): installs /etc/cron.daily/
  currencicombo-prune-backups that deletes entries older than 30 days
  from /var/lib/currencicombo/backups/ WHILE always keeping the newest
  5 regardless of age. Enforced via newest-first sort + i<KEEP_MIN skip.
- webapp-nginx.conf: drop the misleading /events/* 421 guard-rail. The
  orchestrator's SSE endpoint is /api/plans/:id/events/stream (under
  /api/), so one /api/* guard-rail covers both normal REST and SSE.
- README.md: corrected NPMplus rule table to TWO rules (/api/* with
  SSE-friendly proxy_buffering=off + 24h read_timeout + Connection ""
  + http/1.1, and /); added post-cutover smoke checks section with a
  concrete SSE streaming test that catches silent proxy_buffering=on
  misconfig; documented the /root/currencicombo-first-keys.txt handoff
  and the install-prune-cron.sh workflow; replaced stale 'not auto-pruned'
  note.

Verification:
- shellcheck --severity=warning: clean on all 3 scripts.
- bash -n: clean on install-prune-cron.sh.
- install-prune-cron.sh --dry-run: prints the pruner body with resolved
  env values as expected.
- install.sh --dry-run: walks through user/dirs/nginx-apt steps, then
  fails fast on missing psql (expected on a build box without Postgres).

Co-Authored-By: Nakamoto, S <defi@defi-oracle.io>
Author
Owner

Closing as superseded. main 4a1f69a "deploy: make Phoenix redeploys archive-safe" adopted the bulk of this PRs scope directly to main (install.sh, deploy-currencicombo-8604.sh, .env.prod.example, README.md, systemd/currencicombo-webapp.service, plus the three locked ops improvements). The three files that did not make it into 4a1f69a (webapp-nginx.conf, systemd/currencicombo-orchestrator.service, install-prune-cron.sh) are all referenced by main but physically missing. Those three are now in a small follow-on PR opened against main.

Original PR #31 branch devin/1776898782-pr-aa-phoenix-migration stays available for history.

Closing as superseded. main 4a1f69a "deploy: make Phoenix redeploys archive-safe" adopted the bulk of this PRs scope directly to main (install.sh, deploy-currencicombo-8604.sh, .env.prod.example, README.md, systemd/currencicombo-webapp.service, plus the three locked ops improvements). The three files that did not make it into 4a1f69a (webapp-nginx.conf, systemd/currencicombo-orchestrator.service, install-prune-cron.sh) are all referenced by main but physically missing. Those three are now in a small follow-on PR opened against main. Original PR #31 branch devin/1776898782-pr-aa-phoenix-migration stays available for history.
nsatoshi closed this pull request 2026-04-23 04:28:18 +00:00
Some checks failed
CI / Frontend Lint (pull_request) Failing after 7s
CI / Frontend Type Check (pull_request) Failing after 7s
CI / Frontend Build (pull_request) Failing after 6s
CI / Frontend E2E Tests (pull_request) Failing after 7s
CI / Orchestrator Build (pull_request) Failing after 7s
CI / Orchestrator Unit Tests (pull_request) Failing after 6s
CI / Orchestrator E2E (Testcontainers) (pull_request) Has been skipped
CI / Contracts Compile (pull_request) Failing after 5s
CI / Contracts Test (pull_request) Failing after 6s
Code Quality / SonarQube Analysis (pull_request) Failing after 20s
Code Quality / Code Quality Checks (pull_request) Failing after 7s
Security Scan / Dependency Vulnerability Scan (pull_request) Failing after 4s
Security Scan / OWASP ZAP Scan (pull_request) Failing after 4s

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: d-bis/CurrenciCombo#31