Some checks failed
CI/CD Pipeline / Solidity Contracts (push) Failing after 1m3s
CI/CD Pipeline / Security Scanning (push) Successful in 2m18s
CI/CD Pipeline / Lint and Format (push) Failing after 34s
CI/CD Pipeline / Terraform Validation (push) Failing after 20s
CI/CD Pipeline / Kubernetes Validation (push) Successful in 22s
Deploy ChainID 138 / Deploy ChainID 138 (push) Failing after 40s
HYBX OMNL TypeScript & anchor / token-aggregation build + reconcile artifact (push) Failing after 49s
OMNL reconcile anchor / Run omnl:reconcile and upload artifacts (push) Failing after 21s
Validation / validate-genesis (push) Successful in 25s
Validation / validate-terraform (push) Failing after 21s
Validation / validate-kubernetes (push) Failing after 8s
Validation / validate-smart-contracts (push) Failing after 8s
Validation / validate-security (push) Failing after 1m11s
Validation / validate-documentation (push) Failing after 14s
Verify Deployment / Verify Deployment (push) Failing after 45s
Ship AddressActivityRegistry V1/V2, ISO20022IntakeGateway, Chain138ParticipantSurface, checkpoint hub contracts, checkpoint-core package, aggregator/indexer/sdk services, relay profile guards, M00 diamond bridge facet, and OMNL compliance contracts. Co-authored-by: Cursor <cursoragent@cursor.com>
218 lines
9.8 KiB
Markdown
218 lines
9.8 KiB
Markdown
# CCIP Relay Service
|
||
|
||
Off-chain relay for forwarding Chain 138 `MessageSent` events to destination relay routers/bridges.
|
||
|
||
## Current Topology
|
||
|
||
Source (Chain 138) — match `.env.bsc` / operator deploy:
|
||
- Router: `0x42DAb7b888Dd382bD5Adcf9E038dBF1fD03b4817`
|
||
- WETH9 bridge: `0xcacfd227A040002e49e2e01626363071324f820a`
|
||
|
||
Destinations:
|
||
- BSC relay router: `0x4d9Bc6c74ba65E37c4139F0aEC9fc5Ddff28Dcc4`
|
||
- BSC relay bridge: `0x886C6A4ABC064dbf74E7caEc460b7eeC31F1b78C`
|
||
- AVAX relay router: `0x2a0023Ad5ce1Ac6072B454575996DfFb1BB11b16`
|
||
- AVAX relay bridge: `0x3f8C409C6072a2B6a4Ff17071927bA70F80c725F`
|
||
|
||
Direct first-hop support from Chain 138 is intentionally narrow today:
|
||
- Mainnet: supported with the default `.env` profile
|
||
- BSC: supported with `.env.bsc`
|
||
- Avalanche: supported with `.env.avax`
|
||
- Gnosis / Cronos / Celo / Polygon / Arbitrum / Optimism / Base: treat as `via Mainnet hub` unless a dedicated relay router + relay profile are added and proven live
|
||
|
||
Important: on 2026-04-04, a direct `138 -> Arbitrum` WETH send produced a real source `MessageSent` event but no destination delivery because the live relay worker was running a Mainnet-only destination profile. There is currently no tracked `.env.arbitrum` profile in this folder.
|
||
|
||
## Env Profiles
|
||
|
||
Use the prebuilt env files in this folder:
|
||
- `.env.mainnet-cw` — Chain 138 cW → **Ethereum mainnet** (`CW_BRIDGE_MAINNET`)
|
||
- `.env.mainnet-weth` — WETH lane to mainnet
|
||
- `.env.bsc` (template: `.env.bsc.example`)
|
||
- `.env.avax-cw` — cW → Avalanche
|
||
- `.env.avax` — WETH → Avalanche
|
||
- `.env` (default/fallback)
|
||
- `.env.local` — **only** when running without a named profile, or set `RELAY_ALLOW_ENV_LOCAL=1`
|
||
|
||
**Pre-flight (required before restart):**
|
||
|
||
```bash
|
||
./scripts/verify/validate-relay-profiles.sh
|
||
./scripts/verify/diagnose-cw-mesh-ccip-relay.sh # mainnet cW lane + balances
|
||
```
|
||
|
||
Named profiles **do not** load `.env.local` (prevents mainnet router + Avalanche RPC mismatch).
|
||
|
||
Each profile sets destination RPC, selector, relay router/bridge, and destination WETH token.
|
||
|
||
### `START_BLOCK` after catch-up
|
||
|
||
When historical `MessageSent` logs are fully relayed, set **`START_BLOCK=latest`** in `.env.bsc` (or your profile) so a cold start only scans from **~current head − 1** instead of re-queuing the whole backfill range. To replay from an old height again, set an explicit decimal block (e.g. `3012930`) and restart.
|
||
|
||
**BSC RPC:** Prefer a node that accepts short `eth_getLogs` windows (e.g. `https://bsc.publicnode.com`). Some Binance seeds return `-32005` for log queries the relay uses for destination checks.
|
||
|
||
### Fund BSC relay bridge (WETH)
|
||
|
||
From repo root (loads `smom-dbis-138/.env` and relay `.env.bsc` for addresses):
|
||
|
||
```bash
|
||
./scripts/bridge/fund-bsc-relay-bridge.sh --dry-run
|
||
./scripts/bridge/fund-bsc-relay-bridge.sh # full deployer WETH → bridge
|
||
# ./scripts/bridge/fund-bsc-relay-bridge.sh 1000000000000000 # 0.001 WETH wei
|
||
```
|
||
|
||
Wrap BNB to WETH on the deployer first (`cast send <WETH> "deposit()" --value ...` on BSC) if needed.
|
||
|
||
### Fund Mainnet relay bridge (WETH)
|
||
|
||
From repo root:
|
||
|
||
```bash
|
||
./scripts/bridge/fund-mainnet-relay-bridge.sh --dry-run
|
||
./scripts/bridge/fund-mainnet-relay-bridge.sh # full deployer WETH → bridge
|
||
# ./scripts/bridge/fund-mainnet-relay-bridge.sh 1000000000000000 # 0.001 WETH wei
|
||
```
|
||
|
||
## Destination tx confirmation timeout
|
||
|
||
| Env | Default | Purpose |
|
||
|-----|---------|---------|
|
||
| `RELAY_TX_CONFIRM_TIMEOUT_MS` | `180000` (3 min) | Max wait for `tx.wait()` on mainnet relay txs. On timeout the message is retried instead of blocking the queue processor indefinitely. |
|
||
|
||
## Relay shedding (save destination gas)
|
||
|
||
When **no** 138→Mainnet (or configured destination) relay deliveries are needed, pause **destination-chain** transactions so the relayer does not spend native gas on `relayMessage` / direct `ccipReceive`:
|
||
|
||
| Variable | Meaning |
|
||
|----------|---------|
|
||
| `RELAY_SHEDDING=1` | **On** — shedding active (`true` / `yes` / `on` also work). |
|
||
| `RELAY_DELIVERY_ENABLED=0` | Same as shedding on (`false` / `no` / `off`). |
|
||
| `RELAY_SHEDDING_SOURCE_POLL_INTERVAL_MS` | Source router log poll interval while shedding (default **60000** ms, min 5000). Reduces Chain 138 RPC usage. |
|
||
| `RELAY_SHEDDING_QUEUE_POLL_MS` | Idle interval for the queue loop while shedding (default **5000** ms, min 1000). |
|
||
|
||
**Behavior:** Source `MessageSent` logs are still ingested and messages queue locally. Pending queue state is now persisted to `services/relay/data/queue-state.json` by default (override with `RELAY_QUEUE_STATE_PATH`), so a restart no longer drops queued work. For production, still plan shedding around low bridge traffic so the persisted backlog stays small and intentional.
|
||
|
||
## Skip specific message IDs
|
||
|
||
Use `RELAY_SKIP_MESSAGE_IDS` as a comma-separated list of source `MessageSent.messageId` values that the relay should intentionally ignore.
|
||
|
||
This is the safest operational way to park an already-confirmed source message when:
|
||
- destination relay inventory is below the requested release amount
|
||
- you do not want the relay to keep retrying it after service restarts
|
||
- there is no on-chain cancel / refund path on the source bridge
|
||
|
||
Example:
|
||
|
||
```bash
|
||
RELAY_SKIP_MESSAGE_IDS=0xf718c9895c0a5442349996383184d017d2fa041af7aaeb9f0c0675d3ceed756b
|
||
```
|
||
|
||
The relay checks this list during live event ingestion, historical replay, and queue processing.
|
||
|
||
For the current Mainnet WETH backlog policy, see:
|
||
|
||
- [`docs/03-deployment/MAINNET_WETH_RELAY_BACKLOG_POLICY.md`](../../../docs/03-deployment/MAINNET_WETH_RELAY_BACKLOG_POLICY.md)
|
||
|
||
### On-chain pause (`CCIPRelayRouter`)
|
||
|
||
The destination **CCIPRelayRouter** inherits OpenZeppelin **`Pausable`**: admins with `DEFAULT_ADMIN_ROLE` may call **`pause()`** / **`unpause()`**. While paused, **`relayMessage` reverts** (no delivery through the router).
|
||
|
||
**Relay service:** Before sending `relayMessage`, the worker calls **`paused()`** on the destination router (router mode only). If paused, it **re-queues** the message and waits 15s instead of broadcasting a reverting tx. Older routers without `paused()` skip this check (call errors are logged at debug).
|
||
|
||
**Important:** If you `pause()` the router but leave the relay **process** running **without** `RELAY_SHEDDING=1`, failed txs are much less likely thanks to the check above, but off-chain activity (source polling, queue growth) still runs. Prefer **`RELAY_SHEDDING=1`** (or stop the service) whenever the router is paused for an extended period.
|
||
|
||
**Direct-delivery** mode (`DEST_DELIVERY_MODE=direct`) calls the bridge’s `ccipReceive` directly and **does not** go through the router—pause the router alone does not stop that path; use shedding or revoke `ROUTER_ROLE` on the bridge as appropriate.
|
||
|
||
## Start Relay
|
||
|
||
```bash
|
||
cd /home/intlc/projects/proxmox/smom-dbis-138/services/relay
|
||
npm install
|
||
|
||
# BSC relay profile
|
||
./start-relay.sh bsc
|
||
|
||
# AVAX relay profile
|
||
./start-relay.sh avax
|
||
|
||
# Default profile
|
||
./start-relay.sh
|
||
```
|
||
|
||
`start-relay.sh` loads env in this order:
|
||
1. `.env.<profile>` (if profile argument provided)
|
||
2. `.env.local`
|
||
3. `.env`
|
||
|
||
If parent project `.env` defines `PRIVATE_KEY`, `${PRIVATE_KEY}` references in relay env files are expanded.
|
||
|
||
## Relay Health Endpoint
|
||
|
||
The relay now exposes a lightweight JSON status endpoint for explorer / mission-control monitoring.
|
||
|
||
- Default listen address: `0.0.0.0`
|
||
- Default port: `9860`
|
||
- Endpoints: `GET /healthz`, `GET /health`, `GET /status`
|
||
|
||
Optional env overrides:
|
||
|
||
```bash
|
||
RELAY_HEALTH_ENABLED=1
|
||
RELAY_HEALTH_HOST=0.0.0.0
|
||
RELAY_HEALTH_PORT=9860
|
||
```
|
||
|
||
Example from another LAN host:
|
||
|
||
```bash
|
||
curl http://192.168.11.11:9860/healthz | jq .
|
||
```
|
||
|
||
Example explorer backend wiring:
|
||
|
||
```bash
|
||
CCIP_RELAY_HEALTH_URL=http://192.168.11.11:9860/healthz
|
||
CCIP_RELAY_HEALTH_URLS=mainnet-weth=http://192.168.11.11:9860/healthz,mainnet-cw=http://192.168.11.11:9863/healthz,bsc=http://192.168.11.11:9861/healthz,avax=http://192.168.11.11:9862/healthz
|
||
```
|
||
|
||
Recommended systemd ports when running multiple relay workers on the same host:
|
||
|
||
- Mainnet WETH (default `.env`): `9860`
|
||
- Mainnet cW (`ccip-relay-mainnet-cw.service`): `9863`
|
||
- BSC: `9861`
|
||
- Avalanche: `9862`
|
||
|
||
### BSC profile (`start-relay.sh bsc`)
|
||
|
||
- **Source:** Chain 138 public RPC (`RPC_URL_138` in `.env.bsc`).
|
||
- **Destination:** `BSC_RELAY_RPC_URL` in `smom-dbis-138/.env` (ngrok to operator BSC node on chain 56).
|
||
- **Upstream (not used for relay txs):** `BSC_RPC_URL` / Infura — for operator `cast` and health cross-checks.
|
||
- Sync + restart on r630-01: `../../../../scripts/deployment/sync-ccip-relay-bsc-r630-01.sh`
|
||
- Verify: `../../../../scripts/verify/check-bsc-relay-rpc.sh`
|
||
|
||
## Critical Requirements
|
||
|
||
- Relayer key must hold native gas on destination chain.
|
||
- Destination relay bridge must hold enough WETH for payouts.
|
||
- Explicit profile token overrides like `DEST_WETH9_ADDRESS` win over the generic multichain token map. This keeps relay-backed destinations pointed at their bridge-managed wrapped token instead of a public native wrapped asset.
|
||
- Source bridge destination mapping must point to the correct destination relay bridge.
|
||
- Source router `feeToken()` must be a deployed ERC20 with sufficient deployer balance.
|
||
|
||
## Fast Status Checks
|
||
|
||
Check source destination mappings:
|
||
```bash
|
||
cast call 0xcacfd227A040002e49e2e01626363071324f820a "destinations(uint64)" 11344663589394136015 --rpc-url https://rpc.public-0138.defi-oracle.io
|
||
cast call 0xcacfd227A040002e49e2e01626363071324f820a "destinations(uint64)" 6433500567565415381 --rpc-url https://rpc.public-0138.defi-oracle.io
|
||
```
|
||
|
||
Check message settlement:
|
||
```bash
|
||
cast call 0x886C6A4ABC064dbf74E7caEc460b7eeC31F1b78C "processedTransfers(bytes32)(bool)" <bsc_message_id> --rpc-url https://bsc.publicnode.com
|
||
cast call 0x3f8C409C6072a2B6a4Ff17071927bA70F80c725F "processedTransfers(bytes32)(bool)" <avax_message_id> --rpc-url https://avalanche-c-chain.publicnode.com
|
||
```
|
||
|
||
Check destination bridge liquidity:
|
||
```bash
|
||
cast call <dest_weth> "balanceOf(address)(uint256)" <dest_relay_bridge> --rpc-url <dest_rpc>
|
||
```
|