136 lines
7.1 KiB
Markdown
136 lines
7.1 KiB
Markdown
# Simulation Scorecard Contract
|
||
|
||
Simulator runs must emit a **scorecard JSON** so results are comparable and deployability can be gated. This document defines the output schema and pass/fail gates.
|
||
|
||
---
|
||
|
||
## 1. Scorecard JSON schema
|
||
|
||
Every run (hub-only, full-quote, bridge shock) should produce a scorecard with at least:
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|--------------|
|
||
| `capture_mean` | number | Mean router capture ratio (fraction 0–1) across chains/tokens |
|
||
| `capture_p95` | number | 95th percentile capture |
|
||
| `churn_mean` | number | Mean inventory churn per epoch |
|
||
| `churn_p95` | number | 95th percentile churn |
|
||
| `churn_max` | number | Max churn observed |
|
||
| `intervention_cost_total` | number | Total intervention cost (bridge/mint/burn) in run |
|
||
| `intervention_cost_per_1M_volume` | number | Intervention cost per 1M routed volume |
|
||
| `peak_deviation_bps` | number | Peak peg deviation in basis points |
|
||
| `reflexive_route_count` | number | Count of multi-hop routes through multiple PMM pools (reflexivity) |
|
||
| `drain_half_life_epochs` | object | Per (token, chain): epochs until PMM inventory halves under routing pressure; **too short = routing magnet** |
|
||
| `path_concentration_index` | number | HHI on path shares (0–1); **high = you dominate execution**; lower = flow diversified (safer) |
|
||
| `arb_volume_total` | number | Total volume traded by arb step (PR#2) |
|
||
| `arb_profit_total` | number | Arb profit from **execution** (actual PMM output vs oracle), not mid (PR#2 refinement) |
|
||
| `peak_deviation_bps_pre_arb` | number | Max pool \|δ\| before arb step (diagnostic: is arb doing the work?) |
|
||
| `peak_deviation_bps_post_arb` | number | Max pool \|δ\| after arb (current primary gate) |
|
||
| `peak_deviation_bps_post_bot` | number | Max pool \|δ\| after bot (inventory rebalancing effect) |
|
||
| `intervention_cost_inject_total` | number | Bot inject (bridge-in) cost only |
|
||
| `intervention_cost_withdraw_total` | number | Bot withdraw cost only |
|
||
| `intervention_cost_by_chain` | object | Per chain: `{ inject, withdraw }` — which chains are liquidity sinks |
|
||
| `scenario` | string | e.g. `hub_only_11`, `full_quote_1_56_137`, `bridge_shock_137_56` |
|
||
| `runId` | string | Optional run identifier |
|
||
|
||
**Example (minimal):**
|
||
|
||
```json
|
||
{
|
||
"scenario": "hub_only",
|
||
"capture_mean": 0.18,
|
||
"capture_p95": 0.28,
|
||
"churn_mean": 0.45,
|
||
"churn_p95": 0.82,
|
||
"churn_max": 1.1,
|
||
"intervention_cost_total": 1200,
|
||
"intervention_cost_per_1M_volume": 8.5,
|
||
"peak_deviation_bps": 18,
|
||
"reflexive_route_count": 0
|
||
}
|
||
```
|
||
|
||
A machine-readable schema lives in `config/scorecard-schema.json` for validation.
|
||
|
||
---
|
||
|
||
## 2. Pass/fail gates (“deployable” scenarios)
|
||
|
||
From [10-behavioral-stability-analysis.md](10-behavioral-stability-analysis.md):
|
||
|
||
| Gate | Condition | Rationale |
|
||
|------|-----------|-----------|
|
||
| **Churn (normal)** | `churn_mean` in [0.3, 0.8] | Healthy baseline |
|
||
| **Churn (stress)** | `churn_max` < 1.5 | Avoid constant bot intervention |
|
||
| **Capture (baseline)** | `capture_mean` in [0.10, 0.30] | Peg stabilizer, not global venue |
|
||
| **Intervention** | `intervention_cost_per_1M_volume` stable (no explosive jump vs baseline) | Linear in volume |
|
||
| **Full-quote vs hub** | If full-quote: `churn_mean` increase vs hub < 50% | Don’t deploy full-quote if churn jumps >50% |
|
||
| **Peak deviation** | `peak_deviation_bps` below circuit-break (e.g. 200 bps USD) | Stay inside band |
|
||
| **Drain half-life** | `drain_half_life_epochs` not collapsing under full-quote vs hub | Routing magnet check |
|
||
| **Path concentration** | `path_concentration_index` not spiking during bridge shock | Diversified routing |
|
||
| **Reflexivity** | `reflexive_route_count` low relative to total routes | Avoid feedback loops |
|
||
|
||
**Sanity checks (PR#2):** Arb volume should rise when k is tight; bot interventions should rise when inventory targets are low.
|
||
|
||
**Pass:** All gates satisfied for the scenario.
|
||
**Fail:** Any gate violated; do not treat scenario as deployable without parameter change or topology reduction.
|
||
|
||
---
|
||
|
||
## 3. Phase 0 comparison (three scenarios)
|
||
|
||
Run and compare:
|
||
|
||
1. **Hub-only** across all 11 chains
|
||
2. **Full-quote** only on 1, 56, 137
|
||
3. **Bridge shock** (e.g. 137 → 56)
|
||
|
||
Compare deltas:
|
||
|
||
- **churn +%** (full-quote vs hub)
|
||
- **intervention cost +%** (full-quote vs hub)
|
||
- **peak deviation** under shock
|
||
|
||
If churn jumps >50% with full-quote → clear “don’t deploy full-quote” rule.
|
||
|
||
---
|
||
|
||
## 4. Phase 0: Runnable scenarios and knob guidance
|
||
|
||
**Exact scenario JSONs to run** (in `config/scenarios/`):
|
||
|
||
| Scenario file | Description | Expected pass |
|
||
|---------------|-------------|----------------|
|
||
| `hub_only_11.json` | Hub topology, all 11 chains, 720 epochs | churn_mean in [0.3, 0.8], capture_mean in [0.10, 0.30], churn_max < 1.5 |
|
||
| `full_quote_1_56_137.json` | Full-quote on Ethereum, BSC, Polygon; 720 epochs | Same gates; churn_mean increase vs hub_only_11 < 50% |
|
||
| `bridge_shock_137_56.json` | Hub on 1/56/137; 5% migration 137 to 56 over 24 epochs | peak_deviation_bps < 200; damped re-center (not resonant). **Note:** Shock is a **stress injection** (paired local sell/buy), not cross-chain router equilibrium; see §6. |
|
||
|
||
**One command = one scorecard = pass/fail:** Run sim with scenario JSON; validate output against `config/scorecard-schema.json`; apply gates from section 2.
|
||
|
||
**If fail, what knob to turn first:**
|
||
|
||
| Symptom | First knob | Then |
|
||
|---------|------------|------|
|
||
| Capture too high | Increase feeBps | Then increase k |
|
||
| Churn too high | Reduce pool count (hub model only) | Then increase k |
|
||
| Intervention cost explodes | Increase latency penalty rho or widen bands | Add caps (maxTradeSizeUnits, maxDailyNotional) |
|
||
| Drain half-life too short | Increase k or lower depth | Consider publicRoutingEnabled false on defense pools |
|
||
| Path concentration too high | Widen topology or increase fee on dominant pools | Reduce single-pool magnetism |
|
||
|
||
---
|
||
|
||
## 5. Bridge shock modeling (Phase 0)
|
||
|
||
The **bridge shock** scenario (`bridge_shock_137_56.json`) is implemented as a **stress injection**, not as cross-chain path enumeration:
|
||
|
||
- Each epoch during the shock window, the sim adds **paired local trades**: sell cW→hub on the “from” chain (137), buy cW with hub on the “to” chain (56), at a magnitude that sums to the configured migration over the window.
|
||
- This tests **corridor defense under forced migration** (can arb + bot keep deviation and intervention in check?), which is what matters operationally for Phase 0.
|
||
- It does **not** model a router endogenously choosing to bridge because it’s cheaper; that requires cross-chain path selection (PR#3). When you add cross-chain routing, you can validate whether the same stress emerges from router equilibrium.
|
||
|
||
Be explicit when interpreting results: shock metrics answer “given forced migration, does the system damp?” not “does routing naturally push flow across chains?”
|
||
|
||
---
|
||
|
||
## 6. Confirming EUR defaults
|
||
|
||
Run **hub-only baseline** with (a) USD-only tokens, (b) USD + EUR tokens. Compare: churn_mean, churn_max, peak_deviation_bps, intervention_cost_per_1M_volume. If EUR tokens meaningfully worsen these: increase eurDefaults.k (e.g. 0.25), widen bands for EUR in peg-bands.json, and/or add routing caps (maxTradeSizeUnits) for EUR pools.
|