Files
smom-dbis-138/orchestration/portal/MONITORING_SETUP.md
defiQUG 1fb7266469 Add Oracle Aggregator and CCIP Integration
- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control.
- Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities.
- Created .gitmodules to include OpenZeppelin contracts as a submodule.
- Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment.
- Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks.
- Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring.
- Created scripts for resource import and usage validation across non-US regions.
- Added tests for CCIP error handling and integration to ensure robust functionality.
- Included various new files and directories for the orchestration portal and deployment scripts.
2025-12-12 14:57:48 -08:00

244 lines
6.0 KiB
Markdown

# 🔍 Service Monitoring - Besu, Cacti, Firefly, Chainlink CCIP
## Overview
Comprehensive monitoring system for blockchain and interoperability services integrated into the orchestration portal.
## Monitored Services
### 1. **Besu (Hyperledger Besu)**
- Block number and chain status
- Peer count and network connectivity
- Sync status (syncing/synced/behind)
- Gas price and network metrics
- Pending transactions
- Resource usage (CPU, memory, disk)
- Chain ID and network ID
### 2. **Cacti (Hyperledger Cacti)**
- Connector count
- Active connections
- Transaction metrics (total, pending, failed)
- Average latency
- Resource usage (CPU, memory)
### 3. **Firefly (Hyperledger Firefly)**
- Namespace count
- Active APIs
- Transaction metrics (total, pending, failed)
- Average latency
- Resource usage (CPU, memory)
- Database connections
### 4. **Chainlink CCIP (Cross-Chain Interoperability Protocol)**
- Router address and configuration
- Active chains count
- Message metrics (total, pending, failed)
- Average latency
- Token transfers
- Fee token balance
- Resource usage (CPU, memory)
## Features
### ✅ Implemented
1. **Monitoring API Endpoints**
- `/api/monitoring/dashboard` - Overall health summary
- `/api/monitoring/besu` - Besu metrics
- `/api/monitoring/cacti` - Cacti metrics
- `/api/monitoring/firefly` - Firefly metrics
- `/api/monitoring/chainlink-ccip` - Chainlink CCIP metrics
- `/api/monitoring/services/:type/:name/status` - Service status
- `/api/monitoring/environments/:name/collect` - Collect all metrics
2. **Database Schema**
- `service_monitoring` table for metrics storage
- Indexed for fast queries
- Supports historical data
3. **Monitoring Service**
- `MonitoringService` class for metrics collection
- Simulated metrics (ready for Prometheus integration)
- Health status calculation
- Automatic metric storage
4. **Vue Dashboard**
- `MonitoringDashboard.vue` - Main monitoring view
- Service-specific components:
- `BesuMonitoring.vue`
- `CactiMonitoring.vue`
- `FireflyMonitoring.vue`
- `ChainlinkCCIPMonitoring.vue`
- `ServiceHealthCard.vue` - Health overview cards
- `MetricCard.vue` - Individual metric display
5. **Real-time Updates**
- WebSocket integration
- Auto-refresh every 30 seconds
- Live updates on metric collection
## API Usage
### Get Monitoring Dashboard
```bash
GET /api/monitoring/dashboard
```
Response:
```json
{
"besu": {
"total": 5,
"healthy": 4,
"degraded": 1,
"unhealthy": 0,
"services": [...]
},
"cacti": {...},
"firefly": {...},
"chainlinkCcip": {...}
}
```
### Get Besu Metrics
```bash
GET /api/monitoring/besu?environment=workload-azure-eastus&service=besu-validator-1
```
### Collect All Metrics
```bash
POST /api/monitoring/environments/workload-azure-eastus/collect
```
## Database Schema
### service_monitoring
```sql
CREATE TABLE service_monitoring (
id INTEGER PRIMARY KEY AUTOINCREMENT,
service_name TEXT NOT NULL,
service_type TEXT NOT NULL,
environment TEXT,
metric_name TEXT NOT NULL,
metric_value REAL NOT NULL,
metric_unit TEXT,
status TEXT NOT NULL,
timestamp TEXT NOT NULL,
metadata TEXT
);
```
## Integration with Prometheus
The monitoring service is designed to integrate with Prometheus. Replace the simulated metrics with actual Prometheus queries:
```typescript
// Example: Replace collectBesuMetrics with Prometheus query
async collectBesuMetrics(environment: string, serviceName: string): Promise<BesuMetrics> {
// Query Prometheus
const blockNumber = await prometheus.query(
`besu_blockchain_blockNumber{instance="${serviceName}"}`
);
const peerCount = await prometheus.query(
`besu_network_peers{instance="${serviceName}"}`
);
// ... etc
}
```
## Metrics Collected
### Besu
- `block_number` - Current block number
- `peer_count` - Number of connected peers
- `cpu_usage` - CPU utilization percentage
- `memory_usage` - Memory utilization percentage
- `disk_usage` - Disk utilization percentage
### Cacti
- `connector_count` - Number of connectors
- `active_connections` - Active connection count
- `failed_transactions` - Failed transaction count
- `average_latency` - Average transaction latency
### Firefly
- `namespace_count` - Number of namespaces
- `active_apis` - Active API count
- `failed_transactions` - Failed transaction count
- `database_connections` - Database connection count
### Chainlink CCIP
- `active_chains` - Number of active chains
- `total_messages` - Total message count
- `failed_messages` - Failed message count
- `average_latency` - Average message latency
## Health Status Calculation
### Healthy
- All critical metrics within normal ranges
- No failed operations above threshold
- Resource usage below warning levels
### Degraded
- Some metrics outside optimal range
- Increased failed operations
- Resource usage approaching limits
### Unhealthy
- Critical metrics in danger zone
- High failure rates
- Resource usage at critical levels
## Access
**URL**: `http://localhost:5000/monitoring` or `/monitoring` in the Vue app
## Future Enhancements
- [ ] Prometheus integration (replace simulated metrics)
- [ ] Grafana dashboards export
- [ ] Alert rules and thresholds
- [ ] Historical trend analysis
- [ ] Custom metric queries
- [ ] Service-specific dashboards
- [ ] Export metrics to CSV/JSON
- [ ] Metric comparison across environments
- [ ] Performance benchmarking
## Configuration
### Environment Variables
```bash
# Prometheus endpoint (when integrated)
PROMETHEUS_URL=http://prometheus:9090
# Metrics collection interval
METRICS_COLLECTION_INTERVAL=30000 # 30 seconds
```
## Testing
Test monitoring endpoints:
```bash
# Get dashboard
curl http://localhost:5000/api/monitoring/dashboard
# Get Besu metrics
curl http://localhost:5000/api/monitoring/besu?environment=workload-azure-eastus
# Collect metrics
curl -X POST http://localhost:5000/api/monitoring/environments/workload-azure-eastus/collect
```
---
**Status**: ✅ **Monitoring system implemented and ready!**
**Last Updated**: 2024-11-19
**Version**: 1.0.0