Files
smom-dbis-138/docs/deployment/VM_DEPLOYMENT_TROUBLESHOOTING.md
defiQUG 1fb7266469 Add Oracle Aggregator and CCIP Integration
- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control.
- Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities.
- Created .gitmodules to include OpenZeppelin contracts as a submodule.
- Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment.
- Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks.
- Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring.
- Created scripts for resource import and usage validation across non-US regions.
- Added tests for CCIP error handling and integration to ensure robust functionality.
- Included various new files and directories for the orchestration portal and deployment scripts.
2025-12-12 14:57:48 -08:00

306 lines
6.3 KiB
Markdown

# VM Deployment Troubleshooting Guide
## Common Issues and Solutions
### VM Not Accessible
**Symptoms:**
- Cannot SSH into VM
- Ping fails
- Connection timeout
**Solutions:**
1. Check VM status:
```bash
az vm show --resource-group $RESOURCE_GROUP --name $VM_NAME --show-details
```
2. Check Network Security Group rules:
```bash
az network nsg rule list --resource-group $RESOURCE_GROUP --nsg-name $NSG_NAME
```
3. Restart VM:
```bash
az vm restart --resource-group $RESOURCE_GROUP --name $VM_NAME
```
4. Check public IP:
```bash
az vm show --resource-group $RESOURCE_GROUP --name $VM_NAME --show-details --query "publicIps" -o tsv
```
### Besu Container Not Starting
**Symptoms:**
- Container exits immediately
- Container status shows "Exited"
- No logs available
**Solutions:**
1. Check container logs:
```bash
ssh besuadmin@$VM_IP "docker logs besu-validator-0"
```
2. Check Docker service:
```bash
ssh besuadmin@$VM_IP "systemctl status docker"
```
3. Check systemd service:
```bash
ssh besuadmin@$VM_IP "systemctl status besu.service"
```
4. Check configuration file:
```bash
ssh besuadmin@$VM_IP "cat /opt/besu/config/besu-config.toml"
```
5. Check disk space:
```bash
ssh besuadmin@$VM_IP "df -h"
```
### Genesis File Not Found
**Symptoms:**
- Besu fails to start
- Error: "Genesis file not found"
**Solutions:**
1. Check if genesis file exists:
```bash
ssh besuadmin@$VM_IP "ls -la /opt/besu/config/genesis.json"
```
2. Download genesis file manually:
```bash
ssh besuadmin@$VM_IP "wget -O /opt/besu/config/genesis.json $GENESIS_FILE_URL"
```
3. Copy genesis file from local:
```bash
scp config/genesis.json besuadmin@$VM_IP:/opt/besu/config/genesis.json
```
### Validator Keys Not Found
**Symptoms:**
- Validator node fails to start
- Error: "Validator key not found"
**Solutions:**
1. Check keys directory:
```bash
ssh besuadmin@$VM_IP "ls -la /opt/besu/keys/"
```
2. Download keys from Key Vault:
```bash
az keyvault secret show --vault-name $KEY_VAULT_NAME --name "validator-key-0" --query value -o tsv | ssh besuadmin@$VM_IP "cat > /opt/besu/keys/validator-key.txt"
```
3. Set correct permissions:
```bash
ssh besuadmin@$VM_IP "chmod 600 /opt/besu/keys/*"
```
### Network Connectivity Issues
**Symptoms:**
- Nodes cannot peer
- P2P connection fails
- RPC endpoint not accessible
**Solutions:**
1. Check P2P port:
```bash
telnet $SENTRY_IP 30303
```
2. Check RPC port:
```bash
curl http://$RPC_IP:8545
```
3. Check firewall rules:
```bash
ssh besuadmin@$VM_IP "sudo ufw status"
```
4. Check NSG rules:
```bash
az network nsg rule list --resource-group $RESOURCE_GROUP --nsg-name $NSG_NAME
```
### High Resource Usage
**Symptoms:**
- VM is slow
- High CPU usage
- High memory usage
**Solutions:**
1. Check resource usage:
```bash
ssh besuadmin@$VM_IP "top"
ssh besuadmin@$VM_IP "docker stats"
```
2. Check Besu JVM settings:
```bash
ssh besuadmin@$VM_IP "cat /opt/besu/docker-compose.yml | grep BESU_OPTS"
```
3. Scale up VM:
```bash
az vm resize --resource-group $RESOURCE_GROUP --name $VM_NAME --size Standard_D8s_v3
```
### Disk Space Issues
**Symptoms:**
- Besu fails to write
- "No space left on device" error
**Solutions:**
1. Check disk usage:
```bash
ssh besuadmin@$VM_IP "df -h"
```
2. Clean up old logs:
```bash
ssh besuadmin@$VM_IP "docker system prune -f"
ssh besuadmin@$VM_IP "find /opt/besu/logs -name '*.log' -mtime +7 -delete"
```
3. Resize disk:
```bash
az disk update --resource-group $RESOURCE_GROUP --name $DISK_NAME --size-gb 512
```
### Cloud-init Issues
**Symptoms:**
- VM not configured properly
- Docker not installed
- Services not started
**Solutions:**
1. Check cloud-init logs:
```bash
ssh besuadmin@$VM_IP "sudo cat /var/log/cloud-init-output.log"
```
2. Re-run cloud-init:
```bash
ssh besuadmin@$VM_IP "sudo cloud-init clean"
ssh besuadmin@$VM_IP "sudo cloud-init init"
```
3. Manually run setup script:
```bash
ssh besuadmin@$VM_IP "sudo /opt/besu/setup.sh"
```
### Key Vault Access Issues
**Symptoms:**
- Cannot download keys from Key Vault
- "Access denied" error
**Solutions:**
1. Check Managed Identity:
```bash
az vm identity show --resource-group $RESOURCE_GROUP --name $VM_NAME
```
2. Check Key Vault access policy:
```bash
az keyvault show --name $KEY_VAULT_NAME --query "properties.accessPolicies"
```
3. Add access policy:
```bash
PRINCIPAL_ID=$(az vm identity show --resource-group $RESOURCE_GROUP --name $VM_NAME --query "principalId" -o tsv)
az keyvault set-policy --name $KEY_VAULT_NAME --object-id $PRINCIPAL_ID --secret-permissions get list
```
## Diagnostic Commands
### Check VM Status
```bash
az vm list --resource-group $RESOURCE_GROUP --show-details
```
### Check Container Status
```bash
ssh besuadmin@$VM_IP "docker ps -a"
```
### Check Service Status
```bash
ssh besuadmin@$VM_IP "systemctl status besu.service"
```
### Check Logs
```bash
# Besu logs
ssh besuadmin@$VM_IP "docker logs besu-validator-0"
# System logs
ssh besuadmin@$VM_IP "journalctl -u besu.service -n 100"
# Cloud-init logs
ssh besuadmin@$VM_IP "sudo cat /var/log/cloud-init-output.log"
```
### Check Network
```bash
# Check connectivity
ping $VM_IP
# Check ports
nmap -p 30303,8545,8546,9545 $VM_IP
# Check DNS
nslookup $VM_IP
```
### Check Resources
```bash
# CPU and memory
ssh besuadmin@$VM_IP "top -bn1 | head -20"
# Disk usage
ssh besuadmin@$VM_IP "df -h"
# Network usage
ssh besuadmin@$VM_IP "iftop"
```
## Getting Help
If you encounter issues not covered here:
1. Check the [main troubleshooting guide](../docs/TROUBLESHOOTING.md)
2. Review [VM deployment documentation](VM_DEPLOYMENT.md)
3. Check Besu logs for detailed error messages
4. Review Azure VM logs in Azure Portal
5. Check Network Security Group rules
6. Verify Key Vault access policies
## Prevention
To prevent common issues:
1. **Regular Monitoring**: Use monitoring scripts to catch issues early
2. **Backup**: Regularly backup VM data
3. **Updates**: Keep VMs and Docker images updated
4. **Resource Planning**: Monitor resource usage and scale as needed
5. **Security**: Regularly review and update NSG rules and Key Vault policies