Enhance Proxmox VM deployment documentation
- Added a reference to the comprehensive VM Deployment Plan for better deployment strategy understanding. - Included a quick start guide for deploying infrastructure VMs. - Emphasized the importance of reviewing the VM Deployment Plan before deployment to optimize resource allocation. - Updated the documentation index to include the new VM Deployment Plan link for improved navigation.
This commit is contained in:
@@ -382,6 +382,9 @@ kubectl apply -f gitops/apps/monitoring/
|
||||
|
||||
### 6.4 Proxmox VM Deployment
|
||||
|
||||
**See**: [VM Deployment Plan](../vm/VM_DEPLOYMENT_PLAN.md) for comprehensive deployment strategy, resource allocation, and phased deployment approach.
|
||||
|
||||
**Quick Start**:
|
||||
```bash
|
||||
# 1. Deploy infrastructure VMs first
|
||||
kubectl apply -f examples/production/nginx-proxy-vm.yaml
|
||||
@@ -397,6 +400,8 @@ kubectl get proxmoxvm -A -w
|
||||
kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=50 -f
|
||||
```
|
||||
|
||||
**Important**: Review the [VM Deployment Plan](../vm/VM_DEPLOYMENT_PLAN.md) before deployment to understand resource constraints and optimized allocation strategy.
|
||||
|
||||
### 6.5 GitOps Setup (ArgoCD)
|
||||
|
||||
```bash
|
||||
@@ -621,6 +626,7 @@ k6 run scripts/k6-load-test.js
|
||||
- **[Deployment Plan](./deployment_plan.md)** - Phased rollout plan
|
||||
- **[System Architecture](./system_architecture.md)** - Overall architecture
|
||||
- **[Hardware BOM](./hardware_bom.md)** - Hardware specifications
|
||||
- **[VM Deployment Plan](vm/VM_DEPLOYMENT_PLAN.md)** - Comprehensive VM deployment plan with resource allocation
|
||||
- **[VM Specifications](vm/VM_SPECIFICATIONS.md)** - Complete VM specifications and deployment patterns
|
||||
- **[VM Creation Procedure](vm/VM_CREATION_PROCEDURE.md)** - Step-by-step VM deployment guide
|
||||
|
||||
|
||||
547
docs/vm/VM_DEPLOYMENT_PLAN.md
Normal file
547
docs/vm/VM_DEPLOYMENT_PLAN.md
Normal file
@@ -0,0 +1,547 @@
|
||||
# VM Deployment Plan
|
||||
|
||||
**Date**: 2025-01-XX
|
||||
**Status**: Ready for Deployment
|
||||
**Version**: 2.0
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document provides a comprehensive deployment plan for all virtual machines in the Sankofa Phoenix infrastructure. The plan includes hardware capabilities, resource allocation, deployment priorities, and step-by-step deployment procedures.
|
||||
|
||||
### Key Constraints
|
||||
|
||||
- **ML110-01 (Site-1)**: 6 CPU cores, 256 GB RAM
|
||||
- **R630-01 (Site-2)**: 28 CPU cores, 768 GB RAM
|
||||
- **Total VMs to Deploy**: 30 VMs
|
||||
- **Deployment Method**: Crossplane Proxmox Provider via Kubernetes
|
||||
|
||||
---
|
||||
|
||||
## Hardware Capabilities
|
||||
|
||||
### Site-1: ML110-01
|
||||
|
||||
**Location**: 192.168.11.10
|
||||
**Hardware Specifications**:
|
||||
- **CPU**: Intel Xeon E5-2603 v3 @ 1.60GHz
|
||||
- **CPU Cores**: 6 cores (6 threads, no hyperthreading)
|
||||
- **RAM**: 256 GB (251 GiB usable, ~244 GB available for VMs)
|
||||
- **Storage**:
|
||||
- local-lvm: 794.3 GB available
|
||||
- ceph-fs: 384 GB available
|
||||
- **Network**: vmbr0 (1GbE)
|
||||
|
||||
**Resource Allocation Strategy**:
|
||||
- Reserve 1 core for Proxmox host (5 cores available for VMs)
|
||||
- Reserve 8 GB RAM for Proxmox host (~248 GB available for VMs)
|
||||
- Suitable for: Light-to-medium workloads, infrastructure services
|
||||
|
||||
### Site-2: R630-01
|
||||
|
||||
**Location**: 192.168.11.11
|
||||
**Hardware Specifications**:
|
||||
- **CPU**: Intel Xeon E5-2660 v4 @ 2.00GHz (dual socket)
|
||||
- **CPU Cores**: 28 cores (56 threads with hyperthreading)
|
||||
- **RAM**: 768 GB (755 GiB usable, ~744 GB available for VMs)
|
||||
- **Storage**:
|
||||
- local-lvm: 171.3 GB available
|
||||
- Ceph OSD: Configured
|
||||
- **Network**: vmbr0 (10GbE capable)
|
||||
|
||||
**Resource Allocation Strategy**:
|
||||
- Reserve 2 cores for Proxmox host (26 cores available for VMs)
|
||||
- Reserve 16 GB RAM for Proxmox host (~752 GB available for VMs)
|
||||
- Suitable for: High-resource workloads, compute-intensive applications, blockchain nodes
|
||||
|
||||
---
|
||||
|
||||
## VM Inventory and Resource Requirements
|
||||
|
||||
### Summary Statistics
|
||||
|
||||
| Category | Count | Total CPU | Total RAM | Total Disk |
|
||||
|----------|-------|-----------|-----------|------------|
|
||||
| **Phoenix Infrastructure** | 8 | 52 cores | 128 GiB | 1,150 GiB |
|
||||
| **Core Infrastructure** | 2 | 4 cores | 8 GiB | 30 GiB |
|
||||
| **SMOM-DBIS-138 Blockchain** | 16 | 64 cores | 128 GiB | 320 GiB |
|
||||
| **Test/Example VMs** | 4 | 8 cores | 16 GiB | 200 GiB |
|
||||
| **TOTAL** | **30** | **128 cores** | **280 GiB** | **1,700 GiB** |
|
||||
|
||||
**Note**: These totals exceed available resources on a single node. VMs are distributed across both nodes.
|
||||
|
||||
---
|
||||
|
||||
## VM Deployment Schedule
|
||||
|
||||
### Phase 1: Core Infrastructure (Priority: CRITICAL)
|
||||
|
||||
**Deployment Order**: Deploy these first as they support other services.
|
||||
|
||||
#### 1.1 Nginx Proxy VM
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 2 CPU, 4 GiB RAM, 20 GiB disk
|
||||
- **Purpose**: Reverse proxy and SSL termination
|
||||
- **Dependencies**: None
|
||||
- **Deployment File**: `examples/production/nginx-proxy-vm.yaml`
|
||||
|
||||
#### 1.2 Cloudflare Tunnel VM
|
||||
- **Node**: r630-01
|
||||
- **Site**: site-2
|
||||
- **Resources**: 2 CPU, 4 GiB RAM, 10 GiB disk
|
||||
- **Purpose**: Cloudflare Tunnel for secure outbound connectivity
|
||||
- **Dependencies**: None
|
||||
- **Deployment File**: `examples/production/cloudflare-tunnel-vm.yaml`
|
||||
|
||||
**Phase 1 Resource Usage**:
|
||||
- **ML110-01**: 2 CPU, 4 GiB RAM, 20 GiB disk
|
||||
- **R630-01**: 2 CPU, 4 GiB RAM, 10 GiB disk
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Phoenix Infrastructure Services (Priority: HIGH)
|
||||
|
||||
**Deployment Order**: Deploy in dependency order.
|
||||
|
||||
#### 2.1 DNS Primary Server
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 4 CPU, 8 GiB RAM, 50 GiB disk
|
||||
- **Purpose**: Primary DNS server (BIND9)
|
||||
- **Dependencies**: None
|
||||
- **Deployment File**: `examples/production/phoenix/dns-primary.yaml`
|
||||
|
||||
#### 2.2 Git Server
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 8 CPU, 16 GiB RAM, 500 GiB disk
|
||||
- **Purpose**: Git repository hosting (Gitea/GitLab)
|
||||
- **Dependencies**: DNS (optional)
|
||||
- **Deployment File**: `examples/production/phoenix/git-server.yaml`
|
||||
|
||||
#### 2.3 Email Server
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 8 CPU, 16 GiB RAM, 200 GiB disk
|
||||
- **Purpose**: Email services (Postfix/Dovecot)
|
||||
- **Dependencies**: DNS (optional)
|
||||
- **Deployment File**: `examples/production/phoenix/email-server.yaml`
|
||||
|
||||
#### 2.4 DevOps Runner
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 8 CPU, 16 GiB RAM, 200 GiB disk
|
||||
- **Purpose**: CI/CD runner (Jenkins/GitLab Runner)
|
||||
- **Dependencies**: Git Server (optional)
|
||||
- **Deployment File**: `examples/production/phoenix/devops-runner.yaml`
|
||||
|
||||
#### 2.5 Codespaces IDE
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: 8 CPU, 32 GiB RAM, 200 GiB disk
|
||||
- **Purpose**: Cloud IDE (code-server)
|
||||
- **Dependencies**: None
|
||||
- **Deployment File**: `examples/production/phoenix/codespaces-ide.yaml`
|
||||
|
||||
#### 2.6 AS4 Gateway
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: TBD
|
||||
- **Purpose**: AS4 messaging gateway
|
||||
- **Dependencies**: DNS, Email
|
||||
- **Deployment File**: `examples/production/phoenix/as4-gateway.yaml`
|
||||
|
||||
#### 2.7 Business Integration Gateway
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: TBD
|
||||
- **Purpose**: Business integration services
|
||||
- **Dependencies**: DNS
|
||||
- **Deployment File**: `examples/production/phoenix/business-integration-gateway.yaml`
|
||||
|
||||
#### 2.8 Financial Messaging Gateway
|
||||
- **Node**: ml110-01
|
||||
- **Site**: site-1
|
||||
- **Resources**: TBD
|
||||
- **Purpose**: Financial messaging services
|
||||
- **Dependencies**: DNS
|
||||
- **Deployment File**: `examples/production/phoenix/financial-messaging-gateway.yaml`
|
||||
|
||||
**Phase 2 Resource Usage**:
|
||||
- **ML110-01**: 44+ CPU, 88+ GiB RAM, 1,150+ GiB disk
|
||||
- **R630-01**: 0 CPU, 0 GiB RAM, 0 GiB disk
|
||||
|
||||
**⚠️ WARNING**: Phase 2 exceeds ML110-01 CPU capacity (6 cores available). Some VMs may need to be moved to R630-01 or resources reduced.
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: SMOM-DBIS-138 Blockchain Infrastructure (Priority: HIGH)
|
||||
|
||||
**Deployment Order**: Deploy validators first, then sentries, then RPC nodes, then services.
|
||||
|
||||
#### 3.1 Validators (Site-1: ml110-01)
|
||||
- **smom-validator-01**: 6 CPU, 12 GiB RAM, 20 GiB disk
|
||||
- **smom-validator-02**: 6 CPU, 12 GiB RAM, 20 GiB disk
|
||||
- **smom-validator-03**: 6 CPU, 12 GiB RAM, 20 GiB disk
|
||||
- **smom-validator-04**: 6 CPU, 12 GiB RAM, 20 GiB disk
|
||||
- **Total**: 24 CPU, 48 GiB RAM, 80 GiB disk
|
||||
- **Deployment Files**: `examples/production/smom-dbis-138/validator-*.yaml`
|
||||
|
||||
**⚠️ WARNING**: 24 CPU cores required but only 6 available on ML110-01. **RECOMMENDATION**: Move validators to R630-01 or reduce CPU allocation.
|
||||
|
||||
#### 3.2 Sentries (Distributed)
|
||||
- **Site-1 (ml110-01)**:
|
||||
- **smom-sentry-01**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-sentry-02**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **Site-2 (r630-01)**:
|
||||
- **smom-sentry-03**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-sentry-04**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **Total**: 16 CPU, 32 GiB RAM, 80 GiB disk
|
||||
- **Deployment Files**: `examples/production/smom-dbis-138/sentry-*.yaml`
|
||||
|
||||
#### 3.3 RPC Nodes (Site-2: r630-01)
|
||||
- **smom-rpc-node-01**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-rpc-node-02**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-rpc-node-03**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-rpc-node-04**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **Total**: 16 CPU, 32 GiB RAM, 80 GiB disk
|
||||
- **Deployment Files**: `examples/production/smom-dbis-138/rpc-node-*.yaml`
|
||||
|
||||
#### 3.4 Services (Site-2: r630-01)
|
||||
- **smom-management**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-monitoring**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-services**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **smom-blockscout**: 4 CPU, 8 GiB RAM, 20 GiB disk
|
||||
- **Total**: 16 CPU, 32 GiB RAM, 80 GiB disk
|
||||
- **Deployment Files**: `examples/production/smom-dbis-138/{management,monitoring,services,blockscout}.yaml`
|
||||
|
||||
**Phase 3 Resource Usage**:
|
||||
- **ML110-01**: 8 CPU (sentries only), 16 GiB RAM, 40 GiB disk
|
||||
- **R630-01**: 36 CPU, 72 GiB RAM, 180 GiB disk
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Test/Example VMs (Priority: LOW)
|
||||
|
||||
**Deployment Order**: Deploy after production VMs are stable.
|
||||
|
||||
- **vm-100**: ml110-01, 2 CPU, 4 GiB RAM, 50 GiB disk
|
||||
- **basic-vm**: ml110-01, 2 CPU, 4 GiB RAM, 50 GiB disk
|
||||
- **medium-vm**: ml110-01, 4 CPU, 8 GiB RAM, 50 GiB disk
|
||||
- **large-vm**: ml110-01, 8 CPU, 16 GiB RAM, 50 GiB disk
|
||||
|
||||
**Phase 4 Resource Usage**:
|
||||
- **ML110-01**: 16 CPU, 32 GiB RAM, 200 GiB disk
|
||||
|
||||
---
|
||||
|
||||
## Resource Allocation Analysis
|
||||
|
||||
### ML110-01 (Site-1) - Resource Constraints
|
||||
|
||||
**Available Resources**:
|
||||
- CPU: 5 cores (6 - 1 reserved)
|
||||
- RAM: ~248 GB (256 - 8 reserved)
|
||||
- Disk: 794.3 GB (local-lvm) + 384 GB (ceph-fs)
|
||||
|
||||
**Requested Resources** (Phases 1-2):
|
||||
- CPU: 46+ cores ⚠️ **EXCEEDS CAPACITY BY 9x**
|
||||
- RAM: 92+ GiB ✅ Within capacity
|
||||
- Disk: 1,170+ GiB ⚠️ **EXCEEDS CAPACITY**
|
||||
|
||||
**Requested Resources** (Phases 1-3):
|
||||
- CPU: 54+ cores ⚠️ **EXCEEDS CAPACITY BY 11x**
|
||||
- RAM: 108+ GiB ✅ Within capacity
|
||||
- Disk: 1,250+ GiB ⚠️ **EXCEEDS CAPACITY**
|
||||
|
||||
**Recommendations**:
|
||||
1. **Move high-CPU VMs to R630-01**: Git Server, Email Server, DevOps Runner, Codespaces IDE
|
||||
2. **Reduce CPU allocations**: Use 2-4 cores instead of 8 cores for most services
|
||||
3. **Use Ceph storage**: Move large disk VMs to Ceph storage
|
||||
4. **Prioritize critical services**: Deploy only essential services on ML110-01
|
||||
|
||||
### R630-01 (Site-2) - Resource Capacity
|
||||
|
||||
**Available Resources**:
|
||||
- CPU: 26 cores (28 - 2 reserved)
|
||||
- RAM: ~752 GB (768 - 16 reserved)
|
||||
- Disk: 171.3 GB (local-lvm) + Ceph OSD
|
||||
|
||||
**Requested Resources** (Phase 3):
|
||||
- CPU: 36 cores ⚠️ **EXCEEDS CAPACITY BY 1.4x**
|
||||
- RAM: 72 GiB ✅ Within capacity
|
||||
- Disk: 180 GiB ⚠️ **EXCEEDS CAPACITY**
|
||||
|
||||
**Recommendations**:
|
||||
1. **Reduce CPU allocations**: Use 2-3 cores per validator instead of 6
|
||||
2. **Use Ceph storage**: Move VM disks to Ceph storage
|
||||
3. **Optimize resource allocation**: Share resources more efficiently
|
||||
|
||||
---
|
||||
|
||||
## Revised Deployment Plan
|
||||
|
||||
### Optimized Resource Allocation
|
||||
|
||||
#### ML110-01 (Site-1) - Light Workloads Only
|
||||
|
||||
**Phase 1: Core Infrastructure**
|
||||
- Nginx Proxy VM: 2 CPU, 4 GiB RAM, 20 GiB disk ✅
|
||||
|
||||
**Phase 2: Phoenix Infrastructure (Reduced)**
|
||||
- DNS Primary: 2 CPU, 4 GiB RAM, 50 GiB disk ✅
|
||||
- Git Server: **MOVE TO R630-01** or reduce to 2 CPU
|
||||
- Email Server: **MOVE TO R630-01** or reduce to 2 CPU
|
||||
- DevOps Runner: **MOVE TO R630-01** or reduce to 2 CPU
|
||||
- Codespaces IDE: **MOVE TO R630-01** or reduce to 2 CPU, 16 GiB RAM
|
||||
- AS4 Gateway: 2 CPU, 4 GiB RAM, 50 GiB disk ✅
|
||||
- Business Integration Gateway: 2 CPU, 4 GiB RAM, 50 GiB disk ✅
|
||||
- Financial Messaging Gateway: 2 CPU, 4 GiB RAM, 50 GiB disk ✅
|
||||
|
||||
**Phase 3: Blockchain (Sentries Only)**
|
||||
- smom-sentry-01: 2 CPU, 4 GiB RAM, 20 GiB disk ✅
|
||||
- smom-sentry-02: 2 CPU, 4 GiB RAM, 20 GiB disk ✅
|
||||
|
||||
**ML110-01 Total**: 18 CPU cores requested, 5 available ⚠️ **Still exceeds capacity**
|
||||
|
||||
**Final Recommendation**: Deploy only 2-3 critical VMs on ML110-01, move rest to R630-01.
|
||||
|
||||
#### R630-01 (Site-2) - Primary Compute Node
|
||||
|
||||
**Phase 1: Core Infrastructure**
|
||||
- Cloudflare Tunnel VM: 2 CPU, 4 GiB RAM, 10 GiB disk ✅
|
||||
|
||||
**Phase 2: Phoenix Infrastructure (Moved)**
|
||||
- Git Server: 4 CPU, 16 GiB RAM, 500 GiB disk (use Ceph)
|
||||
- Email Server: 4 CPU, 16 GiB RAM, 200 GiB disk (use Ceph)
|
||||
- DevOps Runner: 4 CPU, 16 GiB RAM, 200 GiB disk (use Ceph)
|
||||
- Codespaces IDE: 4 CPU, 32 GiB RAM, 200 GiB disk (use Ceph)
|
||||
|
||||
**Phase 3: Blockchain Infrastructure**
|
||||
- Validators (4x): 3 CPU each = 12 CPU, 12 GiB RAM each = 48 GiB RAM, 80 GiB disk (use Ceph)
|
||||
- Sentries (2x): 2 CPU each = 4 CPU, 4 GiB RAM each = 8 GiB RAM, 40 GiB disk
|
||||
- RPC Nodes (4x): 2 CPU each = 8 CPU, 4 GiB RAM each = 16 GiB RAM, 80 GiB disk (use Ceph)
|
||||
- Services (4x): 2 CPU each = 8 CPU, 4 GiB RAM each = 16 GiB RAM, 80 GiB disk (use Ceph)
|
||||
|
||||
**R630-01 Total**: 42 CPU cores requested, 26 available ⚠️ **Exceeds capacity by 1.6x**
|
||||
|
||||
**Final Recommendation**: Reduce CPU allocations further or deploy in batches.
|
||||
|
||||
---
|
||||
|
||||
## Deployment Execution Plan
|
||||
|
||||
### Step 1: Pre-Deployment Verification
|
||||
|
||||
```bash
|
||||
# 1. Verify Proxmox nodes are accessible
|
||||
./scripts/check-proxmox-quota-ssh.sh
|
||||
|
||||
# 2. Verify images are available
|
||||
./scripts/verify-image-availability.sh
|
||||
|
||||
# 3. Check Crossplane provider is ready
|
||||
kubectl get providerconfig -n crossplane-system
|
||||
kubectl get pods -n crossplane-system -l app=crossplane-provider-proxmox
|
||||
```
|
||||
|
||||
### Step 2: Deploy Phase 1 - Core Infrastructure
|
||||
|
||||
```bash
|
||||
# Deploy Nginx Proxy (ML110-01)
|
||||
kubectl apply -f examples/production/nginx-proxy-vm.yaml
|
||||
|
||||
# Deploy Cloudflare Tunnel (R630-01)
|
||||
kubectl apply -f examples/production/cloudflare-tunnel-vm.yaml
|
||||
|
||||
# Monitor deployment
|
||||
kubectl get proxmoxvm -w
|
||||
```
|
||||
|
||||
**Wait for**: Both VMs to be in "Running" state before proceeding.
|
||||
|
||||
### Step 3: Deploy Phase 2 - Phoenix Infrastructure
|
||||
|
||||
```bash
|
||||
# Deploy DNS Primary (ML110-01)
|
||||
kubectl apply -f examples/production/phoenix/dns-primary.yaml
|
||||
|
||||
# Wait for DNS to be ready, then deploy other services
|
||||
kubectl apply -f examples/production/phoenix/git-server.yaml
|
||||
kubectl apply -f examples/production/phoenix/email-server.yaml
|
||||
kubectl apply -f examples/production/phoenix/devops-runner.yaml
|
||||
kubectl apply -f examples/production/phoenix/codespaces-ide.yaml
|
||||
kubectl apply -f examples/production/phoenix/as4-gateway.yaml
|
||||
kubectl apply -f examples/production/phoenix/business-integration-gateway.yaml
|
||||
kubectl apply -f examples/production/phoenix/financial-messaging-gateway.yaml
|
||||
```
|
||||
|
||||
**Note**: Adjust node assignments and CPU allocations based on resource constraints.
|
||||
|
||||
### Step 4: Deploy Phase 3 - Blockchain Infrastructure
|
||||
|
||||
```bash
|
||||
# Deploy validators first
|
||||
kubectl apply -f examples/production/smom-dbis-138/validator-01.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/validator-02.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/validator-03.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/validator-04.yaml
|
||||
|
||||
# Deploy sentries
|
||||
kubectl apply -f examples/production/smom-dbis-138/sentry-01.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/sentry-02.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/sentry-03.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/sentry-04.yaml
|
||||
|
||||
# Deploy RPC nodes
|
||||
kubectl apply -f examples/production/smom-dbis-138/rpc-node-01.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/rpc-node-02.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/rpc-node-03.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/rpc-node-04.yaml
|
||||
|
||||
# Deploy services
|
||||
kubectl apply -f examples/production/smom-dbis-138/management.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/monitoring.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/services.yaml
|
||||
kubectl apply -f examples/production/smom-dbis-138/blockscout.yaml
|
||||
```
|
||||
|
||||
### Step 5: Deploy Phase 4 - Test VMs (Optional)
|
||||
|
||||
```bash
|
||||
# Deploy test VMs only if resources allow
|
||||
kubectl apply -f examples/production/vm-100.yaml
|
||||
kubectl apply -f examples/production/basic-vm.yaml
|
||||
kubectl apply -f examples/production/medium-vm.yaml
|
||||
kubectl apply -f examples/production/large-vm.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Verification
|
||||
|
||||
### Real-Time Monitoring
|
||||
|
||||
```bash
|
||||
# Watch all VM deployments
|
||||
kubectl get proxmoxvm -A -w
|
||||
|
||||
# Check specific VM status
|
||||
kubectl describe proxmoxvm <vm-name>
|
||||
|
||||
# Check controller logs
|
||||
kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=100 -f
|
||||
```
|
||||
|
||||
### Resource Monitoring
|
||||
|
||||
```bash
|
||||
# Check Proxmox node resources
|
||||
./scripts/check-proxmox-quota-ssh.sh
|
||||
|
||||
# Check VM resource usage
|
||||
kubectl get proxmoxvm -A -o wide
|
||||
```
|
||||
|
||||
### Post-Deployment Verification
|
||||
|
||||
```bash
|
||||
# Verify all VMs are running
|
||||
kubectl get proxmoxvm -A | grep -v Running
|
||||
|
||||
# Check VM IP addresses
|
||||
kubectl get proxmoxvm -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.network.ipAddress}{"\n"}{end}'
|
||||
|
||||
# Verify guest agents
|
||||
./scripts/verify-guest-agent.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
### Resource Overcommitment
|
||||
|
||||
**Risk**: Requested resources exceed available capacity.
|
||||
|
||||
**Mitigation**:
|
||||
1. Deploy VMs in batches, monitoring resource usage
|
||||
2. Reduce CPU allocations where possible
|
||||
3. Use Ceph storage for large disk requirements
|
||||
4. Move high-resource VMs to R630-01
|
||||
5. Consider adding additional Proxmox nodes
|
||||
|
||||
### Deployment Failures
|
||||
|
||||
**Risk**: VM creation may fail due to resource constraints or configuration errors.
|
||||
|
||||
**Mitigation**:
|
||||
1. Validate all VM configurations before deployment
|
||||
2. Check Proxmox quotas before each deployment
|
||||
3. Monitor controller logs for errors
|
||||
4. Have rollback procedures ready
|
||||
5. Test deployments on non-critical VMs first
|
||||
|
||||
### Network Issues
|
||||
|
||||
**Risk**: Network connectivity problems may prevent VM deployment or operation.
|
||||
|
||||
**Mitigation**:
|
||||
1. Verify network bridges exist on all nodes
|
||||
2. Test network connectivity before deployment
|
||||
3. Configure proper DNS resolution
|
||||
4. Verify firewall rules allow required traffic
|
||||
|
||||
---
|
||||
|
||||
## Deployment Timeline
|
||||
|
||||
### Estimated Timeline
|
||||
|
||||
- **Phase 1 (Core Infrastructure)**: 30 minutes
|
||||
- **Phase 2 (Phoenix Infrastructure)**: 2-4 hours
|
||||
- **Phase 3 (Blockchain Infrastructure)**: 3-6 hours
|
||||
- **Phase 4 (Test VMs)**: 1 hour (optional)
|
||||
|
||||
**Total Estimated Time**: 6-11 hours (excluding verification and troubleshooting)
|
||||
|
||||
### Critical Path
|
||||
|
||||
1. Core Infrastructure (Nginx, Cloudflare Tunnel) → 30 min
|
||||
2. DNS Primary → 15 min
|
||||
3. Git Server, Email Server → 1 hour
|
||||
4. DevOps Runner, Codespaces IDE → 1 hour
|
||||
5. Blockchain Validators → 2 hours
|
||||
6. Blockchain Sentries → 1 hour
|
||||
7. Blockchain RPC Nodes → 1 hour
|
||||
8. Blockchain Services → 1 hour
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Review and Approve**: Review this plan and approve resource allocations
|
||||
2. **Update VM Configurations**: Update VM YAML files with optimized resource allocations
|
||||
3. **Pre-Deployment Checks**: Run all pre-deployment verification scripts
|
||||
4. **Execute Deployment**: Follow deployment steps in order
|
||||
5. **Monitor and Verify**: Continuously monitor deployment progress
|
||||
6. **Post-Deployment**: Verify all services are operational
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [VM Deployment Checklist](./VM_DEPLOYMENT_CHECKLIST.md) - Step-by-step checklist
|
||||
- [VM Creation Procedure](./VM_CREATION_PROCEDURE.md) - Detailed creation procedures
|
||||
- [VM Specifications](./VM_SPECIFICATIONS.md) - Complete VM specifications
|
||||
- [Deployment Requirements](../deployment/DEPLOYMENT_REQUIREMENTS.md) - Overall deployment requirements
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-01-XX
|
||||
**Status**: Ready for Review
|
||||
**Maintainer**: Infrastructure Team
|
||||
**Version**: 2.0
|
||||
|
||||
Reference in New Issue
Block a user