# Remaining Steps - Proxmox VE Deployment **Generated:** 2025-11-27 **Based on:** Current status review and bring-up checklist This document provides a comprehensive, prioritized list of all remaining steps to complete the Proxmox VE → Azure Arc → Hybrid Cloud Stack deployment. ## Priority Legend - 🔴 **Critical/Blocking** - Must be completed before other work can proceed - 🟠 **High Priority** - Core infrastructure required for deployment - 🟡 **Medium Priority** - Service deployment and configuration - 🟢 **Low Priority** - Optimization, hardening, and polish --- ## 🔴 Critical/Blocking Items ### 1. Azure Subscription Verification **Status:** ⏳ PENDING **Blocking:** Azure Arc onboarding, resource creation **Actions:** - [ ] Verify Azure subscription status: `az account show` - [ ] Check if subscription is enabled (currently documented as disabled) - [ ] Re-enable subscription in Azure Portal if needed - [ ] Verify subscription ID: `fc08d829-4f14-413d-ab27-ce024425db0b` - [ ] Verify tenant ID: `fb97e99d-3e94-4686-bfde-4bf4062e05f3` **Commands:** ```bash az account show az account list ``` **Reference:** `docs/temporary/DEPLOYMENT_STATUS.md` --- ## 🟠 High Priority: Core Infrastructure ### 2. Proxmox Cluster Configuration #### 2.1 Create Cluster on ML110 **Status:** ⏳ PENDING **Server:** ML110 (192.168.1.206) **Actions:** - [ ] SSH to ML110: `ssh root@192.168.1.206` - [ ] Set environment variables: ```bash export CLUSTER_NAME=hc-cluster export NODE_ROLE=create ``` - [ ] Run cluster setup script: `./infrastructure/proxmox/cluster-setup.sh` - [ ] Verify cluster creation: `pvecm status` - [ ] Verify node count: `pvecm nodes` **Script:** `infrastructure/proxmox/cluster-setup.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 2 #### 2.2 Join R630 to Cluster **Status:** ⏳ PENDING **Server:** R630 (192.168.1.49) **Actions:** - [ ] SSH to R630: `ssh root@192.168.1.49` - [ ] Set environment variables: ```bash export CLUSTER_NAME=hc-cluster export NODE_ROLE=join export CLUSTER_NODE_IP=192.168.1.206 export ROOT_PASSWORD= ``` - [ ] Run cluster setup script: `./infrastructure/proxmox/cluster-setup.sh` - [ ] Verify cluster membership: `pvecm status` - [ ] Verify both nodes visible: `pvecm nodes` **Script:** `infrastructure/proxmox/cluster-setup.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 2 #### 2.3 Verify Cluster Health **Status:** ⏳ PENDING **Actions:** - [ ] Check cluster quorum: `pvecm expected` - [ ] Verify cluster services: `systemctl status pve-cluster` - [ ] Test cluster communication between nodes - [ ] Verify shared configuration: `ls -la /etc/pve/nodes/` **Commands:** ```bash pvecm status pvecm nodes pvecm expected ``` --- ### 3. Storage Configuration #### 3.1 Configure NFS Storage on ML110 **Status:** ⏳ PENDING **Server:** ML110 (192.168.1.206) **Prerequisites:** - NFS server available (Router server at 10.10.10.1 or configured location) - NFS export path: `/mnt/storage` (or as configured) **Actions:** - [ ] SSH to ML110: `ssh root@192.168.1.206` - [ ] Set environment variables: ```bash export NFS_SERVER=10.10.10.1 # Adjust if different export NFS_PATH=/mnt/storage # Adjust if different export STORAGE_NAME=router-storage export CONTENT_TYPES=images,iso,vztmpl,backup ``` - [ ] Run NFS storage script: `./infrastructure/proxmox/nfs-storage.sh` - [ ] Verify storage: `pvesm status` - [ ] Test storage access **Script:** `infrastructure/proxmox/nfs-storage.sh` **Alternative:** `infrastructure/storage/configure-proxmox-storage.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 5 #### 3.2 Configure NFS Storage on R630 **Status:** ⏳ PENDING **Server:** R630 (192.168.1.49) **Actions:** - [ ] SSH to R630: `ssh root@192.168.1.49` - [ ] Set environment variables (same as ML110) - [ ] Run NFS storage script: `./infrastructure/proxmox/nfs-storage.sh` - [ ] Verify storage: `pvesm status` - [ ] Verify shared storage accessible from both nodes **Script:** `infrastructure/proxmox/nfs-storage.sh` #### 3.3 Verify Shared Storage **Status:** ⏳ PENDING **Actions:** - [ ] Verify storage visible on both nodes: `pvesm status` - [ ] Test storage read/write from both nodes - [ ] Verify storage content types configured correctly - [ ] Document storage configuration **Commands:** ```bash pvesm status pvesm list ``` --- ### 4. Network/VLAN Configuration #### 4.1 Configure VLAN Bridges on ML110 **Status:** ⏳ PENDING **Server:** ML110 (192.168.1.206) **Required VLANs:** - VLAN 10: Management - VLAN 20: Infrastructure - VLAN 30: Services - VLAN 40: Monitoring - VLAN 50: CI/CD - VLAN 60: Development - VLAN 99: External/Cloudflare **Actions:** - [ ] SSH to ML110: `ssh root@192.168.1.206` - [ ] Review network topology: `docs/architecture/network-topology.md` - [ ] Run VLAN configuration script: `./infrastructure/network/configure-proxmox-vlans.sh` - [ ] Verify bridges created: `ip addr show` or Proxmox web UI - [ ] Test VLAN connectivity **Script:** `infrastructure/network/configure-proxmox-vlans.sh` **Alternative:** `infrastructure/proxmox/configure-proxmox-vlans.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 4 #### 4.2 Configure VLAN Bridges on R630 **Status:** ⏳ PENDING **Server:** R630 (192.168.1.49) **Actions:** - [ ] SSH to R630: `ssh root@192.168.1.49` - [ ] Run VLAN configuration script: `./infrastructure/network/configure-proxmox-vlans.sh` - [ ] Verify bridges created: `ip addr show` or Proxmox web UI - [ ] Verify VLAN configuration matches ML110 - [ ] Test VLAN connectivity **Script:** `infrastructure/network/configure-proxmox-vlans.sh` #### 4.3 Verify Network Configuration **Status:** ⏳ PENDING **Actions:** - [ ] Verify all VLAN bridges on both nodes - [ ] Test VLAN isolation - [ ] Test inter-VLAN routing (if applicable) - [ ] Document network configuration **Commands:** ```bash ip addr show cat /etc/network/interfaces ``` --- ### 5. Azure Arc Onboarding #### 5.1 Create Azure Resource Group **Status:** ⏳ PENDING **Blockers:** Azure subscription must be enabled **Actions:** - [ ] Load environment variables from `.env` - [ ] Verify Azure CLI authenticated: `az account show` - [ ] Set subscription: `az account set --subscription "$AZURE_SUBSCRIPTION_ID"` - [ ] Create resource group: ```bash az group create \ --name "$AZURE_RESOURCE_GROUP" \ --location "$AZURE_LOCATION" ``` - [ ] Verify resource group: `az group show --name "$AZURE_RESOURCE_GROUP"` **Reference:** `docs/temporary/NEXT_STEPS.md` Section 2 #### 5.2 Onboard ML110 to Azure Arc **Status:** ⏳ PENDING **Server:** ML110 (192.168.1.206) **Actions:** - [ ] SSH to ML110: `ssh root@192.168.1.206` - [ ] Set environment variables: ```bash export RESOURCE_GROUP=HC-Stack # or from .env export TENANT_ID= export SUBSCRIPTION_ID= export LOCATION=eastus # or from .env export TAGS="type=proxmox,host=ml110" ``` - [ ] Run onboarding script: `./scripts/azure-arc/onboard-proxmox-hosts.sh` - [ ] Verify agent installed: `azcmagent show` - [ ] Verify connection: Check Azure Portal **Script:** `scripts/azure-arc/onboard-proxmox-hosts.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 6 #### 5.3 Onboard R630 to Azure Arc **Status:** ⏳ PENDING **Server:** R630 (192.168.1.49) **Actions:** - [ ] SSH to R630: `ssh root@192.168.1.49` - [ ] Set environment variables (same as ML110, change TAGS): ```bash export TAGS="type=proxmox,host=r630" ``` - [ ] Run onboarding script: `./scripts/azure-arc/onboard-proxmox-hosts.sh` - [ ] Verify agent installed: `azcmagent show` - [ ] Verify connection: Check Azure Portal **Script:** `scripts/azure-arc/onboard-proxmox-hosts.sh` #### 5.4 Verify Azure Arc Integration **Status:** ⏳ PENDING **Actions:** - [ ] Verify both servers in Azure Portal: Azure Arc → Servers - [ ] Check server status (should be "Connected") - [ ] Verify tags applied correctly - [ ] Test Azure Policy assignment (if configured) - [ ] Verify Azure Monitor integration (if configured) **Reference:** `docs/deployment/azure-arc-onboarding.md` --- ### 6. Cloudflare Configuration #### 6.1 Configure Cloudflare Credentials **Status:** ⏳ PENDING **Actions:** - [ ] Create Cloudflare API token: https://dash.cloudflare.com/profile/api-tokens - [ ] Add to `.env` file: ```bash CLOUDFLARE_API_TOKEN= CLOUDFLARE_ACCOUNT_EMAIL= ``` - [ ] Verify credentials not committed to git (check `.gitignore`) - [ ] Test Cloudflare API access (if script available) **Reference:** `docs/temporary/DEPLOYMENT_STATUS.md` Section "Cloudflare Configuration Pending" --- ## 🟡 Medium Priority: Service Deployment ### 7. VM Template Creation #### 7.1 Verify/Create Ubuntu 24.04 Template **Status:** ⏳ PENDING **Note:** VM 9000 exists on ML110 but may need configuration **Actions:** - [ ] Check existing template VM 9000 on ML110 - [ ] Verify template configuration: - Cloud-init enabled - QEMU agent enabled - Proper disk size - Network configuration - [ ] If template needs creation: - [ ] Upload Ubuntu 24.04 ISO to Proxmox storage - [ ] Create VM from ISO - [ ] Install Ubuntu 24.04 - [ ] Install QEMU guest agent - [ ] Install Azure Arc agent (optional, for template) - [ ] Configure cloud-init - [ ] Convert to template - [ ] Verify template accessible from both nodes (if clustered) **Scripts:** - `scripts/vm-management/create/create-proxmox-template.sh` - `scripts/vm-management/create/create-template-via-api.sh` **Reference:** `docs/operations/proxmox-ubuntu-images.md` --- ### 8. Service VM Deployment #### 8.1 Deploy Cloudflare Tunnel VM **Status:** ⏳ PENDING **VM Specifications:** - **VM ID:** 100 (or next available) - **Name:** cloudflare-tunnel - **IP:** 192.168.1.60/24 - **Gateway:** 192.168.1.254 - **VLAN:** 99 - **CPU:** 2 cores - **RAM:** 4GB - **Disk:** 40GB - **Template:** ubuntu-24.04-cloudinit **Actions:** - [ ] Create VM from template (via Terraform or Proxmox API) - [ ] Configure network (VLAN 99) - [ ] Configure IP address (192.168.1.60/24) - [ ] Start VM - [ ] Verify VM accessible **Scripts:** - Terraform: `terraform/proxmox/` - API: `scripts/vm-management/create/create-vms-from-template.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 8.2 Deploy K3s Master VM **Status:** ⏳ PENDING **VM Specifications:** - **VM ID:** 101 (or next available) - **Name:** k3s-master - **IP:** 192.168.1.188/24 - **Gateway:** 192.168.1.254 - **VLAN:** 30 (Services) - **CPU:** 4 cores - **RAM:** 8GB - **Disk:** 80GB - **Template:** ubuntu-24.04-cloudinit **Actions:** - [ ] Create VM from template - [ ] Configure network (VLAN 30) - [ ] Configure IP address (192.168.1.188/24) - [ ] Start VM - [ ] Verify VM accessible **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 8.3 Deploy Git Server VM **Status:** ⏳ PENDING **VM Specifications:** - **VM ID:** 102 (or next available) - **Name:** git-server - **IP:** 192.168.1.121/24 - **Gateway:** 192.168.1.254 - **VLAN:** 50 (CI/CD) - **CPU:** 4 cores - **RAM:** 8GB - **Disk:** 100GB - **Template:** ubuntu-24.04-cloudinit **Actions:** - [ ] Create VM from template - [ ] Configure network (VLAN 50) - [ ] Configure IP address (192.168.1.121/24) - [ ] Start VM - [ ] Verify VM accessible **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 8.4 Deploy Observability VM **Status:** ⏳ PENDING **VM Specifications:** - **VM ID:** 103 (or next available) - **Name:** observability - **IP:** 192.168.1.82/24 - **Gateway:** 192.168.1.254 - **VLAN:** 40 (Monitoring) - **CPU:** 4 cores - **RAM:** 8GB - **Disk:** 200GB - **Template:** ubuntu-24.04-cloudinit **Actions:** - [ ] Create VM from template - [ ] Configure network (VLAN 40) - [ ] Configure IP address (192.168.1.82/24) - [ ] Start VM - [ ] Verify VM accessible **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 --- ### 9. OS Installation on VMs #### 9.1 Install Ubuntu 24.04 on All VMs **Status:** ⏳ PENDING **Note:** This requires manual console access **Actions (for each VM):** - [ ] Access Proxmox Web UI: https://192.168.1.206:8006 or https://192.168.1.49:8006 - [ ] For each VM (100, 101, 102, 103): - [ ] Click on VM → Console - [ ] Ubuntu installer should boot from ISO/cloud-init - [ ] Complete installation with appropriate IP configuration: - **VM 100 (cloudflare-tunnel):** IP: 192.168.1.60/24, Gateway: 192.168.1.254 - **VM 101 (k3s-master):** IP: 192.168.1.188/24, Gateway: 192.168.1.254 - **VM 102 (git-server):** IP: 192.168.1.121/24, Gateway: 192.168.1.254 - **VM 103 (observability):** IP: 192.168.1.82/24, Gateway: 192.168.1.254 - [ ] Create user account (remember for SSH) - [ ] Verify SSH access **Reference:** `docs/temporary/COMPLETE_STATUS.md` Step 1 #### 9.2 Verify OS Installation **Status:** ⏳ PENDING **Actions:** - [ ] Run VM status check: `./scripts/check-vm-status.sh` (if available) - [ ] Verify network connectivity from each VM - [ ] Verify SSH access to each VM - [ ] Verify Ubuntu 24.04 installed correctly - [ ] Verify QEMU guest agent working **Scripts:** - `scripts/check-vm-status.sh` (if exists) - `scripts/vm-management/monitor/check-vm-disk-sizes.sh` --- ### 10. Service Configuration #### 10.1 Configure Cloudflare Tunnel **Status:** ⏳ PENDING **VM:** cloudflare-tunnel (192.168.1.60) **Actions:** - [ ] SSH to cloudflare-tunnel VM - [ ] Install cloudflared: ```bash curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared chmod +x /usr/local/bin/cloudflared ``` - [ ] Authenticate: `cloudflared tunnel login` - [ ] Create tunnel: `cloudflared tunnel create azure-stack-hci` - [ ] Configure tunnel routes (see `docs/deployment/cloudflare-integration.md`) - [ ] Configure tunnel for: - Windows Admin Center (if applicable) - Proxmox UI - Dashboards - Git/CI services - [ ] Set up systemd service for cloudflared - [ ] Test external access **Script:** `scripts/setup-cloudflare-tunnel.sh` (if available) **Reference:** `docs/deployment/cloudflare-integration.md` #### 10.2 Deploy and Configure K3s **Status:** ⏳ PENDING **VM:** k3s-master (192.168.1.188) **Actions:** - [ ] SSH to k3s-master VM - [ ] Install K3s: `curl -sfL https://get.k3s.io | sh -` - [ ] Verify K3s running: `kubectl get nodes` - [ ] Get kubeconfig: `sudo cat /etc/rancher/k3s/k3s.yaml` - [ ] Configure kubectl access - [ ] Install required addons (if any) - [ ] Onboard to Azure Arc (if applicable): ```bash export RESOURCE_GROUP=HC-Stack export CLUSTER_NAME=proxmox-k3s-cluster ./infrastructure/kubernetes/arc-onboard-k8s.sh ``` **Script:** `scripts/setup-k3s.sh` (if available) **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 10.3 Set Up Git Server **Status:** ⏳ PENDING **VM:** git-server (192.168.1.121) **Actions:** - [ ] SSH to git-server VM - [ ] Choose Git server (Gitea or GitLab CE) - [ ] Install Git server: - **Gitea:** `./infrastructure/gitops/gitea-deploy.sh` - **GitLab CE:** `./infrastructure/gitops/gitlab-deploy.sh` - [ ] Configure Git server: - Admin account - Repository creation - User access - [ ] Create initial repositories - [ ] Configure GitOps workflows **Scripts:** - `scripts/setup-git-server.sh` (if available) - `infrastructure/gitops/gitea-deploy.sh` - `infrastructure/gitops/gitlab-deploy.sh` **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 10.4 Deploy Observability Stack **Status:** ⏳ PENDING **VM:** observability (192.168.1.82) **Actions:** - [ ] SSH to observability VM - [ ] Deploy Prometheus: - Install Prometheus - Configure scrape targets - Set up retention policies - [ ] Deploy Grafana: - Install Grafana - Configure data sources (Prometheus) - Import dashboards - Configure authentication - [ ] Configure monitoring for: - Proxmox hosts - VMs - Kubernetes cluster - Network metrics - Storage metrics - [ ] Set up alerting rules **Script:** `scripts/setup-observability.sh` (if available) **Reference:** `docs/deployment/bring-up-checklist.md` Phase 8 #### 10.5 Configure GitOps Workflows **Status:** ⏳ PENDING **Actions:** - [ ] Create Git repository in Git server - [ ] Copy `gitops/` directory to repository - [ ] Configure Flux or ArgoCD (if applicable) - [ ] Set up CI/CD pipelines - [ ] Configure automated deployments - [ ] Test GitOps workflow **Reference:** `docs/operations/runbooks/gitops-workflow.md` --- ## 🟢 Low Priority: Optimization & Hardening ### 11. Security Hardening #### 11.1 Create RBAC Accounts for Proxmox **Status:** ⏳ PENDING **Actions:** - [ ] Review RBAC guide: `docs/security/proxmox-rbac.md` - [ ] Create service accounts for automation - [ ] Create operator accounts with appropriate roles - [ ] Generate API tokens for service accounts - [ ] Document RBAC account usage - [ ] Update automation scripts to use API tokens instead of root - [ ] Test API token authentication - [ ] Remove or restrict root API access (if desired) **Reference:** `docs/security/proxmox-rbac.md` #### 11.2 Review Firewall Rules **Status:** ⏳ PENDING **Actions:** - [ ] Review firewall configuration on both Proxmox hosts - [ ] Verify only necessary ports are open - [ ] Configure firewall rules for cluster communication - [ ] Document firewall configuration - [ ] Test firewall rules #### 11.3 Configure Security Policies **Status:** ⏳ PENDING **Actions:** - [ ] Review Azure Policy assignments - [ ] Configure security baselines - [ ] Enable Azure Defender (if applicable) - [ ] Configure update management - [ ] Review secret management - [ ] Perform security scan **Reference:** `docs/deployment/bring-up-checklist.md` Phase 10 --- ### 12. Monitoring Setup #### 12.1 Configure Monitoring Dashboards **Status:** ⏳ PENDING **Actions:** - [ ] Configure Grafana dashboards for: - Proxmox hosts - VMs - Kubernetes cluster - Network performance - Storage performance - [ ] Set up Prometheus alerting rules - [ ] Configure alert notifications - [ ] Test alerting #### 12.2 Configure Azure Monitor **Status:** ⏳ PENDING **Actions:** - [ ] Enable Log Analytics workspace - [ ] Configure data collection rules - [ ] Set up Azure Monitor alerts - [ ] Configure log queries - [ ] Test Azure Monitor integration **Reference:** `docs/deployment/bring-up-checklist.md` Phase 10 --- ### 13. Performance Tuning **Status:** ⏳ PENDING **Actions:** - [ ] Review storage performance - [ ] Optimize VM resource allocation - [ ] Tune network settings - [ ] Optimize Proxmox cluster settings - [ ] Run performance benchmarks - [ ] Document performance metrics --- ### 14. Documentation Updates **Status:** ⏳ PENDING **Actions:** - [ ] Update `docs/temporary/COMPLETE_STATUS.md` with actual status - [ ] Update `docs/temporary/DEPLOYMENT_STATUS.md` with current blockers - [ ] Update `docs/temporary/NEXT_STEPS.md` with completed items - [ ] Create runbooks for common operations - [ ] Document network topology - [ ] Document storage configuration - [ ] Create troubleshooting guides --- ## Summary Checklist ### Critical (Must Complete First) - [ ] Azure subscription verification/enablement - [ ] Proxmox cluster configuration - [ ] NFS/shared storage configuration - [ ] Network/VLAN configuration ### High Priority (Core Infrastructure) - [ ] Azure Arc onboarding (both servers) - [ ] Cloudflare credentials configuration ### Medium Priority (Service Deployment) - [ ] VM template creation/verification - [ ] Service VM deployment (4 VMs) - [ ] OS installation on VMs - [ ] Service configuration (Cloudflare, K3s, Git, Observability) ### Low Priority (Optimization) - [ ] Security hardening (RBAC, firewalls) - [ ] Monitoring setup - [ ] Performance tuning - [ ] Documentation updates --- ## Estimated Timeline - **Week 1:** Critical and High Priority items (Infrastructure foundation) - **Week 2:** Medium Priority items (Service deployment) - **Week 3-4:** Low Priority items (Optimization and hardening) **Total Estimated Time:** 3-4 weeks for complete deployment --- ## Quick Reference ### Key Scripts - Cluster Setup: `infrastructure/proxmox/cluster-setup.sh` - NFS Storage: `infrastructure/proxmox/nfs-storage.sh` - VLAN Configuration: `infrastructure/network/configure-proxmox-vlans.sh` - Azure Arc: `scripts/azure-arc/onboard-proxmox-hosts.sh` - Health Check: `scripts/health/check-proxmox-health.sh` - Status Query: `scripts/health/query-proxmox-status.sh` ### Key Documentation - Status Review: `docs/PROXMOX_STATUS_REVIEW.md` - Bring-Up Checklist: `docs/deployment/bring-up-checklist.md` - Azure Arc Onboarding: `docs/deployment/azure-arc-onboarding.md` - Cloudflare Integration: `docs/deployment/cloudflare-integration.md` - Proxmox RBAC: `docs/security/proxmox-rbac.md` ### Server Information - **ML110:** 192.168.1.206:8006 - **R630:** 192.168.1.49:8006 - **Cluster Name:** hc-cluster (to be created) - **Resource Group:** HC-Stack (to be created) --- **Last Updated:** 2025-11-27 **Next Review:** After completing Phase 1 (Infrastructure Foundation)