Some checks failed
Test / test (push) Has been cancelled
Co-authored-by: Cursor <cursoragent@cursor.com>
751 lines
21 KiB
Markdown
751 lines
21 KiB
Markdown
# Remaining Steps - Proxmox VE Deployment
|
|
|
|
**Generated:** 2025-11-27
|
|
**Based on:** Current status review and bring-up checklist
|
|
|
|
This document provides a comprehensive, prioritized list of all remaining steps to complete the Proxmox VE → Azure Arc → Hybrid Cloud Stack deployment.
|
|
|
|
## Priority Legend
|
|
|
|
- 🔴 **Critical/Blocking** - Must be completed before other work can proceed
|
|
- 🟠 **High Priority** - Core infrastructure required for deployment
|
|
- 🟡 **Medium Priority** - Service deployment and configuration
|
|
- 🟢 **Low Priority** - Optimization, hardening, and polish
|
|
|
|
---
|
|
|
|
## 🔴 Critical/Blocking Items
|
|
|
|
### 1. Azure Subscription Verification
|
|
**Status:** ⏳ PENDING
|
|
**Blocking:** Azure Arc onboarding, resource creation
|
|
|
|
**Actions:**
|
|
- [ ] Verify Azure subscription status: `az account show`
|
|
- [ ] Check if subscription is enabled (currently documented as disabled)
|
|
- [ ] Re-enable subscription in Azure Portal if needed
|
|
- [ ] Verify subscription ID: `fc08d829-4f14-413d-ab27-ce024425db0b`
|
|
- [ ] Verify tenant ID: `fb97e99d-3e94-4686-bfde-4bf4062e05f3`
|
|
|
|
**Commands:**
|
|
```bash
|
|
az account show
|
|
az account list
|
|
```
|
|
|
|
**Reference:** `docs/temporary/DEPLOYMENT_STATUS.md`
|
|
|
|
---
|
|
|
|
## 🟠 High Priority: Core Infrastructure
|
|
|
|
### 2. Proxmox Cluster Configuration
|
|
|
|
#### 2.1 Create Cluster on ML110
|
|
**Status:** ⏳ PENDING
|
|
**Server:** ML110 (192.168.1.206)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to ML110: `ssh root@192.168.1.206`
|
|
- [ ] Set environment variables:
|
|
```bash
|
|
export CLUSTER_NAME=hc-cluster
|
|
export NODE_ROLE=create
|
|
```
|
|
- [ ] Run cluster setup script: `./infrastructure/proxmox/cluster-setup.sh`
|
|
- [ ] Verify cluster creation: `pvecm status`
|
|
- [ ] Verify node count: `pvecm nodes`
|
|
|
|
**Script:** `infrastructure/proxmox/cluster-setup.sh`
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 2
|
|
|
|
#### 2.2 Join R630 to Cluster
|
|
**Status:** ⏳ PENDING
|
|
**Server:** R630 (192.168.1.49)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to R630: `ssh root@192.168.1.49`
|
|
- [ ] Set environment variables:
|
|
```bash
|
|
export CLUSTER_NAME=hc-cluster
|
|
export NODE_ROLE=join
|
|
export CLUSTER_NODE_IP=192.168.1.206
|
|
export ROOT_PASSWORD=<ML110_root_password>
|
|
```
|
|
- [ ] Run cluster setup script: `./infrastructure/proxmox/cluster-setup.sh`
|
|
- [ ] Verify cluster membership: `pvecm status`
|
|
- [ ] Verify both nodes visible: `pvecm nodes`
|
|
|
|
**Script:** `infrastructure/proxmox/cluster-setup.sh`
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 2
|
|
|
|
#### 2.3 Verify Cluster Health
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Check cluster quorum: `pvecm expected`
|
|
- [ ] Verify cluster services: `systemctl status pve-cluster`
|
|
- [ ] Test cluster communication between nodes
|
|
- [ ] Verify shared configuration: `ls -la /etc/pve/nodes/`
|
|
|
|
**Commands:**
|
|
```bash
|
|
pvecm status
|
|
pvecm nodes
|
|
pvecm expected
|
|
```
|
|
|
|
---
|
|
|
|
### 3. Storage Configuration
|
|
|
|
#### 3.1 Configure NFS Storage on ML110
|
|
**Status:** ⏳ PENDING
|
|
**Server:** ML110 (192.168.1.206)
|
|
|
|
**Prerequisites:**
|
|
- NFS server available (Router server at 10.10.10.1 or configured location)
|
|
- NFS export path: `/mnt/storage` (or as configured)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to ML110: `ssh root@192.168.1.206`
|
|
- [ ] Set environment variables:
|
|
```bash
|
|
export NFS_SERVER=10.10.10.1 # Adjust if different
|
|
export NFS_PATH=/mnt/storage # Adjust if different
|
|
export STORAGE_NAME=router-storage
|
|
export CONTENT_TYPES=images,iso,vztmpl,backup
|
|
```
|
|
- [ ] Run NFS storage script: `./infrastructure/proxmox/nfs-storage.sh`
|
|
- [ ] Verify storage: `pvesm status`
|
|
- [ ] Test storage access
|
|
|
|
**Script:** `infrastructure/proxmox/nfs-storage.sh`
|
|
**Alternative:** `infrastructure/storage/configure-proxmox-storage.sh`
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 5
|
|
|
|
#### 3.2 Configure NFS Storage on R630
|
|
**Status:** ⏳ PENDING
|
|
**Server:** R630 (192.168.1.49)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to R630: `ssh root@192.168.1.49`
|
|
- [ ] Set environment variables (same as ML110)
|
|
- [ ] Run NFS storage script: `./infrastructure/proxmox/nfs-storage.sh`
|
|
- [ ] Verify storage: `pvesm status`
|
|
- [ ] Verify shared storage accessible from both nodes
|
|
|
|
**Script:** `infrastructure/proxmox/nfs-storage.sh`
|
|
|
|
#### 3.3 Verify Shared Storage
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Verify storage visible on both nodes: `pvesm status`
|
|
- [ ] Test storage read/write from both nodes
|
|
- [ ] Verify storage content types configured correctly
|
|
- [ ] Document storage configuration
|
|
|
|
**Commands:**
|
|
```bash
|
|
pvesm status
|
|
pvesm list
|
|
```
|
|
|
|
---
|
|
|
|
### 4. Network/VLAN Configuration
|
|
|
|
#### 4.1 Configure VLAN Bridges on ML110
|
|
**Status:** ⏳ PENDING
|
|
**Server:** ML110 (192.168.1.206)
|
|
|
|
**Required VLANs:**
|
|
- VLAN 10: Management
|
|
- VLAN 20: Infrastructure
|
|
- VLAN 30: Services
|
|
- VLAN 40: Monitoring
|
|
- VLAN 50: CI/CD
|
|
- VLAN 60: Development
|
|
- VLAN 99: External/Cloudflare
|
|
|
|
**Actions:**
|
|
- [ ] SSH to ML110: `ssh root@192.168.1.206`
|
|
- [ ] Review network topology: `docs/architecture/network-topology.md`
|
|
- [ ] Run VLAN configuration script: `./infrastructure/network/configure-proxmox-vlans.sh`
|
|
- [ ] Verify bridges created: `ip addr show` or Proxmox web UI
|
|
- [ ] Test VLAN connectivity
|
|
|
|
**Script:** `infrastructure/network/configure-proxmox-vlans.sh`
|
|
**Alternative:** `infrastructure/proxmox/configure-proxmox-vlans.sh`
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 4
|
|
|
|
#### 4.2 Configure VLAN Bridges on R630
|
|
**Status:** ⏳ PENDING
|
|
**Server:** R630 (192.168.1.49)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to R630: `ssh root@192.168.1.49`
|
|
- [ ] Run VLAN configuration script: `./infrastructure/network/configure-proxmox-vlans.sh`
|
|
- [ ] Verify bridges created: `ip addr show` or Proxmox web UI
|
|
- [ ] Verify VLAN configuration matches ML110
|
|
- [ ] Test VLAN connectivity
|
|
|
|
**Script:** `infrastructure/network/configure-proxmox-vlans.sh`
|
|
|
|
#### 4.3 Verify Network Configuration
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Verify all VLAN bridges on both nodes
|
|
- [ ] Test VLAN isolation
|
|
- [ ] Test inter-VLAN routing (if applicable)
|
|
- [ ] Document network configuration
|
|
|
|
**Commands:**
|
|
```bash
|
|
ip addr show
|
|
cat /etc/network/interfaces
|
|
```
|
|
|
|
---
|
|
|
|
### 5. Azure Arc Onboarding
|
|
|
|
#### 5.1 Create Azure Resource Group
|
|
**Status:** ⏳ PENDING
|
|
**Blockers:** Azure subscription must be enabled
|
|
|
|
**Actions:**
|
|
- [ ] Load environment variables from `.env`
|
|
- [ ] Verify Azure CLI authenticated: `az account show`
|
|
- [ ] Set subscription: `az account set --subscription "$AZURE_SUBSCRIPTION_ID"`
|
|
- [ ] Create resource group:
|
|
```bash
|
|
az group create \
|
|
--name "$AZURE_RESOURCE_GROUP" \
|
|
--location "$AZURE_LOCATION"
|
|
```
|
|
- [ ] Verify resource group: `az group show --name "$AZURE_RESOURCE_GROUP"`
|
|
|
|
**Reference:** `docs/temporary/NEXT_STEPS.md` Section 2
|
|
|
|
#### 5.2 Onboard ML110 to Azure Arc
|
|
**Status:** ⏳ PENDING
|
|
**Server:** ML110 (192.168.1.206)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to ML110: `ssh root@192.168.1.206`
|
|
- [ ] Set environment variables:
|
|
```bash
|
|
export RESOURCE_GROUP=HC-Stack # or from .env
|
|
export TENANT_ID=<tenant_id>
|
|
export SUBSCRIPTION_ID=<subscription_id>
|
|
export LOCATION=eastus # or from .env
|
|
export TAGS="type=proxmox,host=ml110"
|
|
```
|
|
- [ ] Run onboarding script: `./scripts/azure-arc/onboard-proxmox-hosts.sh`
|
|
- [ ] Verify agent installed: `azcmagent show`
|
|
- [ ] Verify connection: Check Azure Portal
|
|
|
|
**Script:** `scripts/azure-arc/onboard-proxmox-hosts.sh`
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 6
|
|
|
|
#### 5.3 Onboard R630 to Azure Arc
|
|
**Status:** ⏳ PENDING
|
|
**Server:** R630 (192.168.1.49)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to R630: `ssh root@192.168.1.49`
|
|
- [ ] Set environment variables (same as ML110, change TAGS):
|
|
```bash
|
|
export TAGS="type=proxmox,host=r630"
|
|
```
|
|
- [ ] Run onboarding script: `./scripts/azure-arc/onboard-proxmox-hosts.sh`
|
|
- [ ] Verify agent installed: `azcmagent show`
|
|
- [ ] Verify connection: Check Azure Portal
|
|
|
|
**Script:** `scripts/azure-arc/onboard-proxmox-hosts.sh`
|
|
|
|
#### 5.4 Verify Azure Arc Integration
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Verify both servers in Azure Portal: Azure Arc → Servers
|
|
- [ ] Check server status (should be "Connected")
|
|
- [ ] Verify tags applied correctly
|
|
- [ ] Test Azure Policy assignment (if configured)
|
|
- [ ] Verify Azure Monitor integration (if configured)
|
|
|
|
**Reference:** `docs/deployment/azure-arc-onboarding.md`
|
|
|
|
---
|
|
|
|
### 6. Cloudflare Configuration
|
|
|
|
#### 6.1 Configure Cloudflare Credentials
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Create Cloudflare API token: https://dash.cloudflare.com/profile/api-tokens
|
|
- [ ] Add to `.env` file:
|
|
```bash
|
|
CLOUDFLARE_API_TOKEN=<your_token>
|
|
CLOUDFLARE_ACCOUNT_EMAIL=<your_email>
|
|
```
|
|
- [ ] Verify credentials not committed to git (check `.gitignore`)
|
|
- [ ] Test Cloudflare API access (if script available)
|
|
|
|
**Reference:** `docs/temporary/DEPLOYMENT_STATUS.md` Section "Cloudflare Configuration Pending"
|
|
|
|
---
|
|
|
|
## 🟡 Medium Priority: Service Deployment
|
|
|
|
### 7. VM Template Creation
|
|
|
|
#### 7.1 Verify/Create Ubuntu 24.04 Template
|
|
**Status:** ⏳ PENDING
|
|
**Note:** VM 9000 exists on ML110 but may need configuration
|
|
|
|
**Actions:**
|
|
- [ ] Check existing template VM 9000 on ML110
|
|
- [ ] Verify template configuration:
|
|
- Cloud-init enabled
|
|
- QEMU agent enabled
|
|
- Proper disk size
|
|
- Network configuration
|
|
- [ ] If template needs creation:
|
|
- [ ] Upload Ubuntu 24.04 ISO to Proxmox storage
|
|
- [ ] Create VM from ISO
|
|
- [ ] Install Ubuntu 24.04
|
|
- [ ] Install QEMU guest agent
|
|
- [ ] Install Azure Arc agent (optional, for template)
|
|
- [ ] Configure cloud-init
|
|
- [ ] Convert to template
|
|
- [ ] Verify template accessible from both nodes (if clustered)
|
|
|
|
**Scripts:**
|
|
- `scripts/vm-management/create/create-proxmox-template.sh`
|
|
- `scripts/vm-management/create/create-template-via-api.sh`
|
|
|
|
**Reference:** `docs/operations/proxmox-ubuntu-images.md`
|
|
|
|
---
|
|
|
|
### 8. Service VM Deployment
|
|
|
|
#### 8.1 Deploy Cloudflare Tunnel VM
|
|
**Status:** ⏳ PENDING
|
|
|
|
**VM Specifications:**
|
|
- **VM ID:** 100 (or next available)
|
|
- **Name:** cloudflare-tunnel
|
|
- **IP:** 192.168.1.60/24
|
|
- **Gateway:** 192.168.1.254
|
|
- **VLAN:** 99
|
|
- **CPU:** 2 cores
|
|
- **RAM:** 4GB
|
|
- **Disk:** 40GB
|
|
- **Template:** ubuntu-24.04-cloudinit
|
|
|
|
**Actions:**
|
|
- [ ] Create VM from template (via Terraform or Proxmox API)
|
|
- [ ] Configure network (VLAN 99)
|
|
- [ ] Configure IP address (192.168.1.60/24)
|
|
- [ ] Start VM
|
|
- [ ] Verify VM accessible
|
|
|
|
**Scripts:**
|
|
- Terraform: `terraform/proxmox/`
|
|
- API: `scripts/vm-management/create/create-vms-from-template.sh`
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 8.2 Deploy K3s Master VM
|
|
**Status:** ⏳ PENDING
|
|
|
|
**VM Specifications:**
|
|
- **VM ID:** 101 (or next available)
|
|
- **Name:** k3s-master
|
|
- **IP:** 192.168.1.188/24
|
|
- **Gateway:** 192.168.1.254
|
|
- **VLAN:** 30 (Services)
|
|
- **CPU:** 4 cores
|
|
- **RAM:** 8GB
|
|
- **Disk:** 80GB
|
|
- **Template:** ubuntu-24.04-cloudinit
|
|
|
|
**Actions:**
|
|
- [ ] Create VM from template
|
|
- [ ] Configure network (VLAN 30)
|
|
- [ ] Configure IP address (192.168.1.188/24)
|
|
- [ ] Start VM
|
|
- [ ] Verify VM accessible
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 8.3 Deploy Git Server VM
|
|
**Status:** ⏳ PENDING
|
|
|
|
**VM Specifications:**
|
|
- **VM ID:** 102 (or next available)
|
|
- **Name:** git-server
|
|
- **IP:** 192.168.1.121/24
|
|
- **Gateway:** 192.168.1.254
|
|
- **VLAN:** 50 (CI/CD)
|
|
- **CPU:** 4 cores
|
|
- **RAM:** 8GB
|
|
- **Disk:** 100GB
|
|
- **Template:** ubuntu-24.04-cloudinit
|
|
|
|
**Actions:**
|
|
- [ ] Create VM from template
|
|
- [ ] Configure network (VLAN 50)
|
|
- [ ] Configure IP address (192.168.1.121/24)
|
|
- [ ] Start VM
|
|
- [ ] Verify VM accessible
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 8.4 Deploy Observability VM
|
|
**Status:** ⏳ PENDING
|
|
|
|
**VM Specifications:**
|
|
- **VM ID:** 103 (or next available)
|
|
- **Name:** observability
|
|
- **IP:** 192.168.1.82/24
|
|
- **Gateway:** 192.168.1.254
|
|
- **VLAN:** 40 (Monitoring)
|
|
- **CPU:** 4 cores
|
|
- **RAM:** 8GB
|
|
- **Disk:** 200GB
|
|
- **Template:** ubuntu-24.04-cloudinit
|
|
|
|
**Actions:**
|
|
- [ ] Create VM from template
|
|
- [ ] Configure network (VLAN 40)
|
|
- [ ] Configure IP address (192.168.1.82/24)
|
|
- [ ] Start VM
|
|
- [ ] Verify VM accessible
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
---
|
|
|
|
### 9. OS Installation on VMs
|
|
|
|
#### 9.1 Install Ubuntu 24.04 on All VMs
|
|
**Status:** ⏳ PENDING
|
|
**Note:** This requires manual console access
|
|
|
|
**Actions (for each VM):**
|
|
- [ ] Access Proxmox Web UI: https://192.168.1.206:8006 or https://192.168.1.49:8006
|
|
- [ ] For each VM (100, 101, 102, 103):
|
|
- [ ] Click on VM → Console
|
|
- [ ] Ubuntu installer should boot from ISO/cloud-init
|
|
- [ ] Complete installation with appropriate IP configuration:
|
|
- **VM 100 (cloudflare-tunnel):** IP: 192.168.1.60/24, Gateway: 192.168.1.254
|
|
- **VM 101 (k3s-master):** IP: 192.168.1.188/24, Gateway: 192.168.1.254
|
|
- **VM 102 (git-server):** IP: 192.168.1.121/24, Gateway: 192.168.1.254
|
|
- **VM 103 (observability):** IP: 192.168.1.82/24, Gateway: 192.168.1.254
|
|
- [ ] Create user account (remember for SSH)
|
|
- [ ] Verify SSH access
|
|
|
|
**Reference:** `docs/temporary/COMPLETE_STATUS.md` Step 1
|
|
|
|
#### 9.2 Verify OS Installation
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Run VM status check: `./scripts/check-vm-status.sh` (if available)
|
|
- [ ] Verify network connectivity from each VM
|
|
- [ ] Verify SSH access to each VM
|
|
- [ ] Verify Ubuntu 24.04 installed correctly
|
|
- [ ] Verify QEMU guest agent working
|
|
|
|
**Scripts:**
|
|
- `scripts/check-vm-status.sh` (if exists)
|
|
- `scripts/vm-management/monitor/check-vm-disk-sizes.sh`
|
|
|
|
---
|
|
|
|
### 10. Service Configuration
|
|
|
|
#### 10.1 Configure Cloudflare Tunnel
|
|
**Status:** ⏳ PENDING
|
|
**VM:** cloudflare-tunnel (192.168.1.60)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to cloudflare-tunnel VM
|
|
- [ ] Install cloudflared:
|
|
```bash
|
|
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
|
|
chmod +x /usr/local/bin/cloudflared
|
|
```
|
|
- [ ] Authenticate: `cloudflared tunnel login`
|
|
- [ ] Create tunnel: `cloudflared tunnel create azure-stack-hci`
|
|
- [ ] Configure tunnel routes (see `docs/deployment/cloudflare-integration.md`)
|
|
- [ ] Configure tunnel for:
|
|
- Windows Admin Center (if applicable)
|
|
- Proxmox UI
|
|
- Dashboards
|
|
- Git/CI services
|
|
- [ ] Set up systemd service for cloudflared
|
|
- [ ] Test external access
|
|
|
|
**Script:** `scripts/setup-cloudflare-tunnel.sh` (if available)
|
|
**Reference:** `docs/deployment/cloudflare-integration.md`
|
|
|
|
#### 10.2 Deploy and Configure K3s
|
|
**Status:** ⏳ PENDING
|
|
**VM:** k3s-master (192.168.1.188)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to k3s-master VM
|
|
- [ ] Install K3s: `curl -sfL https://get.k3s.io | sh -`
|
|
- [ ] Verify K3s running: `kubectl get nodes`
|
|
- [ ] Get kubeconfig: `sudo cat /etc/rancher/k3s/k3s.yaml`
|
|
- [ ] Configure kubectl access
|
|
- [ ] Install required addons (if any)
|
|
- [ ] Onboard to Azure Arc (if applicable):
|
|
```bash
|
|
export RESOURCE_GROUP=HC-Stack
|
|
export CLUSTER_NAME=proxmox-k3s-cluster
|
|
./infrastructure/kubernetes/arc-onboard-k8s.sh
|
|
```
|
|
|
|
**Script:** `scripts/setup-k3s.sh` (if available)
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 10.3 Set Up Git Server
|
|
**Status:** ⏳ PENDING
|
|
**VM:** git-server (192.168.1.121)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to git-server VM
|
|
- [ ] Choose Git server (Gitea or GitLab CE)
|
|
- [ ] Install Git server:
|
|
- **Gitea:** `./infrastructure/gitops/gitea-deploy.sh`
|
|
- **GitLab CE:** `./infrastructure/gitops/gitlab-deploy.sh`
|
|
- [ ] Configure Git server:
|
|
- Admin account
|
|
- Repository creation
|
|
- User access
|
|
- [ ] Create initial repositories
|
|
- [ ] Configure GitOps workflows
|
|
|
|
**Scripts:**
|
|
- `scripts/setup-git-server.sh` (if available)
|
|
- `infrastructure/gitops/gitea-deploy.sh`
|
|
- `infrastructure/gitops/gitlab-deploy.sh`
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 10.4 Deploy Observability Stack
|
|
**Status:** ⏳ PENDING
|
|
**VM:** observability (192.168.1.82)
|
|
|
|
**Actions:**
|
|
- [ ] SSH to observability VM
|
|
- [ ] Deploy Prometheus:
|
|
- Install Prometheus
|
|
- Configure scrape targets
|
|
- Set up retention policies
|
|
- [ ] Deploy Grafana:
|
|
- Install Grafana
|
|
- Configure data sources (Prometheus)
|
|
- Import dashboards
|
|
- Configure authentication
|
|
- [ ] Configure monitoring for:
|
|
- Proxmox hosts
|
|
- VMs
|
|
- Kubernetes cluster
|
|
- Network metrics
|
|
- Storage metrics
|
|
- [ ] Set up alerting rules
|
|
|
|
**Script:** `scripts/setup-observability.sh` (if available)
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 8
|
|
|
|
#### 10.5 Configure GitOps Workflows
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Create Git repository in Git server
|
|
- [ ] Copy `gitops/` directory to repository
|
|
- [ ] Configure Flux or ArgoCD (if applicable)
|
|
- [ ] Set up CI/CD pipelines
|
|
- [ ] Configure automated deployments
|
|
- [ ] Test GitOps workflow
|
|
|
|
**Reference:** `docs/operations/runbooks/gitops-workflow.md`
|
|
|
|
---
|
|
|
|
## 🟢 Low Priority: Optimization & Hardening
|
|
|
|
### 11. Security Hardening
|
|
|
|
#### 11.1 Create RBAC Accounts for Proxmox
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Review RBAC guide: `docs/security/proxmox-rbac.md`
|
|
- [ ] Create service accounts for automation
|
|
- [ ] Create operator accounts with appropriate roles
|
|
- [ ] Generate API tokens for service accounts
|
|
- [ ] Document RBAC account usage
|
|
- [ ] Update automation scripts to use API tokens instead of root
|
|
- [ ] Test API token authentication
|
|
- [ ] Remove or restrict root API access (if desired)
|
|
|
|
**Reference:** `docs/security/proxmox-rbac.md`
|
|
|
|
#### 11.2 Review Firewall Rules
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Review firewall configuration on both Proxmox hosts
|
|
- [ ] Verify only necessary ports are open
|
|
- [ ] Configure firewall rules for cluster communication
|
|
- [ ] Document firewall configuration
|
|
- [ ] Test firewall rules
|
|
|
|
#### 11.3 Configure Security Policies
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Review Azure Policy assignments
|
|
- [ ] Configure security baselines
|
|
- [ ] Enable Azure Defender (if applicable)
|
|
- [ ] Configure update management
|
|
- [ ] Review secret management
|
|
- [ ] Perform security scan
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 10
|
|
|
|
---
|
|
|
|
### 12. Monitoring Setup
|
|
|
|
#### 12.1 Configure Monitoring Dashboards
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Configure Grafana dashboards for:
|
|
- Proxmox hosts
|
|
- VMs
|
|
- Kubernetes cluster
|
|
- Network performance
|
|
- Storage performance
|
|
- [ ] Set up Prometheus alerting rules
|
|
- [ ] Configure alert notifications
|
|
- [ ] Test alerting
|
|
|
|
#### 12.2 Configure Azure Monitor
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Enable Log Analytics workspace
|
|
- [ ] Configure data collection rules
|
|
- [ ] Set up Azure Monitor alerts
|
|
- [ ] Configure log queries
|
|
- [ ] Test Azure Monitor integration
|
|
|
|
**Reference:** `docs/deployment/bring-up-checklist.md` Phase 10
|
|
|
|
---
|
|
|
|
### 13. Performance Tuning
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Review storage performance
|
|
- [ ] Optimize VM resource allocation
|
|
- [ ] Tune network settings
|
|
- [ ] Optimize Proxmox cluster settings
|
|
- [ ] Run performance benchmarks
|
|
- [ ] Document performance metrics
|
|
|
|
---
|
|
|
|
### 14. Documentation Updates
|
|
**Status:** ⏳ PENDING
|
|
|
|
**Actions:**
|
|
- [ ] Update `docs/temporary/COMPLETE_STATUS.md` with actual status
|
|
- [ ] Update `docs/temporary/DEPLOYMENT_STATUS.md` with current blockers
|
|
- [ ] Update `docs/temporary/NEXT_STEPS.md` with completed items
|
|
- [ ] Create runbooks for common operations
|
|
- [ ] Document network topology
|
|
- [ ] Document storage configuration
|
|
- [ ] Create troubleshooting guides
|
|
|
|
---
|
|
|
|
## Summary Checklist
|
|
|
|
### Critical (Must Complete First)
|
|
- [ ] Azure subscription verification/enablement
|
|
- [ ] Proxmox cluster configuration
|
|
- [ ] NFS/shared storage configuration
|
|
- [ ] Network/VLAN configuration
|
|
|
|
### High Priority (Core Infrastructure)
|
|
- [ ] Azure Arc onboarding (both servers)
|
|
- [ ] Cloudflare credentials configuration
|
|
|
|
### Medium Priority (Service Deployment)
|
|
- [ ] VM template creation/verification
|
|
- [ ] Service VM deployment (4 VMs)
|
|
- [ ] OS installation on VMs
|
|
- [ ] Service configuration (Cloudflare, K3s, Git, Observability)
|
|
|
|
### Low Priority (Optimization)
|
|
- [ ] Security hardening (RBAC, firewalls)
|
|
- [ ] Monitoring setup
|
|
- [ ] Performance tuning
|
|
- [ ] Documentation updates
|
|
|
|
---
|
|
|
|
## Estimated Timeline
|
|
|
|
- **Week 1:** Critical and High Priority items (Infrastructure foundation)
|
|
- **Week 2:** Medium Priority items (Service deployment)
|
|
- **Week 3-4:** Low Priority items (Optimization and hardening)
|
|
|
|
**Total Estimated Time:** 3-4 weeks for complete deployment
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
### Key Scripts
|
|
- Cluster Setup: `infrastructure/proxmox/cluster-setup.sh`
|
|
- NFS Storage: `infrastructure/proxmox/nfs-storage.sh`
|
|
- VLAN Configuration: `infrastructure/network/configure-proxmox-vlans.sh`
|
|
- Azure Arc: `scripts/azure-arc/onboard-proxmox-hosts.sh`
|
|
- Health Check: `scripts/health/check-proxmox-health.sh`
|
|
- Status Query: `scripts/health/query-proxmox-status.sh`
|
|
|
|
### Key Documentation
|
|
- Status Review: `docs/PROXMOX_STATUS_REVIEW.md`
|
|
- Bring-Up Checklist: `docs/deployment/bring-up-checklist.md`
|
|
- Azure Arc Onboarding: `docs/deployment/azure-arc-onboarding.md`
|
|
- Cloudflare Integration: `docs/deployment/cloudflare-integration.md`
|
|
- Proxmox RBAC: `docs/security/proxmox-rbac.md`
|
|
|
|
### Server Information
|
|
- **ML110:** 192.168.1.206:8006
|
|
- **R630:** 192.168.1.49:8006
|
|
- **Cluster Name:** hc-cluster (to be created)
|
|
- **Resource Group:** HC-Stack (to be created)
|
|
|
|
---
|
|
|
|
**Last Updated:** 2025-11-27
|
|
**Next Review:** After completing Phase 1 (Infrastructure Foundation)
|
|
|