Files
loc_az_hci/docs/temporary/NEXT_STEPS.md
defiQUG c39465c2bd
Some checks failed
Test / test (push) Has been cancelled
Initial commit: loc_az_hci (smom-dbis-138 excluded via .gitignore)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-08 09:04:46 -08:00

372 lines
9.5 KiB
Markdown

# Next Steps - Azure Stack HCI Deployment
## ✅ Completed
- [x] Environment configuration (`.env` file setup)
- [x] Proxmox credential structure (best practices with `PVE_ROOT_PASS`)
- [x] Connection testing script created and verified
- [x] Both Proxmox servers tested and accessible:
- HPE ML110 Gen9: `192.168.1.206:8006`
- Dell R630: `192.168.1.49:8006`
- [x] Documentation updated with security best practices
## 🎯 Immediate Next Steps (Priority Order)
### 1. Complete Environment Configuration
**Status**: Partially complete - Proxmox configured, Azure/Cloudflare pending
```bash
# Edit .env file and configure remaining credentials
nano .env # or use your preferred editor
```
**Required:**
- [ ] `AZURE_SUBSCRIPTION_ID` - Get from: `az account show --query id -o tsv`
- [ ] `AZURE_TENANT_ID` - Get from: `az account show --query tenantId -o tsv`
- [ ] `AZURE_RESOURCE_GROUP` - Set to: `HC-Stack` (or your preferred name)
- [ ] `AZURE_LOCATION` - Set to: `eastus` (or your preferred region)
- [ ] `CLOUDFLARE_API_TOKEN` - Create at: https://dash.cloudflare.com/profile/api-tokens
- [ ] `CLOUDFLARE_ACCOUNT_EMAIL` - Your Cloudflare account email
**Verify configuration:**
```bash
# Test Proxmox connections (already working)
./scripts/utils/test-proxmox-connection.sh
# Test Azure CLI connection
az account show
# Verify environment variables loaded
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
echo "Azure Subscription: $AZURE_SUBSCRIPTION_ID"
echo "Azure Tenant: $AZURE_TENANT_ID"
```
### 2. Azure Prerequisites Setup
**Create Azure Resource Group:**
```bash
# Load environment variables
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
# Login to Azure
az login
# Set subscription
az account set --subscription "$AZURE_SUBSCRIPTION_ID"
# Create resource group
az group create \
--name "$AZURE_RESOURCE_GROUP" \
--location "$AZURE_LOCATION"
# Verify
az group show --name "$AZURE_RESOURCE_GROUP"
```
**Verify Azure CLI:**
```bash
# Check prerequisites
./scripts/utils/prerequisites-check.sh
```
### 3. Proxmox Cluster Configuration
**Current Status**: Both servers are accessible but may not be clustered yet.
**Option A: If servers are already clustered:**
```bash
# Verify cluster status (run on one of the Proxmox hosts)
pvecm status
pvecm nodes
```
**Option B: If servers need to be clustered:**
**On ML110 (192.168.1.206):**
```bash
# SSH to the server
ssh root@192.168.1.206
# Configure network (if needed)
export NODE_IP=192.168.1.206
export NODE_GATEWAY=192.168.1.254 # Adjust based on your network
export NODE_HOSTNAME=pve-ml110
# Run configuration scripts (if available)
# ./infrastructure/proxmox/network-config.sh
# ./infrastructure/proxmox/cluster-setup.sh
```
**On R630 (192.168.1.49):**
```bash
# SSH to the server
ssh root@192.168.1.49
# Configure network (if needed)
export NODE_IP=192.168.1.49
export NODE_GATEWAY=192.168.1.254 # Adjust based on your network
export NODE_HOSTNAME=pve-r630
export CLUSTER_NODE_IP=192.168.1.206
# Run configuration scripts (if available)
# ./infrastructure/proxmox/network-config.sh
# export NODE_ROLE=join
# ./infrastructure/proxmox/cluster-setup.sh
```
**Verify cluster:**
```bash
# From either Proxmox host
pvecm status
pvecm nodes
```
### 4. Azure Arc Onboarding
**Onboard Proxmox Hosts to Azure Arc:**
**On ML110:**
```bash
# SSH to ML110
ssh root@192.168.1.206
# Load environment variables (copy .env or set manually)
export RESOURCE_GROUP="${AZURE_RESOURCE_GROUP:-HC-Stack}"
export TENANT_ID="${AZURE_TENANT_ID}"
export SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
export LOCATION="${AZURE_LOCATION:-eastus}"
export TAGS="type=proxmox,host=ml110"
# Run onboarding script
./scripts/azure-arc/onboard-proxmox-hosts.sh
```
**On R630:**
```bash
# SSH to R630
ssh root@192.168.1.49
# Load environment variables
export RESOURCE_GROUP="${AZURE_RESOURCE_GROUP:-HC-Stack}"
export TENANT_ID="${AZURE_TENANT_ID}"
export SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
export LOCATION="${AZURE_LOCATION:-eastus}"
export TAGS="type=proxmox,host=r630"
# Run onboarding script
./scripts/azure-arc/onboard-proxmox-hosts.sh
```
**Verify in Azure Portal:**
- Navigate to: Azure Portal → Azure Arc → Servers
- Both Proxmox hosts should appear as "Connected"
### 5. Create Service VMs
**Using Terraform (Recommended):**
```bash
cd terraform/proxmox
# Create terraform.tfvars
cat > terraform.tfvars <<EOF
proxmox_host = "192.168.1.206" # or 192.168.1.49
proxmox_username = "root@pam"
proxmox_password = "${PVE_ROOT_PASS}"
proxmox_node = "pve" # Adjust based on your node name
EOF
# Initialize and apply
terraform init
terraform plan
terraform apply
```
**Or manually via Proxmox Web UI:**
- Access: `https://192.168.1.206:8006` or `https://192.168.1.49:8006`
- Create VMs for:
- Kubernetes (K3s)
- Cloudflare Tunnel
- Git Server (Gitea/GitLab)
- Observability (Prometheus/Grafana)
### 6. Cloudflare Tunnel Setup
**Prerequisites:**
- Cloudflare account with Zero Trust enabled
- Ubuntu VM deployed in VLAN 99 (or appropriate network)
**Setup Tunnel:**
```bash
# On Ubuntu Tunnel VM
# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared
# Authenticate
cloudflared tunnel login
# Create tunnel
cloudflared tunnel create azure-stack-hci
# Configure tunnel (see docs/cloudflare-integration.md)
```
**Reference:**
- [Cloudflare Integration Guide](docs/cloudflare-integration.md)
### 7. Kubernetes (K3s) Deployment
**On K3s VM:**
```bash
# Install K3s
./infrastructure/kubernetes/k3s-install.sh
# Onboard to Azure Arc
export RESOURCE_GROUP="${AZURE_RESOURCE_GROUP:-HC-Stack}"
export TENANT_ID="${AZURE_TENANT_ID}"
export SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
export LOCATION="${AZURE_LOCATION:-eastus}"
export CLUSTER_NAME=proxmox-k3s-cluster
./infrastructure/kubernetes/arc-onboard-k8s.sh
```
### 8. GitOps Setup
**Deploy Git Server:**
**Option A: Gitea (Recommended for small deployments):**
```bash
./infrastructure/gitops/gitea-deploy.sh
```
**Option B: GitLab CE:**
```bash
./infrastructure/gitops/gitlab-deploy.sh
```
**Configure GitOps:**
1. Create Git repository in your Git server
2. Copy `gitops/` directory to repository
3. Configure GitOps in Azure Portal or using Flux CLI
### 9. Security Hardening
**Create RBAC Accounts for Proxmox:**
```bash
# Follow the guide
cat docs/security/proxmox-rbac.md
# Create service accounts
# Create operator accounts
# Generate API tokens
# Replace root usage in automation
```
**Reference:**
- [Proxmox RBAC Guide](docs/security/proxmox-rbac.md)
### 10. Monitoring and Observability
**Deploy Monitoring Stack:**
```bash
# Deploy via GitOps or manually
helm install prometheus ./gitops/apps/prometheus -n monitoring
helm install grafana ./gitops/apps/grafana -n monitoring
```
**Configure Azure Monitor:**
- Enable Log Analytics workspace
- Configure data collection rules
- Set up alerting
## 📋 Detailed Checklists
For comprehensive step-by-step instructions, refer to:
1. **[Bring-Up Checklist](docs/bring-up-checklist.md)** - Complete day-one installation guide
2. **[Deployment Guide](docs/deployment-guide.md)** - Detailed deployment phases
3. **[Azure Arc Onboarding](docs/azure-arc-onboarding.md)** - Azure integration steps
4. **[Cloudflare Integration](docs/cloudflare-integration.md)** - Secure external access
## 🔧 Useful Commands
**Test Connections:**
```bash
# Test Proxmox connections
./scripts/utils/test-proxmox-connection.sh
# Check prerequisites
./scripts/utils/prerequisites-check.sh
```
**Verify Configuration:**
```bash
# Check .env file
cat .env | grep -v "^#" | grep -v "^$"
# Verify Azure connection
az account show
# Check Proxmox cluster (from Proxmox host)
pvecm status
```
**Load Environment Variables:**
```bash
# Source .env file
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
```
## 🚨 Troubleshooting
**If Proxmox connection fails:**
- Verify internal IPs are correct in `.env`
- Check firewall rules for port 8006
- Verify Proxmox services are running
- Test web UI access in browser
**If Azure Arc onboarding fails:**
- Verify Azure CLI is authenticated: `az login`
- Check network connectivity (outbound HTTPS 443)
- Verify resource group exists
- Review agent logs: `journalctl -u azcmagent`
**If scripts fail:**
- Ensure `.env` file is properly configured
- Check script permissions: `chmod +x scripts/**/*.sh`
- Verify all prerequisites are installed
## 📚 Documentation Reference
- [Complete Architecture](docs/complete-architecture.md)
- [Network Topology](docs/network-topology.md)
- [Hardware BOM](docs/hardware-bom.md)
- [PCIe Allocation](docs/pcie-allocation.md)
- [Runbooks](docs/runbooks/)
## 🎯 Success Criteria
You'll know you're ready for the next phase when:
- [x] Both Proxmox servers are accessible and tested
- [ ] Azure credentials configured and verified
- [ ] Cloudflare credentials configured
- [ ] Azure resource group created
- [ ] Proxmox cluster configured (if applicable)
- [ ] Azure Arc agents installed on Proxmox hosts
- [ ] Service VMs created
- [ ] Cloudflare Tunnel configured
- [ ] Kubernetes cluster deployed
- [ ] GitOps repository configured
---
**Current Status**: Environment configuration complete, ready for Azure Arc onboarding and service deployment.
**Recommended Next Action**: Complete Azure and Cloudflare credential configuration, then proceed with Azure Arc onboarding.