# Complete Azure Stack HCI Architecture ## Overview This document describes the complete architecture for a local Azure Stack HCI environment with Cloudflare Zero Trust, Azure Arc governance, Proxmox VE virtualization, and Ubuntu service VMs. The system transforms your environment into a local Azure "cloud" using Azure Stack HCI principles. ## Core Objectives - **Local Azure cloud:** Govern on-prem servers with Azure Arc and adopt Azure operations practices - **Hyper-converged stack:** Proxmox VE for virtualization, Ubuntu VMs for services, centralized storage via external shelves - **Secure edge:** Cloudflare Zero Trust/Tunnel to expose services without inbound ports - **High-availability networking:** 4× 1Gbps Spectrum WAN, multi-WAN failover/policy routing, QAT-accelerated VPN/TLS offload - **Unified ops:** CI/CD, monitoring, and consistent configuration across all nodes ## Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────────┐ │ Azure Portal │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Azure Arc │ │ Azure Policy │ │ Azure Monitor │ │ │ │ Servers │ │ │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Arc K8s │ │ GitOps │ │ Defender │ │ │ │ │ │ (Flux) │ │ for Cloud │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ │ Azure Arc Connection │ ┌─────────────────────────────────────────────────────────────────┐ │ On-Premises Infrastructure │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Router/Switch/Storage Controller Server │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Windows Server│ │ OpenWrt VM │ │ Storage S2D │ │ │ │ │ │ Core + Hyper-V│ │ (mwan3) │ │ Pools │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ Azure Arc │ │ 4× WAN │ │ 4× Shelves │ │ │ │ │ │ Agent │ │ (Spectrum) │ │ (via LSI HBAs)│ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ │ │ │ │ │ │ └─────────┼──────────────────┼──────────────────┼──────────┘ │ │ │ │ │ │ │ ┌─────────▼──────────────────▼──────────────────▼──────────┐ │ │ │ Proxmox VE Hosts (Existing) │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ HPE ML110 │ │ Dell R630 │ │ │ │ │ │ Gen9 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ Azure Arc │ │ Azure Arc │ │ │ │ │ │ Agent │ │ Agent │ │ │ │ │ └──────────────┘ └──────────────┘ │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Ubuntu Service VMs │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Cloudflare │ │ Reverse │ │ Observability │ │ │ │ │ │ Tunnel VM │ │ Proxy VM │ │ VM │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ Azure Arc │ │ Azure Arc │ │ Azure Arc │ │ │ │ │ │ Agent │ │ Agent │ │ Agent │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ ┌──────────────┐ │ │ │ │ │ CI/CD VM │ │ │ │ │ │ │ │ │ │ │ │ Azure Arc │ │ │ │ │ │ Agent │ │ │ │ │ └──────────────┘ │ │ │ └──────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ │ Cloudflare Tunnel (Outbound Only) │ ┌─────────────────────────────────────────────────────────────────┐ │ Cloudflare Zero Trust │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Zero Trust │ │ WAF │ │ Tunnel │ │ │ │ Policies │ │ Rules │ │ Endpoints │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ ``` ## Physical Infrastructure ### Router/Switch/Storage Controller Server (New) - **Chassis:** Entry-level Supermicro/Dell mini-server - **CPU:** Intel Xeon E-2100 or similar (6-8 cores), PCIe 3.0 support - **Memory:** 8× 4GB DDR4 ECC RDIMM = 32GB (reused from R630) - **Storage:** 256GB SSD (OS, configs), optional mirrored boot - **PCIe Cards:** - Intel i350-T4: 4× 1GbE (WAN - Spectrum connections) - Intel X550-T2: 2× 10GbE RJ45 (future uplinks or high-perf server links) - Intel i225 Quad-Port: 4× 2.5GbE (LAN to key servers) - Intel i350-T8: 8× 1GbE (LAN to remaining servers) - Intel QAT 8970: Crypto acceleration (TLS/IPsec/compression) - 2× LSI 9207-8e: SAS HBAs for 4 external shelves ### Proxmox VE Hosts (Existing) - **HPE ProLiant ML110 Gen9:** - CPU: Intel Xeon E5-series - Memory: Remaining DDR4 ECC RDIMM after Router allocation - Storage: Local SSDs/HDDs for OS and VM disks - Networking: 1GbE onboard NICs; optional Intel add-in NICs - **Dell PowerEdge R630:** - CPU: Intel Xeon E5 v3/v4 dual-socket - Memory: Remaining DDR4 ECC RDIMM (32GB spare pool noted) - Storage: PERC or HBA with SSDs - Networking: 1/10GbE depending on NICs installed ### Storage Shelves - **Quantity:** 4 external SAS JBOD shelves - **Connectivity:** Each shelf via SFF-8644 to LSI HBAs; dual-pathing optional - **Role:** Backing storage for VMs, Kubernetes PVCs, and NAS services ### WAN Connectivity - **Providers:** 4× Spectrum Internet 1Gbps - **Termination:** i350-T4 on Router server - **Routing:** Multi-WAN policy routing and failover; per-ISP health checks ## Software Stack ### Router Server - **Base OS:** Windows Server Core with Hyper-V (for HCI integration) OR Proxmox VE (uniform virtualization) - **Network Services:** - OpenWrt VM: Multi-WAN (mwan3), firewall, VLANs, policy routing - Intel PROSet drivers for all NICs - QAT drivers/qatlib + OpenSSL QAT engine - **Storage Services:** - LSI HBAs: IT mode, mpt3sas driver, attach shelves - Storage Spaces Direct: Pools/volumes for VM and app storage - Optional ZFS on Linux (VM or host) for NAS - **Management:** - Windows Admin Center (WAC): Cluster lifecycle, health - Azure Arc agent: Connected Machine agent on Linux VMs/hosts ### Proxmox VE (ML110, R630) - **Hypervisor:** Latest Proxmox VE - **Guests:** Ubuntu LTS for app services, Cloudflare Tunnel endpoints, monitoring, logging, Arc agents - **Storage:** Connect to shelves via exported protocols (NFS/iSCSI) or pass-through HBAs/volumes - **Networking:** Tag VLANs per VM bridge; allocate vNICs tied to VLAN schema ### Ubuntu Service VMs - **Cloudflare Tunnel (Zero Trust):** `cloudflared` to publish internal apps (WAC, dashboards, SSH, selected services) without inbound ports - **Azure Arc agent:** Connected Machine agent to enroll Linux VMs and hosts for policy/monitor/defender/update - **Observability:** Prometheus, Grafana, Loki/OpenSearch for logs; syslog from Router and Proxmox nodes - **Reverse proxy:** NGINX/Traefik with mTLS, integrated behind Cloudflare - **Automation/CI:** GitLab Runner/Jenkins agents for local CI/CD pipelines ## Key Integrations ### Cloudflare - **Zero Trust/Tunnel:** Use `cloudflared` on Ubuntu VM in VLAN 99 to expose: - Management portals: WAC, Proxmox UI, dashboards (restrict via SSO/MFA) - Developer services: Git, CI, internal APIs - **Policies:** SSO (Azure AD/Okta), device posture checks, least privilege - **WAF and routing:** Protect public ingress; no inbound ports on Spectrum WAN CPE ### Azure Arc - **Targets:** Ubuntu service VMs, optionally Proxmox hosts (as Linux), and Windows management VM - **Process:** Install Connected Machine agent; validate Arc connection; enable Azure Policy, Monitor, Defender, and Update Manager - **Proxy considerations:** If outbound constraints apply, onboarding via proxy methods is documented ## High-Level Data Flows - **North-south:** 4× Spectrum WAN → Router (OpenWrt VM) → Cloudflare Tunnel outbound only for published services - **East-west:** VLAN-segmented traffic across Proxmox nodes, Ubuntu VMs, storage shelves; QAT accelerates crypto within Router server for site-to-site VPN if needed - **Storage:** Router server's HBAs → shelves; exports (NFS/SMB/iSCSI) → Proxmox/Ubuntu VMs ## Security Model - **Perimeter:** No inbound ports; Cloudflare Tunnel + Zero Trust policies - **Identity:** SSO + MFA for management; role-based access - **Network:** Inter-VLAN default deny; explicit allow for app→storage, monitoring→inbound - **Supply chain:** Signed commits/artifacts; secret vault (no secrets in repos) - **Azure governance:** Policies for baseline configuration and updates via Arc ## Milestones for Success 1. **Foundation** - Hardware ready, base software installed 2. **Infrastructure Automation** - Azure Arc agents installed, storage configured 3. **Networking and Storage Services** - OpenWrt VM with multi-WAN, VLAN segmentation, storage exports 4. **VM and Platform** - Ubuntu VMs deployed, Proxmox bridges mapped to VLANs 5. **Secure External Access and Governance** - Cloudflare Tunnel published, Azure governance via Arc 6. **Operations and Continuous Improvement** - Observability dashboards live, runbooks documented ## Related Documentation - [Hardware BOM](hardware-bom.md) - Complete bill of materials - [PCIe Allocation](pcie-allocation.md) - Slot allocation map - [Network Topology](network-topology.md) - VLAN/IP schema and routing - [Cloudflare Integration](cloudflare-integration.md) - Tunnel and Zero Trust setup - [Azure Arc Onboarding](azure-arc-onboarding.md) - Agent installation and governance - [Bring-Up Checklist](bring-up-checklist.md) - Day-one installation guide