Some checks failed
Test / test (push) Has been cancelled
Co-authored-by: Cursor <cursoragent@cursor.com>
577 lines
19 KiB
Markdown
577 lines
19 KiB
Markdown
# Network Topology
|
||
|
||
## Overview
|
||
|
||
This document describes the network architecture and topology for the Proxmox Azure Arc Hybrid Cloud Stack.
|
||
|
||
## Network Architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ Internet / Azure Cloud │
|
||
└─────────────────────────────────────────────────────────────────┘
|
||
│
|
||
│ VPN / Internet
|
||
│
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ On-Premises Network │
|
||
│ │
|
||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||
│ │ Management Network (192.168.1.0/24) │ │
|
||
│ │ │ │
|
||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||
│ │ │ PVE Node 1 │ │ PVE Node 2 │ │ │
|
||
│ │ │ 192.168.1.10 │ │ 192.168.1.11 │ │ │
|
||
│ │ │ vmbr0 │ │ vmbr0 │ │ │
|
||
│ │ └──────┬───────┘ └──────┬───────┘ │ │
|
||
│ │ │ │ │ │
|
||
│ │ └──────────┬───────────────────┘ │ │
|
||
│ │ │ │ │
|
||
│ │ ┌─────▼─────┐ │ │
|
||
│ │ │ Switch │ │ │
|
||
│ │ │ / Router │ │ │
|
||
│ │ └───────────┘ │ │
|
||
│ │ │ │ │
|
||
│ │ ┌───────────┼───────────┐ │ │
|
||
│ │ │ │ │ │ │
|
||
│ │ ┌──────▼───┐ ┌─────▼────┐ ┌───▼────┐ │ │
|
||
│ │ │ K3s VM │ │ Git VM │ │ Other │ │ │
|
||
│ │ │ .1.50 │ │ .1.60 │ │ VMs │ │ │
|
||
│ │ └──────────┘ └──────────┘ └────────┘ │ │
|
||
│ └──────────────────────────────────────────────────────────┘ │
|
||
│ │
|
||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||
│ │ Storage Network (Optional - 10.0.0.0/24) │ │
|
||
│ │ │ │
|
||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||
│ │ │ PVE Node 1 │ │ PVE Node 2 │ │ │
|
||
│ │ │ vmbr1 │ │ vmbr1 │ │ │
|
||
│ │ │ 10.0.0.10 │ │ 10.0.0.11 │ │ │
|
||
│ │ └──────┬───────┘ └──────┬───────┘ │ │
|
||
│ │ │ │ │ │
|
||
│ │ └──────────┬───────────────────┘ │ │
|
||
│ │ │ │ │
|
||
│ │ ┌─────▼─────┐ │ │
|
||
│ │ │ NFS │ │ │
|
||
│ │ │ Server │ │ │
|
||
│ │ │ 10.0.0.100│ │ │
|
||
│ │ └───────────┘ │ │
|
||
│ └──────────────────────────────────────────────────────────┘ │
|
||
│ │
|
||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||
│ │ Kubernetes Pod Network (10.244.0.0/16) │ │
|
||
│ │ │ │
|
||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||
│ │ │ Besu Pod │ │ Firefly Pod │ │ Chainlink │ │ │
|
||
│ │ │ 10.244.1.10 │ │ 10.244.1.20 │ │ 10.244.1.30 │ │ │
|
||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
|
||
│ │ │ │
|
||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||
│ │ │ Blockscout │ │ Cacti │ │ NGINX │ │ │
|
||
│ │ │ 10.244.1.40 │ │ 10.244.1.50 │ │ 10.244.1.60 │ │ │
|
||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
|
||
│ └──────────────────────────────────────────────────────────┘ │
|
||
└─────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
## Network Segments
|
||
|
||
### 1. Management Network (192.168.1.0/24)
|
||
|
||
**Purpose**: Primary network for Proxmox nodes, VMs, and management traffic
|
||
|
||
**Components**:
|
||
- Proxmox Node 1: `192.168.1.10`
|
||
- Proxmox Node 2: `192.168.1.11`
|
||
- K3s VM: `192.168.1.188`
|
||
- Git Server (Gitea/GitLab): `192.168.1.60`
|
||
- Gateway: `192.168.1.1`
|
||
- DNS: `192.168.1.1` (or your DNS server)
|
||
|
||
**Traffic**:
|
||
- Proxmox web UI access
|
||
- SSH access to nodes and VMs
|
||
- Azure Arc agent communication
|
||
- Cluster communication (Corosync)
|
||
- VM management
|
||
|
||
**Firewall Rules**:
|
||
- Allow: SSH (22), HTTPS (443), Proxmox API (8006)
|
||
- Allow: Azure Arc agent ports (outbound)
|
||
- Allow: Cluster communication (5404-5412 UDP)
|
||
|
||
### 2. Storage Network (10.0.0.0/24) - Optional
|
||
|
||
**Purpose**: Dedicated network for storage traffic (NFS, iSCSI)
|
||
|
||
**Components**:
|
||
- Proxmox Node 1: `10.0.0.10`
|
||
- Proxmox Node 2: `10.0.0.11`
|
||
- NFS Server: `10.0.0.100`
|
||
|
||
**Traffic**:
|
||
- NFS storage access
|
||
- VM disk I/O
|
||
- Cluster storage replication
|
||
|
||
**Benefits**:
|
||
- Isolates storage traffic from management
|
||
- Reduces network congestion
|
||
- Better performance for storage operations
|
||
|
||
### 3. Kubernetes Pod Network (10.244.0.0/16)
|
||
|
||
**Purpose**: Internal Kubernetes pod networking (managed by Flannel/CNI)
|
||
|
||
**Components**:
|
||
- Pod IPs assigned automatically
|
||
- Service IPs: `10.43.0.0/16` (K3s default)
|
||
- Cluster DNS: `10.43.0.10`
|
||
|
||
**Traffic**:
|
||
- Inter-pod communication
|
||
- Service discovery
|
||
- Ingress traffic routing
|
||
|
||
## Network Configuration
|
||
|
||
### Proxmox Bridge Configuration
|
||
|
||
**vmbr0 (Management)**:
|
||
```bash
|
||
auto vmbr0
|
||
iface vmbr0 inet static
|
||
address 192.168.1.10/24
|
||
gateway 192.168.1.1
|
||
bridge-ports eth0
|
||
bridge-stp off
|
||
bridge-fd 0
|
||
```
|
||
|
||
**vmbr1 (Storage - Optional)**:
|
||
```bash
|
||
auto vmbr1
|
||
iface vmbr1 inet static
|
||
address 10.0.0.10/24
|
||
bridge-ports eth1
|
||
bridge-stp off
|
||
bridge-fd 0
|
||
```
|
||
|
||
### Kubernetes Network
|
||
|
||
**K3s Default Configuration**:
|
||
- CNI: Flannel
|
||
- Pod CIDR: `10.42.0.0/16`
|
||
- Service CIDR: `10.43.0.0/16`
|
||
- Cluster DNS: `10.43.0.10`
|
||
|
||
**Custom Configuration** (if needed):
|
||
```yaml
|
||
# /etc/rancher/k3s/config.yaml
|
||
cluster-cidr: "10.244.0.0/16"
|
||
service-cidr: "10.245.0.0/16"
|
||
cluster-dns: "10.245.0.10"
|
||
```
|
||
|
||
## Port Requirements
|
||
|
||
### Proxmox Nodes
|
||
- **8006**: Proxmox web UI (HTTPS)
|
||
- **22**: SSH
|
||
- **5404-5412**: Corosync cluster communication (UDP)
|
||
- **3128**: SPICE proxy (optional)
|
||
|
||
### Azure Arc Agents
|
||
- **Outbound HTTPS (443)**: Azure Arc connectivity
|
||
- **Outbound TCP 443**: Azure Monitor, Azure Policy
|
||
|
||
### Kubernetes (K3s)
|
||
- **6443**: Kubernetes API server
|
||
- **10250**: Kubelet API
|
||
- **8472**: Flannel VXLAN (UDP)
|
||
- **51820-51821**: Flannel WireGuard (UDP)
|
||
|
||
### Application Services
|
||
- **8545**: Besu RPC (HTTP)
|
||
- **8546**: Besu RPC (WebSocket)
|
||
- **30303**: Besu P2P
|
||
- **5000**: Firefly API
|
||
- **6688**: Chainlink API
|
||
- **4000**: Blockscout
|
||
- **80/443**: NGINX Proxy
|
||
- **80**: Cacti
|
||
|
||
### Git Servers
|
||
- **3000**: Gitea web UI
|
||
- **2222**: Gitea SSH
|
||
- **8080**: GitLab web UI
|
||
- **2222**: GitLab SSH
|
||
|
||
## Network Security
|
||
|
||
### Firewall Recommendations
|
||
|
||
**Proxmox Nodes**:
|
||
```bash
|
||
# Allow cluster communication
|
||
ufw allow 5404:5412/udp
|
||
|
||
# Allow Proxmox API
|
||
ufw allow 8006/tcp
|
||
|
||
# Allow SSH
|
||
ufw allow 22/tcp
|
||
```
|
||
|
||
**Kubernetes Nodes**:
|
||
```bash
|
||
# Allow Kubernetes API
|
||
ufw allow 6443/tcp
|
||
|
||
# Allow Flannel networking
|
||
ufw allow 8472/udp
|
||
ufw allow 51820:51821/udp
|
||
```
|
||
|
||
### Network Policies (Kubernetes)
|
||
|
||
Example network policy to restrict traffic:
|
||
```yaml
|
||
apiVersion: networking.k8s.io/v1
|
||
kind: NetworkPolicy
|
||
metadata:
|
||
name: blockchain-network-policy
|
||
namespace: blockchain
|
||
spec:
|
||
podSelector: {}
|
||
policyTypes:
|
||
- Ingress
|
||
- Egress
|
||
ingress:
|
||
- from:
|
||
- namespaceSelector:
|
||
matchLabels:
|
||
name: hc-stack
|
||
egress:
|
||
- to:
|
||
- namespaceSelector:
|
||
matchLabels:
|
||
name: blockchain
|
||
```
|
||
|
||
## DNS Configuration
|
||
|
||
### Internal DNS
|
||
|
||
**Hosts File** (for local resolution):
|
||
```
|
||
192.168.1.188 k3s.local
|
||
192.168.1.60 git.local gitea.local
|
||
192.168.1.10 pve-node-1.local
|
||
192.168.1.11 pve-node-2.local
|
||
```
|
||
|
||
### Service Discovery
|
||
|
||
**Kubernetes DNS**:
|
||
- Service names resolve to cluster IPs
|
||
- Format: `<service-name>.<namespace>.svc.cluster.local`
|
||
- Example: `besu.blockchain.svc.cluster.local`
|
||
|
||
## Load Balancing
|
||
|
||
### NGINX Ingress Controller
|
||
|
||
- **Type**: LoadBalancer or NodePort
|
||
- **Ports**: 80 (HTTP), 443 (HTTPS)
|
||
- **Backend Services**: All application services
|
||
|
||
### Proxmox Load Balancing
|
||
|
||
- Use Proxmox HA groups for VM-level load balancing
|
||
- Configure multiple VMs behind a load balancer
|
||
|
||
## Network Monitoring
|
||
|
||
### Tools
|
||
- **Cacti**: Network traffic monitoring
|
||
- **Azure Monitor**: Network metrics via Azure Arc
|
||
- **Kubernetes Metrics**: Pod and service network stats
|
||
|
||
### Key Metrics
|
||
- Bandwidth utilization
|
||
- Latency between nodes
|
||
- Packet loss
|
||
- Connection counts
|
||
|
||
---
|
||
|
||
## Azure Stack HCI VLAN Schema
|
||
|
||
### Overview
|
||
|
||
The Azure Stack HCI environment uses a comprehensive VLAN-based network segmentation strategy for security, isolation, and scalability.
|
||
|
||
### VLAN Definitions
|
||
|
||
#### VLAN 10 - Core Storage (10.10.10.0/24)
|
||
|
||
**Purpose:** Storage network for shelves, NAS services, and backup
|
||
|
||
**Components:**
|
||
- Storage shelves: 10.10.10.1-10.10.10.9
|
||
- NAS services: 10.10.10.10
|
||
- Backup services: 10.10.10.20
|
||
- Router server storage interface: 10.10.10.1
|
||
|
||
**Traffic:**
|
||
- Storage I/O (NFS, SMB, iSCSI)
|
||
- Backup operations
|
||
- Storage replication
|
||
|
||
**Firewall Rules:**
|
||
- Default: Allow storage protocols
|
||
- Restrict: No internet access
|
||
- Allow: Compute nodes → Storage
|
||
|
||
#### VLAN 20 - Compute (10.10.20.0/24)
|
||
|
||
**Purpose:** Hypervisor traffic, Proxmox migrations, VM management
|
||
|
||
**Components:**
|
||
- Proxmox Node 1 (ML110): 10.10.20.10
|
||
- Proxmox Node 2 (R630): 10.10.20.20
|
||
- Router server compute interface: 10.10.20.1
|
||
- Future compute nodes: 10.10.20.30+
|
||
|
||
**Traffic:**
|
||
- Proxmox cluster communication
|
||
- VM migrations
|
||
- Hypervisor management
|
||
- Storage access (to VLAN 10)
|
||
|
||
**Firewall Rules:**
|
||
- Default: Allow cluster communication
|
||
- Allow: Proxmox API (8006)
|
||
- Allow: Corosync (5404-5412 UDP)
|
||
- Allow: Storage access (VLAN 10)
|
||
|
||
#### VLAN 30 - App Tier (10.10.30.0/24)
|
||
|
||
**Purpose:** Web/API services, internal applications
|
||
|
||
**Components:**
|
||
- Web services: 10.10.30.10-10.10.30.30
|
||
- API services: 10.10.30.40-10.10.30.50
|
||
- Reverse proxy: 10.10.30.10
|
||
- Router server app interface: 10.10.30.1
|
||
|
||
**Traffic:**
|
||
- HTTP/HTTPS traffic
|
||
- API requests
|
||
- Application-to-application communication
|
||
|
||
**Firewall Rules:**
|
||
- Default: Allow HTTP/HTTPS
|
||
- Allow: Reverse proxy → Apps
|
||
- Allow: Monitoring access (VLAN 40)
|
||
|
||
#### VLAN 40 - Observability (10.10.40.0/24)
|
||
|
||
**Purpose:** Monitoring, logging, metrics collection
|
||
|
||
**Components:**
|
||
- Prometheus: 10.10.40.10
|
||
- Grafana: 10.10.40.20
|
||
- Loki/OpenSearch: 10.10.40.30
|
||
- Router server monitoring interface: 10.10.40.1
|
||
|
||
**Traffic:**
|
||
- Metrics collection
|
||
- Log aggregation
|
||
- Dashboard access
|
||
- Alert notifications
|
||
|
||
**Firewall Rules:**
|
||
- Default: Allow monitoring protocols
|
||
- Allow: Prometheus scraping
|
||
- Allow: Grafana access (from management VLAN)
|
||
- Allow: Log collection
|
||
|
||
#### VLAN 50 - Dev/Test (10.10.50.0/24)
|
||
|
||
**Purpose:** Lab workloads, development, testing
|
||
|
||
**Components:**
|
||
- Dev VMs: 10.10.50.10-10.10.50.30
|
||
- Test VMs: 10.10.50.40-10.10.50.60
|
||
- CI/CD services: 10.10.50.70
|
||
- Router server dev interface: 10.10.50.1
|
||
|
||
**Traffic:**
|
||
- Development traffic
|
||
- Testing operations
|
||
- CI/CD pipelines
|
||
- Git operations
|
||
|
||
**Firewall Rules:**
|
||
- Default: Restrict to dev/test only
|
||
- Allow: Git access
|
||
- Allow: CI/CD operations
|
||
- Block: Production network access
|
||
|
||
#### VLAN 60 - Management (10.10.60.0/24)
|
||
|
||
**Purpose:** WAC, Azure Arc, SSH, hypervisor management
|
||
|
||
**Components:**
|
||
- Router server management: 10.10.60.1
|
||
- Jump host: 10.10.60.10
|
||
- Windows Admin Center: 10.10.60.20
|
||
- Azure Arc agents: 10.10.60.30+
|
||
- Router server mgmt interface: 10.10.60.1
|
||
|
||
**Traffic:**
|
||
- Management protocols (SSH, RDP, WAC)
|
||
- Azure Arc agent communication
|
||
- Administrative access
|
||
- System updates
|
||
|
||
**Firewall Rules:**
|
||
- Default: Restrict access
|
||
- Allow: SSH (22) from trusted sources
|
||
- Allow: WAC (443) from trusted sources
|
||
- Allow: Azure Arc outbound (443)
|
||
- Block: Inbound from internet
|
||
|
||
#### VLAN 99 - Utility/DMZ (10.10.99.0/24)
|
||
|
||
**Purpose:** Proxies, bastions, Cloudflare tunnel hosts
|
||
|
||
**Components:**
|
||
- Cloudflare Tunnel VM: 10.10.99.10
|
||
- Reverse proxy: 10.10.99.20
|
||
- Bastion host: 10.10.99.30
|
||
- Router server DMZ interface: 10.10.99.1
|
||
|
||
**Traffic:**
|
||
- Cloudflare Tunnel outbound (443)
|
||
- Reverse proxy traffic
|
||
- External access (via Cloudflare)
|
||
- DMZ services
|
||
|
||
**Firewall Rules:**
|
||
- Default: Restrict to DMZ only
|
||
- Allow: Cloudflare Tunnel outbound (443)
|
||
- Allow: Reverse proxy → Internal services
|
||
- Block: Direct internet access (except Cloudflare)
|
||
|
||
### Physical Port Mapping (Router Server)
|
||
|
||
#### WAN Ports (i350-T4)
|
||
|
||
- **WAN1:** Spectrum modem/ONT #1 → VLAN untagged
|
||
- **WAN2:** Spectrum modem/ONT #2 → VLAN untagged
|
||
- **WAN3:** Spectrum modem/ONT #3 → VLAN untagged
|
||
- **WAN4:** Spectrum modem/ONT #4 → VLAN untagged
|
||
|
||
#### 10GbE Ports (X550-T2)
|
||
|
||
- **10GbE-1:** Reserved for future 10GbE switch or direct server link
|
||
- **10GbE-2:** Reserved for future 10GbE switch or direct server link
|
||
|
||
#### 2.5GbE LAN Ports (i225 Quad-Port)
|
||
|
||
- **LAN2.5-1:** Direct to HPE ML110 Gen9 → VLAN 20 (compute)
|
||
- **LAN2.5-2:** Direct to Dell R630 → VLAN 20 (compute)
|
||
- **LAN2.5-3:** Key service #1 → VLAN 30 (app tier)
|
||
- **LAN2.5-4:** Key service #2 → VLAN 30 (app tier)
|
||
|
||
#### 1GbE LAN Ports (i350-T8)
|
||
|
||
- **LAN1G-1:** Server/appliance #1 → Appropriate VLAN
|
||
- **LAN1G-2:** Server/appliance #2 → Appropriate VLAN
|
||
- **LAN1G-3:** Server/appliance #3 → Appropriate VLAN
|
||
- **LAN1G-4:** Server/appliance #4 → Appropriate VLAN
|
||
- **LAN1G-5:** Server/appliance #5 → Appropriate VLAN
|
||
- **LAN1G-6:** Server/appliance #6 → Appropriate VLAN
|
||
- **LAN1G-7:** Server/appliance #7 → Appropriate VLAN
|
||
- **LAN1G-8:** Server/appliance #8 → Appropriate VLAN
|
||
|
||
### IP Address Allocation Examples
|
||
|
||
```
|
||
VLAN 10 (Storage): 10.10.10.0/24
|
||
- Router: 10.10.10.1
|
||
- NAS: 10.10.10.10
|
||
- Backup: 10.10.10.20
|
||
|
||
VLAN 20 (Compute): 10.10.20.0/24
|
||
- Router: 10.10.20.1
|
||
- ML110: 10.10.20.10
|
||
- R630: 10.10.20.20
|
||
|
||
VLAN 30 (App Tier): 10.10.30.0/24
|
||
- Router: 10.10.30.1
|
||
- Reverse Proxy: 10.10.30.10
|
||
- Apps: 10.10.30.20-50
|
||
|
||
VLAN 40 (Observability): 10.10.40.0/24
|
||
- Router: 10.10.40.1
|
||
- Prometheus: 10.10.40.10
|
||
- Grafana: 10.10.40.20
|
||
- Loki: 10.10.40.30
|
||
|
||
VLAN 50 (Dev/Test): 10.10.50.0/24
|
||
- Router: 10.10.50.1
|
||
- Dev VMs: 10.10.50.10-30
|
||
- Test VMs: 10.10.50.40-60
|
||
- CI/CD: 10.10.50.70
|
||
|
||
VLAN 60 (Management): 10.10.60.0/24
|
||
- Router: 10.10.60.1
|
||
- Jump Host: 10.10.60.10
|
||
- WAC: 10.10.60.20
|
||
- Arc Agents: 10.10.60.30+
|
||
|
||
VLAN 99 (DMZ): 10.10.99.0/24
|
||
- Router: 10.10.99.1
|
||
- Cloudflare Tunnel: 10.10.99.10
|
||
- Reverse Proxy: 10.10.99.20
|
||
- Bastion: 10.10.99.30
|
||
```
|
||
|
||
### Inter-VLAN Routing
|
||
|
||
**Default Policy:** Deny all inter-VLAN traffic
|
||
|
||
**Allowed Routes:**
|
||
- Management (60) → All VLANs (administrative access)
|
||
- Compute (20) → Storage (10) (storage access)
|
||
- App Tier (30) → Storage (10) (application storage)
|
||
- Observability (40) → All VLANs (monitoring access)
|
||
- DMZ (99) → App Tier (30), Management (60) (reverse proxy access)
|
||
|
||
**Firewall Rules:**
|
||
- Explicit allow rules for required traffic
|
||
- Default deny for all other inter-VLAN traffic
|
||
- Log all denied traffic for security monitoring
|
||
|
||
### Multi-WAN Configuration
|
||
|
||
**WAN Interfaces:**
|
||
- 4× Spectrum 1Gbps connections via i350-T4
|
||
- Each WAN on separate interface (WAN1-4)
|
||
|
||
**Load Balancing:**
|
||
- mwan3 for multi-WAN load balancing
|
||
- Per-ISP health checks
|
||
- Automatic failover
|
||
|
||
**Policy Routing:**
|
||
- Route specific traffic over specific WANs
|
||
- Balance traffic across all WANs
|
||
- Failover to remaining WANs if one fails
|
||
|