Technical Deep Dive: Management Access, Data Access, and the Chisel Tunnel
This document explains one of the most important ideas in the whole platform: the difference between managing an Azure resource and accessing the private data inside that resource. It then shows why some Terraform operations fail from a public automation runner and how the Chisel tunnel solves that problem.
Table of Contents
- Control Plane vs Data Plane: The Fundamental Divide
- Key Vault Operations: What Requires Data Plane Access
- Terraform + Key Vault: When Operations Fail
- Chisel: How the Tunnel Works
- Deployment Scenarios: Who Needs What
1. Control Plane vs Data Plane: The Fundamental Divide
What is the Control Plane?
The control plane is Azure's management layer. It is the part of the platform that lets you create, update, delete, and describe resources:
- Creating, updating, deleting resources
- Reading metadata and configuration
- Managing RBAC and policies
Endpoint: management.azure.com
Authentication: OpenID Connect, often shortened to OIDC, works well here because these calls go to Azure's public management endpoint.
What is the Data Plane?
The data plane is where real data access happens. This is the layer you touch when you read, write, upload, download, or query actual tenant or platform data:
- Reading/writing secrets from Key Vault
- Accessing storage blobs
- Querying databases
- Any operation that touches actual data
Endpoint: Service-specific (e.g., *.vault.azure.net, *.blob.core.windows.net)
Authentication: Authentication can still succeed, but the request often fails when private endpoints are enabled because the caller may not have a network path to the service.
Why the Divide?
Azure separates these two layers for security and architectural reasons:
- Control plane can use service principals and managed identities with OIDC
- Data plane requires direct network access to the service endpoint
- Private endpoints block public access to data plane endpoints
2. Key Vault Operations: What Requires Data Plane Access
Key Vault Endpoints
- Control Plane:
management.azure.com, used to create the vault, assign permissions, and read configuration details. - Data Plane:
*.vault.azure.net, used to read and write secrets, keys, and certificates.
Operations That Require Data Plane Access
Reading Secrets
// This REQUIRES data plane access
data "azurerm_key_vault_secret" "example" {
name = "my-secret"
vault_id = "/subscriptions/.../keyvaults/myvault"
}
// When private endpoints are enabled:
// ✗ FAILS: Cannot reach myvault.vault.azure.net from GitHub runner
// ✓ WORKS: Can reach myvault.vault.azure.net from within VNet or via Chisel
Writing Secrets
// This REQUIRES data plane access
resource "azurerm_key_vault_secret" "example" {
name = "my-secret"
value = "secret-value"
key_vault_id = azurerm_key_vault.main.id
// Terraform must connect to: myvault.vault.azure.net
// When private endpoints are enabled:
// ✗ FAILS: GitHub runner cannot reach myvault.vault.azure.net
// ✓ WORKS: Chisel tunnel provides access to myvault.vault.azure.net
}
Why These Fail Even When OpenID Connect Works
GitHub Runner (OIDC Authentication)
↓
Azure AD (Token: aud=management.azure.com)
↓
Azure ARM API (management.azure.com) ✓ WORKS
↓
Terraform creates resource
↓
Terraform tries to write secret
↓
needs to access: myvault.vault.azure.net ✗ BLOCKED
↑
Private endpoint blocks public access
GitHub runner has no route to VNet
3. Terraform + Key Vault: When Operations Fail
The following examples show how Terraform interacts with Key Vault. In this repository, the real implementation is spread across infra-ai-hub/stacks/ rather than one single main.tf file, but the same networking rule still applies.
Creating Key Vault (Control Plane - Works)
resource "azurerm_key_vault" "main" {
name = var.key_vault_name
location = var.location
resource_group_name = azurerm_resource_group.main.name
}
// ✓ WORKS with OIDC
// Uses: management.azure.com
// No private endpoint yet, so no blocking
Creating Private Endpoint (Control Plane - Works)
resource "azurerm_private_endpoint" "key_vault_pe" {
name = "${var.app_name}-kv-pe"
location = var.location
resource_group_name = azurerm_resource_group.main.name
subnet_id = module.network.private_endpoint_subnet_id
private_service_connection {
name = "${var.app_name}-kv-psc"
private_connection_resource_id = azurerm_key_vault.main.id
is_manual_connection = false
subresource_names = ["vault"]
}
}
// ✓ WORKS with OIDC
// Uses: management.azure.com
// This BLOCKS public access to *.vault.azure.net
Waiting for DNS (DNS Propagation Workaround)
resource "null_resource" "wait_for_key_vault_private_dns" {
triggers = {
private_endpoint_id = azurerm_private_endpoint.key_vault_pe.id
// ... other values
}
provisioner "local-exec" {
command = <<EOT
# Wait for DNS to be ready before trying data plane operations
# Azure creates private DNS records asynchronously
# We poll until the private endpoint can resolve its DNS name
EOT
}
depends_on = [azurerm_private_endpoint.key_vault_pe]
}
// Why necessary?
// After private endpoint creation, Azure asynchronously:
// 1. Creates private DNS zone: privatelink.vaultcore.azure.net
// 2. Links it to the VNet
// 3. Creates A records for the private endpoint
// 4. This can take 30-120 seconds
//
// If we try to create secrets immediately:
// ✗ FAILS: Cannot resolve myvault.vault.azure.net
// ✓ WORKS: After DNS is ready, myvault.vault.azure.net resolves to private IP
Creating Secrets (Data Plane - FAILS without Chisel)
resource "azurerm_key_vault_secret" "secret_one" {
name = "example-secret-test-one"
value = random_password.secret_one.result
key_vault_id = azurerm_key_vault.main.id
expiration_date = "2025-12-31T23:59:59Z"
depends_on = [null_resource.wait_for_key_vault_private_dns]
}
resource "azurerm_key_vault_secret" "secret_two" {
name = "example-secret-test-two"
value = random_password.secret_two.result
key_vault_id = azurerm_key_vault.main.id
expiration_date = "2025-12-31T23:59:59Z"
depends_on = [null_resource.wait_for_key_vault_private_dns]
}
// ✗ FAILS with OIDC + Private Endpoints
// Terraform tries to connect to: myvault.vault.azure.net
// GitHub runner has no route to the private IP
//
// ✓ WORKS with:
// 1. Self-hosted runner inside VNet
// 2. Bastion + Jumpbox (VPN to VNet)
// 3. Chisel tunnel (App Service → VNet → *.vault.azure.net)
4. Chisel: How the Tunnel Works
The Problem Chisel Solves
BEFORE Chisel:
┌──────────────┐ ┌──────────────┐
│ GitHub Runner │ ──OIDC──> │ Azure ARM API │
│ (Public) │ │ mgmt.azure.com│
└──────────────┘ └──────┬───────┘
│
│ Creates resources
↓
┌──────────────┐ ┌──────▼───────┐
│ GitHub Runner │ │ Private │
│ ✗ No route │ │ Endpoint │
│ to VNet │ │ *.vault.azure.net │
└──────────────┘ │ BLOCKED │
└──────────────┘
AFTER Chisel:
┌──────────────┐ ┌──────────────┐
│ GitHub Runner │ ──OIDC──> │ Azure ARM API │
│ (Public) │ │ mgmt.azure.com│
└──────┬───────┘ └──────┬───────┘
│ │
│ Creates │
↓ │
┌──────▼──────────┐ ┌─────▼──────┐
│ App Service │ │ Private │
│ (Chisel Server) │────────> │ Endpoint │
│ VNet Integrated │ │ *.vault.azure.net │
└──────┬──────────┘ │ ACCESSIBLE │
│ │
│ Chisel tunnel │
↓ (HTTPS/WebSocket) │
┌──────▼──────────┐ │
│ Local Machine │ ←────────────┘
│ (Chisel Client) │
└─────────────────┘
How Chisel Works (Technical Deep Dive)
Step 1: App Service Deployment (Control Plane - Works with OIDC)
From initial-setup/infra/modules/azure-proxy/main.tf:
resource "azurerm_linux_web_app" "azure_proxy" {
name = "${var.app_name}-${var.app_env}-azure-proxy-${random_string.proxy_dns_suffix.result}"
resource_group_name = var.resource_group_name
location = var.location
service_plan_id = azurerm_service_plan.azure_proxy_asp.id
// ✓ KEY: VNet Integration
virtual_network_subnet_id = var.app_service_subnet_id
// ✓ KEY: Managed Identity for Docker pulls
identity {
type = "SystemAssigned"
}
site_config {
application_stack {
docker_image_name = var.azure_proxy_image
docker_registry_url = var.container_registry_url
}
}
app_settings = {
// ✓ KEY: Chisel authentication
CHISEL_AUTH = "${random_uuid.proxy_chisel_username.result}:${random_password.proxy_chisel_password.result}"
}
}
// ✓ This WORKS with OIDC because:
// 1. Creates App Service (management.azure.com) ✓
// 2. Pulls Docker image (docker.io) ✓
// 3. Deploys to App Service (management.azure.com) ✓
// 4. Integrates with VNet (management.azure.com) ✓
Step 2: Chisel Server Runs in App Service
From azure-proxy/chisel/start-chisel.sh:
#!/bin/sh
set -e
# Chisel runs as a server inside the App Service
# Listens on port 80 (configured by PORT env var)
chisel server \
--port $PORT \
--auth $CHISEL_AUTH \
--socks5 \
--reverse
# What this does:
# --port 80: Listen on HTTP (App Service handles HTTPS termination)
# --auth: Require authentication (prevents unauthorized access)
# --socks5: Enable SOCKS5 proxy (for general traffic)
# --reverse: Enable reverse tunneling (clients can expose local ports)
Step 3: Local Machine Connects
From azure-proxy/chisel/README.md:
# Developer runs this on their laptop:
docker run --rm -it -p 5462:5432 jpillora/chisel:latest client \
--auth "tunnel:XXXXXXX" \
https://${azure-proxy-app-service-url} \
0.0.0.0:5432:${postgres_hostname}$:5432
# What happens:
# 1. Chisel client on laptop connects to Chisel server in App Service
# 2. Connection goes: laptop → Internet → App Service (public)
# 3. App Service is VNet-integrated
# 4. Chisel forwards: laptop:5432 → App Service → VNet → postgres:5432
Step 4: Traffic Flow with Private Endpoints
┌──────────────┐
│ Laptop │
│ Chisel Client│
│ Port 5432 │
└──────┬───────┘
│
│ 1. Connect to App Service
│ HTTPS / WebSocket
↓
┌──────▼──────────┐
│ App Service │
│ (Chisel Server) │
│ Public endpoint │
│ VNet integrated │
└──────┬──────────┘
│
│ 2. Inside VNet
│ Private IPs
↓
┌──────▼──────────────────┐
│ Private Endpoint │
│ Subnet │
│ Routes to: │
│ myvault.vault.azure.net │
└──────┬──────────────────┘
│
│ 3. Data plane access
│ *.vault.azure.net
↓
┌──────▼──────────┐
│ Key Vault │
│ Data operations │
│ - Get secrets │
│ - Set secrets │
│ - List secrets │
└────────────────┘
✓ SUCCESS: Terraform can read/write secrets!
Why Chisel vs Alternatives?
| Method | Cost | Setup | Use Case | Limitations |
|---|---|---|---|---|
| Bastion + Jumpbox | $200-300/mo | High | Admin access, full VM | Expensive, overkill for single developer |
| Self-hosted Runner | $100/mo | Medium | CI/CD with secrets | Always running, needs management |
| Chisel + App Service | $15/mo | Low | Local dev, on-demand access | Manual connection needed |
5. Deployment Scenarios: Who Needs What
Scenario 1: Platform Team (Deploys Landing Zone)
Who: Platform Services Team, Infrastructure Admins
What they deploy:
- ✓ Network (VNets, subnets, NSGs)
- ✓ Key Vault + Private Endpoints
- ✓ App Service Plans + Azure Proxy (Chisel)
- ✓ Bastion + Jumpbox (optional)
- ✓ Self-hosted runners (optional)
- ✓ API Management, App Gateway
- ✓ Azure AI Foundry
When they need data plane access:
- Creating Key Vault secrets (initial setup)
- Testing private endpoints
- Debugging network connectivity
- Manual secret rotation
Recommended access method:
- Option 1: Chisel tunnel ($15/mo) - for most developers
- Option 2: Bastion + Jumpbox ($200-300/mo) - for admins who need full VM
- Option 3: GitHub-hosted runners + Chisel tunnel — for CI/CD (use
.deployer-using-secure-tunnel)
Scenario 2: Project Teams (Deploy Applications)
Who: Ministry Application Teams
What they deploy:
- ✓ App Service Plans (their apps)
- ✓ Container Apps (their workloads)
- ✓ Storage Accounts (their data)
- ✓ Application Code
What they DON'T deploy:
- ✗ Network infrastructure (provided by platform)
- ✗ Bastion/Jumpbox (platform team maintains)
- ✗ Chisel (platform team provides endpoint)
- ✗ Key Vault (may use shared or create their own)
- ✗ Private endpoints (configured by platform)
When they need data plane access:
- Reading secrets from Key Vault (their app needs secrets)
- Writing secrets during deployment
- Testing their apps against private endpoints
Recommended access method:
- Use platform's Chisel endpoint - provided by platform team
- Public GitHub runners - for control plane operations
- No self-hosted needed - platform provides infrastructure
Scenario 3: Solo Developer
Who: Single developer, proof-of-concept work
What they deploy:
- ✓ Their application code
- ✓ Minimal infrastructure if needed
Recommended setup:
- Chisel tunnel only ($15/mo) - connect laptop to VNet
- Public GitHub runners (free) - for infrastructure deploys
Skip:
- ✗ Bastion + Jumpbox ($200-300/mo) - too expensive
- ✗ Self-hosted runners ($100/mo) - not needed for solo work
6. Portal → Hub Key Vault Integration
The Tenant Onboarding Portal reads APIM subscription keys from the hub Key Vault at request time, rather than storing them in its own data store. This section describes the end-to-end flow and access model.
Access Flow
- An authenticated tenant-admin user requests credentials in the portal UI (env tab selected).
- The portal backend (
AppController.getCredentials) verifies the session and confirms the user is in the tenant'sadmin_userslist. HubKeyVaultServiceselects theSecretClientfor the requested environment and callsgetSecretfor{tenant}-apim-primary-key,{tenant}-apim-secondary-key, and{tenant}-apim-rotation-metadata.- The portal's system-assigned Managed Identity authenticates to the hub Key Vault via
DefaultAzureCredential. No connection strings or stored credentials are used. - The response is returned over the authenticated session — keys are immediately written to the clipboard client-side and never stored in React state or rendered as DOM text.
RBAC Model
The portal MI is granted Key Vault Secrets User on each hub Key Vault (one azurerm_role_assignment per environment in tenant-onboarding-portal/infra/main.tf, gated on var.hub_keyvault_id_{env} != ""). This is a read-only RBAC role — the portal can retrieve secret values but cannot create, update, or delete secrets.
APIM Tenant Info
The GET /tenants/:name/tenant-info endpoint proxies the APIM /{tenant}/internal/tenant-info policy (which already existed — no APIM changes were required). The portal uses the per-env apimGatewayUrl setting and the tenant's primary key to authenticate the internal call.
Configuration
Six environment variables wire the hub Key Vault URIs and APIM gateway URLs into the portal backend:
PORTAL_HUB_KEYVAULT_URL_{DEV,TEST,PROD}— hub Key Vault URI per environmentPORTAL_APIM_GATEWAY_URL_{DEV,TEST,PROD}— APIM (or App Gateway) public URL per environment
These are populated by the CI/CD pipeline from hub Terraform remote state (via the collect-hub-outputs matrix job — see ADR on hub output collection).
Summary
Why We Need Key Vault Access
- Security: Private endpoints prevent public access to secrets
- Compliance: BC Gov requires zero-trust, no public data plane access
- Operations: Applications need to read secrets at runtime
- Automation: Terraform needs to write secrets during deployment
Why OIDC Alone Isn't Enough
- OIDC provides access to
management.azure.com(control plane) - Private endpoints block
*.vault.azure.net(data plane) - GitHub runners have no route to VNet private IPs
- Even with valid tokens, network access is required
How Chisel Solves This
- App Service runs in VNet (via VNet integration)
- Chisel server in App Service bridges public → private
- Developer connects laptop to App Service via HTTPS
- Traffic flows: laptop → App Service → VNet → private endpoints
- Result: Data plane access without exposing services publicly
When to Use Each Method
- Platform Team: Deploy all, use Chisel or Bastion for access
- Project Teams: Deploy apps only, use platform's Chisel
- Solo Dev: Deploy minimal, use Chisel only
- CI/CD with secrets: Use GitHub-hosted runners + Chisel tunnel (
.deployer-using-secure-tunnelworkflow) - Admin work: Use Bastion + Jumpbox