HomeBlogTerraform on Azure: AKS, Key Vault, and Production-Ready Infrastructure
TerraformAzureKubernetesIaC

Terraform on Azure: AKS, Key Vault, and Production-Ready Infrastructure

January 12, 2026·14 min read·Omphora Engineering

Terraform on Azure: what's different from AWS

If you've written Terraform for AWS, moving to Azure feels familiar in structure but different in details. The resource model maps reasonably well: VPC → Virtual Network, EC2 → Virtual Machine, EKS → AKS, IAM Role → Managed Identity. But the Azure provider has its own patterns, naming conventions, and gotchas.

This guide builds a production-grade Azure environment from scratch: networking, AKS, Key Vault, managed identities, and a remote state backend in Azure Storage.

Remote state with Azure Storage

Don't start without configuring remote state. Create the storage account manually (or with a bootstrap script — not tracked by Terraform itself):

# Bootstrap script — run once
RESOURCE_GROUP="terraform-state-rg"
STORAGE_ACCOUNT="tfstate${RANDOM}"   # must be globally unique
CONTAINER="tfstate"

az group create --name $RESOURCE_GROUP --location eastus
az storage account create   --name $STORAGE_ACCOUNT   --resource-group $RESOURCE_GROUP   --sku Standard_LRS   --encryption-services blob
az storage container create   --name $CONTAINER   --account-name $STORAGE_ACCOUNT

echo "Storage account: $STORAGE_ACCOUNT"

Configure the backend in Terraform:

terraform {
  required_version = ">= 1.6"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.90"
    }
  }

  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "tfstate12345"   # your storage account name
    container_name       = "tfstate"
    key                  = "production.tfstate"
  }
}

provider "azurerm" {
  features {
    key_vault {
      purge_soft_delete_on_destroy    = true
      recover_soft_deleted_key_vaults = true
    }
  }
}

Resource groups and naming

Azure uses Resource Groups as logical containers — every resource belongs to one. Use a consistent naming convention:

variable "environment" {
  default = "prod"
}

variable "location" {
  default = "eastus"
}

resource "azurerm_resource_group" "main" {
  name     = "omphora-${var.environment}-rg"
  location = var.location

  tags = {
    environment = var.environment
    managed_by  = "terraform"
  }
}

Virtual Network with proper subnet segmentation

resource "azurerm_virtual_network" "main" {
  name                = "omphora-${var.environment}-vnet"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  address_space       = ["10.0.0.0/8"]

  tags = { environment = var.environment }
}

# AKS nodes — large range for pod IP allocation
resource "azurerm_subnet" "aks" {
  name                 = "aks-subnet"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.1.0.0/16"]
}

# Private endpoints (databases, Key Vault)
resource "azurerm_subnet" "private_endpoints" {
  name                 = "private-endpoints-subnet"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.2.0.0/24"]

  private_endpoint_network_policies_enabled = false
}

# Application Gateway (if using for ingress)
resource "azurerm_subnet" "appgw" {
  name                 = "appgw-subnet"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.3.0.0/24"]
}

AKS cluster with Workload Identity

AKS Workload Identity is the Azure equivalent of AWS IRSA — it lets pods authenticate to Azure services using a Kubernetes ServiceAccount, without storing credentials anywhere.

data "azurerm_client_config" "current" {}

resource "azurerm_kubernetes_cluster" "main" {
  name                = "omphora-${var.environment}-aks"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  dns_prefix          = "omphora-${var.environment}"
  kubernetes_version  = "1.29"

  # Enable OIDC issuer for Workload Identity
  oidc_issuer_enabled       = true
  workload_identity_enabled = true

  default_node_pool {
    name                = "system"
    node_count          = 2
    vm_size             = "Standard_D4s_v3"
    vnet_subnet_id      = azurerm_subnet.aks.id
    os_disk_type        = "Ephemeral"
    enable_auto_scaling = true
    min_count           = 2
    max_count           = 5

    upgrade_settings {
      max_surge = "33%"
    }
  }

  identity {
    type = "SystemAssigned"
  }

  network_profile {
    network_plugin     = "azure"
    network_policy     = "calico"
    load_balancer_sku  = "standard"
    outbound_type      = "loadBalancer"
    service_cidr       = "172.16.0.0/16"
    dns_service_ip     = "172.16.0.10"
  }

  azure_active_directory_role_based_access_control {
    managed                = true
    azure_rbac_enabled     = true
    admin_group_object_ids = [var.aks_admin_group_id]
  }

  # Azure Monitor integration
  oms_agent {
    log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
  }

  tags = { environment = var.environment }
}

# User node pool for workloads (keeps system pool clean)
resource "azurerm_kubernetes_cluster_node_pool" "workload" {
  name                  = "workload"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.main.id
  vm_size               = "Standard_D8s_v3"
  vnet_subnet_id        = azurerm_subnet.aks.id
  os_disk_type          = "Ephemeral"
  enable_auto_scaling   = true
  min_count             = 1
  max_count             = 20

  node_taints = []
  node_labels = { role = "workload" }
}

Azure Key Vault for secrets

Key Vault is Azure's managed secrets service. Create one per environment:

resource "azurerm_key_vault" "main" {
  name                        = "omphora-${var.environment}-kv"
  location                    = azurerm_resource_group.main.location
  resource_group_name         = azurerm_resource_group.main.name
  tenant_id                   = data.azurerm_client_config.current.tenant_id
  sku_name                    = "standard"
  soft_delete_retention_days  = 90
  purge_protection_enabled    = true   # required for compliance

  # Disable public access — access via private endpoint only
  public_network_access_enabled = false

  network_acls {
    default_action = "Deny"
    bypass         = "AzureServices"
  }
}

# Private endpoint so AKS pods can reach Key Vault over the VNet
resource "azurerm_private_endpoint" "key_vault" {
  name                = "omphora-${var.environment}-kv-pe"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  subnet_id           = azurerm_subnet.private_endpoints.id

  private_service_connection {
    name                           = "kv-connection"
    private_connection_resource_id = azurerm_key_vault.main.id
    subresource_names              = ["vault"]
    is_manual_connection           = false
  }
}

Workload Identity: connecting pods to Key Vault

This is the pattern that replaces service principal credentials in pods:

# Managed identity for the application
resource "azurerm_user_assigned_identity" "my_app" {
  name                = "my-app-${var.environment}"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
}

# Grant the identity access to Key Vault secrets
resource "azurerm_key_vault_access_policy" "my_app" {
  key_vault_id = azurerm_key_vault.main.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_user_assigned_identity.my_app.principal_id

  secret_permissions = ["Get", "List"]
}

# Federated credential links the K8s ServiceAccount to the managed identity
resource "azurerm_federated_identity_credential" "my_app" {
  name                = "my-app-k8s"
  resource_group_name = azurerm_resource_group.main.name
  parent_id           = azurerm_user_assigned_identity.my_app.id
  audience            = ["api://AzureADTokenExchange"]
  issuer              = azurerm_kubernetes_cluster.main.oidc_issuer_url
  subject             = "system:serviceaccount:my-app:my-app"
}

In Kubernetes, create the ServiceAccount with the Workload Identity annotation:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app
  namespace: my-app
  annotations:
    azure.workload.identity/client-id: "<managed-identity-client-id>"

Use the Secrets Store CSI Driver with the Azure provider to mount Key Vault secrets into pods:

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: my-app-keyvault
  namespace: my-app
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    clientID: "<managed-identity-client-id>"
    keyvaultName: "omphora-prod-kv"
    tenantId: "<tenant-id>"
    objects: |
      array:
        - |
          objectName: database-url
          objectType: secret

No credentials stored anywhere in the cluster. The pod's ServiceAccount token is exchanged for an Azure token, which authorizes access to Key Vault.

Module structure for a real project

Don't write a 500-line main.tf. Split into modules:

infrastructure/
  main.tf           # top-level module calls
  variables.tf
  outputs.tf
  backend.tf

  modules/
    networking/     # VNet, subnets, NSGs
    aks/            # cluster, node pools, workload identity
    keyvault/       # Key Vault, private endpoint, policies
    monitoring/     # Log Analytics, Diagnostic settings

Each module gets its own variables.tf, outputs.tf, and main.tf. The root module wires them together, passing outputs from one module as inputs to another.

# main.tf
module "networking" {
  source      = "./modules/networking"
  environment = var.environment
  location    = var.location
}

module "aks" {
  source              = "./modules/aks"
  environment         = var.environment
  location            = var.location
  resource_group_name = azurerm_resource_group.main.name
  aks_subnet_id       = module.networking.aks_subnet_id
  oidc_issuer_url     = module.aks.oidc_issuer_url
}

module "keyvault" {
  source                     = "./modules/keyvault"
  environment                = var.environment
  resource_group_name        = azurerm_resource_group.main.name
  private_endpoint_subnet_id = module.networking.private_endpoint_subnet_id
}

Authenticate in CI/CD without service principal credentials

Use OIDC authentication from GitHub Actions to avoid storing Azure credentials as secrets:

# .github/workflows/terraform.yml
permissions:
  id-token: write
  contents: read

steps:
  - uses: azure/login@v1
    with:
      client-id: ${{ secrets.AZURE_CLIENT_ID }}
      tenant-id: ${{ secrets.AZURE_TENANT_ID }}
      subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

  - name: Terraform Apply
    run: terraform apply -auto-approve
    env:
      ARM_USE_OIDC: true
      ARM_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
      ARM_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
      ARM_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

The GitHub Actions OIDC token is exchanged for a short-lived Azure token. No client secrets stored in GitHub. Rotation is automatic.

Not sure where to start?
Let's talk.

One conversation, no commitment. We listen to what your team is struggling with and give you an honest picture of what needs to change — and what doesn't.

  • What's slowing down your team's deployment process
  • Where your cloud spend is going — and what's being wasted
  • Security vulnerabilities in your current setup
  • Reliability gaps that could cause downtime
  • Blind spots in your monitoring and alerting
Available for new projectsResponse within 1 business dayNo long-term commitment required
your-infra ~ after-omphora
$ terraform apply
✓ 23 resources. Apply complete in 4m 12s
$ kubectl get nodes
NAME STATUS ROLES AGE
ip-10-0-1 Ready worker 2d
ip-10-0-2 Ready worker 2d
ip-10-0-3 Ready worker 2d
$ argocd app list
production Synced Healthy
staging Synced Healthy
$ # Commit → production: 3m 42s
✓ Zero downtime · p99: 82ms · all systems healthy
$ # Example output — results vary by workload.
3m 42s
Deploy time
IaC
Every resource
HA
Built-in reliability