HomeBlogTerraform on AWS: A Production-Ready Architecture Guide
TerraformAWSIaCBest Practices

Terraform on AWS: A Production-Ready Architecture Guide

May 10, 2026·15 min read·Omphora Engineering

The Terraform module problem

Most teams start with Terraform in a single main.tf file. It works great for 10 resources. By the time you hit 200 resources across three environments, it's a maintenance nightmare — duplicated code, inconsistent naming, state conflicts, and no clear ownership.

This guide covers how to structure Terraform properly for teams that want to scale.

Repository structure

infrastructure/
  modules/
    vpc/              # Reusable VPC module
    eks/              # Reusable EKS module
    rds/              # Reusable RDS module
    iam/              # IAM roles and policies
  environments/
    dev/
      main.tf         # Calls modules with dev values
      terraform.tfvars
    staging/
    production/
  global/
    iam/              # Cross-account roles
    route53/          # DNS zones

Remote state with S3 + DynamoDB

Never use local state in a team environment. Configure remote state from day one:

terraform {
  backend "s3" {
    bucket         = "my-company-terraform-state"
    key            = "production/eks/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks"

    # Assume role for state access
    role_arn = "arn:aws:iam::123456789:role/terraform-state-access"
  }
}

The DynamoDB table prevents concurrent applies that corrupt state:

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

Writing reusable modules

A good Terraform module has three things: a clear interface (variables), an implementation (resources), and documented outputs:

# modules/vpc/variables.tf
variable "name" {
  description = "VPC name, used as prefix for all resources"
  type        = string
}

variable "cidr" {
  description = "VPC CIDR block"
  type        = string
  default     = "10.0.0.0/16"
}

variable "availability_zones" {
  description = "List of AZs to deploy subnets into"
  type        = list(string)
}

variable "private_subnet_cidrs" {
  type = list(string)
}

variable "public_subnet_cidrs" {
  type = list(string)
}

variable "tags" {
  type    = map(string)
  default = {}
}

Module versioning

For team use, version your modules:

module "vpc" {
  source  = "git::https://github.com/your-org/terraform-modules.git//vpc?ref=v1.4.2"

  name               = "production"
  cidr               = "10.0.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  # ...
}

CI/CD for Terraform

Use GitHub Actions with OIDC for automated plan and apply:

on:
  pull_request:
    paths:
      - 'infrastructure/**'
  push:
    branches: [main]
    paths:
      - 'infrastructure/**'

jobs:
  plan:
    if: github.event_name == 'pull_request'
    steps:
      - name: Terraform Plan
        run: terraform plan -out=plan.tfplan
      - name: Comment plan on PR
        # Post plan output as PR comment

  apply:
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    environment: production  # Requires approval
    steps:
      - name: Terraform Apply
        run: terraform apply -auto-approve plan.tfplan

Security scanning with Checkov

Add Checkov to your Terraform CI pipeline to catch security issues before they reach production:

- name: Run Checkov
  uses: bridgecrewio/checkov-action@master
  with:
    directory: infrastructure/
    framework: terraform
    output_format: sarif
    soft_fail: false
    check: CKV_AWS_*  # All AWS checks

Common issues Checkov catches: public S3 buckets, unencrypted RDS, security groups open to 0.0.0.0/0, missing CloudTrail logging.

Key practices

  1. Modules for everything reusable — no copy-paste between environments
  2. Remote state from day one — S3 + DynamoDB, one state file per environment per service
  3. OIDC in CI/CD — no AWS credentials in GitHub secrets
  4. Checkov in CI — catch security issues before apply
  5. Module versioning — pin module versions with git tags, never use main
  6. terraform-docs — auto-generate module documentation

Not sure where to start?
Let's talk.

One conversation, no commitment. We listen to what your team is struggling with and give you an honest picture of what needs to change — and what doesn't.

  • What's slowing down your team's deployment process
  • Where your cloud spend is going — and what's being wasted
  • Security vulnerabilities in your current setup
  • Reliability gaps that could cause downtime
  • Blind spots in your monitoring and alerting
Available for new projectsResponse within 1 business dayNo long-term commitment required
your-infra ~ after-omphora
$ terraform apply
✓ 23 resources. Apply complete in 4m 12s
$ kubectl get nodes
NAME STATUS ROLES AGE
ip-10-0-1 Ready worker 2d
ip-10-0-2 Ready worker 2d
ip-10-0-3 Ready worker 2d
$ argocd app list
production Synced Healthy
staging Synced Healthy
$ # Commit → production: 3m 42s
✓ Zero downtime · p99: 82ms · cost ↓ 38%
$ # Example output — results vary by workload.
3m 42s
Deploy time
38%
Cost saved
99.9%
Uptime