The chart that works in dev but breaks in prod
Most Helm charts start simple. A Deployment, a Service, maybe an Ingress. You paste in the image name, run helm install, and it works. Then you need to deploy it to staging with different resource limits. Then to prod with different environment variables and secrets. Then a second service with a similar structure.
By that point, the chart is a mess of conditionals, hard-coded values, and environment-specific overrides scattered across three values.yaml files nobody remembers why exist.
This guide covers how to structure Helm charts from the start for multi-environment, production use.
Chart structure
A production-ready chart looks like this:
my-service/
Chart.yaml
values.yaml # base defaults — no environment-specific values here
values-staging.yaml # staging overrides
values-prod.yaml # prod overrides
templates/
_helpers.tpl # named templates and label helpers
deployment.yaml
service.yaml
ingress.yaml
hpa.yaml
pdb.yaml
serviceaccount.yaml
configmap.yaml
externalsecret.yaml # if using External Secrets Operator
The key rule: values.yaml contains safe defaults that work for local development. Environment-specific files only contain overrides — never duplicate the full structure.
The _helpers.tpl file
Named templates in _helpers.tpl prevent you from copying the same label selectors and resource names across every template file. Always define these:
{{/* Standard labels for all resources */}}
{{- define "my-service.labels" -}}
helm.sh/chart: {{ include "my-service.chart" . }}
app.kubernetes.io/name: {{ include "my-service.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/* Selector labels — these NEVER change after initial deploy */}}
{{- define "my-service.selectorLabels" -}}
app.kubernetes.io/name: {{ include "my-service.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
Critical: selector labels on a Deployment cannot be changed after creation. If you change them, Kubernetes rejects the update. Use the stable subset (name + instance) for selectors, and the full set for labels on the pod template.
Values hierarchy: defaults vs overrides
Your base values.yaml should have every key the chart uses, with sensible defaults:
# values.yaml — complete defaults
replicaCount: 1
image:
repository: "" # required — no default
tag: "" # required — set at deploy time
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
targetPort: 8080
ingress:
enabled: false
className: nginx
host: ""
tls: false
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 70
podDisruptionBudget:
enabled: false
minAvailable: 1
env: {} # key-value environment variables
envFrom: [] # references to ConfigMaps/Secrets
serviceAccount:
create: true
annotations: {} # used for IRSA on EKS
Then values-prod.yaml only overrides what differs:
# values-prod.yaml — only overrides
replicaCount: 3
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2
memory: 1Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
podDisruptionBudget:
enabled: true
minAvailable: 2
ingress:
enabled: true
host: api.yourdomain.com
tls: true
Deploy with:
helm upgrade --install my-service ./my-service -f values.yaml -f values-prod.yaml --set image.tag=$IMAGE_TAG
Secrets: never put them in values.yaml
This is the most common mistake. Secrets do not belong in Helm values files — not even base64-encoded. If your values-prod.yaml has a database password, it's in Git history forever.
The right approach is the External Secrets Operator. Your chart creates an ExternalSecret resource that pulls the actual secret from AWS Secrets Manager or HashiCorp Vault at runtime:
# templates/externalsecret.yaml
{{- if .Values.externalSecrets.enabled }}
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: {{ include "my-service.fullname" . }}
labels:
{{- include "my-service.labels" . | nindent 4 }}
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: {{ include "my-service.fullname" . }}
creationPolicy: Owner
data:
{{- range .Values.externalSecrets.keys }}
- secretKey: {{ .targetKey }}
remoteRef:
key: {{ $.Values.externalSecrets.path }}
property: {{ .remoteKey }}
{{- end }}
{{- end }}
In values:
externalSecrets:
enabled: true
path: production/my-service
keys:
- targetKey: DATABASE_URL
remoteKey: database_url
- targetKey: API_KEY
remoteKey: api_key
Now secrets live in AWS Secrets Manager, rotation is automatic, and your Git history has no credentials.
Resource requests and limits: get these right
Under-specified requests cause scheduling problems. Over-specified limits cause OOMKills and CPU throttling. Both hurt reliability.
Requests determine where the pod schedules. If you set 100m CPU and 128Mi memory, the scheduler finds a node with at least that much available. Set too low and you'll have noisy neighbour problems. Set too high and pods won't schedule.
Limits determine the ceiling. CPU limits cause throttling — even if a node has spare CPU, Kubernetes won't let the container use more than the limit. For latency-sensitive services, set CPU limits generously or omit them. Memory limits cause OOMKill — always set these, as unbounded memory use will kill the node.
A reasonable starting point for a typical web service:
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m # 4x request headroom for bursts
memory: 512Mi # 2x request — tight enough to catch leaks
Use Vertical Pod Autoscaler (VPA) in recommendation mode to tune these after running in production for a week.
Pod Disruption Budgets for zero-downtime deploys
Without a PDB, rolling updates and node drains can take all replicas down simultaneously. A PDB prevents this:
# templates/pdb.yaml
{{- if .Values.podDisruptionBudget.enabled }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "my-service.fullname" . }}
spec:
{{- if .Values.podDisruptionBudget.minAvailable }}
minAvailable: {{ .Values.podDisruptionBudget.minAvailable }}
{{- else if .Values.podDisruptionBudget.maxUnavailable }}
maxUnavailable: {{ .Values.podDisruptionBudget.maxUnavailable }}
{{- end }}
selector:
matchLabels:
{{- include "my-service.selectorLabels" . | nindent 6 }}
{{- end }}
Enable PDBs in production. A minAvailable: 1 PDB ensures at least one replica stays healthy during node maintenance, preventing a Karpenter consolidation from briefly taking your service offline.
Chart versioning and App versioning
Keep these separate. Chart.yaml has two version fields:
apiVersion: v2
name: my-service
version: 1.3.0 # chart version — bump on any chart changes
appVersion: "2.1.4" # application version — your Docker image tag
Bump the chart version when you change templates or add features to the chart itself. The app version tracks your application's release. In CI, override appVersion with the image tag at deploy time:
helm upgrade --install my-service ./my-service --set image.tag=${{ github.sha }}
Lint and test before merging
# Lint the chart
helm lint ./my-service -f values.yaml -f values-prod.yaml
# Render templates without installing — catch template errors
helm template my-service ./my-service -f values.yaml -f values-prod.yaml --set image.tag=test | kubectl apply --dry-run=client -f -
Add both to your CI pipeline. Helm template errors are fast and cheap to catch in a PR. Finding them in production is not.