Venturing Beyond Hello World: Mastering Containerization and Orchestration
Context: Everything here lives in personal repos (docker_multilang_project, Car-Match backend, ProjectHub proxy). I’ve never owned production containers.
AI assist: ChatGPT condensed my Docker/EKS notes; I validated each callout against the actual repos and AWS lab scripts on 2025-10-15.
Status: Learning log. Use it to gauge my current level, not as evidence of SRE tenure.
Reality snapshot
- Day-to-day dev happens in Docker Compose (Node + Python + Postgres or Mongo).
- When I want to stretch, I follow AWS workshops to stand up an EKS cluster with Terraform, deploy the same services, and watch how health checks + autoscaling behave.
- Observability equals
stdoutJSON logs,/healthzroutes, and occasionally Prometheus/Grafana during labs. No 24/7 pager yet.
Compose: my default sandbox
Stack anatomy
services:api:build: ./apienv_file: .env.apiports: ["4000:4000"]depends_on: [db]healthcheck:test: ["CMD", "curl", "-f", "http://localhost:4000/healthz"]interval: 30stimeout: 5sretries: 3frontend:build:context: ./frontendtarget: productionports: ["8080:80"]depends_on: [api]db:image: postgres:15environment:POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}volumes:- pgdata:/var/lib/postgresql/datavolumes:pgdata:
- Why it works: Health checks gate traffic,
.envfiles keep secrets out of the compose file, and named volumes preserve data between runs. - Observability: Services log JSON with request IDs so
docker compose logs apistitches calls together. - Chaos drills: I kill containers with
docker compose kill apito verify the frontend fails gracefully and recovers once the container restarts.
Lessons
- Multi-stage builds keep images small (e.g.,
node:20-alpine+npm ciin builder stage). - Mounting local certs into containers makes HTTPS dev possible without messing with the host.
- Documenting every command (
docs/dev-runbook.md) stops classmates from asking, “why doesn’t this container start?”
EKS labs: leveling up (still sandboxed)
What I practice
- Terraform provisioning – VPC, node groups, IAM roles. All lives in
labs/eks/terraform. - Deployments & services – Basic
Deployment+Servicemanifests, ConfigMaps for environment variables, Secrets for credentials. - Ingress – AWS Load Balancer Controller with TLS certs for the sample domain.
- Observability – Prometheus + Grafana via Helm, scraping the demo pods.
- Autoscaling & rolling updates – HPA based on CPU + custom metrics,
kubectl rollout statusdemos.
apiVersion: apps/v1kind: Deploymentmetadata:name: apispec:replicas: 3selector:matchLabels:app: apitemplate:metadata:labels:app: apispec:containers:- name: apiimage: $ECR_URI/api:${GITHUB_SHA}ports:- containerPort: 4000envFrom:- secretRef:name: api-secretsreadinessProbe:httpGet:path: /readyport: 4000initialDelaySeconds: 5periodSeconds: 10
Honest caveats
- Traffic is synthetic (k6 scripts + curl). No paying customers.
- IAM roles follow workshop defaults. Before touching production I’d need a full review.
- I rely on AWS Cloud9 + workshop accounts. Costs are low, but I still tear everything down immediately after lab time.
Tooling & guardrails
- Container builds:
npm run build:dockerorscripts/docker-build.shensure multi-stage builds use the same base images. - Security:
npm audit,docker scan, and occasional Trivy runs keep dependencies honest. Findings go in the repo issues list. - Docs: Every repo has
docs/runbook.md(start/stop commands, health checks, log locations, TODOs). For EKS labs I add Terraform diagrams +destroyinstructions.
What I’m working on next
- Add automated smoke tests (k6 or Playwright) that hit the running Compose stack on CI before merging.
- Package Terraform/EKS lab into a repeatable template so I can spin it up faster (and share with classmates).
- Explore App Mesh or Linkerd to understand service meshes before I make claims about them.
- Figure out how to shrink cold-start time on the Render backend (Car-Match) without leaving the free tier.