# Everything as code without the sprawl

> Terraform, Ansible, GitHub, Sentry, local machines, and the console work I try to turn into repeatable code.

Infrastructure code becomes dangerous when it starts trying to be a platform before the team has
platform problems.

I still want more of the operating path in code, not less. GitHub repos, branch protections, Sentry
projects, Cloudflare DNS, buckets, local development machines, build runners, tunnels, and service
deploys all become easier to trust when the desired state is reviewable and rerunnable.

The line I try to hold is simple: use the console to learn the system; use code to operate it.

Terraform and Ansible can stay useful for years if they have a small job. Terraform owns durable
shape: cloud resources, repo settings, DNS, Sentry projects, and other things that should survive a
person leaving the browser tab. Ansible owns machine and service setup where an image or managed
service is not the better answer. Everything else needs to justify itself.

The sprawl starts with good intentions: one module for reuse, one role for consistency, one helper
script for convenience. Six months later, nobody can tell whether a value comes from a variable, a
default, a workspace, a secret store, or the one manual thing everyone forgot.

## Console for discovery, code for operation

The web console is fine for exploration. It is a bad permanent interface for repeated work.

If I create the same kind of GitHub repo twice, the repo shape belongs in Terraform. If I add a
Sentry project for every service, that belongs in Terraform too. If a Mac build machine needs the
same shell, runtimes, mobile tooling, and GitHub runner setup as the next one, that belongs in
Ansible. The point is not purity. The point is that the next person can review a diff, run a plan,
and understand the operating surface.

```mermaid
flowchart LR
  A[discover in console] --> B[name the durable shape]
  B --> C[encode desired state]
  C --> D[plan or check]
  D --> E[apply]
  E --> F[rerun safely]
  F --> G[agent or teammate can inspect diff]
```

<Principle title="The repeat path belongs in code">
  A console click can teach you what a system needs. The second time the team depends on that shape,
  move it into a repo where it can be reviewed, tested, and rerun.
</Principle>

## Modules should hide repetition, not decisions

A Terraform module is useful when the decision has already been made.

If every caller overrides half the inputs, the module is not an abstraction. It is a negotiation layer. I would rather have three boring resources in the root module than a clever shared module with twenty knobs and no clear owner.

The test I use is simple: can a new engineer open the environment file and understand what will exist after `terraform apply`? If not, the module is probably hiding the wrong thing.

```hcl title="module-boundary.tf"
module "service_bucket" {
  source = "../modules/private-bucket"

  name        = "inference-artifacts"
  environment = var.environment
}
```

The module can standardize encryption, labels, lifecycle policy, and access patterns. It should not
hide whether the bucket exists, who owns it, or why it is there.

The same rule applies to GitHub and Sentry. A module that creates repositories, branches, branch
protections, vulnerability alerts, teams, collaborators, and Actions secrets is valuable because the
organization's delivery rules become explicit. A module that hides those decisions behind twenty
inputs is just a web console with worse ergonomics.

## State is a production dependency

Treat state like a database, not a cache.

That means remote state, locking, least-privilege access, and a written recovery path. It also means
small state files. One giant state for the whole company feels convenient until a harmless change to
a dashboard service needs to refresh half the cloud.

Split by ownership and blast radius:

- network and shared foundations;
- data stores;
- application services;
- observability;
- experiments and temporary environments.

That split keeps a small service deploy from becoming an infrastructure ceremony.

```mermaid
flowchart LR
  A[foundation state] --> B[data stores]
  A --> C[app services]
  C --> D[observability]
  C --> E[temporary envs]
  B -. separate owner .-> D
```

<Principle title="State follows ownership">
  A state file should be small enough that the team changing it understands the blast radius. Shared
  foundations, data, apps, and experiments should not all wait behind one lock.
</Principle>

## Ansible is for the parts images do not cover

I still like Ansible when the job is explicit: bootstrap a host, configure a service, rotate a
config, repair drift, or document an operational procedure as code. I do not like it as a hidden
runtime where business logic slowly moves into YAML.

The local machine is part of the platform when the team depends on it. Fish, mise, uv, Docker,
mobile build tooling, local LLM stacks, GitHub Actions runners, and Cloudflare Tunnel setup are not
glamorous, but they decide how quickly a teammate can become useful.

Good roles are idempotent, noisy in the right places, and boring to re-run:

```yaml title="role-rules.yml"
- name: write service config
  ansible.builtin.template:
    src: service.env.j2
    dest: /etc/my-service/env
    mode: '0600'
  notify: restart service
```

The handler matters. The mode matters. The fact that secrets are not printed matters. Those details
are the difference between automation and a script that makes people nervous.

For agents, this matters even more. A browser-only setup is invisible to the model. A role, a vars
file, a Make target, and a check-mode run are facts it can inspect before touching anything.

## Leave some things manual on purpose

Not every rare operation deserves a pipeline. Some risky actions should stay manual with a runbook,
a checklist, and a second pair of eyes.

The goal is not maximum automation. The goal is reliable operations. Automate the repetitive and
error-prone paths. Keep the irreversible paths explicit until the team has enough reps to encode
them safely.

<Tradeoff title="Manual can be the safer interface">
  If a rare operation is irreversible and poorly understood, encoding it too early only makes the
  risky path faster. Keep it manual until the checks, owners, and failure modes are boring.
</Tradeoff>

That is how everything as code stays useful: fewer knobs, smaller states, idempotent roles, tested
modules, explicit manual exceptions, and no pretending that code removes the need to think.
