3.9 KiB
3.9 KiB
| name | description |
|---|---|
| infra-ops | Canonical conventions for Manohar's self-hosted infrastructure (Hetzner CX32 + Dokploy + Tailscale + Forgejo). Use whenever creating or editing a service, writing a Dokploy compose file, running SSH ops on the server, deploying via Forgejo, or touching networking/UFW. Encodes the script-first workflow, compose label requirements, overlay-vs-bridge networking rules, and the deploy loop so these directions never need restating. |
Infra Ops — house style
Server
- Host
manohar-ubuntu: Hetzner CX32 (4 vCPU / 7.6 GB / 75 GB), Ubuntu 24, Docker 29, Helsinki. - SSH (Tailscale-only; user is always
root):SSH_AUTH_SOCK=$(launchctl getenv SSH_AUTH_SOCK) ssh -i ~/.ssh/id_ed25519 root@100.75.128.45 'bash -s' < /local/script.sh- Tailscale IP
100.75.128.45| public IPv477.42.82.225 - NEVER use
-t(no pseudo-TTY). NEVER heredoc over SSH. - Tailscale node idle = online, not down. Re-auth prompt is normal: approve, then kill+restart any wedged session.
- Tailscale IP
Script-first (never deviate)
- Write scripts locally to
~/MyProjects/via Desktop Commanderwrite_file(NOT the sandbox). - Execute remotely via the ssh pipe above (
'bash -s' < script.sh). - Never patch files in place on the server bypassing git.
- Backup-before-change: write a rollback script to
/opt/<service>/before modifying configs. - Dead-man's-switch for risky ops: a verify step that proves success before the change is trusted.
Dokploy compose conventions
Dokploy deploys compose as a swarm stack, so Traefik routing needs BOTH label sets:
- container-level
labels:(docker provider) ANDdeploy: labels:(swarm provider) — mirror them exactly. - No
container_name:(swarm assigns names). - Attach
dokploy-network(external: true) for Traefik ingress. - Deploy only through the Dokploy UI (not
docker stack deployby hand). /etc/dokploy/compose/*/code/is OVERWRITTEN on every redeploy — never treat it as source of truth.- Standard Traefik labels (replace SVC / HOST / PORT):
traefik.enable=true traefik.docker.network=dokploy-network traefik.http.routers.SVC.rule=Host(`HOST`) traefik.http.routers.SVC.entrypoints=websecure traefik.http.routers.SVC.tls.certresolver=letsencrypt traefik.http.services.SVC.loadbalancer.server.port=PORT - Scaffold to copy:
templates/dokploy-service.compose.yml
Networking (the rules that bite)
dokploy-networkis a swarm OVERLAY → containers on it CANNOT reach the host (not10.0.1.1, not the Tailscale IP) and cannot cleanly egress to a tailnet peer.- To reach the host OR a tailnet peer from a container, give it a second bridge
network; its gateway (
172.x.0.1) is the host, which then routes/masquerades out. Precedents: n8n →172.19.0.1; tiger-bridgetiger-net→172.18.0.1; ha-proxy uses this for tailnet egress. - UFW:
ufw allowcovers bridge subnets (172.x). It does NOT expose docker-published ports — those needufw-docker allow PORT(DOCKER-USER chain). - Always
ufw reloadafter rule changes; verify withiptables -L ufw-user-input -n -v.
Deploy loop
- Git-driven services: source in
~/MyProjects/<svc>/, Forgejo remotegit.manohargupta.com/manohar/<svc>. Push → Forgejo webhook → Dokploy rebuild. No manual server steps. - infra repo = local
~/MyProjects/deployments/(remotemanohar/infra), pushes over HTTPS:443. Flat*.compose.ymlfiles and per-service subfolders are both fine. - Manual (non-Dokploy) stacks — Tiger
/opt/tiger/, LiteLLM, code-server — compose lives in the repo, deployed by hand.
Working style
- Root cause before fix; state tradeoffs between fix paths.
- One mini-question / understanding check per major topic.
- Explicit risk flag before any change touching security, stability, or data.
- Token-efficient: batch ops, don't re-explain established context.
- Don't redo security hardening (UFW/ufw-docker/fail2ban/SSH) — it's done.