TL;DR — One tiny Arm EC2 + a stack of cheap on‑prem workstations = cloud‑grade uptime without cloud‑grade invoices. We save roughly ₹1 crore every year and still run everything with five devs and two interns.

I joined xcelerator last year. One of my first weekend hacks was an AI interview-prep agent for students; it required long-running TCP connections to stay stable.

Networking was a pain: Cloudflare Tunnels struggled with connections that stayed open longer than 60–120 seconds, so for our AI workloads it simply wasn’t enough. It handled short-lived traffic most of the time, but whenever our ISP rotated the public IP, cloudflared got stuck in a timeout loop. We had to restart the service to recover, and every open connection dropped when we did.

I spun up a t4g.micro instance in AWS, wired a WireGuard tunnel to our on-prem production subnet, and it just worked. Restarting the tunnel (rarely required) now simply resumes live sessions once it’s back up; the only downside is a brief latency bump while WireGuard renegotiates, and clients never see an error.

This setup handles up to 2.3 million requests per day at peak without even trying.

We’ve kept it minimal ever since: plain VMs, Elastic IPs, and self-managed Postgres. If AWS hikes prices or goes dark, we can shift to another provider over a weekend.

Counting the money

WireGuard isn’t what saves the ₹1 crore; the hybrid cloud layout does. What WireGuard allows is a stable Elastic IP tunnel (no ₹12 k ISP static IP, survives local line cuts) that keeps the whole setup scalable, cheap and private.

10 × 28‑vCPU workstations — ₹8 L one‑time (AWS equivalent: m6i.8xlarge × 10 ≈ ₹1.16 Cr per year)

1 × GPU workstation — ₹2.5 L one‑time (AWS equivalent: g5.4xlarge ≈ ₹12.2 L per year)

Power (12 months) — ₹1.13 L

One‑year on‑prem cost (hardware + power): ₹11.6 L. AWS bill for the same year: about ₹1.28 Cr. That’s ≈ ₹1.17 Cr saved each year.

(There’s also a t4g.micro relay at ₹500 / month—tiny next to a six‑figure AWS bill, but still a line‑item we track.)

Cost disclaimer

The price gap we quote compares our on-prem workstations with AWS on-demand rates. A 1-year reserved instance paid up-front trims that AWS bill by roughly one-third, and a 3-year commitment brings it down by about one-half. Even at those longer terms the cloud still costs several times more than the hardware we own, but the headline saving does shrink.

As a startup, locking cash into 3-year reservations is not practical; we rely on the flexibility of on-demand capacity to grow or reconfigure at short notice. That is why the main numbers in the post use on-demand pricing and why the 1- and 3-year options are mentioned only here for context.

Bandwidth costs average about $300 (≈ ₹25 k) per month at peak usage, so around ₹3 lakh per year if traffic stays at that level, still only a small slice of the overall saving.

How the pieces fit today

Workstations. Ten 28‑thread workstations we bought in bulk directly from the vendor at a healthy discount, plus one RTX A4500 tower that later swallowed a spare GTX 1060 I had lying around.

Tiny cloud footprint. The t4g.micro is our single public IP and the WireGuard peer.

DIY NAT. A t4g.nano running nftables avoids AWS’s eye‑wateringly priced NAT Gateway.

Simple Self‑healing script. wg‑monitor restarts the tunnel if the ISP rotates our address.

Postgres. A self‑managed read replica in AWS keeps the data warm.

Queue gateway. Trigger.dev’s Next.js UI lives here. It started with 16 GB RAM, hit OOM walls, so we bumped it to 32 GB—no crashes since (or yet?).

Queue worker. A sibling workstation that chews through the jobs the gateway dispatches; CPU‑spiky, memory‑light.

I still get Bezel alerts when Sentry spikes to 50 % CPU, and it’s oddly comforting: the graph jumps, nobody panics, everything keeps ticking.

Disaster recovery & high availability

What happens right now

The Docker Swarm stretches across all on‑prem hosts and a mirror Swarm in AWS.

Critical workloads run as duplicate stacks; Postgres streams WAL to the self‑managed read‑replica mentioned above.

The same Nginx gateway (t4g.micro) fronts both sites. Its upstream list includes both on‑prem and AWS manager IPs, so traffic flips automatically if on‑prem goes dark — instant in practice, worst case ≈ 5 minutes if Nginx needs to reload.

Where we’re going next

Swap the mirrored Swarm for a self‑managed Kubernetes control plane in the cloud; the on‑prem rigs re‑join as worker / backup nodes.

Stand up a Traefik ingress gateway inside that cluster; the public Nginx simply proxies to the Traefik endpoint, so fail‑over remains effectively instant.

Internal employee IDP, OpenProject, and AI‑automation dashboards stay 100 % on‑prem to keep operating costs near zero.

(A deeper Kubernetes post is on my backlog.)

What’s running on‑prem

App workstations (3) — Docker Swarm; Go & Next.js microservices (≈ 2 % CPU, 10 % RAM each).

Database workstations (2) — Postgres; secondary peaks at ≈ 50 % CPU on heavy reads.

Monitoring workstation — Sentry (3.5 % CPU, 18 GB RAM) + Bezel.

Queue workstation — Queue gateway & workers; dispatches long‑running jobs.

Dev sandbox workstation — Safe testing area for experiments.

Experimental workstation — Mastodon and other heavy side projects for our community.

VPN gateways — Three gateways expose the internal LAN to dev and employee laptops via Netbird, with per‑user ACLs so student logins see only whitelisted sandboxes and bare‑metal test boxes.

Student sandbox workstation 001 — First in a planned series for hackathons and research projects.

GPU workstation — RTX A4500 + GTX 1060 (AI inference).

What the savings buy us

Real compute for students. Hackathon teams and university clubs deploy straight to our test boxes through Cloudflare tunnels — docker compose up and they’re live.

Tiny team, big footprint. Five devs + two interns keep infra humming for 100k+ students; no SRE war room, no surprise bills.

Freedom to tinker. Need a new service? Spin it tonight. Need AI muscle? Trigger a job on the queue workstation—it automatically off‑loads the GPU‑heavy inference to the GPU workstation. Cost isn’t a blocker.

Open‑source everything. Nearly every back‑office task — employee IDP, project management, AI automations — is solved with a self‑hosted OSS app (Zitadel, OpenProject, Netbird, Dify, etc.). License cost: ₹0; hardware cost: we already own the rigs. We’ll break down those software‑license savings in a separate blog.

Takeaways

You don’t need a fleet of EC2s to look like you have one.

Bulk‑buying hardware + one Arm micro instance saves silly money.

Simplicity scales teams as well as traffic.

Stick to cloud primitives (VMs + routing). It keeps us vendor‑agnostic and ready to migrate if costs or outages demand.

Questions, ideas, memes? Ping dev@xcelerator.co.in, browse xcelerator.co.in, or join us at careers.

Wireguard saves us ₹1,00,00,000 annually