I Set Up a Staging Server on EC2… and Everything Broke First 😁

It started like most engineering sessions do — with a whiteboard, a cup of chai, and dangerous levels of confidence.

"It'll take a few hours," i said.

It did not take a few hours.

This is not a polished guide. This is what actually happened — the confusion, the wrong assumptions, the "yaar ab kya ho gaya" moments — when i tried to deploy two backend services and a database across EC2 instances, wire them together with Nginx, and somehow come out alive on the other side.

The Plan (Simple… in Our Heads)

The architecture looked clean on paper:

One EC2 for PostgreSQL (the database)
One EC2 for two backend services (running via Docker)
Nginx as a reverse proxy to route traffic
Everything talking to each other privately
Done by evening

code

Browser
   │
   ▼
[Nginx EC2]
   ├──► :8080 → Main Service (Docker)
   └──► :8081 → Book Service (Docker)
                      │
                      ▼
              [PostgreSQL EC2]

Easy lag raha tha. Bohot easy.

Act 1: The Database That Refused to Talk

I spsn up the first EC2, installed PostgreSQL, and set everything up.

bash

sudo apt install postgresql -y
sudo -u postgres psql

Created the database, the user, the schemas. Felt good. Connected the backend.

And then —

code

connection refused

The Problem

I stared at the error for a while. Database toh chalu hai. Phir kya issue hai?

After some digging, the answer was embarrassingly simple:

PostgreSQL was only listening on localhost — meaning it would only accept connections from inside the same machine.

Our backend lived on a completely different EC2. It was knocking on a door that wasn't even facing the street.

The Fix

I updated /etc/postgresql/*/main/postgresql.conf:

conf

listen_addresses = '*'

And updated /etc/postgresql/*/main/pg_hba.conf to allow the backend's private IP:

conf

host    all    all    <backend-private-ip>/32    md5

Then opened port 5432 in the AWS Security Group for the DB instance — but only for the backend EC2's private IP. Not the whole internet.

code

Inbound Rule:
Type: PostgreSQL | Port: 5432 | Source: <backend-private-ip>/32

Restarted PostgreSQL:

bash

sudo systemctl restart postgresql

First small win. Backend connected.

Act 2: The Schema That Existed But Didn't Work

Next error, arriving quickly:

code

permission denied for schema public

Schema toh bana diya hai. Phir permission kyu nahi hai?

The Problem

Creating a schema and granting access to it are two different things. I had done the first. Not the second.

The Fix

sql

GRANT USAGE ON SCHEMA public TO your_user;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO your_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO your_user;

Reconnected. Worked.

Act 3: Tables That Were Never Born

I barely had time to feel good before:

code

relation "users" does not exist

Yaar ab kya reh gaya hai.

The Problem

Database sync was turned off in our backend config. The ORM never ran migrations. The tables simply did not exist.

The Fix

Enabled sync temporarily, restarted the backend:

env

DB_SYNC=true

Tables got created. I turned sync back off immediately — you don't leave that on in a real environment.

Act 4: Docker Networking Is a Different Universe

With the database finally stable, i moved to the backend EC2.

Two services. I containerized both with Docker.

code

main-service   → port 8080
book-service   → port 8081

Started the containers. Both showed as running. Opened the browser.

code

502 Bad Gateway

Nginx problem hoga, i thought.

I was wrong.

The Real Problem

Our docker-compose.yml had the ports mapped incorrectly:

yaml

# What i wrote
ports:
  - "3000:8080"
  - "3001:8081"

So the host machine was listening on 3000 and 3001. But Nginx was sending traffic to 8080 and 8081. The requests were going to a port no one was home at.

The Fix

yaml

# What it should be
ports:
  - "8080:8080"
  - "8081:8081"

Restarted. Tried again.

Still not working.

Act 5: The `localhost` That Lied

Same 502. Different reason.

I checked the backend code. Found this:

typescript

app.listen(PORT)

The Problem

Inside a Docker container, localhost means inside the container — not the host machine, not the network. Our app was listening, but only to itself.

From outside the container, it was completely unreachable.

The Fix

typescript

app.listen(PORT, '0.0.0.0')

0.0.0.0 means: listen on all interfaces — including the one Docker uses to talk to the outside world.

Rebuilt the image. Restarted.

Backend responded. Finally.

Act 6: Nginx Was Quietly Eating Our Routes

New error:

code

Cannot GET /users

The backend was alive. Nginx was routing. But something was getting lost in between.

The Problem

Our Nginx config looked like this:

nginx

location /api/ {
    proxy_pass http://localhost:8080/;
}

That trailing slash on proxy_pass tells Nginx to strip the /api prefix before forwarding.

So a request to /api/users was arriving at the backend as /users.

But the backend was registered to handle /api/users.

The Fix

nginx

location /api/ {
    proxy_pass http://localhost:8080;
}

No trailing slash. Nginx now forwards the full path as-is.

code

/api/users → /api/users ✓

Reloaded Nginx:

bash

sudo nginx -s reload

Routes resolved correctly.

Act 7: The Mistake That Locked Us Out 😅

This one still hurts a little.

I was setting up the firewall on the DB EC2 Instance. Typed this:

bash

sudo ufw allow 5432/tcp
sudo ufw allow 8080/tcp
sudo ufw allow 8081/tcp
sudo ufw allow 80/tcp
sudo ufw enable

And immediately lost SSH access.

The terminal froze. The connection timed out. I was locked out of our own server.

What Happened

I never ran:

bash

sudo ufw allow 22/tcp

The moment UFW enabled, it blocked all incoming traffic — including our SSH session. Port 22 was closed. I was on the outside.

The Recovery (No Serial Connect Available)

Step 1 — Stop the instance from the AWS Console.

Step 2 — Edit User Data (Actions → Instance Settings → Edit User Data):

code

Content-Type: multipart/mixed; boundary="//"
MIME-Version: 1.0

--//
Content-Type: text/cloud-config

#cloud-config
cloud_final_modules:
- [scripts-user, always]

--//
Content-Type: text/x-shellscript

#!/bin/bash
ufw disable
iptables -F
service ufw stop

--//--

Step 3 — Start the instance.

The script ran on boot, cleared the firewall rules, and SSH came back.

Step 4 — Do it right this time:

bash

# SSH first. Always.
sudo ufw allow 22/tcp

# Then everything else
sudo ufw allow 80/tcp
sudo ufw allow 8080/tcp
sudo ufw allow 8081/tcp

# Then enable
sudo ufw enable

Never again.

The Final Architecture (That Actually Worked)

After everything, this is what i had running:

code

Internet
   │
   ▼
[Nginx — EC2 #2]  (:80)
   ├── /api/main/*  ──► main-service  (:8080)  [Docker]
   └── /api/book/*  ──► book-service  (:8081)  [Docker]
                              │
                    Private IP connection
                              │
                              ▼
                   [PostgreSQL — EC2 #1]  (:5432)

Security Groups:

DB EC2: only accepts :5432 from backend EC2's private IP
Backend EC2: accepts :80 from anywhere, :22 from your IP only

UFW on both instances: SSH allowed first. Always.

What I Actually Learned

Not theory. Just the things that burned us:

PostgreSQL doesn't open to the network by default — you have to configure it explicitly, and carefully
Docker networking is its own world — localhost inside a container is not the host machine
0.0.0.0 is not optional when your app needs to be reachable from outside a container
Nginx's trailing slash on proxy_pass silently rewrites your paths
UFW with no SSH rule = instant lockout — recovery is possible but painful
Small config mistakes cost the most time — not the hard problems, the quiet ones

Final Thought

This whole setup took way longer than it should have.

Not because it was complicated — the architecture is genuinely simple. But because of a dozen small assumptions i made without checking. Each one cost us twenty minutes of confusion.

That's how this works. You read the docs, you think you understand, and then reality introduces itself.

The only way through is to break things, understand why they broke, and fix them properly.

Now you know what i know.

If this saved you even one "yaar ab kya ho gaya" moment — you're welcome. 🙂

I Set Up a Staging Server on EC2… and Everything Broke First 😁

The Plan (Simple… in Our Heads)

Act 1: The Database That Refused to Talk

The Problem

The Fix

Act 2: The Schema That Existed But Didn't Work

The Problem

The Fix

Act 3: Tables That Were Never Born

The Problem

The Fix

Act 4: Docker Networking Is a Different Universe

The Real Problem

The Fix

Act 5: The `localhost` That Lied

The Problem

The Fix

Act 6: Nginx Was Quietly Eating Our Routes

The Problem

The Fix

Act 7: The Mistake That Locked Us Out 😅

What Happened

The Recovery (No Serial Connect Available)

The Final Architecture (That Actually Worked)

What I Actually Learned

Final Thought

Keep Reading

One Missing Port Took Down My EC2: How I Fixed a Complete SSH Lockout

Keep Reading

Keep Reading

One Missing Port Took Down My EC2: How I Fixed a Complete SSH Lockout

I Set Up a Staging Server on EC2… and Everything Broke First 😁

The Plan (Simple… in Our Heads)

Act 1: The Database That Refused to Talk

The Problem

The Fix

Act 2: The Schema That Existed But Didn't Work

The Problem

The Fix

Act 3: Tables That Were Never Born

The Problem

The Fix

Act 4: Docker Networking Is a Different Universe

The Real Problem

The Fix

Act 5: The localhost That Lied

The Problem

The Fix

Act 6: Nginx Was Quietly Eating Our Routes

The Problem

The Fix

Act 7: The Mistake That Locked Us Out 😅

What Happened

The Recovery (No Serial Connect Available)

The Final Architecture (That Actually Worked)

What I Actually Learned

Final Thought

Keep Reading

One Missing Port Took Down My EC2: How I Fixed a Complete SSH Lockout

Keep Reading

Keep Reading

One Missing Port Took Down My EC2: How I Fixed a Complete SSH Lockout

Act 5: The `localhost` That Lied