This project was built as part of the Cloud Engineering courseware, Learn to Cloud. Visit here to learn more: https://learntocloud.guide/

In this project, we’ve deployed an existing journal API + Database to Azure using a two-tier architecture. The primary goal of this exercise was to understand cloud architecture patterns, especially regarding networking. We also delve into some Linux Systems Administration.

Goals#

Let’s first define our end goals for the environment. In addition to your application of course functioning as expected, we want to accomplish the following:

SSH to the VMs should be restricted to only your own IP address. Optionally, configure a cloud native solution (Azure Bastion)
The API should only accept HTTP/HTTPS traffic from the internet
The Database should only accept connections directly from the API server
Regular database backups to Azure blob storage
API and Database automatically start upon server boots/restarts

Networking#

First, we create a VNet with the address space of 192.168.1.0 /24. We don’t need extensive space for this VNet as its use will be limited to this project.

Subnets#

Next, we’ll define our subnets. We want to separate the app and database into different subnets. We’ll need to apply different traffic rules to each tier of our application in order to follow the “principle of least privilege”.
Optionally, include an Azure Bastion subnet. A subnet titled specifically “AzureBastion” with a minimum CIDR block of /26 is required if you intend to deploy a Bastion to this project.

Network Security Groups#

Database#

Here’s our configuration for the database subnet. Apart from Azure NSG defaults, we allow inbound from the API subnet with a destination of port 5432 so that our API can communicate with our database.

API#

The API subnet’s NSG. Apart from defaults, we allow inbound SSH specifically from our own public IP address. Inbound HTTP to the API VM is allowed from the Internet on port 80. We also allow the AzureLoadBalancer service tag for the LB health probe. Because the API VM has no public IP, traffic only reaches it via the Load Balancer.

NAT Gateway#

A NAT Gateway with an outbound IP address attached to both subnets handles egress for our VMs.

Public Load Balancer#

Since we don’t want our API VM to have a public IP we use a Public Load Balancer to handle inbound traffic. As part of the config, we map the front end port 80 to the back end port 80. It also uses an HTTP health probe on /health mapped to port 80 to help determine availability. I’m using a Public Load Balancer here for simplicity. In a production setup you’ll want to front the app with Application Gateway or Azure Front Door.

SSH#

When creating your two VMs, make sure to choose the SSH key authentication method. Upon deployment, you’ll receive an option to download a new key pair.
Copy the files into your .ssh folder on your local machine. For me, on Windows, that was “C:\Users<username>.ssh”.

Then, edit your C:\Users<username>.ssh\config file to allow for the connection. In my setup, we wanted to ensure the database VM has no public IP attached at all. So we use ProxyJump from the API VM to facilitate SSH over to the database VM. Make sure your Load Balancer includes an inbound NAT rule to target the API VM’s port 22.

1
# API VM access through LB Public IP
2
Host JournalAPI
3
  HostName <LB Public IP>
4
  User <username>
5
  IdentityFile ~/.ssh/JournalAPI_key.pem
6
  IdentitiesOnly yes
7
  Port 22
8
  ServerAliveInterval 60
9

10
# DB VM accessible only through API VM
11
Host JournalDB
12
  HostName <JournalDB private IP>
13
  User <username>
14
  IdentityFile ~/.ssh/JournalDB_key.pem
15
  IdentitiesOnly yes
16
  Port 22
17
  ProxyJump JournalAPI    #ProxyJump to handle SSH to this VM
18
  ServerAliveInterval 60

Then make sure you have the VS Code extension “Remote- - SSH”.

If you’ve set up the networking correctly, you’ll now be able to SSH in from VSCode using the extension.

API VM#

Create your VM and upload your app into the /opt directory. At this stage, make sure to fulfill certain prerequisites, such as installing venv tooling, creating a venv, setting your environment variables, installing your requirements.txt, etc. Make sure to configure the DATABASE_URL environment variable. As the API and DB VM live in the same VNet, traffic stays internal. Nothing external can hit the database. The app will connect to the database via this URL.

We want to create a systemd service to autostart our app upon VM boot.
Create the service account:

1
sudo adduser --system --group --home /opt/journalapi journalapi

Ensure the service account owns the all of the application files.

1
sudo mkdir -p /opt/journalapi/journal-starter
2
sudo chown -R journalapi:journalapi /opt/journalapi

Next we configure the system unit, here’s what mine ended up looking like:

1
[Unit]
2
Description=Journal API (Uvicorn)
3
Wants=network-online.target
4
After=network-online.target
5

6
[Service]
7
Environment=PYTHONPATH=/opt/journalapi/journal-starter:/opt/journalapi/journal-starter/api
8

9
User=journalapi
10
Group=journalapi
11
WorkingDirectory=/opt/journalapi/journal-starter
12
EnvironmentFile=/opt/journalapi/.env
13

14
ExecStart=/opt/journalapi/venv/bin/uvicorn \
15
  --app-dir /opt/journalapi/journal-starter \
16
  api.main:app \
17
  --host 127.0.0.1 \
18
  --port 8000
19

20
Restart=always
21
RestartSec=5
22
StandardOutput=journal
23
StandardError=journal
24

25
[Install]
26
WantedBy=multi-user.target

Finally, enable it.

1
sudo systemctl daemon-reload
2
sudo systemctl enable --now journalapi

Because we don’t want to open port 8000 to the Internet, we utilize an Nginx reverse proxy, mapping port 80 to 127.0.0.1:8000
First install Nginx.

1
sudo apt update && sudo apt install -y nginx

Our configuration for the reverse proxy lives here: /etc/nginx/sites-available/journalapi

My file looks like this:

1
server {
2
    listen 80 default_server;
3
    listen [::]:80 default_server;
4
    server_name _;
5

6
    location /health {
7
        proxy_pass http://127.0.0.1:8000/health;    # forward to app health
8
    }
9

10
    location / {
11
        proxy_set_header Host $host;                         # preserve Host header
12
        proxy_set_header X-Real-IP $remote_addr;             # client IP for logs
13
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # proxy chain
14
        proxy_set_header X-Forwarded-Proto http;             # scheme
15
        proxy_pass http://127.0.0.1:8000;
16
    }
17
}

You’ll then need to enable the site. Use this command to create a symlink to enable your site.

1
sudo ln -s /etc/nginx/sites-available/<appfolder> /etc/nginx/sites-enabled/<appfolder>
2
sudo systemctl reload nginx

DB VM#

Now for our database VM. Install postgresql on your VM.

Add this to our postgresql.conf file

1
listen_addresses = '*'

Add this to our pg_hba.conf file:

1
host    all     all     192.168.1.0/27     scram-sha-256

In a psql shell, create your DB role, database, and schema.

1
CREATE ROLE <user_here> WITH LOGIN PASSWORD <passwordhere> NOSUPERUSER
2

3
CREATE DATABASE career_journal OWNER <user_here>;
4

5
\c career_journal
6

7
CREATE TABLE IF NOT EXISTS entries (
8
    id VARCHAR PRIMARY KEY,
9
    data JSONB NOT NULL,
10
    created_at TIMESTAMP WITH TIME ZONE NOT NULL,
11
    updated_at TIMESTAMP WITH TIME ZONE NOT NULL
12
);

Backups#

We’ll want to configure some type of backup solution for our database. I decided to backup my database to Azure blob storage, with a cronjob set up to run azcopy to it daily.

Create a data disk for your DB VM.

You’ll need to attach it to the VM. This guide can help: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/attach-disk-portal

Then you can create a mount point from the current postgres data directory to the data disk.
Consult your AI chat of choice for instructions on this, as it is somewhat complicated. I’m still learning how to do this and relied on ChatGPT for assistance here.

Next we need to create the storage account and azure blob container.

Then install azcopy: https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10?tabs=apt
Create a script that will create a pg dump and azcopy it to the blob storage.

Mine looks like this:

1
#!/bin/bash
2

3
DATE=$(date +%F_%H-%M-%S)
4

5
FILE="/tmp/pgdump_$DATE.sql"
6

7
PGPASSWORD="$POSTGRES_PASSWORD"
8

9
pg_dump -U $POSTGRES_USER $POSTGRES_DB > $FILE
10

11
# Replace <storageacct> and <SAS_TOKEN> with your actual values
12
azcopy copy "$FILE" "https://<storageacct>.blob.core.windows.net/pgdumps?<sas_token>"
13

14
#Delete the local file afterwards to save space
15
rm "$FILE"

Make it executable:

1
chmod +x <path to script>

Schedule a daily cron job to run the script:

Open the postgres users crontab

1
sudo -u postgres crontab -e

Then add this entry

1
0 5 * * * /var/lib/postgresql/scripts/pg_backup.sh

Run the script manually like so:

1
sudo -u postgres /var/lib/postgresql/scripts/pg_backup.sh

Then check that a backup was created on your Azure storage blob

One quick note: In a real, much larger database, you’ll want to compress the dumps for faster transfer times and to keep storage costs down. Research “gzip” for this.

How to verify everything works#

From the API VM:

1
curl -i http://127.0.0.1/health

From your local PC’s terminal to test internet accessibility:

1
curl -i http://<LB_PUBLIC_IP>/
2
curl -i http://<LB_PUBLIC_IP>/health

From your local browser, navigate to the docs page:

1
http://<frontend IP of load balancer>/docs

Create some sample entries from this page. Then, explore your database in a psql client from your database VM terminal.
Run a query. For example:

1
SELECT * FROM public.entries

You should see your entries populate:

How to improve this project#

The app currently uses only HTTP. Upgrading this to HTTPS/443 using Let’s Encrypt would bring this project to a production grade state.

You’ll notice all the resources are created through the portal.
All of this can certainly be done through Azure CLI

However, even better would be to define these resources via Terraform.
I intend to do that in the near future, as well as implement other DevOps methodologies to this project.

If you notice any other ways this project can be improved, please let me know! I’m actively working on learning more regarding best practices.

kalebcastillo

azure-two-tier-app

Waiting for api.github.com...

00K

Waiting...