Blog posts

Why Snaps are Bad

03/18/2024

I was performing a migration on a self-hosted GitLab instance when I ran into a strange "bug" with Docker that made containers inaccessible. A couple years ago, I had set up a Docker container with GitLab to store my side projects for redundancy and to work on them from either my laptop or desktop. When I noticed my version needed an upgrade, I tried swapping in the latest GitLab image in my docker-compose.yml and moving on.

It didn't work. The system helpfully informed me I was upgrading from version x to y and should read the documentation on migration procedures. It was more involved than I'd hoped: you need to upgrade step by step (13.9.x → 14.0.12, then 14.0.12 → 14.3.6, and so on) until reaching the target version (16.x.x). Fine, I could run those migrations while doing other things.

After the first migration, I tried accessing the server - no response. Ping? Working fine. SSH still connected... Maybe another container would work, like a vanilla nginx one? That didn't work either. Yet curl localhost worked perfectly. After searching through many results for "container not accessible outside host," I found this gem:

https://askubuntu.com/questions/1423293/ubuntu-22-04-docker-containers-not-accessible-from-outside

Which points to a known bug:

https://bugs.launchpad.net/ubuntu/+source/ufw/+bug/1968608

Older Ubuntu servers (20.04) use iptables-legacy. There are two backends - xtables and netfilter - and as long as you consistently use just one, things work fine. Ubuntu even creates symlinks to ensure this consistency. However, snap packages often ship with their own iptables or nftables.

Did I use snap?

    sudo snap services
    Service                          Startup  Current   Notes
    docker.dockerd                   enabled  active    -

Was it updated?

name:      docker
  latest/stable:    24.0.5   2024-02-01 (2915) 136MB

Lo and behold:

https://github.com/docker-snap/docker-snap/issues/68

Looking at the output of iptables-save/iptables-legacy-save I see that the new docker routing rules are not there, but all my old rules are. And iptables-nft-save shows the new routing rules. Was the snap docker distribution using netfilter a recent major change?

https://github.com/docker-snap/docker-snap/releases

Looks like it.

Therefore, it seems that during an unattended upgrade, a version of Docker that uses netfilter instead of the traditional xtables preferred by the system was installed resulting in undefined firewall behavior (the FORWARD DROP rule that appeared in iptables-legacy but not iptables), making my containers unreachable from outside the host.

That was fun.

tl;dr - The snap distribution of Docker ignored my system's preferences, resulting in undefined firewall behavior.

Writing a static site with Next.js

03/13/2024

I recently rewrote my site with Next.js, migrating from Python's Lektor. The process took just a few hours, improved the site's appearance, and will make future customization easier.

Here are some packages that made this transition smooth:

Mantine

A rising UI library that I've used in recent projects. It's intuitive and pleasant to work with.

Unified & Remark

My previous blog posts were stored as "content.lr" files with all data in the frontmatter. I simply wrote logic in getStaticProps to parse these files into JSON. Using unified made this straightforward:

import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {unified} from 'unified'
import {reporter} from 'vfile-reporter'

const file = await unified()
  .use(remarkParse)
  .use(remarkRehype)
  .use(rehypeDocument, {title: '👋🌍'})
  .use(rehypeFormat)
  .use(rehypeStringify)
  .process('# Hello world!')

https://www.npmjs.com/package/unified

The only thing that was left after that was to sort the posts, then write a map loop for creating an posts index at the top, and one for rendering the content into blog posts using dangerouslySetInnerHTML (great function name).

Comparison

The downside to this approach is that there is no GUI or CMS-like features where you can create new posts and edit them. But I have markdown support installed in VSCode, and frankly I'm enjoying writing in here more than Lektor's confusing web interface (adding posts by running a development server, navigating to the page, clicking "creat subpage"... you get the picture). Lektor also let you upload image files, so that's another thing I'll have to write. Perhaps a bigger drawback for this implementation is that I have to write pagination myself, whereas that came built in with Lektor. Even so, when I was trying to update my Lektor version I was getting a strange error in the pagination, and even the quickstart example had build errors (not sure if it was a theme issue). So here I am.

Deployment remains simple: run build and transfer the output to /var/www/html. It's refreshingly straightforward. Lektor had a deploy feature using rsync, but I've switched to Cloudflare Pages and use wrangler for deployment.

Considering a JavaScript framework for your blog? I recommend it!

CLIs, the cloud, and design

10/24/2023

At my last job, I wrote a CLI for deploying AI/ML models—my first experience creating a command-line interface. While CLIs aren't technically difficult to build, making them intuitive for users is surprisingly challenging.

Recently, I've been deploying several personal websites while trying to minimize complexity and costs. After evaluating GCP and AWS, I discovered Cloudflare, which offers free static site hosting with remarkably simple setup. You can deploy projects through their GUI or via wrangler, their JavaScript-based CLI (part of their "workers-sdk" repository^1). Although I was initially surprised by their choice of JavaScript over Python or Go, I quickly realized why their approach works better than the CLI I had designed.

Auth and User Experience

The first standout feature: if a user forgets their token, wrangler launches a browser-based OAuth workflow. This lets users leverage saved credentials in their browser rather than managing separate authentication:

$ npx wrangler pages deploy html/ --project-name myproject --skip-caching
Attempting to login via OAuth...
Opening a link in your default browser: https://dash.cloudflare.com/oauth2/auth?response_type=code&client_id=[redacted]&redirect_uri=http%3A%2F%2Flocalhost%3A8976%2Foauth%2Fcallback&scope=account%3Aread%20user%3Aread%20workers%3Awrite%20workers_kv%3Awrite%20workers_routes%3Awrite%20workers_scripts%3Awrite%20workers_tail%3Aread%20d1%3Awrite%20pages%3Awrite%20zone%3Aread%20ssl_certs%3Awrite%20constellation%3Awrite%20offline_access&state=lcykA2LiHgli6Az1DFDjbquPHBXWRlnr&code_challenge=_9Jkt0gMzkwSfM-vy9xrSC-yy9y7fa0OYCf5zMYS1Pg&code_challenge_method=S256
Successfully logged in.
🌍  Uploading... (2/2)

✨ Success! Uploaded 2 files (0.74 sec)

✨ Deployment complete! Take a peek over at [redacted]

Seeing is knowing

Command line semantic consistency and visualizing the taxonomy^2. I remember implementing a command that was not intuitive and having to refactor it. This could have been avoided by drawing out the planned command(s).

Looking for inspiration (and choosing the right one)

When facing unfamiliar problems, I search GitHub for similar solutions. I did consult Heroku's CLI, but I passed over it. Cloudflare took inspiration in the form of "colon-delimited command namespacing". I like how it makes painfully clear what the command is operating on. User proofing your interfaces is difficult.

Avoid logic that requires passing in raw ID strings

Another lesson from Cloudflare: avoid requiring users to input raw IDs or GUIDs. My CLI's delete command needed the deployment's GUID, but using human-readable names creates a much better user experience. As developers, we can become desensitized to poor UX patterns that frustrate most users.

Conclusion

Designing an intuitive CLI requires understanding users with varying technical backgrounds and benefits greatly from collaboration. While my CLI wasn't perfect, I did implement some valuable features:

Typo resilience - a mistyped command will still work (deply vs deploy) by comparing string similarity to the list of available commands.
Aliased commands. Sometimes users can't remember the command and type something similar but different (ls instead of list). A few of the commands were aliased like that.
Customized command groupings. Most CLIs when invoked with --help give you a large list of alphabetically sorted commands/options. This is notoriously difficult to read, so we hooked into the CLI framework and modified how commands are printed.

More Time

05/31/2023

At the end of March I received some unfortunate news that many have been receiving lately - me and several others got laid off. This kind of took me by surprise since I was always under the impression that things were OK, that our company was ahead of the curve on this (they let go some people over the summer as well). But I understand it's hard to avoid pressures that not even FAANG companies are insulated from. The silver lining in this? I have 45+ more hours per week to do whatever I please with. Time that cannot be bought anywhere else.

After nearly 7 years of working I realized that I might not get this same opportunity again. So, earlier this month I began writing some apps again. I conceived of, implemented, and deployed 3 apps so far. One per week. They range in complexity from being a simple static site with some vanilla JS/CSS to having a frontend + backend with a sqlite database. It seems like an efficient pace, and assuming I have no idea what idea is good/will succeed, I won't lose too much time on any one project. I have some access to the latest language models now which I'm experimenting with and seems to be adding some value to my development process. I'm finding that chatGPT is actually a nice programming assistant that definitely cannot do everything I can do. But it can answer my questions pretty well and has only hallucinated badly once (imaginary imports and functions).

What will the next generation of apps based on A.I. look like? I think it's already clear that information retrieval is a huge success - I'm testing the limits of this and how much of an "expert" a language model can be. Initial results are impressive but flawed. Hallucination is a larger problem if the subject isn't thoroughly covered in the corpus of the internet. However each version is improving so much that I'm confident the issue won't be nearly as much of a problem in the future. Mechanisms/workarounds for this already exist - simply tell the agent it is wrong and it attempts to correct its answer. But for a model to be used in production in a broad range of industries, hallucinations are unacceptable. Imagine if it started giving you fake news and facts, dubious medical advice, buggy code... That could give rise to a whole new class of lawsuits.

Of the chatGPT use cases I've read of, RAG (retrieval augmented generation) seems to be the most straightforward. Many companies have somewhat organized collections of documents. Pointing a language model at that and asking it questions could be powerful (instead of emailing HR about a benefits question, just ask the AI agent). I have an idea for an application where I could test this...

Advent of Code - not another Leetcode clone

01/31/2022

Lately I've been doing Advent of Code challenges. Somebody suggested it to me recently and I thought why not, many people use this as a measure of proficiency. It's entirely possible that I've been using frameworks / other people's code so much that I got a little rusty at solving problems with just the standard library. I've done Leetcode and HackerRank before so I was expecting something exactly like those platforms. I was wrong. Those sites are platforms where you can run code in the browser, with some social features as well (rating questions, sharing answers, etc). Most of the questions are already pretty well known. Advent of Code is different.

Advent of Code is not nearly as sophisticated - you code locally on your machine and submit the answer in a text form. What it lacks for in front end tech it makes up for in originality: each problem is posed as some fictional event occurring on a submarine that is delving into the unknown depths to retrieve the keys to Santa's sleigh. Or something. The problems are released annually each day in December. There is a leaderboard - top scorers are determined by how quickly they can submit the correct answer. There are a number of unique inputs / outputs to minimize cheating. Each problem has two parts - the first being relatively easier, and the second usually adds another layer of complexity or demands a scalable solution. It is noted that all problems can be computed in 15 seconds or less on old hardware. If you don't have a good solution it could take your code hours or days to finish!

I think my two favorite things about these challenges are the novelty of the problems and the simplicity of the platform. I don't need to write code in the browser and I prefer running code locally so I can debug if needed. I'm posting my solutions on GitHub.

Hacktoberfest 2021

11/29/2021

This year I did something I had been meaning to do for a while: contribute to open source projects. I frequently thought about contributing back to the open source code I use so much, but spending time on my own projects took priority. One thing stopping me from contributing to open source sooner was not knowing exactly how to get started. It's a common issue for many people because each project has its own contributing guidelines and level of openness to contributions.

However, with the little push from Hacktoberfest I was able to jump in and make some quality contributions in just a few weeks to complete the challenge of 4 accepted pull requests. I was even able to find a bug in one of Microsoft's projects! While my number of lines of code was relatively small, most time was spent reviewing open issues, understanding the code base, and trying to gauge whether I could solve the issue quickly or not. Some projects have labeled open issues with "good first issue" which are for newcomers to get started with. Having a set of projects participating in Hacktoberfest with this guidance allowed me to contribute successfully.

And I learned a lot in the process! Every project is different when it comes to the process for committing code. From conforming to style to running and passing tests there are a number of checks in place to ensure the quality of the project stays high. Though I knew git from private repositories, I definitely became more familiar with git commands and doing more nuanced things like resetting, rebasing, squashing, etc. I would like to keep contributing to projects that I use frequently, possibly even adding features of my own. It is time consuming to filter through issues and find a good one to create a pull request for. I may even open source some of my own code if I can refactor it and clean it up enough to be used by other people.

Making Sense of Nonsense

07/25/2021

For my next project I decided I needed to try something that could possibly directly support my hobby financially. So I started creating a very high-level trading algorithm. Instead of focusing on standard mathematical metrics (moving averages, Bollinger bands, RSI, etc) which has been done to death and optimized at higher frequencies than I'll ever be able to achieve on my own, I decided to look at something a little more niche like /r/wallstreetbets sentiment. Don't worry, it's not another mention frequency based meme stock buying bot. I try to decide stocks to trade based on positive and negative sentiments present in the body of text. The idea is there is volatility in these stocks and that it may be possible to day trade these tickers based on what people are about to do (assuming my parsing logic extracts the signal the same way humans are interpreting posts). In short, sentiment analysis of user comments is already a challenging task. Having a computer program that can work its way through double negatives, sarcasm, jokes, and idioms - in a community that does these things to the fullest extent - is a challenging task to say the least. One example is a comment that goes like this

"Buying put options on $GME is free money"

The sentiment of the comment is positive, however the trading action of buying a put is negative (it means betting the stock will go down in price).

Extracting an accurate signal from this type of data is very tricky and time consuming, but I was able to come up with a quick solution to account for these comments. I monkey patched a sentiment analysis algorithm. I have used gevent monkey patching before in production where I was facing certain constraints with async code. This time I wanted to learn something new and decided to do my own monkey patching. Now my algorithm knows the meaning of words in the context of trading.

Another interesting problem that came up is the potential for fake accounts / bots to spread misinformation. Institutional investors have been keen on sentiment for trading for years now (the Bloomberg terminal has a section for Twitter sentiment) and similarly equipped groups could manipulate the impulsivity of /r/wallstreetbets to their advantage.

My next steps are to pull in sentiments about a company from a variety of other sources - twitter has a simple API as well. There are pretty sophisticated financial APIs out there that I am researching. The most challenging thing to do won't be simply answering whether "this news is good news" or "this financial data is bad", but what to do with that particular stock since apparently stocks get dumped on good news.

There are a number of algorithmic trading libraries available in the Python ecosystem. One I want to use is backtrader. The simple interface that "gets out of your way" is what I am looking for. It allows you to define custom data sources, do backtesting, and trade live.

Learning Golang

06/16/2021

I decided it was a good time to learn another programming language. So I chose Go. Why did I choose Go? I don't know, somebody gave a presentation about it once and it seemed kind of interesting. I see other people use Go. It is used in many network and devops projects (ones that I use, like Docker). Why not see what it's about?

I created some simple web apps (see: hget.org) to learn it. I used the Gin web framework ^0 because even though the standard http and net libraries are great, I did not want to spend time writing this code.

Instead I wanted to write something more like this:

	r := gin.Default()

	r.LoadHTMLGlob("templates/*")

	r.GET("/", func(c *gin.Context) {
		c.HTML(http.StatusOK, "base.tmpl", gin.H{
			"title":       "IP",
			"curl":        "curl -H \"Content-Type: application/json\" ip.hget.org/api/",
			"curl_result": "{\"ip\":[\"127.0.0.1\"],\"port\":[\"55636\"]}",
		})
	})

That way I could reuse the same base template and then write some Javascript on the front end to build the main body of the page. Add a nginx config that defines a reverse proxy for different subdomains and I could create 3+ websites for the cost of 1 (domains and server bills add up fast).

What did I like about Go? Coming from Python it was refreshing to be able to compile the app, scp it to the server, and execute it. No requirements file. No "pip install -r requirements.txt". No "version 5.3 of x is required but 4.1 is installed" errors. No WSGI. No mucking around with Python versions and virtualenvs (relevant). Because of static typing and a compiler that forces me to write

if err != nil { 
    panic(err) 
}

every time there are very few surprises I would handle errors better than this in a more serious project. Also, having written a fair amount of async code, I can appreciate Goroutines. Having a synchronous API for asynchronous code is in my opinion and in most cases preferable. I liked structs and how they are embeddable with field promotion. Interfaces are great tools for design. In my time poking around the standard lib source code, I found much of it to be extremely concise and readable. It reminded me of what I learned in Patterns of Enterprise Application Architecture. So much so that I question if Python really is the least complex and most readable after all. On the whole, I feel like Go forces me to be a better programmer.

How to Make a Podcast Feed with Django

04/29/2021

Django comes with a syndication framework that lets you create RSS and Atom feeds easily^0. If you already have your model set up all you need to do is subclass the Feed class, add it to your urls.py, and map your model's fields to the XML fields:

class PodcastFeed(Feed):
    copyright_text = "Copyright (C) 2021, In Shape Mind"
    title = "In Shape Mind Podcast"
    link = "/podcast/"
    description = "News articles, transcribed from text, from a variety of quality sources."

    def categories(self):
        return 'news', 'current events', 'trending'

    def feed_copyright(self):
        return self.copyright_text

    def items(self):
        return PodcastArticle.objects.order_by('-id')[:65]

    def item_title(self, item):
        return item.article.title

    def item_description(self, item):
        return item.article.summary

    def item_link(self, item):
        return item.mp3_file.url

    def item_author_name(self, item):
        return item.article.author

    def item_pubdate(self, item):
        return item.article.publish_date

    def item_categories(self, item):
        return item.article.keywords

Since my site^1 already has tons of quality text data, I thought why not run it through a text-to-speech program and make it a podcast app as well? In order to get this to work I created the PodcastArticle model which has a one-to-one relationship with my Article model. The only thing left to do after that was create the mp3 files with the article text data. The most difficult part was understanding that creating the model so the storage target works in development as well as production is as easy as:

from django.conf import settings

if not settings.DEBUG:
    from inshapemind.storage_backends import PublicMediaStorage


def select_storage():
    return FileSystemStorage() if settings.DEBUG else PublicMediaStorage()


class PodcastArticle(models.Model):
    article = models.OneToOneField('djangonewspaper.NewspaperArticle', null=True, on_delete=models.SET_NULL)
    mp3_file = models.FileField(upload_to='podcasts/', storage=select_storage())

I needed to be selective about which articles I transcribed to audio to keep the quality high, so I just created a celery task and inserted it into the part of my application that searches articles for "trending" keywords (but also filters on completeness of data). So now my celery workers can create a corresponding PodcastArticle object & mp3 file (hosted on CDN) that is added to my Podcast RSS feed. I think I wrote ~ 100 lines of code in all. Cool!

Bitcoin Orderbook with Nodejs and Vuejs

02/13/2021

For this project I wanted to combine several Bitcoin exchange APIs into one order book chart to give a more holistic view of liquidity. An order book is a cumulative chart of bids/asks volume that slopes downward, then upward from left to right. The inspiration came from me wanting to see the overall order book, which was previously available on data.bitcoinity.org, however their order book chart stopped working for me, so I made one myself. Most exchanges have an order book endpoint publicly available, and the data comes back in a similar structure. Just a little data transformation was needed to map the responses to a common format, and then I could pass it all into a chart.js component. I wanted to use Python at first for this part, but Javascript is nice too because of map/filter/reduce functions. I actually preferred this to my usual Django or Flask apps because there is less overhead involved in getting an asynchronous server set up.

I chose to use vue-chartjs because I had used it at work before and had a good experience. A simple vue-chartjs component looks something like this:

Vue-Chartjs component with local data

import { Bar } from 'vue-chartjs'

export default {
  extends: Bar,
  data: () => ({
    chartdata: {
      labels: ['January', 'February'],
      datasets: [
        {
          label: 'Data One',
          backgroundColor: '#f87979',
          data: [40, 20]
        }
      ]
    },
    options: {
      responsive: true,
      maintainAspectRatio: false
    }
  }),

  mounted () {
    this.renderChart(this.chartdata, this.options)
  }
}

I wanted to serve the chart page with the aggregated data local to the chart definition, but it is kind of clunky to render JSON data within a HTML template (toyed around with MustacheJS). I chose to compose my Vue app such that it has an async mounted() method that calls back to the Node server for the aggregated data to populate the chart and render the chart. Otherwise the chart will render with no data because the API call to data is still in transit. However, doing things this way lost me flexibility with declaring options for my chart. The options I set to add axis labels and format the price to a more readable $xx,xxx format do not take effect. If I go against what the documentation says, and pass the options in directly rather than a this.options reference, the axis labels and formatters work, but the rest of the chart breaks.

Overall the project was fun to work on, I enjoyed getting more exposure to Javascript and its ecosystem. I liked using Nodejs for an API centered project since it's easy to work with JSON data. However I can't help but feel I'm missing the Javascript equivalent of Pandas in this toolset. This makes me want to serve a Vuejs app from a Flask backend for my next project that requires manipulating data.

Fullstack news aggregator webapp with Django, Postgres, and Docker

12/14/2020

Overview

In this post I will outline my use of the LEPP stack (Linux, Nginx, PostgreSQL, Python) to create a fairly complex web app (inshapemind.com, now defunct). Using these technologies I was able to create a fairly complex web application with relatively little overhead. You may already be familiar with each component, but I will also point out some interesting patterns I've adopted. For those unfamiliar with the stack - Linux is used because it is free and open source, ubiquitous in the cloud, and extremely robust. Nginx is used for those same reasons as well as being performant. PostgreSQL follows the open source trend and also has better security practices. Python is used for access to web frameworks and libraries that can save you a lot of time (Django, requests, pandas, newspaper are some of my favorites).

Goal

My goal was to create a web app that could aggregate content from various news outlets and wrangle the important data into a model defined within Django's ORM. I did this by using what would probably be regarded as an obscene amount of third party libraries (don't reinvent the wheel), and by integrating them with Django's application patterns. Yes I could probably remove half my dependencies by writing a handful of functions with the standard lib, but the goal is a working prototype first and foremost.

On the front end I wanted a responsive, crisp, and readable interface. The most important thing was to present the articles that people want to read, or make them otherwise searchable/browseable. My target was pages with 1 MB or less of data and load times in the 500-750ms range. My goal for the style of the interface was for it to be tolerable (I prefer backend development).

Component Choices

Django

It was either this or Flask. I chose Django since it has an excellent object-relational mapping (ORM), plenty of libraries like allauth, django-rest-framework, and mail/storage providers like anymail and storages. Not having to wrestle with user creation, password validation and storage, verification, forms, etc will literally save me hundreds of hours of headache and provide my users with a better experience and security.

To give you an idea of how simple querying the database in Django is, look at these typical examples:

>>> queryset = MyModel.objects.all()
# or
>>> queryset = MyModel.objects.filter(publish_date__range=[startdate, enddate], language='en', video=False).prefetch_related('author', 'domain').order_by('~publish_date')

The API covers 99% of SQL functionality^0

https://django-allauth.readthedocs.io/en/latest/
https://www.django-rest-framework.org/tutorial/quickstart/

Newspaper3k

This library allows you to "scrape" or "crawl" websites and extract articles. It has a simple API and makes use of multiprocessing to quickly fetch pages (but not so quickly as to overwhelm the host server). It works by abstracting the DOM and iterating over elements and assigning them scores to decide what text is a part of the article. There are parsers to extract data like title, authors, publish date, and so on. It is not always reliable since every site is different. I had to write some additional parsers for when the data is not extracted. I made a separate Django app based on newspaper, created models for the articles, authors, and domains, and made a management command to run a script that gathers articles from a defined list of websites. So adding more articles looks like:

docker-compose -f production.yml run django python manage.py cron

The management command allows for passing in lists of websites, overriding the default whitelist. That solves the content aspect of the site.

VueJS

I wanted users to be able to do simple things like follow sites and authors, and bookmark articles. I envisioned having a profile page where you could keep track of articles and get a customized feed based on your preferences. The issue with Django is to add/remove an object you typically make a POST request which gets handled by a view and redirects you to a GET request (another database hit). This didn't strike me as very efficient or modern, so I created a REST endpoint and added some VueJS to my templates. VueJS is amazing for applications with data - working with lists and API endpoints is a breeze ^1. You can create a web app entirely with VueJS and serve it as a SPA, or you can add the .js files to your page and make use of its functionality. I feel like I'm cheating a bit here and having my cake while eating it, but it works for now.

PostgreSQL

Postgres was probably the easiest decision to make. It performs. It is secure. But most of all, it has a Django integration ^2. The full text search is perfect for my text heavy application.

Nginx

Nginx hasn't failed me yet. It acts as a reverse proxy, passing requests to the WSGI server (gunicorn). Here is a simplified version of my config:

upstream django {
    server django:5000;
}

server {
    listen 80;
    server_tokens off;
    server_name inshapemind.com;
    return 301 https://inshapemind.com$request_uri;
}

server {

    listen 443 ssl;
    server_name inshapemind.com;
    ssl_certificate /etc/nginx/certs/fullchain.pem;
    ssl_certificate_key /etc/nginx/certs/privkey.pem;

    location / {
        proxy_pass http://django;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
    }
}

It's good practice to use a traditional webserver here because it lets Nginx handle the connections. Usually you would have a line here for serving static assets but I have a CDN set up for that.

Docker

Most people have heard of Docker by now, but I doubt everyone knows the full extent of what it has to offer. Yes, it allows you to run containerized applications, and yes Kubernetes uses it. I like it because it allows you to manage multiple containers and their configuration in a YAML file with docker-compose. Furthermore, you can provision and manage remote servers with docker-machine*. For cloud providers like Digital Ocean who have an API, you can create an instance from the command line by passing in the parameters. The Docker engine will then be installed and you will be able to ssh into the instance simply by invoking docker-machine ssh myproject. Here is what my YAML configuration looks like:

version: '3'

services:
  django: &django
    build:
      context: .
      dockerfile: ./compose/production/django/Dockerfile
    image: inshapemind_production_django
    depends_on:
      - postgres
      - redis
    env_file:
      - ./.envs/.production/.django
      - ./.envs/.production/.postgres
    volumes:
      - newspaper_cache:/tmp/.newspaper_scraper
      - nltk:/root/nltk_data
    expose:
      - 5000
    command: /start

  postgres:
    build:
      context: .
      dockerfile: ./compose/production/postgres/Dockerfile
    image: inshapemind_production_postgres
    volumes:
      - production_postgres_data:/var/lib/postgresql/data
      - production_postgres_data_backups:/backups
    env_file:
      - ./.envs/.production/.postgres

  nginx:
    build:
      context: .
      dockerfile: ./compose/production/nginx/Dockerfile
    image: inshapemind_production_nginx
    depends_on:
      - django
    ports:
      - "0.0.0.0:80:80"
      - "0.0.0.0:443:443"
    volumes:
      - /etc/letsencrypt/live/inshapemind.com/fullchain.pem:/etc/nginx/certs/fullchain.pem:ro
      - /etc/letsencrypt/live/inshapemind.com/privkey.pem:/etc/nginx/certs/privkey.pem:ro

  redis:
    image: redis:5


volumes:
  production_postgres_data: {}
  production_postgres_data_backups: {}
  newspaper_cache: {}
  nltk: {}

Notice how I've easily added volumes for Django and Nginx. Those files are mounted from the host and will persist. The Newspaper cache contains the memoization information so I don't hit the same URL multiple times. I have another container for acquiring a Letsencrypt cert, which is then mounted in the production Nginx container.

Summary

That's my LEPP stack. It is easy to develop and deploy. I hope you learned something new and/or will consider some of these technologies in your next project if you haven't already.

docker-machine has been deprecated. Try "Docker Desktop" instead