Humans of the 12th Armored Division

From a 17,000-row Excel spreadsheet to a contentful website with external API integrations.

Features

  • 17,000 man roster
  • JWT admin auth
  • OpenSearch API integration
  • Discourse custom content

Tech Stack

  • SvelteKit
  • Tailwind CSS
  • Flask
  • SQLite
  • Vercel

Overview

Screenshot of the desktop landing page.
The website landing page on desktop.

Humans of the 12th Armored Division, also known as the 12th Armored Directory, is a website and repository for information relating to the soldiers who served in the U.S. Army’s so-called “Mystery Division” during World War II. Commissioned by a citizen historian, the site aspires to shed light on the stories of these men’s lives and their experiences, fill in missing gaps in the existing record, and make this information accessible to the broader public.

The design of the site is informed by the client’s research processes. Starting from the structure of an initial roster spreadsheet, I designed a relational database schema and API, integrated each roster member’s page with relevant external APIs, and added support for community-editable content that can be attached to a roster member’s page. I was fully responsible for everything from the database schema to the frontend design and implementation.

Architecture

{
  person_id: 1,
  first_name: 'Jack',
  last_name: 'White',
  units: [
    { company: '', battalion: '' },
    { company: '', battalion: '' }
  ],
  addresses: [
    { city: 'New York', state: 'NY' }
  ]
}
Sample JSON response from the API.

First, I converted the data in the spreadsheet my client provided to a relational schema. This required some normalization, since the spreadsheet encoded information about multiple addresses and battalion units inconsistently — sometimes with duplicate rows, other times in extra columns. The final schema assigns a unique numerical ID to each roster member, which is then used as a foreign key to the tables containing addresses, battalion units, and other details.

After importing the normalized roster into a SQLite database, I used Flask to serve a lightweight JSON API that would be used by the frontend (implemented in SvelteKit) to populate each page. I also created indices over names, locations, and units in order to speed up common queries and avoid full database scans.

JSON Web Token authentication

Frontend
user submits username and password
Middleware
forward credentials
Backend
If credentials are authorized...
respond with access and refresh tokens
Set-Cookie containing access and refresh tokens
tokens stored in HTTPOnly cookie
Diagram describing the JWT login flow.

The client requested the ability to create and edit roster member pages, which meant that I needed to implement some form of authentication. I opted for JSON Web Tokens (JWTs), as I had used them in a previous project and found them particularly useful in “stateless” hosting environments such as Vercel, the service I use to deploy the frontend (as well as this portfolio!).

I used Flask-JWT-Extended to protect the relevant POST, PUT, and DELETE routes on the backend. This Flask extension provides helper functions for creating, verifying, and refreshing our authentication tokens, which are read from the Authorization HTTP header. I use an asymmetric signing algorithm to encode the tokens so they can be verified at the frontend without sharing the private key across the two separate applications.

Details

In order to use these same tokens for authorization on the frontend — that is, the actual routes that allow access to the administrative interface — I set up some hooks in the SvelteKit middleware which run on every request.

The login flow works as follows:

  1. The SvelteKit middleware forwards the submitted username and password to the backend.
  2. The backend verifies the credentials and responds with the access and refresh tokens.
  3. The middleware receives these tokens and stores their values in a secure HTTPOnly cookie with SameSite=Strict to prevent cross-site scripting attacks.

After the cookie is set, the middleware verifies the access token on every request using the public key, and automatically refreshes it using the refresh token if it’s about to expire. If the token fails verification or cannot be refreshed, the cookie is deleted. If the token passes verification, then the user is allowed to access protected frontend routes. These routes in turn proxy any updates to the backend, setting both the Authorization header as well as a custom header to provide extra assurance that the request is coming from the client rather than an unauthorized origin.

Integrating the OpenSearch API

Library search results embedded on a roster member's page.

The beginnings of the 12th Armored Directory can be traced back to my client’s efforts to research, recontextualize, and augment existing online information about the 12th Armored Division. While much of this information is scattered across the web, we also found a few key institutional archives and repositories with open APIs.

In particular, The Portal to Texas History hosts a wealth of material related to the 12th Armored Division, including scrapbooks, photographs, and newsletters. I was able to create a component that taps into the OpenSearch API to embed a search within these archives for material related to a roster member, along with a small contextual snippet that links to the resources on the Portal’s website.

Custom content with Discourse

The client expressed interest in creating a community connected to the website. For this purpose, we spun up a small server to run the popular Discourse forum software, hosted on a subdomain of the main website.

Later, the client also expressed interest in adding more “unstructured” data, such as links, images, and files, to the roster member pages. We discussed the possibility that these pages could be edited not just by the site admin but by the community, and this led to the idea of using Discourse itself as the content management system.

We found a number of benefits to extending the site to leverage Discourse rather than writing a fully custom solution. Crucially, Discourse affords community-editable pages without complicating our existing authentication schema. Because Discourse itself already supports media uploads, we would not need to implement a separate file hosting solution. In addition, it provides an interface familiar to any forum user for editing and uploading content, and even allows for full-text search of this content.

Details

I added the ability to connect roster member pages to a Discourse topic by adding an admin option to create a Discourse topic on each roster member page. To prepopulate the fields and add two-way linking (from the backend to Discourse and vice-versa), I used the Discourse Custom Wizard to add a person_id field to these special wiki topics and create them from a template. This content can then be edited on Discourse by any user with the appropriate permissions.

Any page that has an associated Discourse topic retrieves and embeds the extra content via the Discourse API. The custom content can also be viewed on Discourse itself, with a backlink to the roster member page on the main website.

Takeaways

I was responsible for everything from the backend database architecture to the frontend design and development of this website. As the solo developer on this project, I had the opportunity to not only make implementation choices at every level of the stack, but to then also observe the ramifications of these decisions. Decisions made at the level of the backend, such as database architecture, ultimately cascade down to every part of both the developer and user experience. I realized that friction at the frontend can signal a design issue that may exist farther down the pipeline.

I am still actively working with this client, and thus this site is still an work in progress. Throughout this project, I have learned to put an emphasis on designing with flexibility in mind, so the site can accommodate iteration and experimentation without ever having to rewrite existing parts of it from scratch.

A tool that I used for the first time on this project was Tailwind CSS. Although I’m quite comfortable with writing CSS, Tailwind sped up the frontend design process considerably and made it possible for me to focus my attention on backend concerns and features such as the API integrations.