A static site generator for a research group website using Poole and git

The whole idea of static site generators is interesting, especially for someone who has had to deal with the pain of content management systems for website making. I've been dabbling with a static site generator for our research group website and I think it's a good plan.

What's a static site generator?

Firstly here's a quick historical introduction to get you up to speed on what I'm talking about:

  1. When the web was invented it was largely based on "static" HTML webpages with no interaction, no user-customised content etc.
  2. If you wanted websites to remember user details, or generate content based on the weather, or based on some database, the server-side system had to run software to do the clever stuff. Eventually these evolved into widely-used "content management systems" (CMSes) - such as drupal, wordpress, plone, mediawiki.
  3. However, CMSes can be a major pain in the arse to look after. For example:
    • They're very heavily targeted by spammers these days. You can't just leave the site and forget about it, especially if you actually want to use CMS features such as user-edited content, or comments - you need to moderate the site.
    • You often have to keep updating the CMS software, for security patches, new versions of programming languages, etc.
    • They can be hard to move from one web host to another - they'll often have not-the-right version of PHP or whatever.
  4. More recently, HTML+CSS+JavaScript have developed to the point where they can do a lot of whizzy stuff themselves (such as loading and formatting data from some data source), so there's not quite as much need for the server-side cleverness. This led to the idea of a static site generator - why not do all the clever stuff at the point of authoring the site, rather than at the point of serving the site? The big big benefit there is that the server can be completely dumb, just serving files over the network as if it was 1994 again.
    • That gets rid of many security issues and compatibility issues.
    • It also frees you up a bit: you can use whatever software you like to generate the content, it doesn't have to be software that's designed for responding to HTTP requests.
    • It does prevent you from doing certain things - you can't really have a comments system (as in many blogs) if it's purely client-side, for example. There are workarounds but it's still a limitation.

It's not as if SSGs are poised to wipe out CMSes, not at all. But an SSG can be a really neat alternative for managing a website, if it suits your needs. There are lots of nice static site generators out there.

Static site generators for academic websites

So here in academia, we have loads of old websites everywhere. Some of them are plain HTML, some of them are CMSes set up by PhD students who left years ago, some of them are big whizzy CMSes that the central university admin paid millions for and doesn't quite do everything you want.

If you're setting up a new research group website, questions that come to mind are:

  • How much pain it would take to convince the IT department to install this specific version of python/PHP/ruby, plus all the weird little plugins that this software demands?
  • Who's going to maintain the website for years, applying security patches, dealing with hacks, etc?
  • If I go through this hassle of setting up a CMS, which of its whizzy features do I actually want to use? Often you don't really care about many core CMS features, and the features you do want (such as publications lists) are handled by some half-baked plugin that a half-distracted academic cobbled together years ago and now doesn't work properly.

So using a static site generator (SSG) might be a really handy idea. So that's what I've done. I used a static site generator called Poole which is written in Python and it appealed to me because of how minimal it is.


It has one HTML template which you can make yourself, and then it takes content written in markdown syntax and puts the two together to produce your HTML website. It lets you embed bits of python code in the markdown too, if there's any whizzy stuff needed during page generation. And that's it, it doesn't do anything else. Fab!

But there's more: how do people in our research group edit the site? Do they need to understand this crazy little coding system? No! I plugged Poole together with github for editing the markdown pages. The markdown files are in a github project. As with any github project, anyone can propose a change to one of the textfiles. If they're not pre-authorised then it becomes a "Pull Request" which someone like me checks before approving. Then, I have a little script that regularly checks the github project and regenerates the site if the content has changed.

(This is edging a little bit more towards the CMS side of things, with the server actually having to do stuff. But the neat thing is firstly that this auto-update is optional - this paradigm would work even if the server couldn't regularly poll github, for example - and secondly, because Poole is minimal the server requirements are minimal. It just needs Python plus the python-markdown module.)

We did need a couple of whizzy things for the research site: a publications list, and a listing of research group members. We wanted these to come from data such as a spreadsheet so it could be used in multiple pages and easily updated. This is achieved via the embedded bits of python code I mentioned: we have publications stored in bibtex files, and people stored in a CSV file, and the python loads the data and transforms it into HTML.

It's really neat that the SSG means we have all our content stored in a really portable format: a single git repository containing some of the most widely-handled file formats: markdown, bibtex and CSV.

So where is this website? Here: http://c4dm.eecs.qmul.ac.uk/

