pelican - On Web Security

Setting Up a New Sphinx Documentation Framework

When having to write documentation for different formats, I always use the reStructuredText [1] (or reST) format. As this is something that happens quite often, it made sense to put some effort in automating the set up of a new documentation framework, a reusable set up script.

Setting up a framework

The standard documentation framework that I use consists of Sphinx [2], which takes care of converting source pages written in reST into several formats: For example HTML, but also PDF or something more exotic like ePub files. Note that Sphinx already comes with a setup script, sphinx-quickstart [3] - but this doesn't take care of deploying files.

In order to be able to create a reusable framework, I split the necessary files into three groups:

The Sphinx configuration itself,
version information, and
a LaTeX formatting template.

The Sphinx configuration

This part consists of two different files; A generic Makefile [4] to build the different artifact types - as well as a Sphinx configuration file (conf.py [5]) containing basic information about the project, and plugin details. These files rarely change after having initialized the framework.

Version information

The version information (version, or build number) can change per release, and is therefore contained in a separate …

more ...

Generate list of used content tags for Pelican

If your Pelican-generated site uses lots of different tags for articles, it can be difficult to remember or use tag names consistently. Therefore I needed a quick method to print (comma separated) unique tags that were stored in text files.

This shell one-liner from within the content directory will sort and show all tags from reStructuredText (*.rst ) files:

grep -h '^:tags:' *.rst | sed -e 's/^:tags:\s*//;s/\s*,\s*/\n/g' | sort -u

First grep will filter on the :tags: property and will only print out the matching line (without filename, thanks to the -h flag).

Then sed will remove the :tags: keyword (and trailing spaces), and all tags will be split using newline characters.

Finally, sort takes care of sorting and only printing unique entries.

Analogous, one can do the same for categories:

grep -h '^:category:' *.rst | sed -e 's/^:category:\s*//' | sort -u

As Pelican only allows one category, this is somewhat simpler.

For maximum readability, tr can convert the newlines into spaces, so that the output is one big line:

grep -h '^:tags:' *.rst | sed -e 's/^:tags:\s*//;s/\s*,\s*/\n/g' | sort -u | tr '\n' ' '; echo

The last echo is meant to end …

more ...

Convert WordPress to static site generator Pelican

Pelican

After a number of years using WordPress as blogging software, I converted the site to a static site generator: Pelican.

Pelican converts reStructuredText into static HTML. No more PHP, no more databases, but straight static HTML.

The process of converting the site was relatively painless. The conversion tool did a great job of converting an XML export of WordPress into reStructuredText pages.

What needed (and still needs) some manual care were/are the code blocks (the biggest reason of the move from WordPress to Pelican) in articles, and the escaping of variables. WordPress gets pretty complex once you're trying to use it for code snippets and console outputs. The reStructuredText is much more flexible and allows you to edit the site using any text editor. There are tools to do that with WordPress and its API, but it always felt like a difficult workaround.

I thought about keeping the URLs as-is: Over the years the number of visitors of the site has steadily risen, as has the level of indexing by search engines. You don't want dead links - but on the other hand, a transition to another content management system would be the perfect moment to 'clean up' the category …

more ...