Two-level tagging

Have you ever had trouble deciding where to store a file on your hard disk? Or worse, had trouble finding it later?

When you store a file on your hard disk, you have to decide which folder to put it in. That folder can in turn live inside other folders. This results in a hierarchy, known in computer science as a *tree*.

The main problem with trees is that sometimes you want things to live in multiple places.

Tagging provides an alternate system. Tags are a lot like folders, except that things can belong to multiple tags. However, but the tags can’t themselves belong to anything. So you have just one level of organisation with no nesting.

The main problem with single-level tagging is that it’s too simple. We want to be able to use fine-grained categories (e.g. ‘lesser spotted greeb’) that themselves belong to higher-level categories (e.g. ‘greeb’, or even ‘bird’ or ‘animal’). But we said that tags can’t themselves belong to tags.

Described like this, perhaps the solution will seem obvious to you too. We want things to belong to multiple tags, and for those tags to sometimes belong to other tags.

I built this into Emacs Freex, my note-taking system.

For instance, I have tagged this blog post with ‘data structure’ and ‘blog’. In turn ‘data structure’ is tagged with ‘computer science’ and ‘blog’ is tagged with ‘writing’. So I can find this blog post later in various ways, including by intersecting ‘computer science’ and ‘writing’.

This gives you the best of both worlds: things belong to multiple categories, along with a hierarchy of categories.

Blogging with WordPress and Emacs

When it comes to tools, I am a hedgehog rather than a fox. I like to have a small number of tools, and to know them well.

I recently resolved to start writing again. But I decided that I needed to sharpen my pencils first.

I have plans on how publishing and sharing should work. Grand plans. Too grand, perhaps.

So for now, I wrote something simple for myself. Now I can type away, press buttons… publish.

If you like Emacs, Python and WordPress, this might be interesting to you too. If not, it certainly won’t be.

wordpress-python-emacs GitHub repository

Most of the work is being done by this great Python/Wordpress library. Thank you.

I wrote some simple Python scripts. One grabs all my existing blog posts. One looks through their titles, and checks them against the filename to see if this is a new post.

And then there’s a very simple Emacs function that calls them to save/publish the current text file.

I could add more things: deleting posts, or a proper workflow for moving from draft to published. Maybe later.

I wrote this post, then hit M-x wordpress-publish-this-file.

A wiki for spaces. A town anyone can edit. School architecture founded on mnemonic principles

When we think of wikis, we think of text, like the Wikipedia. But this notion of content that anyone can view and anyone can edit has barely unfurled its wings. What if we were to apply it to space?

For instance, imagine growing a World of Warcraft town as a community. Each person could design and improve upon the buildings, fill the walls with graffiti, neighborhoods would define themselves… the ease and pace of iteration might even generate new ideas about town planning.

Alternatively, let’s build on Ed Cooke’s fantastic plan for school architecture in the future [cached]:

Children, well known to be compulsive absorbers of information, crucially learn what they are interested in. Like all animals, they are interested in spaces.

I’d like to see schools’ spatial layout reflect the history of Western culture, and thereby implicitly teach it. A snake-like line of school buildings could begin at one end in Ancient times and run on, in temporally organized fashion, up to the computer science blocks of the present day. Key themes and figures from each epoch could provide the names for classrooms, which could also reflect some of the architecture, customs and furniture of the day.

Because in five years of school, everyone learns every detail of the spatial organisation of the buildings, and because memories always attach to the spaces in which they were first formed, merely attending such a school would give one a wonderfully detailed sense of the history and structure of Western civilisation. And it wouldn’t need to be prescriptive, for one could take advantage of the second source of childrens’ interest – things they have a role in – to redouble the effect. Each year-group could, over the course of five years, reconsider, re-design and re-build one of the twelve epochs/buildings.

Convincing someone to build a school organized on mnemonic principles is going to be tricky. But in the meantime, perhaps schools’ online presence might take the form of a spatial wiki. Students could make changes ranging from decor to naming to overall organization, shaping their online school to their memories and vice versa. We love to deeply inhabit our environment by shaping it – what could be better than exercising our rich faculty for spatial navigation imaginatively?


Most wikis require you to perform one of two contortions to create a link:

  • Use CamelCase. Much like a camel, this is robust, but tiring to finger.
  • Wrap things in [“symbols that are hard to type”].

In both cases, you need to know in advance that you plan to create a link, and be enough of a disciplined philistine to overcome the effort and overlook the ugliness.

Auto-links are the solution [1] – here’s how they work. Say you create a page called ‘Camel case’. Now, type Camel case anywhere else, and that ‘Camel case’ text will be turned into an auto-link as you go. In other words, the wiki notices that you’ve typed the name of an existing page in the midst of your text, and automatically creates a link for you. If you go back and edit the text, the link goes away. [2]

Links between pages become evident to readers without any extra effort on the part of the writer. If I type ‘MySQL’ and an auto-link appears, it’s easy to see that a relevant page about it already exists.

Having used such a system for a long time, I have come to appreciate the tiny flash of satisfaction at seeing a link appear with no extra effort, confirming that the page does indeed exist [3], and making navigation while editing a breeze. Pages that I wrote years ago are now festooned with links to pages that were created long afterwards. Indeed, the most satisfying feeling of all is when an auto-link pops up to a page I’d forgotten I wrote. Lazy serendipity!

[1] see Per Sederberg‘s implementation in Emacs Freex mode, though we called them ‘implicit links’ back then

[2] To do this the way God intended requires running a regex containing all the pagetitles in your wiki over what you type on every keystroke – this is very nearly instantaneous for even 10k documents.

[3] For extra points, allow pages to have multiple aliases, so that (for instance) ‘database’, ‘databases’ and ‘MySQL’ all point to the same page.


Every time a programmer goes away for a few days, a piece of infrastructure they know best breaks. That’s just Murphy’s Algorithm.

If they they had only written a 100-word overview with some examples, that would have saved someone else a painstaking day figuring out how things should work, why they suddenly don’t, and righting the world once more.

How do you make it likely that everyone writes down what they know while it’s still fresh? Think of edits as conversions (in the analytics sense) – our funnel stretches from signup to viewing to editing, and we want to maximize the number of edits.

How do we optimize the ‘edit’ conversion rate for a wiki?

  • Editing should happen in the same mode as viewing. If you have to click ‘edit’, then wait for a page refresh, then scroll down inside a teeny textbox in a browser, then hit save to see your changes … those steps create a barrier to entry. The conversion rate of views to edits will drop dramatically. Typos, inaccuracies, inscrutabilities and out-of-datenesses will accumulate.
  • It needs to be as available as possible. All and only your team can access and contribute to it, even if they’re on a different computer or offline.
  • Consolidate everything in one or two searchable places. When it’s hard to find something, you won’t want to start looking. If you have to search one by one through a wiki, your email, a bug tracker, the version control commit log and comments in the codebase, you’ll end up just tapping someone on the shoulder – the knowledge will never get planted in a way that it can grow.
  • No special knowledge. Wiki markups are confusing and confusable. WYSYWYG editors are a good start – but editing text in most browser textboxes feels like typing with chopsticks. And proprietary document formats are opaque and constricting.
  • No barrier to exit. I want to be able to easily (and ideally automatically) grab a dump of all our documentation, both as a backup and as an export.

After reviewing these possibilities over and over, these are the best solutions I’ve come up with for Memrise:

  • A few monolithic Google Docs. This has worked reasonably well, except that Google Docs still falters in an unwieldy and buggy way when dealing with even medium-sized documents. Boooo!
  • Etherpad clone. They seem pretty expensive for multi-user monthly subscriptions, and seem weak at linking and searching. Plus, they don’t work offline, and I don’t trust the companies behind them to be around in 5 years’ time.
  • Text files in Dropbox. The main downside to this is that you can’t easily inter-link text files, and they lack formatting which makes them ugly to read. But they have no barriers to entry whatsoever.

    In an ideal world, someone would build a nice (optionally hosted?) wiki solution pulling and formatting Dropbox text files as webpages to give you the best of both worlds, perhaps combined with a few desktop apps and extensions to make offline viewing editing more pleasant.

    Neuroscience notes

    The neuroscience notes for my PhD qualifying exam are now online as a single compressed tarball and as browsable individual webpages. If you use Emacs Muse, then you can also grab the muse files as a tarball (or by changing the .html to .muse) too.

    The notes are in wiki form – in other words, I tried to give every brain region, tract, disorder, function and topic its own page. Emacs Muse automatically created the hyperlinks as I was typing (thanks to Per Sederberg‘s implicit linking patch. In reality, many of the pages are missing or sparse, since this is a pretty gargantuan task. There’s also the possibility that things are inaccurate. For instance, I’m pretty sure there’s some confusion regarding the nucleus reticularis vs nucleus reticularis pontis oralis, since I didn’t initially realize that they’re distinct brain regions with similar names…

    Anyway, you’re welcome to use, modify and distribute these in any way you’d like, though I’d appreciate a shout-out if you do. Consider them to be released with a Creative Commons Attribution 3.0 license, though I’ve been lazy and not included the license file. I’d doubly appreciate hearing about any flaws or confusions you find. I’m still working on these, and so I may one day release an updated version.

    Obviously, to some degree this is reinventing the wheel. Most of these notes are from Kandel & Schwarz (4th edition), and you can find most of this stuff online, e.g. the wikipedia has some good pages on the brain, though the usual caveats apply. I also learned a lot about the pros and cons of wikifying knowledge along the way, but that’s the topic of another post.