Creating a blog with Jupyter notebooks

Assuming you already have already installed Jupyter notebook, you will need to do the following:

Installing and configuring a Nikola blog

  1. First you'll need to create a directory structure as follows:

     - /blog
     -- /posts
     -- /output
    • /blog is the root directory for everything you'll be doing with your blog
    • /posts is where you'll store your Jupyter notebooks
    • /output will contain the code generated for your blog
  2. Run the following command to install Nikola (the static website generator which will do most of the heavy lifting)[1]:

    pip install --upgrade "Nikola[extras]"

  3. Change directory to your blog root:

    cd blog

  4. Start up Nikola, following the prompts to configure your new blog:

    nikola init .

  5. Open /blog/conf.py and change the POSTS and PAGES sections to include the lines as follows. This will allow Nikola to treat .ipynb files as blog posts.

     POSTS = (
         ("posts/*.rst", "posts", "post.tmpl"),
         ("posts/*.md", "posts", "post.tmpl"),
         ("posts/*.txt", "posts", "post.tmpl"),
         ("posts/*.html", "posts", "post.tmpl"),
         ("posts/*.ipynb", "posts", "post.tmpl"),
     )
     PAGES = (
         ("pages/*.rst", "pages", "page.tmpl"),
         ("pages/*.md", "pages", "page.tmpl"),
         ("pages/*.txt", "pages", "page.tmpl"),
         ("pages/*.html", "pages", "page.tmpl"),
         ("pages/*.ipynb", "pages", "page.tmpl"),
     )
  6. Write your blog post in Jupyter, saving the .ipynb file to /posts.

  7. You will need to explicitly add the following metadata to your notebook (in the Jupyter menu, select Edit > Edit Notebook Metadata). Change the metadata to match your post.[2]

     "nikola": {
         "title": "Creating a blog with Jupyter notebooks",
         "slug": "creating-a-blog-with-jupyter-notebooks",
         "date": "2017-09-09 21:09:01 UTC+10:00"
     }
  8. Run nikola build each time you update your /posts, which will generate your site and store it in /output!

  9. If you're going to be publishing your blog on Github (like me), you can push the content of /output to your website repo (example).

[1]Problems installing Nikola?

I ran into some issues installing Nikola on OS X with Anaconda. Specifically, gcc in Anaconda was the culprit. Resolution:

  • conda remove gcc to uninstall gcc provided by Anaconda

This will default to the system gcc, which you can check by running which gcc (which should output /usr/bin/gcc).

If this still doesn't resolve the issue still, you may need to install a more up-to-date gcc:

  1. Install Homebrew
  2. brew install gcc (you may be prompted to install Developer Tools)
  3. brew unlink gcc
  4. brew link --overwrite gcc

which gcc should now show /usr/local/Cellar/gcc/7.2.0. 👍

[2]Inferring Nikola post metadata

Like me, you probably want as little as possible to come between your latest notebook hack and your awesome new blog.

Nikola parses Jupyter notebooks with a plugin, which with some modification we can have infer all of the Nikola post metadata automatically. For me, the plugin file was here (though it may differ for you):

~/anaconda/lib/python3.5/site-packages/nikola/plugins/compile/ipynb.py

To automagically infer the required metadata, you can replace the read_metadata() function in the file above with the following code:

In [5]:
def read_metadata(self, post, file_metadata_regexp=None, unslugify_titles=False, lang=None):
    """Read metadata directly from ipynb file.

    As ipynb file support arbitrary metadata as json, the metadata used by Nikola
    will be assume to be in the 'nikola' subfield.
    """
    self._req_missing_ipynb()
    if lang is None:
        lang = LocaleBorg().current_lang
    source = post.translated_source_path(lang)
    with io.open(source, "r", encoding="utf8") as in_file:
        nb_json = nbformat.read(in_file, current_nbformat)
    # Metadata might not exist in two-file posts or in hand-crafted
    # .ipynb files.

    # infer metadata
    title = os.path.splitext(os.path.basename(source))[0]
    slug = title.lower().replace(' ', '-')
    from datetime import datetime
    date = datetime.fromtimestamp(os.path.getctime(source)).strftime('%Y-%m-%d %k:%M:%S')

    implicit = {'title':title, 'slug': slug, 'date':date}
    explicit = nb_json.get('metadata', {}).get('nikola', {})
    
    # replace inference with explicit if available
    metadata = {**implicit, **explicit}

    return metadata

With this small modification, we instruct Nikola to infer the title and slug values based on the filename, and the date value based on the filesystem. Magical! ✨

Update: The makers of Nikola have suggested some official methods for achieving this that are built right into the existing workflow:

In [9]:
%%html
<blockquote class="twitter-tweet" data-conversation="none" data-lang="en"><p lang="en" dir="ltr">Titles and slugs can be done via FILE_METADATA_REGEXP, and auto dates are prone to issues.<br>Better: import files with `nikola new_post -i`</p>&mdash; Nikola Generator (@GetNikola) <a href="https://twitter.com/GetNikola/status/907570254611484672">September 12, 2017</a></blockquote> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>