Migrating From Ghost to Hugo (Again)

A year ago i migrated from Ghost to Hugo, and about the same time my rate of publishing dropped dramatically. A part of this is because most of my writing is done in small bursts, opening the Ghost editor whenever i have 10-20 minutes. With Hugo this wasn’t really possible (for me), and most of my issues with Ghost (upgrading) became voided by running the official Docker image

Fast forward a year, and Ghost has released version 1. It still works well, and yet i’m still not satisfied with it.

With version 1 has come almost daily releases, and while updating my docker image is a small task, it becomes a chore to do this daily. The promised dashboard was cancelled, referring to a rather expensive 3rd party solution instead.

Ghost Dashboard

The editor has been redesigned, and while it’s not entirely all bad, it’s not exactly super great either. Also, with regards to integrating 3rd party apps, things like Slack has taken priority over Disqus, which seems like an odd choice to me, at least for a platform that wants to be a blogging platform. Disqus support is easy to configure, though it requires modifying theme files, which in turn are overwritten every time you update the theme.

So, after doing constant updates for the past 4-6 weeks, tweaking configuration files, html files, etc, i decided i’d finally had enough, and started the journey back to Hugo. I came up with a plan that should make it easier to edit on the go, while still keeping it relatively simple.

Migrating

Initially i just wanted to run my script from last time, but Ghost has changed the layout of their exported JSON, so that no longer worked, and a promising tool, GhostToHugo i found on GitHub also fails. I spent about an hour studing the code for the migration tool, considering if i should submit a patch for it, but realized it would take a long time to write, where as writing a “one off” tool would take around an hour or so.

So, after studying the layout of the new JSON export format, i came up with this (Gist link)

# coding: utf-8
import json
import re
import argparse
import sys
import os


def read_file(filename):
    with open(filename, 'rt') as f:
        dat = f.read()
        db = json.loads(dat)
        return db


def parse_args():
    ap = argparse.ArgumentParser()
    ap.add_argument('INPUT_FILE', help="Input Ghost .json file",
                    type=str, action='store')
    ap.add_argument('OUTPUT_DIR', help="Output Directory",
                    type=str, action='store')
    args = ap.parse_args()
    return args


def get_tags(db, postid):
    dtags = db['db'][0]['data']['tags']
    post_tags = db['db'][0]['data']['posts_tags']
    tt = [t['tag_id'] for t in post_tags if t['post_id'] == postid]
    rtags = [tag['name'] for tag in dtags if tag['id'] in tt]
    return rtags


def get_author(db, author_id):
    authors = db['db'][0]['data']['users']
    for a in authors:
        if a['id'] == author_id:
            return a['name']
    return None


def fix_links(markdown):
    regex = r"(\[[^\]]*\]\()(\/(?!images).*)(\))"
    reg = re.compile(regex, re.MULTILINE)
    ret = reg.sub(r"\1/post\2\3", markdown)
    return ret


def get_posts(db):
    ret = dict()
    for post in db['db'][0]['data']['posts']:
        pid = post['id']
        author_id = get_author(db, post['author_id'])
        title = post['title']
        slug = post['slug']
        page = post['page']
        created = post['created_at']
        updated = post['updated_at']
        published = post['published_at']
        tags = get_tags(db, pid)
        markdown = ''
        try:
            doc = json.loads(post['mobiledoc'])
            markdown = doc['cards'][0][1]['markdown']
            # fix images
            markdown = markdown.replace('/content/images', '/images')
            # fix links, prefix /post
            markdown = fix_links(markdown)
        except:
            pass
        draft = post['status'] == 'draft'

        out = '---\n'
        out += 'title: {}\n'.format(title)
        out += 'slug: {}\n'.format(slug)

        out += 'author: {}\n'.format(author_id) if author_id else ''
        out += 'lastmod: {}\n'.format(updated) if updated else ''
        out += 'date: {}\n'.format(published) if published else ''
        out += 'draft: true\n' if draft else ''

        tstring = "tags: ["
        for t in tags:
            tstring += '"{}", '.format(t)
        if len(tags):
            tstring = tstring[:-2]
        tstring += ']\n'
        out += tstring
        out += '---\n\n'
        out += markdown

        ret[slug] = out
    return(ret)


def write_posts(outdir, posts):
    for k in posts.keys():
        fname = os.path.join(outdir, "{}.markdown".format(k))
        with open(fname, 'wt') as of:
            of.write(posts[k])


def main():
    args = parse_args()
    db = read_file(args.INPUT_FILE)
    posts = get_posts(db)
    print('Converted {} posts'.format(len(posts)))
    write_posts(args.OUTPUT_DIR, posts)


if __name__ == '__main__':
    main()

You run it as follows :

python ghost2hugo.py EXPORT_FILE HUGO_POST_DIRECTORY

So for instance, if your file is named assertion-failed.2017-09-19.json and your hugo directory is in ~/blog, you’d run it like this :

python ghost2hugo.py ~/Downloads/assertion-failed.2017-09-19.json ~/blog/content/post

The script tries to automatically fix problems with links, i.e. images, but also prefixing local links (links starting with / ) with /post.

Posts to Hugo workflow revisited

As I mentioned above, last I migrated to a static blog engine, my posting frequency dropped a lot. I’m not a frequent poster, and posts usually come in bursts, but at any time I have about 10 drafts on various topics that I add a bit to every now and then, and this I do whenever I have 10-20 minutes to spare, so it is absolutely crucial to me that I can edit posts “on the go”, across all my platforms.

I keep my blog source in git, and I don’t use Dropbox, or any other cloud file sync solution. I do use Resilio Sync, and can access my NAS through VPN.

Instead I opted for Working Copy for accessing my git repository from iOS.

On the desktop I use the command line and Sublime Text for finalizing posts before publishing, and on iOS I use a mix of Byword and Editorial. I’ve written a small Workflow for Editorial that I use for creating new posts on the go.

Working Copy is what ties it all together on iOS. from that I can edit markdown files, upload photos, and push my changes back to my Git server.

I still miss the ability to author a complete post with images and all on iOS, preview it, and just upload it to my repository, but finalizing posts on the desktop isn’t that big a deal compared to the actual writing.

Future optimizations

I’m working on a way to automatically publish the blog whenever i push something to the git repo. I have something that “almost works” using Hazel on a Mac, but i’d like it to run on a server, and not relying on wether my mac is running or not.

ghost  hugo  python  blog