I moved my NAS to the cloud

and you could too

This blog has been in a state of neglect for a long time. Not because I meant to, but as they say “Life is what happens to you while you’re busy making other plans“, and life has been happening a lot to me these past few years. To the point where I’ve had little time for personal projects, and when I finally made something, it either turned out to be a bad idea, or not in a state I wanted to show. I have about 20 “draft” posts waiting for the finishing touches, but I just never found the time to do it.

Part of what has been keeping me busy is because i self host everything except mail. Its not a new thing. I’ve been self hosting for decades ever since my first 2/0.5 mbit ADSL connection, but after the Snowden revelations I doubled down on bringing everything back home.

I’ve been across multiple iterations of hardware. From a custom built FreeBSD server running ZFS RAIDZ1, to Synology, to multiple Raspberry Pi’s (this blog started out on a raspberry pi B), a few years on The perfect media server, and up to my latest iteration of a Synology DS918+ for storage and a Dell PowerEdge T30 for hosting services (accessing storage over NFSv4 with Kerberos).

I spent at least an hour every day making sure everything ran as a well oiled machine. Checking logs for hardware errors, intruders, failed backups, software updates. I also spent quite a few hours worrying when a drive would fail, and looking for deals on replacement drives.

When 2021 rolled around the corner I decided I didn’t want to spend my spare time being a system administrator. I already maintain a small fleet (<100) of servers at work on top of my responsibilities as software architect and lead / senior developer. I didn’t want to worry about multiple backups, local and remote, and worry every time thunder would roll if it would fry the UPS.

Baby steps

Shortly after New Years, I handed in my 2 weeks notice as a home system administrator, and migrated every public available self hosted service to cloud hosting. And not VPS hosting, as that’s essentially just self hosting on somebody else’s hardware.

After shutting down my self hosted services I turned off my PowerEdge, deleted a handful of firewall rules, and i could relax a little. I still had my NAS running, but it was only accessible from my LAN, and Synology boxes are more or less appliances you plug in and leave running. Don’t expose them to the internet though!

This “scratched my itch” for a month or so. I started having more spare time, and it was time to tinker again.

I’ve been using Nextcloud in combination with Resilio Sync for “cloud storage” for the past 5-6 years, and it has served me really well, but requires me to host “something”. I still want “on the move” access to files though.

Sensitive files have always been accessible only on my LAN, or through VPN, and I’d have to find a solution for this problem as well when no longer self hosting.

A lot has happened to cloud storage since I last used it some 8 years ago. While privacy is still an issue, things like GDPR and Privacy Shield has made it a little better.

I also reevaluated my threat scenario. While I’ve encrypted everything at home for the past decade or so, chances are nobody really cares how fast my GSD puppy grew. I evaluated what data I keep around, and it’s mostly family photos, various documents and music/movies. While I wouldn’t want it to happen, most of it would probably survive just fine it a public GitHub repository without any damage to me at all.

Getting more serious

I considered my options for weeks. I read reviews, evaluated privacy statements, and checked “survivability” of multiple cloud providers.

I wanted privacy, zero knowledge encryption, speed, a provider that is likely to be in business for the foreseeable future, and something that preferably worked on laptops and smartphones. As with most wishlists not all wishes are granted.

As I read more and more reviews, it became clear that there are “big and not much privacy”, and “small and private but not much guarantee your data won’t just disappear overnight”, and as I already described, most of my data isn’t really that secret, so I went with the former.

A battle plan

I found a great offer on what I perceive as the “least bad big cloud provider”, which surprisingly turns out to be OneDrive. Dropbox was my favorite, but much to my surprise Dropbox will scan your data, to the point of looking inside ZIP files. Nothing Google does surprises me anymore, and i actively avoid their services whenever I can.

I briefly considered using Boxcryptor for securing my OneDrive documents, but ultimately decided against it as it would cost as much (if not more) than the cloud storage itself to get the entire family to run it. I’d be totally depending on Boxcryptor surviving, and while they look like they’re in good shape, it’s a very small niche they occupy.

Instead I settled on keeping only photos and non sensitive documents in OneDrive, and keeping sensitive documents manually encrypted inside LUKS containers or encrypted sparse bundles. This would mean sensitive documents are unavailable on the go, but that’s OK as I rarely need them, and I usually have my laptop with me anyway.

I setup my NAS to synchronize OneDrive accounts with its local storage so that it could make backups both locally and remote. (you also backup your cloud storage, right ?)

I now had all the important stuff safely stowed away in the cloud, and my NAS was reduced to a very expensive backup box and media player.

The final piece of the puzzle.

I was content to run my NAS as an overpriced media player for a while, but when the final part of the puzzle dawned on me, I was kinda angry with myself for not seeing it before.

I’ve used the excellent rclone tool before, and I vaguely remember it being able to mount cloud storage as FUSE mounts, but my experiences with FUSE have not exactly been great (or reliable for that matter).

In the meantime, VFS cache functionality has been added to rclone, meaning provided with a large enough cache drive, things in the cloud could appear to be local.

It was something to try anyway.

Rclone also ticks the privacy box as it supports encryption.

OneDrive only provides 1TB per user account, so I would need an additional service, and preferably something affordable. As the data I was planning on storing are mostly my own DVDs/Blu-ray discs I’ve ripped, or media I’ve purchased, I would be sad to see it lost, but it’s not irreplaceable data.

For this task I settled on Jottacloud. I’ve used them before, and from Europe they’re fast and reliable.

Fitting all the pieces together

Armed with a big toolbox I fired up my PowerEdge T30 and started configuring away.

Considering I would need to cache rather large files, I set aside a 1TB SSD for VFS cache. Combined with a long cache time, files could live “forever” on local storage.

Setting it up was rather easy. Rclone has extensive documentation on setting up backends, I.e. Jottacloud

I setup a “raw” Jottacloud endpoint, and layered an encrypted backend on top of it.

With that done, it was time to mount it automatically at boot. I ended up with something like this

[Unit]
Description=RClone Jottacloud
Wants=network-online.target
After=network-online.target

[Service]
Type=notify
KillMode=none
RestartSec=5
Environment=RCLONE_CONFIG=/root/.config/rclone/rclone.conf

ExecStart=/usr/bin/rclone mount jottacloud_encrypted: /mnt/jotta/ \
  —dir-cache-time 1000h \
  —vfs-cache-mode full \
  —vfs-cache-max-size 500G \
  —vfs-read-ahead 256M \
  —vfs-cache-max-age 1000h \
  —cache-dir /mnt/cache/ \
  —uid 1002 \
  —syslog \
  —user-agent randomuseragent1234 \
  —rc \
  —rc-no-auth \
  —allow-other \
  —poll-interval 60s \
  —jottacloud-hard-delete \
  —fast-list \
  —umask 002 

ExecStop=/bin/fusermount -uz /mnt/jotta/
ExecStartPost=/usr/bin/rclone rc vfs/refresh recursive=true —rc-addr 127.0.0.1:5572 _async=true
Restart=on-failure

[Install]
WantedBy=default.target

I set a cache time of 1000 hours, as well as 500GB space for caching. The top 500GB files used will more or less be available locally.

There’s a small “gotcha” with rclone. Initially I had run the systemd service as my own user, but without the —allow-other flag, not even root can read the files, and as I need access for multiple users, I mount as root, and set the owner of the files as my plex user. Nobody besides me interact directly with files, and everybody else uses plex or similar tools.

With that out of the way it was time to sync things over.

mkdir /mnt/jotta/media    
rclone sync -P /mnt/media/ jottacloud_encrypted:/media/

After a few hours (ok it was a few days) I had my media collection in jottacloud, and after setting up Samba and pointing it to the mounted drives it was time to test it.

Testing it out

The first thing I tried was to create a 1GB file of all zeroes.

➜  ~ time dd if=/dev/zero of=/mnt/jotta/test.dat bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 2.6527 s, 395 MB/s
dd if=/dev/zero of=/mnt/jotta/test.dat bs=1M count=1000  0.00s user 0.71s system 26% cpu 2.660 total

395 MB/s. Not too bad. Of course data has only been written to the cache and is in the process of being uploaded to the cloud, and when the cache drive runs full, transfer speeds will fall to whatever your upload speed is.

Next up a 1GB file of random data

➜  ~ time dd if=/dev/urandom of=/mnt/jotta/test.dat bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 5.17169 s, 203 MB/s
dd if=/dev/urandom of=/mnt/jotta/test.dat bs=1M count=1000  0.00s user 3.38s system 64% cpu 5.279 total

About half speed of all zeroes.

I ran the above test after things had settled after the first test, so it appears that there’s some compression going on. Still 200 MB/s is about twice the speed of gigabit Ethernet.

Next up I copied the test file on jottacloud and attempted to download the copy. Because I have long cache times, I needed to refresh the remote before downloading.

~# rclone copy jottacloud_encrypted:test.dat jottacloud_encrypted:test2.dat
~# /usr/bin/rclone rc vfs/refresh recursive=true —fast-list —rc-addr 127.0.0.1:5572

{
        “result”: {
                “”: “OK”
        }
}

And then on to downloading the file.

➜  ~ time dd if=/mnt/jotta/test2.dat/test.dat of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 37.5651 s, 27.9 MB/s
dd if=/mnt/jotta/test2.dat/test.dat of=/dev/null bs=1M  0.01s user 0.47s system 1% cpu 37.569 total

Downloading it was the only option. It managed to read the file at 27.9MB/s, which should be plenty fast for most media consumption.

I then tried “downloading” it again, as it should now be cached. The use case here would be I add some new media and want to watch it later on.

➜  ~ time dd if=/mnt/jotta/test2.dat/test.dat of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 2.33898 s, 448 MB/s
dd if=/mnt/jotta/test2.dat/test.dat of=/dev/null bs=1M  0.01s user 0.46s system 19% cpu 2.409 total

488 MB/s. Plenty fast for streaming, even multiple streams.

Backing up

With everything now in the cloud, I put a whole bunch of spinning rust to rest, and my “NAS” at home consists of a 1TB SSD and a 8TB USB drive for backups, and only the backup is stored locally.

Like the jottacloud mount, I have the OneDrive accounts mounted under each users home directory under Documents. This allows me to just backup /home and backup OneDrive in the same go.

OneDrive data lives on the user machines as well as in OneDrive, so a local backup would fulfill the 3-2-1 backup strategy

Media data will have to get by with a single backup. Losing it is inconvenient, but not devastating.

I use Borgmatic for backing up, and when backing up FUSE libraries there’s a tweak that needs to be set to avoid downloading everything again. I learned from my time with the Perfect Media Server and mergerfs that FUSE doesn’t bother with accurate inodes, and it doesn’t have to. Backup programs however like them a lot. By caching and comparing the inode of a file, you can tell If it’s been changed. It’s not always possible to tell from the inode, but it’s a good indicator.

Borg uses inodes as well, but fortunately it’s a quick fix in borgmatic.

    # Mode in which to operate the files cache. See
    # https://borgbackup.readthedocs.io/en/stable/usage/create.html#description for
    # details.
    files_cache: ctime,size

Keeping track of things.

I now had a working system with automated backups, but I was still missing alerts whenever things weren’t working. I decided early on I wouldn’t spend every day going through log files, so something that emails me or send a notification throughpushover would be optimal.

After a little searching I found Healthchecks. It integrates with borgmatic and has a free tier that allows you to monitor 3 jobs, and as I only need a local job it would be more than enough.

Simply register your job and add the callback url to borgmatic

hooks:
    healthchecks: https://hc-ping.com/addffa72-da17-40ae-be9c-ff591afb942a

Healthchecks also notifies you if a backup is late.

Final touches.

My NAS has been running in the cloud for over a month now, and I’ve yet to experience major problems. I’m aware of the pitfalls that cache=full brings along, and I’m fully prepared to lose any files being uploaded when the power goes out. I also still keep the UPS connected.

I did have some unexpected behavior with docker when the remote mount restarts, but for now I’ve solved it by simply making docker dependent on the rclone mount, so when rclone restarts, docker will restart after it.

[Unit]
Before=smbd.service docker.service
PartOf=docker.service smbd.service

[Install]
WantedBy=multiuser.target docker.service smbd.service borgmatic.service

Borgmatic is triggered by a timer, so I added borgmatic.service to the WantedBy list, just to make sure the mount is actually up when borgmatic runs.

An unexpected side effect of the cloud setup is that I can now carry a raspberry pi in my EDC bag with a complete rclone setup, and I suddenly have my NAS in my pocket. Plug it in and you’ve got “local” access to your NAS files (don’t do that in an airport please!)

Besides not having to listen to metal spinning, I also save a bit of money. The way I normally calculate TCO on my NAS boxes is I expect a 5 year lifetime, and in the case of my 918+ with drives that was $380/year. Add to that about $130/year in power (38W, 330 kWh / year, $0.4/kWh), and the total comes out to $520/year or $43.3/month. My current cloud setup costs less than half of that.

And there you have it. While not as technical as usual, I hope you at least became inspired to go and do amazing things.

Who knows if my newfound freedom will eventually result in more technical post sometime in the future. For now I have movies I purchased before I had kids that I’ve not yet gotten around to watching, so I guess I’ll start there.

NAS  rclone  cloud  Linux 

See also