Making an API for Calibre on Heroku

Pawit Pornkitprasan
3 min readMay 26, 2019

--

Introduction

Ever since I’ve switched to Chromium OS, I try to do more and more of my activities “on the cloud” when viable as that means I have less software to manage on my computer. One thing I sometimes do is convert eBooks into the .azw3 format for my Kindle using the excellent Calibre eBook management software.

On Chromium OS, I can continue to use Calibre thanks to Crouton or Crostini. However, I wanted to see if I can get away with not installing the software locally. Converting a document to .azw3 is a relatively complex process and thus the easiest way out is wrap Calibre’s ebook-convert command into a REST API.

The Solution

I need somewhere to host this API. The host must allow me to run arbitrary binaries. I certainly do not want to rent a VPS just to host an API I use a few times per week at best. Searching for the applicable no-cost options, I stumbled upon Heroku. I have used Heroku in the past and their auto-sleep model works really well in this situation. If no one is using the app, the app would go to sleep, conserving the free quota. And whenever I need it, the app would automatically wake up in a few seconds.

The next challenge is to install Calibre on Heroku. I know that you can run arbitrary binaries on Heroku using their “custom buildpack” support. Even better, unlike Docker images, you can combine multiple buildpacks into one as each just put some binaries into the resulting image.

Luckily, someone already made a buildpack to install Calibre. I fixed it up a little bit, stacked it up with the node.js buildpack and wrote a simple API to call the ebook-convert utility.

Security

Whenever I execute a command from code with user-supplied input, I get very nervous about security as it’s probably full of exploitable holes. Luckily, in this case, it’s a private-use API and so all input can be trusted. Nevertheless, while coding, I made sure to follow good security practices.

Since the API include functionalities to download and upload arbitrary files, I made sure that the user cannot traverse directories and upload/download arbitrary files. In fact, I took the easy way out by generating a UUID for each new upload and use it as the file name so users cannot control where their upload end up at.

However, there are still these potential issues:

  • HTML can be uploaded and served from the web server, which can lead to script-injection attack and potentially stealing cookies if the domain has other valuable cookies. This can be fixed by only allowing “safe” content types (e.g. images) to be served and serve all unknown contents as octet-stream.
  • Directory Traversal from Calibre. Since users can upload HTML files for Calibre to convert and since that HTML file can refer to external resources, it may be possible to trick Calibre into loading arbitrary files into the conversion result. I know no fool-proof method to fix this except to ensure that the API is ran inside a container or a chroot with no access to sensitive information. In this case, running the API in it’s own Dyno (Heroku’s container technology) provides a good level of defense.

Source Code

The source code for the API can be found on GitHub: https://github.com/pawitp/ebook-convert-api

--

--

Responses (1)