Converting images with Node.js - Part 3: Documents
Documents can be turned into images in Node.js, but this often requires multiple steps. This article will demonstrate how use a custom Sharp installation in combination with additional programs to achieve this. The installation steps required vary on a case-by-case basis, therefore it is recommended to cherry pick the programs you need for your particular use-case.
Details about the libraries and build process required to make a custom installation of sharp are described in the previous article. This article will not repeat theory or context, but installation steps required to achieve a functional prototype will be provided.
The ability to turn documents into images has many productive use-cases such as thumbnails in a list of assets, previews before opening, downloading or sending, and web-based document readers which avoid the need for client-side plugins in some applications.
Turning images into documents will not be covered in this article. Not because it is impossible, but because it almost never has a useful, meaningful application with real-world value. You could put a scaled down image on a one-page PDF, but this is generally less useful than providing the same in image format. Document recovery from images can be achieved with OCR (Optical Character Recognition) among other techniques, but this is out of scope for this article which is about file conversions.
Below is an example of how to decode documents on Ubuntu 24.04, a popular linux distribution. It should be relatively simple to adapt these instructions to environments with different operating system distributions to suit your own needs.
Installing prerequisites
First, update the apt package manager:
apt updateIn order to create a custom libvips build which can be used to decode documents you need to install some packages. Starting with build tools:
apt install -y build-essential pkg-config meson ninja-build cmake libglib2.0-dev libexpat1-devFor parity with the standard distribution of Sharp you will need to install the same libraries:
apt install -y libjpeg-turbo8-dev libpng-dev libwebp-dev libtiff-dev libgif-dev libcgif-dev libimagequant-dev libheif-dev libdav1d-dev liblcms2-dev libexif-dev zlib1g-dev libbrotli-devHowever, if you are certain you will never use one or more of the image types these libraries support, you are free to omit them. Read the previous article for more details.
Sections below will explain optional codecs, any of which can be skipped. Once you have installed all codecs relevant to your use-case feel free to continue to the Building libvips and Sharp section of this article.
The Portable Document Format (PDF) was released in 1993 inspired by PostScript’s page description model. The format is patented by Adobe, but licences are royalty-free. Anyone may create applications that can read and write PDF files.
PDFs are not text and images, but a declarative format interpreted by a complex rendering engine. Computation takes place on the machine that renders the PDF, meaning your server in this case. It is impossible to exhaustively defend against every past and future edge case which causes memory corruption, excessive CPU usage, or long execution times. No PDF rendering library will do that for you. Therefore PDF rendering runtimes should be limited in resource use and execution time.
Sharp can render PDF with PDFium, Poppler, ImageMagick or GraphicsMagick. This article describes a solution with Poppler, because Ubuntu’s package manager provides it. You can install it with:
apt install -y libpoppler-glib-devAfter completing the build steps for vips and Sharp below, you should be able to decode .pdf files:
await sharp('test.pdf').png().toFile('from_pdf.png');Microsoft Office Documents
Microsoft Office, also known as MS Office, or just Office is a suite of applications offered by Microsoft. Its widespread adoption means that transforming a file in one of its proprietary formats to an image is a common challenge. Unfortunately vips is not capable of decoding Microsoft Office documents.
Fortunately the open-source LibreOffice suite is capable of decoding office documents and encoding their contents as PDF, which vips can decode. LibreOffice is a 1.2GB package that needs to be installed on the runtime environment, as it can not be built and shipped within Sharp or any other npm package:
apt install -y libreofficeMaking calls to LibreOffice from the Node.js runtime can be achieved with the libreoffice-convert npm package. This uses the soffice command in order to convert a document in an Office format to PDF. Install with:
npm i libreofficen.b. The LibreOffice program is an external process which runs directly on the machine hosting the Node.js runtime. It could therefore have failures and attack surfaces with significantly higher impact, including memory leaks, process spawning, or even remote code execution.
Conversion is therefore done in two steps: first from Microsoft Office document to PDF with LibreOffice, then from PDF to image through Sharp. This additional step adds hidden complexity and runtime dependencies, making these transformations significantly harder to deploy and maintain than anything Sharp can handle by itself.
The first step is to transcode the document to PDF and store it in a buffer, which Sharp can process. This is the same for every format and doesn’t need to be repeated:
import * as path from 'path';
import {readFileSync} from 'fs';
import libre from 'libreoffice';
import {promisify} from "util";
libre.convertAsync = promisify(libre.convert);
// a buffer containing a PDF file, generated from test.docx
const pdfBuffer = await libre.convertAsync(readFileSync('test.docx'), '.pdf', undefined);DOCX (Word)
The Office Open XML Document is an Office Open XML format which replaced the Microsoft Word Binary File Format in 2007. Microsoft holds the patent, but has issued an irrevocable promise to never assert claims against anyone for making, using, selling, offering for sale, importing or distributing any implementation.
Turning .docx files into images requires an intermediary step. Transforming a Word document to PDF to image is a common method of achieving this:
const pdfBuffer = await libre.convertAsync(readFileSync('test.docx'), '.pdf', undefined);
sharp(pdfBuffer).png().toFile('from_docx.png');XLSX (Excel)
The Office Open XML Workbook format replaced the Excel Binary File Format in 2007, when Microsoft introduced Office Open XML. XLSX is considered an open format; even though Microsoft holds the patent, they issued an irrevocable promise to let anyone use it.
Strictly speaking, Excel files are not documents, because they contain grids, sheets and formulas. Page size, zoom level, and scaling are presentation choices to be made when converting an XLSX file to a document or image.
Turning .xlsx files into images requires an intermediary step. Transforming an Excel file to PDF to image is a common method of achieving this:
const pdfBuffer = await libre.convertAsync(readFileSync('test.xlsx'), '.pdf', undefined);
sharp(pdfBuffer).png().toFile('from_xlsx.png');PPTX (Powerpoint)
The Office Open XML Presentation format replaced the PowerPoint Binary Presentation format in 2007, when Microsoft introduced Office Open XML. PPTX is considered an open format; even though Microsoft holds the patent, they issued an irrevocable promise to let anyone use it.
PowerPoint presentations are more suitable for conversion to a document or image than Word or Excel files, because they have a clear page size and fixed layout.
Turning .pptx files into images requires an intermediary step. Transforming a PowerPoint presentation to PDF to image is a common method of achieving this:
const pdfBuffer = await libre.convertAsync(readFileSync('test.pptx'), '.pdf', undefined);
sharp(pdfBuffer).png().toFile('from_pptx.png');Ghostscript and ImageMagick
In order to render other formats libvips can be built with ImageMagick, which in turn can delegate to Ghostscript. This opens up the ability to decode a large set of exotic and legacy formats so Sharp can create an image. These extra capabilities come at the cost of performance, stability and security which should be accounted for.
ImageMagick is an older image transformation program than libvips. It has worse performance when rasterizing the same file, but also supports a much broader range of files it can process. ImageMagick has a long history of reported security vulnerabilities and should therefore be run in quarantined, temporary environments with limits on runtime, memory and CPU usage when processing untrusted inputs. Install with:
apt install -y imagemagick libmagickwand-dev libmagickcore-dev Ghostscript is a PostScript and PDF interpreter. Rendering PDFs can be done without it as described above, but Ghostscript offers the ability to decode additional niche and legacy formats. PostScript is an executable scripting language; interpreting it brings all the risks involved with that, including execution times, memory corruption, and remote code execution. Install with:
apt install -y ghostscriptPSD (Photoshop)
The Photoshop document format was released in 1990. Though the format is proprietary, and PSD and Photoshop are trademarked, Adobe has historically tolerated ingestion by programs such as GIMP, Affinity Photo, and Clip Studio Paint.
Transforming a .psd file with Sharp often relies on ImageMagick, and can be done with:
await sharp('test.psd').png().toFile('from_psd.png');EPS
Encapsulated PostScript is a legacy, print-oriented format developed in 1987. EPS files are still encountered in Desktoppublishing (DTP) workflows. EPS is an open, publicly specified format.
Rasterizing an .eps file requires Ghostscript to be installed, and can be done with:
await sharp('test.eps').png().toFile('from_eps.png');DjVu
The DjVu format (pronounced déjà vu) was released in 1998 by AT&T Research to make scanned documents smaller and faster to transmit than PDF. It can use a background, foreground and optional text layer in order to apply compression selectively. Modern broadband connections make the advantages the format offers less relevant. DjVu is an open, non-proprietary format that can be freely implemented.
Transforming a .djvu file with Sharp often relies on ImageMagick, and can be done with:
await sharp('test.djvu').png().toFile('from_djvu.png');Building libvips and Sharp
In order to make the codec libraries you have installed available from Node.js you will have to build both libvips and Sharp.
Building libvips can be done with the following commands, which the previous article explains in detail:
curl -L -O https://github.com/libvips/libvips/releases/download/v8.18.0/vips-8.18.0.tar.xz
tar xf vips-8.18.0.tar.xz
cd vips-8.18.0
meson setup build --prefix=/usr/local
ninja -C build
ninja -C build install
ldconfig
cd ..These are the commands for (re-)building sharp, explanation can be found in the previous article:
npm i node-addon-api
npm explore sharp -- npm run buildAn explanation to the limitations of the portability of this build can be found in the previous article as well.
Conclusion
Other document formats can be transformed into images in similar ways, they often require an extra step, such as LibreOffice, as demonstrated above. Keep in mind that such additional complexity tends to increase operational, security and fragility concerns around a transformation pipeline exponentially, rather than linearly. It can work, but building and maintaining a robust solution is not trivial. In the next article we will examine converting videos to images for preview and thumbnail purposes.