I’m currently in the process of adding WebP support for my WordPress sites and one of the tasks involved is converting all existing images (JPEGs and PNGs) to WebP in bulk.

To do that, we’ll be stringing together the find command, xargs, and cwebp.

OS: Ubuntu 20.04. Host: Digital Ocean.

To save you some time, here are the commands I used:-

# Convert PNGs
find . -iname "*.png" -print0 | xargs -0 -n 1 -P 0 -I '{}' cwebp '{}' -short -q 90 -alpha_q 100 -m 6 -o '{}'.webp

# Convert JPGs
find . \( -iname "*.jpg" -o -iname "*.jpeg" \) -print0 | xargs -0 -n 1 -P 0 -I '{}' cwebp -short -q 80 '{}' -o '{}'.webp

Simply put, these commands convert all *.jpg, *.jpeg, *.png under the current directory to webp recursively .

As you can see, the command chain starts off with find, the search results from find are then piped to xargs which calls upon cwebp to do the actual conversion.

If you’re interested, let’s take these commands apart and look at the individual chunks that make up the commands.

find

find . -iname "*.png" -print0

The dot (.) instructs find to start the searching in the current directory.

-iname “*.png” tells find to look for files with names that match this pattern (*.png). The i in iname means Insensitive, as in Case Insensitive. This takes care of cases where a file extension is .PNG or .PnG

-print0 terminate each filename with a null character. This is used in conjunction with xargs’s -0 option (more on that in the xargs section below)

find . \( -iname "*.jpg" -o -iname "*.jpeg" \) -print0

For this command, the goal is to return files that end in .jpg OR .jpeg.

To do that, we use two -iname options -iname "*.jpg", -iname "*.jpeg", and -o which is the OR operator.

The pair of escaped parentheses encapsulates the two options.

See command above for print0 and (.) explanation.

Other articles about Find

xargs

This is the complete xargs command I used

xargs -0 -n 1 -P 0 -I '{}' cwebp '{}' -short -q 90 -alpha_q 100 -m 6 -o '{}'.webp

But let’s ignore the cwebp portion of the command for now.

And look at this example instead.

xargs -0 -n 1 -P 0 -I '{}' echo '{}'

-0 means that the input is terminated by a null character. This is used together with find’s -print0 option.

-n 1 -P 0 tells xargs to spawn as many processes as possible. This can really speed things up if your machine has multiple CPU cores.

An alternative is to explicitly specify the max number of processes to spawn:-

-n 1 -P 4 will spawn 4 processes at most

-I ‘{}’ this option names the input. With this option enabled, we can do things like:-

echo ‘{}’ – this is the command xargs runs for each input provided. ‘{}’ will be replaced with the input when executing.

cwebp

cwebp is a lightweight utility that converts images to WebP format.

Install cwebp

sudo apt update
sudo apt install cwebp

Next let’s take a look at the cwebp commands.

cwebp: JPEG to WebP

cwebp -short -q 80 '{}' -o '{}'.webp

I used this command to convert Jpegs to WebPs.

-short prints a short summary as opposed to the more verbose default output. If you prefer, you can use -quiet instead which runs the command in silent mode.

-q specifies the quality, 100 being the highest. I settled for 80 because it works for me.

‘{}’ this is the input fed by find. E.g. ./2020/08/hello-world.jpg

-o ‘{}’.webp the output filename is specified by -o and for my implementation I’m appending .webp to the input filename. e.g. ./2020/08/hello-world.jpg.webp

cwebp: PNG to WebP

cwebp '{}' -short -q 90 -alpha_q 100 -m 6 -o '{}'.webp

This is almost exactly the same as the cwebp command above. Except:-

-alpha_q 100 this sets the alpha channel (transparency) quality to 100

-m 6 increase the compression to 6 (default is 4). This could result in slower conversion but better compression.

One thing to note here is I’m doing a lossy conversion for PNGs and PNG is a lossless format.

The reason is because I’m really impressed by the resultant WebP images’ quality and reduction in file size. You can read more about it here.

Sample Script

This is one version of the scripts I used that incorporate the commands above.

# -P 8 spawn 8 processes
# -P 0 spawn as many processes as possible

echo "Processing PNGS"
find . -iname "*.png" -print0 | xargs -0 -n 1 -P 0 -I '{}' cwebp '{}' -short -q 90 -alpha_q 100 -m 6 -o '{}'.webp

echo "Processing JPGS"
find . \( -iname "*.jpg" -o -iname "*.jpeg" \) -print0 | xargs -0 -n 1 -P 0 -I '{}' cwebp -short -q 80 '{}' -o '{}'.webp

Another version has cwebp set to -quiet. I have a hunch that cwebp might run faster in silence, but I have no data to back it up.

Further Discussion

Can I do the conversion directly on the production server?

I don’t recommend it unless (1) your server has many free CPU cores, or (2) your website has no traffic, or (3) your website doesn’t have a lot of images.

For (1), make sure you explicitly the number of processes xargs spawn, otherwise your website will slow to a crawl when the script is running.

Here are two methods I used and they are specific to the Digital Ocean stack and WordPress.

A. Test Server

  1. Create a snapshot of the production droplet
  2. Create a test server using the snapshot
  3. Bulk convert images on the test server
  4. Test
  5. Attach block storage volume to test server
  6. Copy over the images (wp-content/uploads) to the block storage volume
  7. Detach storage from test server
  8. Attach storage to production server
  9. Deploy images to production server from storage (i.e. replace wp-content/uploads)
  10. Set file permission and ownership

This is the safer approach because it allows for testing on the test server.

B. Worker Server

  1. Attach block storage to Website 1
  2. Copy the uploads directory to block storage, while you’re at it, also make a backup.
  3. Repeat Step 1 and 2 for Website 2 … Website n
  4. Detach block storage
  5. Create a new droplet that has as many CPU cores as your wallet can afford – this is the worker server
  6. Attach block storage to worker
  7. Bulk convert images
  8. Detach block storage
  9. Attach block storage to Website 1
  10. Copy the uploads directory from block storage to production server
  11. Set file permission and ownership
  12. Do some testing in production server. Clear cache if everything looks well.
  13. Repeat Step 10-12 for Website 2 … Website n

This is the riskier approach but much faster if you have to bulk convert images from multiple servers.

What I ended up doing was a combination of A and B.

How much time does the conversion take?

On a single-core 2-GB ram server, my throughput is about 16 images per second.

On an 8-core 16-GB ram server, my throughput is about 125 to 247 images per second.

Obviously if your images are bigger, it will take more time to convert.

But the conversion can be done faster if you have more CPU cores.

How much more space does WebP require?

For my setup, I’m seeing 57%-65% increase in disk space usage. For example, one of my websites initially used 2 GB for images. After conversion, that number becomes 3.3 GB. That’s 1.3 GB more or 65% increases. Your mileage may vary.

References

  1. find manual
  2. xargs manual
  3. cwebp documentation