Originally published January 7, 2018 @ 10:46 pm

I am not talking about hundreds or thousands of files. I am talking about hundreds of thousands. The usual “/bin/rm -rf *” may work but will take a while. Or it may fail with the “argument list too long” error. So here are a few examples showing you how to delete many files in a hurry.

First, we need something to work with, like maybe a million empty files dumped into a single folder:

Method 1: find + xargs (40 seconds)

Method 2: find + delete (51 seconds)

Method 3: rsync (22 seconds)

Method 4: Perl (23 seconds)

Method 5: rm (19 seconds)

Moral of the story: if the number of files doesn’t exceed rm‘s argument size limit – don’t try to be clever and just use it. Otherwise, rsync seems a viable alternative.

A different approach should be taken when deleting files from NFS mounts. Each delete operation generates a lot of network overhead. The answer is to parallelize the deletion process. For example, imagine your NFS-mounted directory looks like so:

Now, also imagine that each bigfolder_## contains a large number of files. You can do something like this:

This would start ten rsync threads in parallel. This will work faster than a single thread, but there is a way of improving performance by mounting each bigfolder_## individually (if the NAS allows you this option):