I am not talking about hundreds or thousands of files. I am talking about hundreds of thousands. The usual “/bin/rm -rf *” may work, but will take a while. Or it may fail with the “argument list too long” error. So here are a few examples showing you how to delete many files in a hurry.

First, we need something to work with, like maybe a million empty files dumped into a single folder:

Method 1: find + xargs (40 seconds)

Method 2: find + delete (51 seconds)

Method 3: rsync (22 seconds)

Method 4: perl (23 seconds)

Method 5: rm (19 seconds)

Moral of the story: if the number of files doesn’t exceed rm‘s argument size limit – don’t try to be clever and just use it. Otherwise, rsync seems a viable alternative.

A different approach should be takes with deleting files from NFS mounts. Each delete operation generates a lot of network overhead. The answer is to parallelize the deletion process. For example, imagine your NFS-mounted directory looks like so:

Now, also imagine that each bigfolder_## contains a large number of files. You can do something like this:

This would start ten rsync threads in parallel. This will work faster that a single thread but there is a way of improving performance by mounting each bigfolder_## individually (if the NAS allows you this option):

 

 

Leave A Reply

Please enter your comment!
Please enter your name here