Home SysAdmin Commands & Shells Generating Random Text Files for Testing

Generating Random Text Files for Testing

November 3, 2023

206

Originally published November 3, 2020 @ 7:05 am

Sometimes you need dummy folder structures populated with random data for testing your various scripts and processes – backups, file transfers, encryption, compression, etc. Every time I need something like this, I end up writing my little script from scratch.

I’ve written many of these little loops that I’ve lost and forgotten. But here’s the final, definitive version. No more, I promise.

This nested loop will create two folder structures populated with up to 12 subfolders each. Every subfolder will contain up to 120 files. Each file will be no more than 256KB in size, containing lines anywhere between 60 and 280 random alphanumeric characters long. So we are talking about 720MB tops. It’s a good random data set for running various tests.

set +o history
for k in 1 2
do
  mkdir test_set_0${k}
  for i in $(seq -w 01 $(shuf -i 02-12 -n 1))
  do
    mkdir -p ./test_set_0${k}/dir_${i}
    echo "Populating ./test_set_0${k}/dir_${i}"
    for j in $(seq -w 001 $(shuf -i 002-120 -n 1))
    do
      { tr -dc '[:alnum:]' </dev/urandom | fold -w $(shuf -i 60-280 -n 1) | head -c $(shuf -i 512-262144 -n 1) > ./test_set_0${k}/dir_${i}/file_${j} & } 2>/dev/null 1>&2
      pids+=($!)
    done
  done
done
set -o history

Now, if you’re wondering about the set +/-o history lines, this is to make sure the pids+=($!) doesn’t conflict with your shell history. As you can see, each dir_* is populated via a subshell running in the background. This just greatly speeds things up (at the expense of your CPUs, of course).

Igor

Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.

Symbol	USD	% 1h	% 24h	% 7d
BTC	37,157	0.55	2.50	7.72
ETH	1,716.5	0.31	3.66	4.71
XRP	0.3813	0.14	0.63	2.13
USDT	1.000	0.01	0.03	0.02
BNB	691.50	0.13	3.43	5.45
SOL	147.93	0.13	1.23	6.13
USDC	0.9999	0.00	0.00	0.00
DOGE	0.2042	0.13	13.47	24.23
	?	---	0.00	0.00
	?	---	0.00	0.00

Bitcoin $ 37,157	Bitcoin 2.50 %
Ethereum $ 1,716.5	Ethereum 3.66 %
Litecoin $ 53.16	Litecoin 0.18 %
XRP $ 0.3813	XRP 0.63 %

IMDb Movie Title Parser in Bash

Managing Mapped Network Drives in Windows

Squeezing Video Files

Adding and Removing sshd instances on CentOS 7

Adding and Removing sshd instances on CentOS 6

LLM Collapse Explained

Notes on ownCloud configuration

Removing Chef Server Installation

Curated Downloads

Sending Windows Logs to Remote Syslog

Plugging iPhone’s Privacy Holes

Managing Mapped Network Drives in Windows

Squeezing Video Files

Late Night Rant: College Admissions Scandal

Measure DNS Server Performance

Resizing Photos for Instagram

QNAP NAS Performance Analysis

Adding and Removing sshd instances on CentOS 7

Adding and Removing sshd instances on CentOS 6

Measure DNS Server Performance

Inventory Network Services with Nmap

Finding Duplicate Photos

Maryland Renaissance Festival

Focus Stacking with Lightroom and Photoshop

Longwood Gardens, April 2018

Generating Random Text Files for Testing

Sun Juan Mountains, Colorado

Scraping a Web Page in Bash

Agedu – Finding Old Files

CD/DVD-to-ISO Helper Script

Basic Data Recovery in Linux

Installing Alternative Java Versions on RHEL/CentOS