Originally published December 26, 2017 @ 10:44 pm
In this scenario, a user emailed some other user something that probably should not have been emailed. You don’t know who the users are or exactly what they sent. What you have is a bunch of PST files and a list of keywords. And this shell script.
Here’s how the script works. You dump the PST files into $indir. I suggest using a local filesystem and not a network-mounted one for performance and security reasons. Considering the potentially-sensitive nature of this data, you may want to set up an encrypted filesystem that would not auto-mount on startup.
You then create the $keyword_list containing one keyword per line and encrypt it with gpg using a passphrase. The script will prompt you for that passphrase when you launch it. Depending on the volume of PSTs, the conversion process may take some time.
As the script digs through the emails, you may start seeing something along these lines:
PST: /downloads/input/username@domain.com --------------------------------------- FILE: /downloads/input/username@domain.com/Inbox/104.eml KEYS: keyword_01,keyword_07 DATE: Mon, 9 Jun 2010 20 08 16 -0400 FROM: username2@domain.com TO: username@domain.com SUBJ: Email subjet FILE: /downloads/input/username@domain.com/Inbox/269.eml KEYS: keyword_03 DATE: Mon, 7 Jul 2010 12 58 29 -0400 FROM: username3@domain.com TO: username@domain.com SUBJ: Another email subject
You can then view the listed email files for more information. I am sure there is a more civilized tool for this task, but all I had was bash. The script is below and you can also get it here.
#!/bin/bash
# |
# ___/"\___
# __________/ o \__________
# (I) (G) \___/ (O) (R)
# 2017-12-15
# ----------------------------------------------------------------------------
# Convert *.pst mailbox files to text and scan for keywords
# ----------------------------------------------------------------------------
#
readpass() {
# Read your GPG password
echo -n "Password: "
read -s p
if [ -z "${p}" ]; then
exit 1
fi
}
configure() {
# Install readpst, if not there already
if [ ! -x /usr/bin/readpst ]; then
yum -y install libpst.x86_64 || exit 1
fi
# Install the Silver Searcher, if not there already
if [ ! -x /usr/bin/ag ]; then
yum -y install the_sliver_searcher || exit 1
fi
# Install GPG, if not there already
if [ ! -x /usr/bin/gpg ]; then
yum -y install gpg || exit 1
fi
# Put your *.pst files in here
indir="/downloads/input"
# Put your keywords in here, one per line
# and encrypt it like so:
# gpg --batch --symmetric --passphrase "${p}" "${keyword_list}" 2>/dev/null
# chmod 600 "${keyword_list}.gpg"
# /bin/rm -f "${keyword_list}"
keyword_list="/tmp/keywords.txt"
if [ ! -r "${keyword_list}.gpg" ]; then
exit 1
fi
# Just in case you forgot
chmod 400 "${keyword_list}.gpg"
}
extractpst() {
# Find and convert *.pst files to text
find "${indir}" -maxdepth 1 -mindepth 1 -type f -name "*\.pst" | while read pst; do
cd "${indir}" && readpst -j $(grep -c processor /proc/cpuinfo) -b -e "${pst}"
done
}
extractkeywords() {
# Read keyword list into an array
IFS=$'\n'; a=($(gpg --batch --decrypt --passphrase "${p}" "${keyword_list}.gpg" 2>/dev/null)); unset IFS
# Assign keywords to a variable
s=$(for ((i = 0; i < ${#a[@]}; i++)) ; do echo -n "${a[$i]}|" ; done | sed 's/|$//g')
}
findkeywords() {
c=()
IFS=$'\n'; b=($(ag -c "${s}" "${pst_folder}" | awk -F: '{print $1}' | sort -u)); unset IFS
echo "PST: ${pst_folder}"
echo "---------------------------------------"
for ((i = 0; i < ${#b[@]}; i++)) ; do echo "${b[$i]}" ; done | while read line; do
message_id="$(grep -oP -m1 "(?<=Message-ID: <).*(?=>$)" "${line}")"
if [ "$(for ((i = 0; i < ${#c[@]}; i++)) ; do echo "${c[$i]}"; done | grep -c "${message_id}")" -eq 0 ]; then
cat << EOF
FILE: $(echo "${line}")
KEYS: $(grep -oP "${s}" "${line}" | sort -u | tr '\n' ', ' | sed -r 's/,$//g')
DATE: $(grep -P -m1 "^Date:" "${line}" | awk -F: '{$1=""; print $0}' | sed 's/^ //g')
FROM: $(grep -P -m1 "^From:" "${line}" | grep -Po '(?i)\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}\b' | sort -u | tr '\n' ', ' | sed -r 's/,$//g')
TO: $(grep -P -m1 "^To:" "${line}" | grep -Po '(?i)\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}\b' | sort -u | tr '\n' ', ' | sed -r 's/,$//g')
SUBJ: $(grep -P -m1 "^Subject:" "${line}" | awk -F: '{$1=""; print $0}' | sed 's/^ //g')
EOF
echo
c+=("${message_id}")
fi
done
}
find_do() {
# Search and parse
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for pst_folder in $(find "${indir}" -maxdepth 1 -mindepth 1 -type d); do
findkeywords
done
IFS=$SAVEIFS
}
# RUNTIME
readpass
configure
extractpst
extractkeywords
find_do

Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.






















