Home SysAdmin Commands & Shells Plundering Facebook Photo Albums

Plundering Facebook Photo Albums

November 19, 2023

181

Originally published November 19, 2020 @ 3:50 pm

Let’s imagine you need to download all the photos in a Facebook photo album. It can be a public album, a friend’s, or even your own. Sure, you can do this manually, but you probably don’t want to. And so I came up with a little bit of automation.

For any of the stuff below to work, you need to have a Facebook account with access to the photo album in question.

The excellent nixCraft blog by Vivek Gite has an equally wonderful Facebook page with lots of techie memes. I like memes. I like to steal memes. And save them for the time when our civilization destroys itself, and I will become the king of memes.

The only problem was that Mr. Gite has accumulated over four thousand of them in his Timeline Photos album. Contrary to popular belief, I don’t have the kind of time it would take to right-click on every picture in that album and save it.

Making the list

So the first step is to get a list of URLs for the photos you want to download. The simplest option is to go to the album, scroll to the bottom of it by hitting the “End” key like a maniac, and then use one of the URL clipper extensions. I use Firefox (and so should you), and my favorite URL picker is the Linkgopher (also available for Chrome).

An alternative to scrolling to the bottom of the photo album yourself is to use an extension like FoxScroller. This URL-gathering step can also be entirely automated by using the process described below, but I had no time to lay with this.

The list of links to the photos in a Facebook photo album would look something like this:

https://www.facebook.com/${user_id}/photos/a.${album_id}/${photo_id}/

# Example:

https://www.facebook.com/nixcraft/photos/a.431194973560553/3411636868849667/
https://www.facebook.com/nixcraft/photos/a.431194973560553/3411657742180913/
https://www.facebook.com/nixcraft/photos/a.431194973560553/3411663615513659/

The problem with these links is that they are not the photos you want to download. These links will lead you to a JavaScript-infested page that will require authentication before it will generate a dynamic link to the actual photo. And that link will only work for you and your current login session. And it will expire soon too.

Because all of this complexity uses JS, you can’t use wget or curl. Ideally, what you need is a scriptable headless browser that supports JavaScript. PhantomJS is one of them, and that’s what I’ll use.

Getting the cookies

The first step is Facebook authentication. The basic syntax is this:

phantomjs --load-images=true --local-storage-path=/tmp \
--disk-cache=true --disk-cache-path=/tmp \
--cookies-file=${cookies} --ignore-ssl-errors=true \
--ssl-protocol=any --web-security=true \
${basedir}/scripts/facebook_login.js "${ua}" "${login_url}"

# where
login_url="https://www.facebook.com/login/"
ua="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"

The key element here is the facebook_login.js script that you can get here. You will need to edit this file to insert your Facebook login credentials.

Grabbing the photos

The next move is to loop through the list of URLs you collected earlier, open them in PhantomJS, extract the dynamically-generated image link, and download it.

Here’s the basic syntax:

phantomjs --load-images=true --local-storage-path=/tmp \
--disk-cache=true --disk-cache-path=/tmp --cookies-file=${cookies} \
--ignore-ssl-errors=true --ssl-protocol=any --web-security=true \
${basedir}/scripts/phantomjs_render.js "${ua}" "${url}" \
${basedir}/tmp/${project}.png

The important component here is the phantomjs_render.js script that you can grab here. In addition to rendering the Web page, this script will also make a screenshot of it. This is not strictly necessary in our case. I was just too lazy to edit out this feature.

Finally, we need to extract the correct image URL from the dynamically-generated Facebook HTML diarrhea. This piece here can use a bit more work, but it works for the most part.

# Get the photo file extention
ext="$(grep -oP "(?<=\.)[a-z]{3,4}(?=\?_nc_cat)" \
${basedir}/tmp/temp.html | head -1)"

# Extract the photo URL and wget it
wget -q "$(grep -oP "(?<=\"image\":\{\"uri\":\").*(?=\",\"width\")" \
${basedir}/tmp/temp.html | sed 's@\@@g')" -O \
${basedir}/data/${project}/${project}_$(shuf -i 100000-999999 -n 1).${ext}

To make things a bit more user-friendly, I put together this little script that should do all this stuff. Hopefully.

Igor

Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.

Bitcoin $ 37,157	Bitcoin 2.50 %
Ethereum $ 1,716.5	Ethereum 3.66 %
Litecoin $ 53.16	Litecoin 0.18 %
XRP $ 0.3813	XRP 0.63 %

Awk & sed Snippets for SysAdmins

Synology NAS Hacks

Finding Duplicate Photos

Benford’s Law in Bash

Collatz Conjecture in Bash

Synology NAS Hacks

Cutting Videos Into Smaller Segments

Compile ffmpeg From Source

AWS CLI Cheat Sheet

Find Large Folders

Laziness vs Phishing

Synology NAS Hacks

Finding Duplicate Photos

NFS I/O Stats with Logging

Automating Web Page Screenshots

Monitoring Application Network Connections

Atop Script with Scheduling and Logging

Inventory Network Services with Nmap

NFS I/O Stats with Logging

Inventorying NFS Mounts and Mount Options

Verify Network Port Access

Finding Duplicate Photos

Maryland Renaissance Festival

Focus Stacking with Lightroom and Photoshop

Longwood Gardens, April 2018

Plundering Facebook Photo Albums

Making the list

Getting the cookies

Grabbing the photos

CD/DVD-to-ISO Helper Script

Automatic File Backups in VIM

Run Cron Jobs at Arbitrary Intervals

Gather MX Records for a List of Domains

Validating HTTPS Cache Peers for Squid

IMDb Movie Title Parser in Bash