4chan Scraper<!-- --> | <!-- -->Ben Pettis
Ben Pettis

4chan Scraper

January 01, 2019

A screenshot of the 4chan /pol/ politically incorrect imageboard.
View on GitHub

This simple Python script uses 4chan's read-only APIs to scrape the information from the front page of a given imageboard. In addition to saving every image posted to the board, the script will also generate multiple CSV files that record which threads were on the front page at a given time. A folder is generated for each thread's images, as well as an individual CSV file that records each reply in the thread as well. I have done some research on anonymous online communities, the ways they communicate with one another, and how they're able to influence real events in the physical world. Rather than manually browsing and downloading content from 4chan imageboards, I built this script to automatically scrape the most recent content from a given 4chan imageboard.