Published a dataset
https://cryptics.eigenfoo.xyz/

cryptics.eigenfoo.xyz is a dataset of cryptic crossword clues, collected from various blogs and publicly available digital archives. I originally started this project to practice my web scraping and data engineering skills, but as it’s evolved I hope it can be a resource to solvers and constructors of cryptic crosswords.

The project scrapes several blogs and digital archives for cryptic crosswords. Out of these collected web pages, the clues, answers, clue numbers, blogger’s explanation and commentary, puzzle title and publication date are all parsed and extracted into a tabular dataset. The result (as of September 2021) is over half a million clues from cryptic crosswords over the past twelve years, which makes for a rich and peculiar dataset.