I saved an website from "the Nothing" (or very bad people).
This is the website now: wineblogroll.com
It was hosted on blogspot and suddenly it became inaccessible (due to a malicious action) to his owner (wine blogger Francesco Saverio Russo).
"Everything is lost!" he exclaimed. "It can't be lost!" I told myself and sent him a message on Facebook.
Long story short:
This is the website now: wineblogroll.com
It was hosted on blogspot and suddenly it became inaccessible (due to a malicious action) to his owner (wine blogger Francesco Saverio Russo).
"Everything is lost!" he exclaimed. "It can't be lost!" I told myself and sent him a message on Facebook.
Long story short:
- with a Ruby gem (by Hartator) I recovered all the pages from wayback machine and saved the articles html files (I found 1027 articles)
- with a personally written custom python script I changed the "date" information in the html files to be correctly interpreted by Wordpress plugin Html Import 2 (by Stephanie Leary)
- Html Import 2 was set on the new Wordpress webstie by my friend and colleague Alessio Dichio and then I did the massive import of all the edited html file (after some joyful testing)
- I did some redirect job
- Done
- bonus track: knowing that the images wouldn't stay long on blogspot servers, I wrote a second python script to search for every image link in the html file and downlod it. Result: 4307 images automatically downloaded.
Lessons learned:
- always make a backup of everything (more than one is better)
- once on the web it's difficult for things to disappear (wayback machine can save you a lot of time)
- Python is a great programming language. Affordable even to non-experts and extremely powerful