Use PHP DOM Parser for more robust screen scraping

December 5, 2009

I’d just like to put this out there, as I just “failed” a “do-at-home” interview assignment which was to implement a screen scraper using Java/PHP. I had previously (1-2 years ago) done screen scrapers in PHP, so I proceeded to do this assignment the same way – using regexes. Little did I know that using regexes would be one of the weak points of my submission – they wanted me to use a DOM parser instead. In hindsight, I guess I should have looked into that, but it just never occured to me because I already used other methods in the past.

So the moral of the story is to use DOM parsers when writing screen scrapers, they should be more robust than regex parsing in most cases. Here is an example tutorial.


POV-Ray Renderings in Space

December 4, 2009

While using POVRay in my Graphics class, stumbled upon this story where apparently Mark Shuttleworth (of digital certificates and Ubuntu fame) rendered some POVRay scenes in space on the International Space Station – awesome!