Archive for the ‘web development’ Category

Use PHP DOM Parser for more robust screen scraping

December 5, 2009

I’d just like to put this out there, as I just “failed” a “do-at-home” interview assignment which was to implement a screen scraper using Java/PHP. I had previously (1-2 years ago) done screen scrapers in PHP, so I proceeded to do this assignment the same way – using regexes. Little did I know that using regexes would be one of the weak points of my submission – they wanted me to use a DOM parser instead. In hindsight, I guess I should have looked into that, but it just never occured to me because I already used other methods in the past.

So the moral of the story is to use DOM parsers when writing screen scrapers, they should be more robust than regex parsing in most cases. Here is an example tutorial.

Hide Root Node of XML Bound TreeView (ASP.NET)

June 28, 2008

THIS BLOG HAS

MOVED

Please visit http://suanaikyeo.com instead

You have a treeview with an XML file as a data source, but XML files by default must have a single root element, which shows up on your treeview. A lot of us don’t want it to show up, and there’s no obvious way to hide it. After an hour or so of searching for a solution, I found it here. This solution really needs to be more visible on the web!

Basically you just need to set the XMLDataSource XPath property as XPath = “/*/*”. I’m not entirely sure what this means, but you can read up more about it here.

Your Domain Name in XAMPP

May 4, 2008

This probably applies to many other things while working with XAMPP or any webserver on your machine, but I specifically used it when using AJAX. Just determine your LAN/local/internal ip address and use that ip address as your domain name. For example, if this is what you had on another webserver,

var myDomain = "www.domainName.com/location/"; //where your proxy file is placed
var proxy_name = "phpProxy.php";  //proxy's filename
var actualReq = "http://search.yahoo.com/blablabla?=...." //actual request
var url = "http://"+myDomain+proxy_name+"/?"+actualReq;
httpRequest("GET",url,true,handleResponse);
...

and your local ip for example is 192.168.123.321, replace

www.domainName.com/

with

192.168.123.321/

Also, when you call the file from your browser, instead of using “http://localhost/….”, use “http://192.168.123.321/….” instead.

Happy XAMPPing!

PHP Proxy for Cross-Domain Requests That Works!

May 3, 2008

So I’ve been working on some AJAX stuff, dealing with Yahoo! Weather, specifically, and I realized I needed some way to make cross domain xmlhttprequests. Using a PHP Proxy seemed for me to be the easiest way to go, so I first tried what was suggested by Yahoo! I then tried some similar methods from others here and here, both to no avail. I then did some asking around on forums and made the necessary changes to make Yahoo’s example work. Just thought I’d share that with those of you who have similar troubles.

This is the PHP Proxy that I used. Note that:

  1. Your webserver’s PHP needs to support cURL
  2. This code doesn’t support POST requests, only GET.
<?php
// PHP Proxy example for Yahoo! Web services.
// Responds to both HTTP GET and POST requests
//
// Author: Jason Levitt
// December 7th, 2005
//(Modified by Suan-Aik Yeo)
$path = $_SERVER["QUERY_STRING"];
$url = $path;
// Open the Curl session
$session = curl_init($url);
// Don't return HTTP headers. Do return the contents of the call
curl_setopt($session, CURLOPT_HEADER, false);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
// Make the call
$xml = curl_exec($session);
// The web service returns XML. Set the Content-Type appropriately
header("Content-Type: text/xml");
echo $xml;
curl_close($session);
?>

When you’re using javascript to actually make the request, do something like this:

var proxy_name = "phpProxy.php";  //proxy's filename
var actualReq = "http://search.yahoo.com/blablabla?=...."; //actual request
var myDomain = "www.domainName.com/location/"; //where your proxy file is placed
var url = "http://"+myDomain+proxy_name+"/?"+actualReq;

//go on to do the usual httprequest steps...
httpRequest("GET",url,true,handleResponse);
...
...

Hope this helps!