Using Curl 101

To use Curl in PHP, you must have the Curl extension compiled in –with-curl, and you’ll want –with-openssl, if you need to be able to hit https pages.

Once you get PHP working with Curl (I could explain how to do that, but for this article, I am focusing on how to use it).

The code below is for PHP5, but I’m sure you could modify it to work with PHP4, just have to change the syntax a bit.

Here is a good Curl class that I wrote, I call is class.Curl.php, don’t worry about the details of this file, just create it, then look at how easy it is to use it (example follows):

<?php
define(‘VERIFYHOST’, false);
define(‘MAXREDIRS’, 10);
#define(‘USERAGENT’, “Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)”);
define(‘USERAGENT’, “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0″);

class Curl
{
  public $url;
  public $response_code;
  public $response_header;
  public $response_headers;
  public $response_body;
  public $cookieJar;
  private $response;
  private $ch;
  static $stderr = null;

  public function __construct($cookieJar=false)
  {
     $this->cookieJar = $cookieJar ? $cookieJar : tempnam(“/tmp”, “cookieJar”);
     $this->ch = curl_init();
     curl_setopt($this->ch, CURLOPT_SSL_VERIFYPEER, VERIFYHOST);
     curl_setopt($this->ch, CURLOPT_SSL_VERIFYHOST, VERIFYHOST);
     curl_setopt ($this->ch, CURLOPT_USERAGENT, USERAGENT);
     curl_setopt ($this->ch, CURLOPT_COOKIEJAR, $this->cookieJar);
     curl_setopt ($this->ch, CURLOPT_COOKIEFILE, $this->cookieJar);
     curl_setopt ($this->ch, CURLOPT_CRLF, true);
     curl_setopt ($this->ch, CURLOPT_HEADER, true);
     curl_setopt ($this->ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
     curl_setopt ($this->ch, CURLOPT_ENCODING, “gzip”); // “” means all supported
     curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, true);
     curl_setopt($this->ch, CURLOPT_MAXREDIRS, MAXREDIRS);
     curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, true);
     curl_setopt($this->ch, CURLOPT_BINARYTRANSFER, true);
     curl_setopt($this->ch, CURLOPT_USERAGENT, USERAGENT);
     curl_setopt($this->ch, CURLOPT_CONNECTTIMEOUT, 300);

     curl_setopt($this->ch, CURLOPT_HTTPHEADER,
                            array(
                            “Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5″,
                            “Accept-Language: en-us,en;q=0.5″,
                            “Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7″
                            ));
     #curl_setopt($this->ch, CURLOPT_VERBOSE, true);
     #if ( self::$stderr == null )
     #  self::$stderr = fopen(“/tmp/curlerror.log”, “w”);
     #fputs(self::$stderr, date(“Y-m-d H:i:s”).”\n”);
     #curl_setopt($this->ch, CURLOPT_STDERR, self::$stderr);
  }

  public function __destruct()
  {
    // curl_close($this->ch);
  }

  public function nextpage($url, $method=’GET’, $data=false, $referer=false, $extraPost=false)
  {
     $this->url = $url;
     if ( $referer )
       curl_setopt ($this->ch, CURLOPT_REFERER, $referer);
     if (strtoupper($method)==’POST’)
     {
       curl_setopt($this->ch, CURLOPT_POST, 1);
       $postdata = array();
       foreach ($data as $key=>$val)
         $postdata[] = urlencode($key).”=”.urlencode($val);
       curl_setopt($this->ch, CURLOPT_POSTFIELDS, implode(“&”, $postdata).($extraPost ? ‘&’.$extraPost : ”));
     }
     else
       curl_setopt($this->ch, CURLOPT_HTTPGET, true);

     curl_setopt($this->ch, CURLOPT_URL, $url);

     $this->response = curl_exec($this->ch);
     $this->parse_response();

     $this->url = $this->getUrl();

     /*
     if ( ereg(“\nLocation: ([^\r\n]+)”, $this->response_header, $matches) )
     {
       $url = $matches[1];
       $url = str_replace(“:80″, “”, $url);

       return $this->nextpage($url);
     }
     */

     return $this->response_body;

     /*
     if ( ereg(“\nLocation: ([^\n]+)”, $this->response_header, $matches) )
     {
       $this->__construct($matches[1], $this->cookieJar);
     }

     if ( ereg(“\nRefresh: ([0-9]+); *URL=([^\n]+)”, $this->response_header, $matches) )
     {
       sleep(3);
       $this->__construct($matches[2], $this->cookieJar);
     }
     */

  }

  private function parse_response()
  {
    // Split response into header and body sections
    list($this->response_header, $this->response_body) = split(“\r?\n\r?\n”, $this->response, 2);

    $response_header_lines = split(“\r?\n”, $this->response_header);

    // First line of headers is the HTTP response code
    $http_response_line = array_shift($response_header_lines);
    if(preg_match(‘@^HTTP/[0-9]\.[0-9] ([0-9]{3})@’,$http_response_line, $matches))
    {
      $this->response_code = $matches[1];
    }

    // put the rest of the headers in an array
    $this->response_headers = array();
    foreach($response_header_lines as $header_line)
    {
       if ( preg_match(“/^\w/”, $header_line) )
         list($header,$value) = explode(‘: ‘, $header_line, 2);
       else
         $value = $header_line;
       $this->response_headers[$header] .= ( $this->response_headers[$header] ? “\n” : “”) . $value;
    }
  }

  public function getUrl()
  {
    return curl_getinfo ( $this->ch, CURLINFO_EFFECTIVE_URL );
  }
}
?>

Ok, so now you have class.Curl.php, now here is how to use it, let’s do a USPS Track and Confirm:

<?php
  require_once(“class.Curl.php”);

  $url = “http://www.usps.gov/”;
  $aCurl = new Curl();

  $homePage = $aCurl->nextpage($url);

  print “HOMEPAGE: $homePage\n”;

  $url = “http://trkcnfrm1.smi.usps.com/PTSInternetWeb/InterLabelInquiry.do”;
  $data = array();
  $data["CAMEFROM"] = “OK”;
  $data["strOrigTrackNum"] = “9101150134711177503513″;
  $data["Go to Track & Confirm"] = “Go”;
  $resultPage = $aCurl->nextpage($url, ‘POST’, $data);

  print “Result Page: $resultPage\n”;

?>

Pretty easy, eh? The hard part is knowing what data to put into the $data array. I suggest getting the Firefox HTTP LiveHeaders extension from Mozilla, you can manually perform the steps you want to automate and log all the GET/POST data you made along the way. The POST data is in URL format, you need to convert it to PHP array format, I suggest this little utility I wrote:

I call this “urldecode.php”:

#!/usr/local/bin/php -q
<?php

  $query = $argv[1];

  $parts = explode(“&”, $query);
  foreach ($parts as $part)
  {
    list($key, $val) = explode(“=”, $part);
    print “\$data[\"".urldecode($key)."\"] = \”".urldecode($val).”\”;\n”;
  }

?>

I use is like this, the string I got from HTTP LiveHeaders after performing the search manually with Firefox:

./urldecode.php ‘CAMEFROM=OK&strOrigTrackNum=555555555555&Go+to+Track+%26+Confirm.x=21&Go+to+Track+%26+Confirm.y=9&Go+to+Track+%26+Confirm=Go’

which results in:

$data["CAMEFROM"] = “OK”;
$data["strOrigTrackNum"] = “555555555555″;
$data["Go to Track & Confirm.x"] = “21″;
$data["Go to Track & Confirm.y"] = “9″;
$data["Go to Track & Confirm"] = “Go”;

I then copy paste this result into my PHP code.

This entry was posted in Code. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Spam Protection by WP-SpamFree