Book HomePHP CookbookSearch this book

11.2. Fetching a URL with the GET Method

11.2.1. Problem

You want to retrieve the contents of a URL. For example, you want to include part of one web page in another page's content.

11.2.2. Solution

Pass the URL to fopen( ) and get the contents of the page with fread( ):

$page = '';
$fh = fopen('http://www.example.com/robots.txt','r') or die($php_errormsg);
while (! feof($fh)) {
    $page .= fread($fh,1048576);
}
fclose($fh);

You can use the cURL extension:

$c = curl_init('http://www.example.com/robots.txt');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
$page = curl_exec($c);
curl_close($c);

You can also use the HTTP_Request class from PEAR:

require 'HTTP/Request.php';

$r = new HTTP_Request('http://www.example.com/robots.txt');
$r->sendRequest();
$page = $r->getResponseBody();

11.2.3. Discussion

You can put a username and password in the URL if you need to retrieve a protected page. In this example, the username is david, and the password is hax0r. Here's how to do it with fopen( ):

$fh = fopen('http://david:hax0r@www.example.com/secrets.html','r') 
    or die($php_errormsg);
while (! feof($fh)) {
    $page .= fread($fh,1048576);
}
fclose($fh);

Here's how to do it with cURL:

$c = curl_init('http://www.example.com/secrets.html');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_USERPWD, 'david:hax0r');
$page = curl_exec($c);
curl_close($c);

Here's how to do it with HTTP_Request:

$r = new HTTP_Request('http://www.example.com/secrets.html');
$r->setBasicAuth('david','hax0r');
$r->sendRequest();
$page = $r->getResponseBody();

While fopen( ) follows redirects in Location response headers, HTTP_Request does not. cURL follows them only when the CURLOPT_FOLLOWLOCATION option is set:

$c = curl_init('http://www.example.com/directory');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($c);
curl_close($c);

cURL can do a few different things with the page it retrieves. If the CURLOPT_RETURNTRANSFER option is set, curl_exec( ) returns a string containing the page:

$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
$page = curl_exec($c);
curl_close($c);

To write the retrieved page to a file, open a file handle for writing with fopen( ) and set the CURLOPT_FILE option to the file handle:

$fh = fopen('local-copy-of-files.html','w') or die($php_errormsg);
$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_FILE, $fh);
curl_exec($c);
curl_close($c);

To pass the cURL resource and the contents of the retrieved page to a function, set the CURLOPT_WRITEFUNCTION option to the name of the function:

// save the URL and the page contents in a database
function save_page($c,$page) {
    $info = curl_getinfo($c);
    mysql_query("INSERT INTO pages (url,page) VALUES ('" .
                mysql_escape_string($info['url']) . "', '" .
                mysql_escape_string($page) . "')");
}

$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_WRITEFUNCTION, 'save_page');
curl_exec($c);
curl_close($c);

If none of CURLOPT_RETURNTRANSFER, CURLOPT_FILE, or CURLOPT_WRITEFUNCTION is set, cURL prints out the contents of the returned page.

The fopen() function and the include and require directives can retrieve remote files only if URL fopen wrappers are enabled. URL fopen wrappers are enabled by default and are controlled by the allow_url_fopen configuration directive. On Windows, however, include and require can't retrieve remote files in versions of PHP earlier than 4.3, even if allow_url_fopen is on.

11.2.4. See Also

Recipe 11.3 for fetching a URL with the POST method; Recipe 8.13 discusses opening remote files with fopen(); documentation on fopen( ) at http://www.php.net/fopen, include at http://www.php.net/include, curl_init( ) at http://www.php.net/curl-init, curl_setopt( ) at http://www.php.net/curl-setopt, curl_exec( ) at http://www.php.net/curl-exec, and curl_close( ) at http://www.php.net/curl-close; the PEAR HTTP_Request class at http://pear.php.net/package-info.php?package=HTTP_Request.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.