Book HomeApache: The Definitive GuideSearch this book

Chapter 9. Proxy Server

Contents:

Proxy Directives
Caching
Setup

An important concern on the Web is keeping the Bad Guys out of your network (see Chapter 13, "Security"). One established technique is to keep the network hidden behind a firewall; this works well, but as soon as you do it, it also means that everyone on the same network suddenly finds that their view of the Net has disappeared (rather like people living near Miami Beach before and after the building boom). This becomes an urgent issue at Buttherthlies, Inc., as competition heats up and naughty-minded Bad Guys keep trying to break our security and get in. We install a firewall and, anticipating the instant outcries from the marketing animals who need to get out on the Web and surf for prey, we also install a proxy server to get them out there.

So, in addition to the Apache that serves clients visiting our sites and is protected by the firewall, we need a copy of Apache to act as a proxy server to let us, in our turn, access other sites out on the Web. Without the proxy server, those inside are safe but blind.

9.1. Proxy Directives

We are not concerned here with firewalls, so we take them for granted. The interesting thing is how we configure the proxy Apache to make life with a firewall tolerable to those behind it.

site.proxy has three subdirectories: cache, proxy, real. The Config file from ... /site. proxy/proxy is as follows:

User webuser
Group webgroup
ServerName www.butterthlies.com

Port 8000
ProxyRequests on
CacheRoot /usr/www/site.proxy/cache
CacheSize 100000

The points to notice are that:

9.1.1. ProxyRequests

ProxyRequests [on|off]
Default: off
Server config

This directive turns proxy serving on. Even if ProxyRequests is off, ProxyPass directives are still honored.

9.1.2. ProxyRemote

ProxyRemote remote-server = protocol://hostname[:port]
Server config

This directive defines remote proxies to this proxy. remote-server is either the name of a URL scheme that the remote server supports, a partial URL for which the remote server should be used, or " * " to indicate that the server should be contacted for all requests. protocol is the protocol that should be used to communicate with the remote server. Currently, only HTTP is supported by this module. For example:

ProxyRemote ftp http://ftpproxy.mydomain.com:8080
  ProxyRemote http://goodguys.com/ http://mirrorguys.com:8000
  ProxyRemote * http://cleversite.com

9.1.3. ProxyPass

ProxyPass path url
Server config

This command runs on an ordinary server and translates requests for a named directory and below to a demand to a proxy server. So, on our ordinary Butterthlies site, we might want to pass requests to /secrets onto a proxy server darkstar.com:

ProxyPass /secrets http://darkstar.com

Unfortunately, this is less useful than it might appear, since the proxy does not modify the HTML returned by darkstar.com. This means that URLs embedded in the HTML will refer to documents on the main server unless they have been written carefully. For example, suppose a document one.html is stored on darkstar.com with the URL http://darkstar.com/one.html, and we want it to refer to another document in the same directory. Then the following links will work, when accessed as http://www.butterthlies.com/secrets/one.html:

<A HREF="two.html">Two</A>
<A HREF="/secrets/two.html">Two</A>
<A HREF="http://darkstar.com/two.html">Two</A>

But this example will not work:

<A HREF="/two.html">Not two</A>

When accessed directly, through http://darkstar.com/one.html, these links work:

<A HREF="two.html">Two</A>
<A HREF="/two.html">Two</A>
<A HREF="http://darkstar.com/two.html">Two</A>

But the following doesn't:

<A HREF="/secrets/two.html">Two</A>

9.1.4. ProxyDomain

ProxyDomain Domain 
Server config

This directive is only useful for Apache proxy servers within intranets. The ProxyDomain directive specifies the default domain to which the Apache proxy server will belong. If a request to a host without a domain name is encountered, a redirection response to the same host with the configured Domain appended will be generated.

9.1.5. NoProxy

NoProxy { Domain | SubNet | IpAddr | Hostname } 
Server config

This directive is only useful for Apache proxy servers within intranets. The NoProxy directive specifies a list of subnets, IP addresses, hosts, and/or domains, separated by spaces. A request to a host that matches one or more of these is always served directly, without forwarding to the configured ProxyRemote proxy server(s).

9.1.6. ProxyPassReverse

ProxyPassReverse path url
Server config, virtual host

A reverse proxy is a way to share load between several servers -- the frontend server simply accepts requests and forwards them to one of several backend servers. The optional module mod_rewrite has some special stuff in it to support this. This directive lets Apache adjust the URL in the Location response header. If a ProxyPass (or mod_rewrite) has been used to do reverse proxying, then this directive will rewrite Location headers coming back from the reverse proxied server so that they look as if they came from somewhere else (normally this server, of course).



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.