Book HomeApache: The Definitive GuideSearch this book

2.4. Setting Up a Unix Server

We can point httpd at our site with the -d flag (notice the full pathname to the site.toddle directory):

% httpd -d /usr/www/site.toddle

Since you will be typing this a lot, it's sensible to copy it into a script called go in /usr/local/bin by typing:

% cat > /usr/local/bin/go
httpd -d `pwd`
^d

^d is shorthand for CTRL-D, which ends the input and gets your prompt back. This go will work on every site.

Make go runnable and run it by typing the following (note that you have to be in the directory .../site.toddle when you run go):

% chmod +x /usr/local/bin/go
% go

This launches Apache in the background. Check that it's running by typing something like this (arguments to ps vary from Unix to Unix):

% ps -aux

This Unix utility lists all the processes running, among which you should find several httpds.[18]

[18]On System V-based Unix systems (as opposed to Berkeley-based), the command ps -ef should have a similar effect.

Sooner or later, you have finished testing and want to stop Apache. In order to do this, you have to get the process identity (PID) using ps -aux and execute the Unix utility kill:

% kill PID

Alternatively, since Apache writes its PID in the file ... /logs/httpd.pid (by default -- see the PidFile directive), you can write yourself a little script, as follows:

kill `cat /usr/www/site.toddle/logs/httpd.pid`

You may prefer to put more generalized versions of these scripts somewhere on your path. For example, the following scripts will start and stop a server based in your current directory. go looks like this:

httpd -d `pwd`

and stop looks like this:

pwd | read path
kill `cat $path/logs/httpd.pid`

Or, if you don't plan to mess with many different configurations, use .../src/support/apachect1 to start and stop Apache in the default directory. You might want to copy it into /usr/local/bin to get it onto the path. It uses the following flags:

usage: ./apachectl
(start|stop|restart|fullstatus|status|graceful|configtest|help)
start

Start httpd.

stop

Stop httpd.

restart

Restart httpd if running by sending a SIGHUP or start if not running.

fullstatus

Dump a full status screen; requires lynx and mod_status enabled.

status

Dump a short status screen; requires lynx and mod_status enabled.

graceful

Do a graceful restart by sending a SIGUSR1 or start if not running.

configtest

Do a configuration syntax test.

help

This screen.

When we typed ./go, nothing appeared to happen, but when we looked in the logs subdirectory, we found a file called error_log with the entry:

[<date>]:'mod_unique_id: unable to get hostbyname ("myname.my.domain")

This problem was, in our case, due to the odd way we were running Apache and will only affect you if you are running on a host with no DNS or on an operating system that has difficulty determining the local hostname. The solution was to edit the file /etc/hosts and add the line:

10.0.0.2 myname.my.domain myname

where 10.0.0.2 is the IP number we were using for testing.

However, our troubles were not yet over. When we reran httpd we received the following error message:

[<date>] - couldn't determine user name from uid

This means more than might at first appear. We had logged in as root. Because of the security worries of letting outsiders log in with superuser powers, Apache, having been started with root permissions so that it can bind to port 80, has attempted to change its user ID to -1. On many Unix systems, this ID corresponds to the user nobody : a harmless person. However, it seems that FreeBSD does not understand this notion, hence the error message.[19]

[19]In fact, this problem was fixed for FreeBSD shortly before this book went to press, but you may still encounter it on other operating systems.

2.4.1. Webuser and Webgroup

The remedy is to create a new person, called webuser, belonging to webgroup. The names are unimportant. The main thing is that this user should be in a group of its own and should not actually be used by anyone for anything else. On a FreeBSD system, you can run adduser to make this new person:

Enter username [a-z0-9]: webuser
Enter full name[]: webuser
Enter shell bash csh date no sh tcsh [csh]: no
Uid [some number]:
Login group webuser [webuser]: webgroup
Login group is ''webgroup'.q. Invite webuser into other
    groups: guest no [no]:
Enter password []: password

You then get the report:

Name:webuser
Password: password
Fullname: webuser
Uid: some number
Groups:webgroup
HOME:/home/webuser
shell/nonexistent
OK? (y/n) [y]:

send message to ''webuser' and: no route second_mail_address [no]:
Add anything to default message (y/n) [n]:
Send message (y/n) [y]: n
Add another user? (y/n) [y]:n

The bits of the script after OK are really irrelevant, but of course FreeBSD does not know that you are making a nonexistent user. Having told the operating system about this user, you now have to tell Apache. Edit the file httpd.conf to include the following lines:

User webuser
Group webgroup

The following are the interesting directives.

2.4.1.1. User

User unix-userid
Default: User #-1
Server config, virtual host

The User directive sets the user ID under which the server will answer requests. In order to use this directive, the standalone server must be run initially as root. unix-userid is one of the following:

username

Refers to the given user by name

#usernumber

Refers to a user by his or her number

The user should have no privileges that allow him or her to access files not intended to be visible to the outside world; similarly, the user should not be able to execute code that is not meant for httpd requests. It is recommended that you set up a new user and group specifically for running the server. Some administrators use user nobody, but this is not always possible or desirable. For example, mod_proxy 's cache, when enabled, must be accessible to this user (see the CacheRoot directive in Chapter 9, "Proxy Server").

2.4.1.1.1. Notes

If you start the server as a non-root user, it will fail to change to the lesser-privileged user, and will instead continue to run as that original user. If you start the server as root, then it is normal for the parent process to remain running as root.

2.4.1.1.2. Security

Don't set User (or Group) to root unless you know exactly what you are doing and what the dangers are.

2.4.1.2. Group

Group unix-group
Default: Group #-1
Server config, virtual host

The Group directive sets the group under which the server will answer requests. In order to use this directive, the standalone server must be run initially as root. unix-group is one of the following:

groupname

Refers to the given group by name

#groupnumber

Refers to a group by its number

It is recommended that you set up a new group specifically for running the server. Some administrators use group nobody, but this is not always possible or desirable.

2.4.1.2.1. Note

If you start the server as a non-root user, it will fail to change to the specified group, and will instead continue to run as the group of the original user.

Now, when you run httpd and look for the PID, you will find that one copy belongs to root, and several others belong to webuser. Kill the root copy and the others will vanish.

2.4.2. Running Apache Under Unix

When you run Apache now, you may get the following error message:

httpd: cannot determine local hostname
Use ServerName to set it manually.

What Apache means is that you should put this line in the httpd.conf file:

ServerName yourmachinename

Finally, before you can expect any action, you need to set up some documents to serve. Apache's default document directory is ... /httpd/htdocs -- which you don't want to use because you are at /usr/www/site.toddle -- so you have to set it explicitly. Create ... /site.toddle/htdocs, and then in it create a file called 1.txt containing the immortal words "hullo world." Then add this line to httpd.conf :

DocumentRoot /usr/www/site.toddle/htdocs

The complete Config file, .../site.toddle/conf/httpd.conf, now looks like this:

User webuser
Group webgroup
ServerName yourmachinename
DocumentRoot /usr/www/site.toddle/htdocs

When you fire up httpd, you should have a working web server. To prove it, start up a browser to access your new server, and point it at http://yourmachinename/. [21]

[21]Note that if you are on the same machine, you can use http://127.0.0.1/ or http://localhost/, but this can be confusing because virtual host resolution may cause the server to behave differently than if you had used the interface's "real" name.

s As we know, http means use the HTTP protocol to get documents, and "/ " on the end means go to the DocumentRoot directory you set in httpd.conf.

2.4.2.1. DocumentRoot

DocumentRoot directory-filename
Default: /usr/local/apache/htdocs
Server config, virtual host

This directive sets the directory from which Apache will serve files. Unless matched by a directive like Alias, the server appends the path from the requested URL to the document root to make the path to the document. For example:

DocumentRoot /usr/web

An access to http://www.my.host.com/index.html now refers to /usr/web/index.html.

There appears to be a bug in mod_dir that causes problems when the directory specified in DocumentRoot has a trailing slash (e.g., DocumentRoot /usr/web/), so please avoid that. It is worth bearing in mind that the deeper DocumentRoot goes, the longer it takes Apache to check out the directories. For the sake of performance, adopt the British Army's universal motto: KISS (Keep It Simple, Stupid)!

Lynx is the text browser that comes with FreeBSD and other flavors of Unix; if it is available, type:

% lynx http://yourmachinename/

You see:

INDEX OF /
* Parent Directory
* 1.txt

If you move to 1.txt with the down arrow, you see:

hullo world

If you don't have Lynx (or Netscape, or some other web browser) on your server, you can use telnet :[22]

[22]telnet is not really suitable as a web browser, though it can be a very useful debugging tool.

% telnet yourmachinename80

Then type:

GET / HTTP/1.0 <CR><CR>

You should see:

HTTP/1.0 200 OK
Sat, 24 Aug 1996 23:49:02 GMT
Server: Apache/1.3
Connection: close
Content-Type: text/html

<HEAD><TITLE>Index of /</TITLE></HEAD><BODY>
<H1>Index of </H1>
<UL><LI> <A HREF="/"> Parent Directory</A>
<LI> <A HREF="1.txt"> 1.txt</A>
</UL></BODY>
Connection closed by foreign host.

The stuff between the "< " and ">" is HTML, written by Apache, which, if viewed through a browser, produces the formatted message shown by Lynx earlier, and by Netscape in the next chapter.

2.4.3. Several Copies of Apache

To get a display of all the processes running, run:

% ps -aux

Among a lot of Unix stuff, you will see one copy of httpd belonging to root, and a number that belong to webuser. They are similar copies, waiting to deal with incoming queries.

The root copy is still attached to port 80 -- thus its children will be also -- but it is not listening. This is because it is root and has too many powers. It is necessary for this "master" copy to remain running as root because only root can open ports below 1024. Its job is to monitor the scoreboard where the other copies post their status: busy or waiting. If there are too few waiting (default 5, set by the MinSpareServers directive in httpd.conf ), the root copy starts new ones; if there are too many waiting (default 10, set by the MaxSpareServers directive), it kills some off. If you note the PID (shown by ps -ax, or ps -aux for a fuller listing; also to be found in ... /logs/httpd.pid) of the root copy and kill it with:

% kill PID

or use the stop script described in Section 2.4, "Setting Up a Unix Server "," earlier in this chapter, you will find that the other copies disappear as well.

2.4.4. Unix Permissions

If Apache is to work properly, it's important to correctly set the file-access permissions. In Unix systems, there are three kinds of permissions: read, write , and execute. They attach to each object in three levels: user, group, and other or "rest of the world." If you have installed the demonstration sites, go to ... /site.cgi/htdocs and type:

% ls -l

You see:

-rw-rw-r-- 5 root bin 1575 Aug 15 07:45 form_summer.html

The first "-" indicates that this is a regular file. It is followed by three permission fields, each of three characters. They mean, in this case:

User (root)

Read yes, write yes, execute no

Group (bin)

Read yes, write yes, execute no

Other

Read yes, write no, execute no

When the permissions apply to a directory, the "x" execute permission means scan, the ability to see the contents and move down a level.

The permission that interests us is other, because the copy of Apache that tries to access this file belongs to user webuser and group webgroup. These were set up to have no affinities with root and bin, so that copy can gain access only under the other permissions, and the only one set is "read." Consequently, a Bad Guy who crawls under the cloak of Apache cannot alter or delete our precious form_summer.html; he can only read it.

We can now write a coherent doctrine on permissions. We have set things up so that everything in our web site except the data vulnerable to attack has owner root and group wheel. We did this partly because it is a valid approach, but also because it is the only portable one. The files on our CD-ROM with owner root and group wheel have owner and group numbers "0" that translate into similar superuser access on every machine.

Of course, this only makes sense if the webmaster has root login permission, which we had. You may have to adapt the whole scheme if you do not have root login, and you should perhaps consult your site administrator.

In general, on a web site, everything should be owned by a user who is not webuser and a group that is not webgroup (assuming you use these terms for Apache configurations).

There are four kinds of files to which we want to give webuser access: directories, data, programs, and shell scripts. webuser must have scan permissions on all the directories, starting at root down to wherever the accessible files are. If Apache is to access a directory, that directory and all in the path must have x permission set for other. You do this by entering:

% chmod o+x each-directory-in-the-path

In order to produce a directory listing (if this is required by, say, an index), the final directory must have read permission for other. You do this by typing:

% chmod o+r final-directory

It probably should not have write permission set for other :

% chmod o-w final-directory

In order to serve a file as data -- and this includes files like .htaccess (see Chapter 3, "Toward a Real Web Site") -- the file must have read permission for other :

% chmod o+r file

And, as before, deny write permission:

% chmod o-w file

In order to run a program, the file must have execute permission set for other:

% chmod o+x program

In order to execute a shell script, the file must have read and execute permission set for other :

% chmod o+rx script

2.4.5. A Local Network

Emboldened by the success of site.toddle, we can now set about a more realistic setup, without as yet venturing out onto the unknown waters of the Web. We need to get two things running: Apache under some sort of Unix and a GUI browser. There are two main ways this can be achieved:

We cannot hope to give detailed explanations for all possible variants of these situations. We expect that many of our readers will already be webmasters, familiar with these issues, who will want to skip the next section. Those who are new to the Web may find it useful to know what we did.

2.4.6. Our Experimental Micro Web

First, we had to install a network card on the FreeBSD machine. As it boots up, it tests all its components and prints a list on the console, which includes the card and the name of the appropriate driver. We used a 3Com card, and the following entries appeared:

...
1 3C5x9 board(s) on ISA found at 0x300
ep0 at 0x300-0x30f irq 10 on isa
ep0: aui/bnc/utp[*BNC*] address 00:a0:24:4b:48:23 irq 10
...

This indicated pretty clearly that the driver was ep0 , and that it had installed properly. If you miss this at bootup, FreeBSD lets you hit the Scroll Lock key and page up till you see it, then hit Scroll Lock again to return to normal operation.

Once a card was working, we needed to configure its driver, ep0. We did this with the following commands:

ifconfig ep0 192.168.123.2
ifconfig ep0 192.168.123.3 alias netmask 0xFFFFFFFF
ifconfig ep0 192.168.124.1 alias

The alias command makes ifconfig bind an additional IP address to the same device. The netmask command is needed to stop FreeBSD from printing an error message (for more on netmasks, see O'Reilly's TCP/IP Network Administration).

Note that the network numbers used here are suited to our particular network configuration. You'll need to talk to your network administrator to determine suitable numbers for your configuration. Each time we start up the FreeBSD machine to play with Apache, we have to run these commands. The usual way to do this is to add them to /etc/rc.local (or the equivalent location -- it varies from machine to machine, but whatever it is called, it is run whenever the system boots).

If you are following the FreeBSD installation or something like it, you also need to install IP addresses and their hostnames (if we were to be pedantic, we would call them fully qualified domain names, or FQDN) in the file /etc/hosts :

192.168.123.2 www.butterthlies.com
192.168.123.2 sales.butterthlies.com
192.168.123.3 sales-not-vh.butterthlies.com
192.168.124.1 www.faraway.com

Note that www.butterthlies.com and sales.butterthlies.com both have the same IP number. This is so we can demonstrate the new NameVirtualHosts directive in the next chapter. We will need sales-not-vh.butterthlies.com in site.twocopy. Note also that this method of setting up hostnames is normally only appropriate when DNS is not available -- if you use this method, you'll have to do it on every machine that needs to know the names.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.