Unix Power ToolsUnix Power ToolsSearch this book

40.7. Interruptable gets with wget

The GNU utility wget can be used to access files through the Internet using HTTP, HTTPS, or FTP. The best thing about the utility is that if the process is interrupted and started again, it continues from where it left off.

Figure Go to http://examples.oreilly.com/upt3 for more information on: wget

The wget utility is installed by default in a lot of systems, but if you can't find it, it can be downloaded from GNU, at http://www.gnu.org/software/wget/wget.html.

The basic syntax for wget is very simple: type wget followed by the URL of the file or files you're trying to download:

wget http://www.somefile.com/somefile.htm
wget ftp://www.somefile.com/somefile

The file is downloaded and saved and a status is printed out to the screen:

--16:51:58--  http://dynamicearth.com:80/index.htm
           => `index.htm'
Connecting to dynamicearth.com:80... connected!
HTTP request sent, awaiting response... 200 OK
Length: 9,144 [text/html]

    0K -> ........                                               [100%]

16:51:58 (496.09 KB/s) - `index.htm' saved [9144/9144]

The default use of wget downloads the file into your current location. If the download is interrupted, by default wget does not resume at the point of interruption. You need to specify an option for this behavior. The wget options can be found in Table 40-2. Short and long forms of each option are specified, and options that don't require input can be grouped together:

> wget -drc URL

For those options that do require an input, you don't have to separate the option and the input with whitespace:

> wget -ooutput.file URL

Table 40-2. wget options

Option

Purpose

Examples

-V

Get version of wget

wget -V

-h or --help

Get listing of wget options

wget -help

-b or --background

Got to background after start

wget -b url

-e or --execute=COMMAND

Execute command

wget -e COMMAND url

-o or --output-file=file

Log messages to file

wget -o filename url

-a or --append-output=file

Appends to log file

wget -a filename url

-d or --debug

Turn on debug output

wget -d url

-q or --quiet

Turn off wget's output

wget -q url

-v or --verbose

Turn on verbose output

wget -v url

-nv or -non-verbose

Turn off verbose output

wget -nv url

-i or --input-file=file

Read urls from file

wget -I inputfile

-F or --force-html

Force input to be treated as HTML

wget -F url

-t or --tries=number

Number of re-tries to get file

wget -t 3 url

-O or --output-document=file

Forces all documents into specified

wget -O savedfile -i inputfile

-nc or --no-clobber

Don't clobber existing file

wget -nc url

-c or --continue

Continue getting file

wget -c url

--dot-style=style

Retrieval indicator

wget -dot-style=binary url

-N or --timestamping

Turn on time-stamping

wget -N url

-S or --server-response

Print HTTP headers, FTP responses

wget -S url

--spider

Wget behaves as a web spider, doesn't download

wget --spider url

-T or --timeout=seconds

Set the time out

-wget -T 30 url

-w or --wait=seconds

Wait specified number of seconds

wget -w 20 url

-Y or --proxy=on/off

Turn proxy on or off

wget -Y on url

-Q or --quota=quota

Specify download quota size

wget -Q2M url

-nd or --no-directories

Do not create directories in recursive download

wget -nd url

-x or -- force-directories

Opposite of -nd

wget -x url

-nh or --no-host-directories

Disable host-prefixed directories

wget -nh url

--cut-dirs=number

Ignore number directories

wget -cur-dirs=3 url

-P or --directory-prefix=prefix

Set directory to prefix

wget -P test url

--http-user=user --http-passwd=passwd

Set username and password

wget --http-user=user --http-passwd=password url

-- SP



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.