Checking your site |
This section shows which command I use to check my own site. You should be able to adapt it easily to your case. First, here is the command:
bigbro \ -mapfrom "^http://pauillac\.inria\.fr/~fpottier/" \ -mapto file://$HOME/public_html/ \ -rec "^file:" \ -local -remote \ -proxy www-rocq.inria.fr:8080 \ -noproxy "^http://.*\.inria\.fr/" \ -timeout 600 \ -gentle 5 \ -oraw stdout \ -ohtml report.html \ -failures \ -fragments \ -ignore "^http://www\.imdb\.com/M/" \ http://pauillac.inria.fr/~fpottier/(The
\
characters at the end of each line are used to indicate that this is a single command, even though it is
written on several lines for clarity.)
Here is an explanation of the options used above. First, I define a mapping, which tells Big Brother that any document belonging to my site can be read
directly from the public_html
subdirectory of my home directory. (Using $HOME
in this way is
Unix-specific, but you can specify a full path explicitly under Windows.) Then, I enable recursion within my site. Determining whether a file belongs to my site is easy, since if it
does, then it resides on disk, so it has a file:
URL. This explains why I used -rec "^file:"
. Next, I
enable checking both remote and local links. I then let Big Brother know about my proxy (note that the proxy's name
ends with a custom port number which comes after a colon). The proxy is unnecessary when accessing machines within the
domain inria.fr
, hence the use of the -noproxy
option. Next, I set the timeout value to 10 minutes and,
to avoid consuming too much server time, I specify that at least 5 seconds should elapse between two requests to the
same server. (This is especially important when using a proxy, since nearly all requests are sent to the
proxy.) Next, I request "raw" output onscreen and human-readable output to a file called report.html
.
Displaying failures only saves time and makes the report more readable. I want fragments to be checked. I use -ignore to avoid checking some URLs which I know will cause failures. Finally, I tell Big
Brother where to start by specifying the main URL for my site.
That's it! It might seem overwhelming at first sight, but remember that once the command has been stored in a script, all that's needed is to run the script whenever you want your site verified.
Checking your site |