Using a proxy: -proxy and -noproxyReferenceChecking -local or -remote linksConfiguring local checks: -index and -localhtml

Configuring local checks: -index and -localhtml

There are a few things Big Brother needs to know when checking local (file) links.

The first one is the default index file name. Whenever a local link points to a directory, Big Brother looks for an index file inside it. (The index file is the file that should be displayed instead of the directory listing. As its name implies, it is supposed to describe what's in the directory, but it does not have to. It is also all right if there is no index file at all.) If recursion is on, and if this file's URL matches the recursion regexp, then it shall be opened and checked. If you don't use the -index option, Big Brother behaves as if you had said

-index "index.html"

This setting affects local links only, because in the case of remote links, replacing the directory with its index file is done transparently by the remote server.

Additionally, Big Brother needs to be able to tell which files are HTML files. (This is necessary when recursion is on, because it would make no sense to download, say, an image file and to try analyzing it.) For remote files, there is no problem, because the server shall supply the necessary information. There is one, however, for local files, where no information is available. So, Big Brother needs you to supply a regular expression using the -localhtml option. When Big Brother finds a link to a local file, it determines whether it is an HTML file by matching its name against this regular expression. If it does match, the file is assumed to be an HTML document. If you don't use the -localhtml option, Big Brother behaves as if you had typed

-localhtml "\.s?htm"

which means that a file is considered as an HTML file if its name contains .htm or .shtm. Here is another example. My site contains files in English and in French, whose names end in .html.en and .html.fr, respectively. I can let Big Brother know about it by typing

-localhtml "\.html\.(en|fr)$"


François Pottier, May 5, 2004

Using a proxy: -proxy and -noproxyReferenceChecking -local or -remote linksConfiguring local checks: -index and -localhtml