GeneWeb - Access restrictions
With GeneWeb, you can:
If your database is accommodated in a site, you are concerned only by the first three points.
The first two points (global accesses) work only in server mode. In cgi mode, you have to use the means given by the HTTP server you depend on.
Since the topic is about access restrictions, the next section concerns the issue of robots and the ways to prevent them from harming. The last section indicates a even rougher way of access interdiction: the black list.
In this case, only the persons having given the good "wizard" password have the right to make updates. The other ones can normally navigate but don't see the clickable text "update" in the personal pages.
To install a "wizard" password", proceed as follow:
When a "wizard" password is installed, the welcome page (the one with the flags) gives you an area for the passwords.
After having entered the password, the welcome page must be displayed again. You can detect that you are in "wizard" mode because the texts "modify family" and "modify notes" should be displayed.
In the navigation, in the personal pages, you have an access to "update".
The other genealogical informations (parents, children) stay displayed.
If there is a "friend" password, only the persons having provided this password (or the "wizard" password) can see the private informations.
To install the "friend" password, proceed like for the "wizard" password above. If you pass by the file "foo.gwf", the variable name is "friend_passwd".
Warning: this works only in server mode. If you are in cgi mode, you have to use the means provided by the HTTP server you depend on (the present documentation does not explain how, because it depends on the server).
You must first constituate an authorization file. This text file holds line of the kind "user:password". For example:
To install this authorization file, proceed as follow:
When accessing this database, your browser pops a window where you have to give a valid user name and its corresponding password.
In the above example, you have to write "smith" in the "user" area and "ex23zuu" in the "password" area, or "martin" in the "user" area and "2wxuz4" in the "password" area.
If your entry is not validated, you have no access at all to the data base.
The databases are then accessible only by the access protocol. If, however, some database has a specific global access restriction (see previous section), this specific restriction rather applies.
Warning: as for the previous section, this works only in server mode, not cgi mode.
You have to constituate an authorization file, as well. It can be only done by the site administrator: if your database is accommodated elsewhere, you are not concerned.
The global access authorization is done in the parameters of the "gwd" command. The option is named "-auth" and must be followed by the name of the authorizations file.
Most of these robots are bad ideas, because the number of possible pages is almost infinite. For example, if the robot starts from a personal page, clicks on "relationship computing", then on the spouse, it computes all the relationship links between the person and his spouse, then it can click on all the details of these links, and in all the intermediate persons, and so on.
The idea of these persons who launch robot is often to say: rather than pass hours clicking on this interesting site, I am going to download all its pages and read them quietly afterwards, at home, freeing my telephonic line.
Unhappily, as the GeneWeb service is a maze without exit, this makes only one thing: overload the person's disk with HTML pages.
For the site owner's point of view, he is generally not happy to receive so many requests so quickly, sometimes more than 10 per second. Because:
However, the pages that GeneWeb generated clearly indicate to the robot (in the protocol) not to go on exploring from its pages.
"Good" robots, those which index the Web pages or the whole world, respect this protocol and don't insist. But when one launches a robot, it is perfectly possible to tell it to ignore it.
Against these impolite robots, the option "-robot_xcl" of gwd exists. It is based on the observation of the quickness of the requests coming from the same place.
The parameters are two numbers, separated by a comma. The first one is a number "x" of accesses and the second one a number "y" of seconds. If some address makes more than "x" accesses in "y" seconds, the address is then automatically registered in a black list and all the future requests are refused for it with an appropriate message.
gwd -robot_xcl 100,150
If more than 100 connections in 150 seconds is detected from the same place, the origin address has no more access until the site owner decides to "free" it.
To free the forbidden addresses, the site owner just has to delete the file named "robot" located in the directory "cnt": anyway this is displayed in the access traces of gwd.
Build a file named "gwd.xcl" and put it in the same directory than the one of the command "gwd".
Edit this text file and put the address (one per line) you want to refuse. You can put "*" to indicated any sequence of characters. Example:
This forbids the access from the addresses like "email@example.com", "firstname.lastname@example.org", "email@example.com", and so on. If you just put a line with "*", all address are forbidden of access (including yours).