Local Network with dual OS Linux/Windows

Michel Quercia, October 1998
michel.quercia@prepas.org

Presentation

Carnot is a school in the center of Dijon with approximately 2700 students (700 in secondary school, 2000 in high school). It has an important set of computers : about 200 machines PC-type running the operating systems Windows-3.11 or Windows-95 (client machines), Novell (administrative servers) and Linux 2.0.34 (servers and clients). Those machines are spread in all the school and connected together with two local networks :

The reason for that division in two networks is to grant security for the administration's computers against the computer science laboratory and to divide network traffic in order to keep a good fluidity.

The computer science laboratory is made up with four rooms each with about fifteen machines used for computer science lessons and experiments at all levels (high school and undergraduate). One or two of these rooms depending on the day are open to free-access every afternoon and Saturday morning to allow students to browse the Internet, type reports or do any work requiring a computer. For now, the laboratory has 74 machines :

History :

At the beginning, we had at the computer science laboratory only PC running MS-DOS, most of them of type 286, 386 and 486 gradually replaced with Pentiums thank's to endowments from rectorat and the "région Bourgogne".

In 1995-1996, in order to teach the Caml language in good conditions, we install Linux on about twenty machines (486 and P-75) together with Windows-3.11.

1996-1997 : settlement of a local network in two rooms out of four. These rooms are each equipped with a file server running Linux (386SX-33) storing the software used on clients both with Linux and Windows OS. In the middle of the school-year the two rooms are connected together and to the only machine accessing the Internet through an analog telephone line. This computer runs both Windows-95 and Linux, but doesn't yet share it's Internet access to the other machines. Then we put it permanently under Linux and we set up a proxy-server (httpd from Cern first, then apache) to allow the computers connected to the network to access the Internet. The analog phone line is replaced with an ISDN line in order to provide higher bandwidth.

1997-1998 : connection of a third room to the network, replacement of the file servers 386SX with 486DX-33 and connection of the Internet machine to the administrative network. At this time, about a hundred computers virtually have concurrent access to the Internet, the ISDN line being the bottleneck. One can see in the proxy logs 17000 downloads per day, which represents probably 1700 documents browsed per day (one document initiates several downloads counting included pictures). During the year another modem is installed on the gateway allowing incoming phone calls for the teachers willing to transfer files between their home and their personal directory at school and also allowing remote administration of the network. Since end of 1997 : automatic service of the laboratory's machines during night and holidays.

1998-1999 : connection of the fourth room, complete automatization of night computing and departure from the school of the benevolent administrator which keeps an eye on the network from Nancy by phone.

Architecture

All clients are now equipped with both operating systems Linux and MS-DOS/Windows-3.11 (Windows-95 for five machines), the choice of the system is done when booting the computer. Linux is mainly used for programming in Caml, for computer science TIPE (travaux d'initiative personnelle, personal initiative works) et by a few students in free-service. The other activities, especially computer aided publishing and computer algebra are done under Windows, Internet browsing is done equally with Windows or Linux. With Linux as with Windows, a computer stores on the local disk only the minimum software necessary to boot and access the network (mainly TCP/IP software). In case of general network failure (this never happened yet) a computer is still usable (with Linux or Windows) but only to do text processing.

All softwares used in the laboratory except the operating systems are stored on the Linux file server, there is one server in each room to grant good hardware redundancy but those servers are identical (except for a few minor differences) and each of them can, in case of failure, replace almost immediately an out of order server. We have been able to see, while testing the system, that a single file server is able to serve all computers in the laboratory without visible performance issue (the file servers are 486DX-33, we noted during an afternoon an average load of 20% on a server, which means that server is spending 80% of it's time... doing nothing).

Technically, a computer running Windows gains access to the softwares on the server by "connecting" the directory /home/workgroup of the server as a remote disk (D:) and a computer running Linux by "mounting" the directory /usr of the server in it's local tree. The server accepts both connection types and exports the corresponding directories read-only to prevent a client from modifying the content of server's disk. To install a new software we use a client computer while enabling temporarily and with access right check the writing in one or several server's directories. After checking and disabling write access, the software is transfered to the other servers in order to be available on all the laboratory.

By another way, each server has a CD-rom reader also exported to the clients but with password check to prevent students from using their own CD in the laboratory (particularly games CD). Last, each server offers a b/w laser printer and a color deskjet without password. Among the four servers only one really both printers attached, the three others have only the laser model and relay the color jobs to the first one (this make a color job travel twice on the network but the time loss is negligible and it simplifies the configuration of the computers).

Students have no storing right on the computers of the network (they can use temporary directories but those are wiped out every day), the reason for this is that managing student personal files would require a lot of storage and especially a security level and backup operations incompatible with benevolent administration. On the other way, interested teachers can have a personal directory on the network, they get then an account with password. One machine (386SX-16 running Linux) is devoted to the storage and the backup of those personal directories, but the teachers can gain read-write access to their directory from any client by giving their password, whatever the OS (Linux or Windows) the client is running. Since the settlement of phone incoming calls they can also access their directory from their home (with caller ID check done by the gateway). Students can access read-only and without password the subdirectory public of each teacher, which allows a teacher to put any file he wishes available to all students without administrator intervention.

Internet gateway

The gateway, also incorrectly called "Internet machine", is actually the hub of the two school networks (laboratory network and administrative network). It's the only machine physically connected with two Ethernet boards to the two networks as well as to the phone line (with an ISDN board and two analog modems), which make every communication between two machines connected to different networks or between one machine in the school and the outside obliged to cross that gateway. The gateway enables or denies crossing depending on the caller and the addressee.

Actually, the gateway enables communications between any machine on the administrative network and one of the file servers, including the personal directories server to allow all the school to use the softwares available in the laboratory. Regarding Internet access, only the "teacher" machines form both networks can access the Internet without restriction (with an address masquerading mechanism) meanwhile "ordinary" machines have to go through the proxy-server apache and can only make http or ftp retrievals.

The purpose of those restrictions is to ensure, as far as the computer science laboratory is concerned, protection of the administration's machines against unauthorized access or sabotage and to limit the possible actions of the students on the Internet considering most of them are under 18. There is no automatized censorship about the documents browsed, we expect the students to follow the school regulations and count on the watchmen to prevent the retrieval of pornographic or youth-dangerous documents.

The Internet connection is provided by the rectorat of the "académie de Dijon" (Indunet network). The gateway calls automatically the rectorat when a communication has to be routed to the Internet and hangs up automatically after two minutes of inactivity or in case the link goes down (which occurs frequently without our understanding of the cause). Usually the communications with the rectorat are done through the ISDN board at 64Kbps which is insufficient when traffic is high but we can't do better, especially because the link between the rectorat and Renater has only 256Kbps capacity and the rectorat has to provide Internet access to all schools of the academy (actually those of of Dijon and the surroundings) In the beginning, we devoted a 100Mo cache disk space to apache in order to store the most wanted documents and serve them faster, but experiments revealed that this cache is of little use, a document is seldom retrieved twice (home pages of Internet search engines are frequently retrieved but their content is always changing and the cache copy is out of date). For now, the cache is disabled. In case the ISDN link fails, which is the case now (the ISDN board died after interventions on the phone line), the gateway uses an analog modem at 28Kbps. A second analog modem is devoted to incoming calls.

Automatic operations

Since November 1997 the laboratory is working 24h a day. The day, during school time, the computers are used for lessons or free service while the night and during holidays all machines run under Linux for maintaining and updating themselves and "work" on the behalf of "Inria de Lorraine" (Loria). The trigger mechanism can not be more simple : at the end of the day, the watchman boots the computers in "maintenance mode", shuts down the screens to spare electricity and screen lifetime, then goes away...

Maintenance consists mainly in regenerating the DOS-Windows partition from an archive. That archive, stored on the file servers, is first transfered to the clients through the network and then installed after uncompression on the DOS partition. The same archive is used by all machines running Windows-3.11, the details of each machine are taken into account to modify system files (especially system.ini) after restoration and adapt them to the machine. By another way, in order to speed up the process, a local copy of the archive is stored on the Linux partition and it is this copy which is used for subsequent restorations until the server archive is modified.

The benefit of this restoration is to grant a machine in perfect state, regardless of user manipulations, and virus-free, at least for the next day 8 o'clock. If during the day a machine is ill-functioning, we can trigger manually a new restoration without problem, the operation does not last five minutes. By another way, if the installation of a new software under Windows requires to add or modify system files, all we need to do is make an new archive from the machine where the installation was done and put this archive on the file servers, the modifications will be automatically transfered to all machines on the network during the next restoration. Actually, when this situation occurs, we update the archive on a single file server and put the corresponding room in "observation" during a few days in order to detect any incompatibilities ; we transfer the new archive to the other servers only after this observation time.

The few computers running Windows-95 cannot benefit of a so easy restoration because the specific informations of a machine are stored in a binary formatted file (reg.dat) and we don't know how to modify automatically that file. Each of those Windows-95 machines is subject to a particular backup and updating is done manually, which prevents an extension of Windows-95 to all computers of the laboratory. Such an extension is actually not necessary, at least for now, as all softwares we use run without problems under Windows-3.11. The esthetic of 95 interface is not a big loss considering the ease of maintenance we get by keeping 3.11.

Restoration of the Linux partition of a machine is also done by uncompressing an archive retrieved from a file server, but it is not done automatically because the access right check mechanism of Linux give (for now) enough a protection to prevent ill-functioning due to the users. The only cases where it is necessary to do such a restoration are hardware failure or installation of a new computer. In such cases we do the restoration with the help of rescue diskettes containing a small Linux system sufficient for accessing an archive on the network or on an external backup device (ZIP drive).

After restoring it's DOS partition, each Linux machine launch a big number factor program written by Paul Zimmermann for the ECM-net project. The numbers to be factored and the number of attempts to do are stored on Loria's ftp server, the gateway fetches a new job each time a former is done and distributes it to unemployed machines, it reports the end of the jobs and the factors possibly found by e-mail. The only manual intervention required, not counting booting of the computers in maintenance mode, consists in filling in the job file at Loria and watch the network activity during holidays in case something happens (it happened several times that a file server or a complete room was stopped by power shutdown due to maintenance staff or a divided teacher, in such a case one has to physically go and turn the power on, remote administration still has limits...)

Problems encountered

Since the settlement of the network in the first two rooms which induced an almost total dependence of the client machines on the network and the file servers, that is since two years now, we encountered a few problems more or less serious, but none had as a consequence to stop even partially the network.

Hardware and software problems : on certain machines the Ethernet boards trouble randomly the operation of the mouse under Windows. It is probably due to a bug in BIOS or in Windows drivers because the trouble ceases when we open a DOS session under Windows and don't re-appear even after closing the DOS session (now this is done automatically on those machines each time Windows is launched). Other machines, of two different models, "loose" the pointer under Word, the phenomenon is for now unexplained and unsolved.

Slow servers : the first file servers put into service were of type 386SX-33. In "turbo" mode they managed to sustain the load of about fifteen clients provided they are not synchronized, but we found several times the turbo mode disabled which caused timeouts in the clients. Replacement of those servers with 486DX-33 ones fixed the problem, the only 386SX still in service now stores the teacher's personal directories and has not big activity.

Bad Internet link : causing untimely hangups. The problem is recurrent, it doesn't seem to be related to the size of the Internet traffic, we have never been able to see a gateway overload and as these hangups occur mainly on the afternoon we think the problem could lie on the provider side (the rectorat). At the beginning those perturbations freezed the ppp driver on the gateway, now we activated a "dead link" detection mechanism : the ppp driver sends regularly echo requests to the rectorat and hangs up as soon as an echo doesn't come back, the automatic dialing mechanism triggers then a new connection after 30s delay. This measure doesn't improve the quality of the link, but at least the gateway doesn't freeze anymore and re-establishes the link as soon as possible. Actually, the true problem here is lack of technical knowledge at school allowing to diagnose the phenomenon and fix it.

Administrator's error : which opened the directories exported to Windows clients to public write on two of the three file servers that were in activity at that time. The clients of those two servers were running the Maple program under Windows. For unknown reason the program initiated writes on the servers and then ceased to work. As the third server was still intact and in read-only mode, we directed the whole laboratory towards that server while restoring the two other through the network. Two hours later normal functioning was back.


This article was written for the netdays conference at Brest (October 21st 1998). Please send comments or questions to Michel Quercia