This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CWW Home Page] [SinCris Home Page]


Documentation for IUCr WWW Mirror Servers

Some general background information on the establishment and structure of the IUCr-sponsored pan-crystallography information service may be found in the appropriate section of A Crystallographer's Guide to Internet Tools and Resources. The information in these pages supplements and expands on that article.

Accreditation

Only sites authorised by the Executive Secretary of the IUCr (execsec@iucr.org) in consultation with National Crystallographic Committees (or in some circumstances Regional Crystallographic Associations) should act as accredited mirrors of the services sponsored by the IUCr and available from its Chester master server.

Hardware requirements

A mirror server may run on any suitable platform with direct Internet access that can provide the traffic capacity envisaged for the community it is intended to serve. Sufficient disk space should be available. Currently the pan-crystallography services occupy over 740 Mb of disk space; it would be prudent to expect this to grow at the rate of perhaps 100 Mb/year.

Operating system requirements

In principle, any operating system that supports httpd server software and long file names may be used. Many file names do not conform to the 8+3 DOS convention. The mode of mirroring documented below assumes that a Unix platform is in use.

Server software configuration

The web server must be configured in such a way that the directories /iucr-top, /sincris-top and /cww-top are visible at the root level of URLs served. That is, the main entry file index.html of the IUCr home page must be accessible as a URL of the form http://domain.name.of.server/iucr-top/index.html, and not at some lower level of directory hierarchy. This may be achieved by various means, such as

It is required that the server be configured to serve files with extensions .htm and .html as HTML files (i.e. as documents with MIME type text/html).

It is preferred that the server be configured so that a request for a URL of the form http://x.y.z/directory/ will serve by default the file index.html in the directory requested.

Mirroring the web contents

The web crawler software wget is the preferred software for downloading and updating the web pages distributed from Chester. A copy of the source distribution for version 1.4.3 (which is the version used at Chester) is available for download. It has been reported that there are problems with this version on some Linux platforms, and later versions may be available from GNU software archive sites. wget should be run as often as considered necessary (in general once per day) to retrieve files from the separate pan-crystallography components. It is important that an appropriate set of options be supplied to the utility. A possible shell script for downloading (and updating) all three major components is included below. In this script, wget transfers files through a privileged ftp account, the username and password of which are supplied to the accredited technical implementor. This username and password should be used in the download script.
#!/bin/sh

# You should CHANGE THE NEXT 5 LINES to suit your local setup
ROOTDIR=/usr/local/httpd/docs         # httpd server root directory
LOGDIR=/usr/local/logs                # directory for storing logs
NATLOGO=/usr/local/icons/iucrXX.gif   # location of customised home page logo
USER=foobar                           # privileged ftp user name
PASSWORD=foobar                       # privileged ftp account password

# Transfer the IUCr (Chester) pages
wget -nh -nH -r -N -nr -l0 -R robots.txt -X /cgi-bin -R iucrhome.gif -np \
 ftp://$USER:$PASSWORD@www.iucr.org/iucr-top/ -P $ROOTDIR > $LOGDIR/iucr-top.log

# Transfer the Crystallography World Wide pages
wget -nh -nH -r -N -nr -l0 -R robots.txt -X /cgi-bin -np \
 ftp://$USER:$PASSWORD@www.iucr.org/cww-top/ -P $ROOTDIR > $LOGDIR/cww-top.log

# Transfer the SinCris pages
wget -nh -nH -r -N -nr -l0 -R robots.txt -X /cgi-bin -np \
 ftp://$USER:$PASSWORD@www.iucr.org/sincris-top/ -P $ROOTDIR > $LOGDIR/sincris-top.log

# Copy the local customised logo on top of the standard one
cp $NATLOGO $ROOTDIR/iucr-top/logos/iucrhome.gif

Technical Note: The options listed in the above calls are
-nh Do not perform DNS lookup on host name
-nH Disable generation of host-prefixed directories
-r Recursively traverse directory tree
-N Turn on time-stamping
-nr Retain .listing files from FTP
-l0 Recursively fetch to infinite depth (that's a zero after the l)
-R Reject files with name...
-X Reject directories with name...
-np Never fetch a parent of the root directory
-P Prefix of directory to which all downloaded files will be saved
Other ftp mirroring software may be used if appropriate.

National identity

The mirror servers are intended to represent national resources, and this is emphasised through two mechanisms.

Customised logos Each site should obtain a copy of the IUCr Home Page logo customised with its appropriate national emblem. The relevant file may be fetched from the URL http://www.iucr.org/iucr-top/logos/iucrXX.gif, where XX should be replaced by the appropriate ISO 2-letter code for the country (if the code is not known, or the relevant graphic file cannot be found, the IUCr Research & Development Officer should be contacted). This file should replace the default IUCr Home Page logo (/iucr-top/iucrhome.gif). The sample shell script shown above indicates how this might be done.

Customised Internet address Each site will be referenced through the DNS by a name such as www.XX.iucr.org, where again XX stands for the 2-letter ISO code. Upon commissioning of a new mirror site, the IUCr Research & Development Officer should be contacted with details of the IP address of the mirror host, to allow the update of the .iucr.org zone files with the new address mapping.


Updated 14 January 1999

Copyright © International Union of Crystallography

IUCr Webmaster