Merging data from a multi-detector continuous scanning powder diffraction system
Jon P. Wright, Gavin B. M. Vaughan and Andy N. Fitch,
European Synchrotron Radiation Facility, BP-220, F-38043, Grenoble Cedex, France.
E-mail: email@example.com - firstname.lastname@example.org - email@example.com ; WWW: http://www.esrf.fr/exp_facilities/ID31/contacts.html - http://www.esrf.fr/exp_facilities/ID11/handbook/staff/gavin/Gavin.html - http://www.esrf.fr/exp_facilities/ID11/handbook/staff/jon/jon.html
For a wide range of problems in powder diffraction, high angular resolution is crucial for untangling overlapped reflections. Higher resolution can always be obtained at the cost of a decreased count rate, and the scientist must balance the requirements for resolution against those of flux. The extremely high intensity and collimation of a synchrotron X-ray source provides an excellent basis for a general purpose high resolution powder diffraction instrument. At the ESRF a parallel beam geometry was selected and originally implemented using X-rays from a bending magnet source on beam line BM16, with nine analyser crystals mounted on a comb support with constant angular offsets of ~2 degrees (figure 1). That instrument has recently been moved to beam line ID31, where an undulator source gives a large enhancement in the incident X-ray flux and lower divergence in the incident beam.
Figure 1: Picture of the multianalyser stage .
In optimising the data collection rate from a high resolution instrument one of the surprising but important factors turns out to be the time for the mechanical system to move. Collecting data with a step scan introduces a significant dead time while the detector arm is repositioned between measurements of adjacent data points. An alternative scheme is to drive the detector arm continuously and record accumulated counts and the detector arm’s angular position repeatedly throughout the scan. This scheme is advantageous as the dead time is eliminated, however some complications are introduced in that the data no longer arrive in a familiar format. Also the data from the nine separate channels and usually a series of scans need to be normalised and merged together to provide a convenient format for Rietveld refinement and other data analysis programs. These procedures are somewhat analogous to those in use with time of flight neutron diffractometers, where many detectors are rebinned to produce a histogram of data. We outline the main aspects of the data processing procedures implemented here at ESRF for data collected on the high-resolution powder diffractometer. Other procedures are available for data collected using area detectors, which have been described elsewhere .
The hardware is set up to record the state of the instrument as a function of time. The detector arm;s angular position is read from the encoder mounted on the diffractometer axis and the counts received in each of the nine detectors since the last angle reading are also recorded. The raw data produced by the instrument are a list of angle positions, times and counts, with each line in the data file representing the state of the instrument at a particular time. Thus the raw counts represent those photons which were detected during the time that the instrument moved from the previous angular position to the current one. The raw data files (in ascii) contain a header giving the initial position of the detector arm, along with various other motors present on the beam line, in a standard ESRF SPEC format. Currently the system is triggered by a hardware clock, giving steps which are constant in time but not necessarily angle. The angle encoder has a precision (stepsize) of 0.00005º and is accurate to ±0.5 arcsec in 360º, allowing a series scans to be merged together without encountering problems due to changes in zero-shift. Offsets between the nine detector channels are constant (by mechanical construction) and are also independent of energy, since all nine analyser crystals are mounted on the same rigid support.
For very strongly diffracting samples the maximum count rate available from the scintillator detectors (~100kHz) or the maximum angular velocity (16 deg/min) available from the motors can be rate limiting factors, although such cases are relatively rare. In practice the diffractometer is usually scanned at an angular velocity of the order 1-10 degrees per minute with a sampling time such that 2000 points per degree are measured (or 1 point for every 10 encoder steps).
Rebinning the data is a straightforward computational procedure, provided it is assumed that the counts arrived uniformly in between the sampling times. In practice the experiment should always be carried out such that the sampling step size is much less than that required for the final binned histogram, which is normally the case with default software settings. Figure 2 illustrates the behaviour of the algorithm used, where the "spread over many bins" case should generally be avoided. The data which are produced are a histogram, which can potentially cause problems in some Rietveld software if a very coarse step size is used. (The peak function should be the integral over the bin width and not the value at the centre, which is only a sufficiently good approximation when the function is linear over the bin width; usually the case if the step size is less than the FWHM/5).
Figure 2: Rebinning the raw data.
Since the nine channels are offset from each other it is necessary to rebin each channel separately, and also to rebin the incident beam monitor counts along with each channel, since the incident X-ray intensity may change significantly during the time it takes for detectors to reach equivalent places in the 2θ scan. Detector and monitor counts collected during different scans but for the same channel can simply be summed after they have been assigned to equivalent angular bins. Data points that are collected when the incident flux is below a threshold, such as during the few minutes it takes to refill the storage ring, are excluded.
Combining the data together from the nine different channels requires some corrections to be made, apart from the simple addition of the angular offset. In the most common experimental arrangement the sample is mounted in a capillary and all the channels have equivalent geometric and absorbtion factors, so that only the detector efficiency needs to be taken into account. The relative efficiency of the detector channels can easily be determined from those data points where all of the nine channels have contributed and a capillary sample was used. For flat plate samples it is only possible to provide a θ-2θ geometry for one of the nine channels, meaning that appropriate corrections must be made before the data can be combined. These corrections are further complicated in the cases where the diffracted beam is clipped or obscured when a long sample length is illuminated by the X-ray beam, particularly at low angles. Capillary geometry is generally to be preferred for these reasons, as well as to avoid unwanted texture effects.
Propagating the errors from the original counts to the final data file is very important for later analysis and refinement and requires an "esd" data format for most popular refinement packages. When some data points may only have been measured once by one channel but others may have been visited several times by all nine detectors the statistical weighting will be very different. A subtlety which arises in the error propagation comes from the usual approximation of taking sqrt(counts) to represent the Poisson distribution for the counting statistics. At very low count rates this approximation becomes progressively worse, failing completely when zero counts are recorded. The easiest way to avoid these pitfalls is to sum the data together as far as possible before taking the square roots, and also to apply the efficiency correction to the monitor, rather than the detector signal, since this will generally be the larger of the two. Within the actual software it is preferable to use the quotient of sums when combining detector and monitor counts from different channels, rather than the weighted sum of quotients, simply because the weights can be difficult to determine at low count rates.
In the current software the signal, S, is given by S = C/M, where C is the sum over scans and channels of detected counts and M is the equivalent sum of efficiency corrected monitor counts. The error in the signal is then given by: <S>2 = (C+a)/M2 + (C<M>/M2)2, where <M> is the error in the summed corrected monitor counts and a is a constant to avoid taking sqrt(0) as the error in zero counts, by default set to 0.5, which may be user defined. A single channel has a corrected monitor value given by m = n.e, where n is the number of monitor counts and e is the efficiency. Adding the errors for the nine individual monitor spectra in quadrature provides the error in the summed corrected monitor counts where. The error is given by <m>2 = n(e2 + n<e>2), where e and <e> are the efficiency and error in the efficiency respectively. The uncertainty in the detector efficiency is usually negligible in practice, provided a proper calibration is carried out.
The data analysis routines were originally developed as a collection of C programs which wrote intermediate results to files and were coordinated by Unix shell scripts. Coincident with the move of the instrument from the bending magnet source to the undulator, the software was completely rewritten in Fortran in order to cope with an increased data collection rate and exploit improved computer hardware. Sufficient computer memory is now available to retain the intermediate results without writing to file, allowing data to be processed using only a single pass through the raw data file. Figure 3 shows typical output from running the program.
High data collection rates with a fast turnover of experiments and frequent changes to the sample environment can all potentially lead to problems with the final dataset, which must be diagnosed as soon as possible. Comparing the signal from the nine channels and ensuring that all are measuring the same signal, at least within statistical expectations may identify many problems. Similarly the data from a series of scans may also be compared to ensure that the sample has remained stable over time. This is an advantage of collecting a series of faster scans, compared to a slow single scan which must eventually be abandoned if it is later found that the sample was not in the same state at the beginning and end of the experiment.
diffract31:~% id31sum data/nac_050_18-12-2002_si111.dat 0.001 1 10 renorm
temp.res file found and read in
-30.000 < tth < 160.00 step= 0.10000E-02 npts= 190000
Processing scan 1 2 3 4 5 6 7 8 9 10
Determining detector efficiencies
Channel efficiences found from 43887 points where all detectors overlap
Efficiencies from temp.res file, the values found now are compared
Det Offset Old Eff Old <E> New Eff New <E>
0 8.0606637 1.0000000 0.0000000 1.0407409 0.0006292
1 5.8858071 1.0000000 0.0000000 0.9677989 0.0006151
2 3.9589980 1.0000000 0.0000000 0.9637898 0.0006241
3 2.0975745 1.0000000 0.0000000 1.0696193 0.0006726
4 0.0000000 1.0000000 0.0000000 1.0615594 0.0006832
5 -1.9474072 1.0000000 0.0000000 0.9937005 0.0006717
6 -3.9962779 1.0000000 0.0000000 0.9673909 0.0006772
7 -6.0458136 1.0000000 0.0000000 0.9413822 0.0006848
8 -8.0536742 1.0000000 0.0000000 0.9940181 0.0007245
Created temp.res file
R_exp = 5.446 with 10 pts having I/<I> less than 3, from 76116 pts obs
Reduced chi**2 for channel merge = 0.9876 from 535831 pairs of pts
0.54% differ by >3 sigma, 0.0174% by >6 sigma (ideally 0.04% and 0.0000%)
Wrote diag.mtv file, outliers at 6.0000
Time taken was 16.76/s
Figure 3: Typical output from the binning program. The "temp.res" file mentioned in the output is used to store efficiency and offset information between runs of the program.
A graphic example of the dangers of combining data without performing checks, or indeed only carrying out one slow scan with a single detector channel is shown on figure 4. Such spurious peaks might provide an insurmountable challenge for indexing the diffraction pattern, but in fact correspond to thermal instability in the experiment. A hiccupping cryogenic gas stream was occasionally sending jets of warm shroud gas onto the capillary, causing brief temperature excursions of many tens of Celsius. The thermal expansion of the sample was sufficient that when one of the detectors was measuring the Bragg peak it had moved to a completely different position.
Figure 4: Plots of data from individual detector channels, showing spurious data due to temperature instability. In the left diagram the peak in channel 6 is clearly displaced, but not due to an incorrect detector offset, as shown by the agreement of peak positions in the diagram on the right.
In conclusion, the data from the multi-detector continuous scans collected at ESRF are routinely found to give excellent Rietveld refinements, indexing figures of merit (M20) of the order several hundred and represent a much more efficient way to collect data compared to step scanning. Data with variable counting time strategies, particularly counting for longer at high angles for structural refinement problems, may easily be collected as a series of scans for later merging. The extra complications introduced by collecting data from multiple detectors and irregular step sizes can be treated in a simple way, provided appropriate statistics are applied.
Thanks to the ESRF instrument control software personnel, especially A. Beteva, J. Klora and A. Homs for the implementation of the continuous scanning control procedures.
1. J.-L. Hodeau, et al., SPIE Proceedings, 3448 (1998) 353-361.
2. A P Hammersley, http://www.esrf.fr/computing/scientific/FIT2D/
These pages are maintained by the Commission Last updated: 18 Sep 2008