# Statistical descriptors in crystallography

## Refinement

Refinement is the process of adjusting the parameters of a model to find values most nearly compatible with the observations. From an approximate set of starting parameters for a model obtained by the methods of crystal structure determination, the parameter estimates are varied to obtain a best fit between the`n`observed quantities

`O`

_{j}and the corresponding calculated quantities

`C`

_{j}. In most crystallographic work, the

`O`

_{j}are diffraction intensities, but non-diffraction data may also be included. Concerning correction factors to obtain structure amplitudes |

`F`|

^{2}or |

`F`|, we refer to the discussion of the term model.

The most popular method in the physical sciences, and about the only one used in crystallography, is the method of least squares, which minimizes the weighted deviance `D`_{w} = **d**^{T}`W`**d**. Some statistical studies, however, suggest that least squares may not always be the best method. Tukey (1974) has asserted that chemists and physicists 'both make less of their data than they should and, too often, come to think better of their results than deserved'. While this may be true, no other methods have been shown to be convincing alternatives, except variants of least squares such as, for example, robust-resistant methods. Among the other methods described in textbooks (Kendall & Stuart, 1979; Eadie, Drijard, James, Roos & Sadoulet, 1971), maximum likelihood takes a prominent place, but it is equivalent to least squares for a *normal* distribution of errors. The popularity of least squares is probably because the error distribution of the deviates `O`_{j} - `C`_{j} is not known (for the distribution of the net intensity see Wilson, 1980a). In the absence of this information, it is difficult to justify alternative approaches. In addition, most theoretical work assumes a normal distribution.

An authoritative discussion of the method of least squares is found in Prince (1982, 1985, 1989). Linear least squares is an **unbiased** estimator if the weights are independent of the observations, and if the model represents physical reality correctly for some set of values of the parameters. In particular, the **Gauss-Markov theorem** states that minimal variances of the estimates are obtained if the weight matrix `W` is chosen as the inverse of the variance-covariance matrix `V` of the joint p.d.f. of the observations. The inverse of the normal equations matrix is an unbiased estimate of the variance-covariance matrix of the model parameters **if, and only if**, `W` = `V`^{-1}. For any weighting scheme other than `W` = `V`^{-1}, the estimation of the variance-covariance matrix of the least-squares parameters should in principle be carried out using formulae given in Prince (1985, 1989) and in Rollett (1988), but this has never been tried in practice. The goodness of fit is expected to be 1·0. In crystallographic applications, the model is usually expressed by non-linear differentiable functions. The weighted deviance is minimized by iteratively linearizing the model functions with a Taylor expansion at approximate parameter values. The Gauss-Markov theorem applies then to the fit and parameter values at convergence insofar as the linearized functions are good approximations to the model functions in the vicinity of the minimum (see also Eadie *et al*., 1971). In practice, the number of parameters to be estimated is considerable, and the use of off-diagonal terms in the weight matrix `W` is cumbersome in several respects. For this reason, `W` is usually chosen as a diagonal matrix. This practice is often justified by the presumption that intensity measurements are uncorrelated. However, off-diagonal terms in `V` may well arise if non-random errors, such as absorption errors, are present or if the measurements have been systematically altered.

The assumptions that the Gauss-Markov theorem is based upon are never realized in practice. In particular, the variances of the observations, whether they be derived from the spread of repeated or symmetry-equivalent observations about their average, or from Poisson statistics, are estimated from the observations themselves, by methods which are indeed a part of the model. The model is only an approximation to physical reality, and is often rather crude. The adjustable parameters may have no objective significance. Thus, harmonic and higher-order displacement formalisms only serve to parametrize the atomic p.d.f.s. The values obtained by a refinement depend on the number of terms used in the expansion, and consequently only the total estimated p.d.f.s may be physically meaningful. The true p.d.f.s they approximate are not even guaranteed to possess the higher moments appearing in the Gram-Charlier and Edgeworth expansion formalisms (Johnson & Levy, 1974).

The assumed symmetry of the crystal structure is part of the model and is in many cases higher than the observed symmetry of the diffraction intensities. These are affected by anisotropic effects such as absorption and extinction for which corrections are only approximate. The symmetry-equivalent intensities, even if corrected for these effects, thus may belong to different populations. Consequently, their symmetry equivalence should not in principle be used as a criterion for averaging. Practical considerations of computation time and data storage require however that some data reduction by way of averaging be carried out. Refinements, using a model that respects the assumed symmetry of the crystal structure, carried out by least squares on unaveraged and averaged data are only identical under certain strict conditions* rarely realized in practice. The significance of the standard uncertainties of the average values of corrected intensities may be doubtful (although a Bayesian interpretation is possible).

It is not surprising that the goodness-of-fit values of most least-squares calculations on real data are only accidentally within an acceptable range near 1·0. The uncertainties obtained for the estimates are thus of dubious significance. In fact, the results of independent determinations of the same structure may differ by much more, and hardly ever by less, than is allowed by statistical tests. The commonly applied procedure of multiplying s.u.s with the goodness-of-fit value has no statistical basis. From the point of view of the Gauss-Markov theorem and frequentist statistics, this situation can only be improved by making the model more realistic.

The Bayesian interpretation of statistics incorporates prior knowledge or beliefs in the p.d.f. of the observations. In fact, measurements in all physical science are necessarily conditioned by what we expect to find *a priori*, and are thus not independent of the model. The variance-covariance matrix of the observations reflects the author's estimation not only of the variability of repeated measurements, but also of the effects of approximate or omitted corrections which are akin to model deficiencies. The deviations from the average of symmetry-equivalent |`F`|^{2} values may serve to estimate the anisotropy of such effects, and the standard practice of averaging can thus be justified. The confidence of the crystallographer in the model and in the variance-covariance matrix of the observations is based on an examination of the deviates and may be modified during refinement. The use of restraints also fits into the Bayesian philosophy, their estimated variances being chosen by the scientist's confidence in the restrained features of the model. The error estimate of the results is represented by a standard uncertainty. Different scientists may arrive at different estimates, a degree of confidence being a subjective measure. The underlying model may be criticized and must, of course, be completely described.

* These conditions are:

- the averaged data should consist of the weighted averages of the equivalent reflections calculated using the least-squares weights for the unaveraged data
- the least-squares weights for the averaged data should be the sum of those before averaging
- a weighted deviance term corresponding to the dispersion of the equivalent reflections about their weighted average should be added to the weighted deviance minimized for the averaged data. This deviance term does not alter the parameter values, but may affect the interpretation of the error estimates by altering the goodness of fit.

© 1989, 1995 International Union of Crystallography

Updated 18th Sept. 1996

These pages are maintained by the Commission Last updated: 15 Oct 2021