# Statistical descriptors in crystallography

## Refinement on `I`, |`F`|^{2} or |`F`|?

There was strong disagreement among members of the Subcommittee over the question of whether 'observations' used in a refinement should be net integrated intensities `I`, values of |`F`|^{2} or values of |`F`|, or indeed whether it makes a difference. The critical factors in the transformation from peak and background scan intensities through net integrated intensities to |`F`|^{2} concern the application of correction terms associated with absorption, the Lorentz-polarisation factor, thermal diffuse scattering *etc*., whereas the change from |`F`|^{2} to |`F`| concerns the square-root function. The extraction of the square root is a non-linear operation that has the potential of introducing a bias proportional to the variance of |`F`|^{2} (Wilson, 1976b, 1979), with the additional problem of determining what to do for the very weak reflections where statistical fluctuations in the peak and background measurements may cause the net intensity to be negative. The widely used formula for the s.u. of |`F`|, `u`(|`F`|) = `u`(|`F`|^{2})/2|`F`|, may be appropriate for strong reflections, but must be modified for weak reflections in order to prevent `u`(|`F`|) from becoming infinite at |`F`| = 0. French & Wilson (1978) and Gonschorek (1985) have proposed methods for obtaining |`F`| and `u`(|`F`|) from |`F`|^{2} and `u`(|`F`|^{2}).

The partial derivative of the calculated quantity `C`_{j} = |`F`|^{n} = (`A`^{2} + `B`^{2})^{n/2} with respect to the variable `v`_{r} is

(18) . . `C`_{j}/`v`_{r} = `n`|`F`|^{n - 2}{`A`(`A`/`v`_{r}) + `B`(`B`/`v`_{r})}.

If the calculated structure factor is zero, i.e. `A` = `B` = 0, `C`_{j}/`v`_{r} = 0 for `n` > 2, and undefined for `n` < 2. The contributions of the `j`th observation to the normal-equations matrix and vector are `w`_{j}(`C`_{j}/`v`_{r})(`C`_{j}/`v`_{s}) and `w`_{j}(`O`_{j} - `C`_{j})(`C`_{j}/`v`_{r}), respectively. If the weight is chosen according to `w`(|`F`_{j}|) = 4|`F`_{j}|^{2}`w`(| `F`_{j}|^{2}), then the contribution to the matrix is exactly the same, and the contribution to the vector nearly the same, for refinements on |`F`| and |`F`|^{2}. In many cases, omission of weak reflections has a negligible effect on the results (see recommendation 6), and the two kinds of refinement are then nearly identical. For this reason, the |`F`| versus |`F`|^{2} controversy is often considered to be irrelevant.

**The arguments in favor of refinement on | F|** are based on a mathematical analysis by Prince & Nicholson (1985). They observe that different reflections have different leverage which is a quantity that measures the influence of an individual reflection on the fit. It is proportional to the contributions to the matrix and vector described above. Because |

`F`|

^{2}/

`v`

_{r}is small if |

`F`|

^{2}is small, weak reflections have little leverage in refinements on |

`F`|

^{2}, even if some of the derivatives |

`F`|/

`v`

_{r}are substantial. Including them in a refinement on |

`F`|

^{2}can do no harm, but no good either. In a refinement on |

`F`|, care must be taken to give non-zero weights to the weak reflections, as mentioned before. Regarding the discontinuity of the partial derivatives of |

`F`

_{c}| at

`F`

_{c}= 0, Prince & Nicholson (1985) argue that the practice of assigning the phase of

`F`

_{c}to |

`F`

_{o}| in effect makes the 'observation' a Bayesian prior estimate of

`F`(not |

`F`|), and that the equivalent value for an unobserved reflection is

`F`= 0 with a finite variance based on the threshold of observability. The partial derivatives of

`F`with respect to the model parameters are continuous everywhere. These reflections, even though weak, may be sensitive to some parameters and do not lose their leverage. Their inclusion can improve the precision of parameter estimates.

**The arguments in favor of refinement on | F|^{2}** are based on the bias introduced by extracting the square root (Wilson 1976b, 1979), and on the undesirable discontinuity of the partial derivatives of |

`F`| at

`F`= 0. In principle, refinements should be on quantities as near as possible to the actual observations, and all non-linearities should be part of the model. It is improper to exclude the weak reflections, where the bias may be a substantial fraction of the |

`F`| value, by introducing an arbitrary cut-off at a minimum net intensity, be it zero or some positive value, because this results in a systematic inclusion of reflections with positive fluctuations and exclusion of reflections with negative fluctuations, and thus in a biased data set. Bias is therefore avoided by using

**all**reflections in a refinement on |

`F`|

^{2}. Hirshfeld & Rabinovich (1973) recommend inclusion of negative net intensities with their negative values. The discontinuity of the partial derivatives of |

`F`| is akin to the crystallographic phase problem. The model contains little information on the phase of

`F`if the calculated structure factor is small, and none at all at

`F`

_{c}= 0. For these reflections, we thus do not have a good Bayesian prior estimate of

`F`, hence their variances should be large and their small leverage in a refinement on |

`F`|

^{2}appears to be justified. The leverage is not given by the

**observed**, but by the

**calculated**value |

`F`

_{c}|

^{2}. Thus, a weak reflection may have a non-negligible effect if |

`F`

_{c}|

^{2}is considerably larger than |

`F`

_{o}|

^{2}. If on the other hand, |

`F`

_{c}|

^{2}= 0 while |

`F`

_{o}|

^{2}is larger than maybe 5 s.u.s, the structure presents an unsolved phase problem. Apart from a possible bias when refining on |

`F`|, the main difference between refinements on |

`F`| and |

`F`|

^{2}is equivalent to an up-weighting of weak reflections if

`u`(|

`F`|) is kept finite for |

`F`| = 0. The same effect could be achieved by explicitly up-weighting the weak reflections. In some structures, weak reflections are of prime importance. Crystallographers working on such structures are urged to give serious attention to these opposing arguments (see recommendation 6).

**The arguments in favor of refinement on I** are an extension of those for refinement on |

`F`|

^{2}. It has become customary to regard the relationships between peak and background measurements, net intensity and |

`F`|

^{2}as being linear transformations with constant factors. In this way, the measurement error (or the confidence in

*a priori*estimates) of crystal dimensions, polarization ratio, changes in reference intensity for radiation damaged or decomposing crystals amongst other systematic effects are not taken into account. The transformation from peak and background measurements to net intensity values leads to the paradox of 'negative as-measured' intensities. As systematic error is a major source of trouble, some members of the Subcommittee propose investigation of methods permitting refinement of crystal structures on

**all**observations,

*i.e.*measured intensities as well as crystal dimensions, polarization ratio, crystal decomposition curves,

*etc*. They propose inclusion in the model of additional, refineable parameters (

*e.g.*crystal dimensions) and introduction of the corresponding observations as restraints. The advantages are clearly an improved modelling of the structure and more realistic error estimates on atomic parameters.

© 1989, 1995 International Union of Crystallography

Updated 23rd Sept. 1996

These pages are maintained by the Commission Last updated: 15 Oct 2021