S0275

PROTEIN PRECISION RE-EXAMINED: LUZZATI PLOTS DO NOT ESTIMATE FINAL ERRORS. D W J Cruickshank, Chemistry Department, UMIST, Manchester, M60 1QD, UK

The misuse of Luzzati plots of the residual R versus sin[[theta]]/[[lambda]] to estimate final coordinate errors has stimulated a re-examination of protein precision. Luzzati (1952, Acta Cryst.) gave a theory for uncompleted refinements which estimated the r.m.s. shifts still needed to reach R = 0. His theory assumed no errors in Fobs and that the Fcalc model was perfect apart from coordinate errors. The Gaussian error distribution was the same for all atoms. These assumptions are invalid for proteins. Quite apart from the dependence on atomic number, it is well established that errors depend very strongly on atomic B values. Nor do Luzzati plots provide an upper limit for <[[Delta]]r>.

Restrained refinement will be examined theoretically. As applied to the simplest protein model of 2 like atoms in one dimension, restrained refinement determines a length which is the weighted mean of the diffraction-only length and the geometric-dictionary length.

By extending the order-of-magnitude error formula for small molecules given by Cruickshank (1960, Acta Cryst.), the e.s.d. for protein atom i with B = Bi is, very roughly,

[[sigma]](xi) = k(Ni/p)l/2 [g(Bi)/g(BW)] C-1/3 dmin R,

where k is about 1.0, Ni = [[Sigma]]Zj2/Zi2,p = Nobs - Nparams, [provisionally] g(B) ~ (1 + 0.04B + 0.003B2), Bw is the Wilson B for the structure, and C is the fractional completeness of the data to dmin. For example if Ni = 1000, p = 15000 - 4000, Bi = Bw, C = 0.9, dmin = 1.4Å, and R = 0.15, then [[sigma]](xi) = 0.07Å. This approach reveals the basic statistical flaws in the use of Luzzati plots.

Some authors have been able to invert the full LS matrix, and so obtain proper estimates of e.s.d.'s. Even when this is not possible, determined efforts should be made to use the information in a partial LS matrix.