E1186

A NEW APPROACH TO MACROMOLECULAR REFINEMENT VIA BAYES' LAW AND BOLTZMANN'S DISTRIBUTION. Rob Grothe, Washington University, Dept. of Electrical Engineering, St. Louis MO, USA

The refinement of a macromolecular structure from crystal diffraction data can be formulated as follows: find the most likely mini-ensemble of structures given molecular energetics and observed data. The (posterior) conditional probability to be maximized can be expressed via Bayes' law as the multiplicative product of two distributions, prior and data likelihood. The prior assigns probability to each mini-ensemble, viewed as a single state, via Boltzmann's distribution of states for a canonical ensemble at ambient temperature; the mini-ensemble energy is the mean of the energy values computed for individual structures under an energetic model. For a given mini-ensemble, a virtual asymmetric unit is constructed by averaging the structures and a virtual crystal by symmetric replication. The data likelihood is the probability that the measurement of x-rays (modeled by a random process) diffracted by this crystal results in the observed data.

The Boltzmann relationship converts energy into probability, the common currency through which two disparate information sources, molecular energetics and diffraction data, can be unified via Baye's law. Viewed in reverse, the relationship yields the molecular dynamics interpretation: the most likely ensemble minimizes the model energy function derived from the model probability. As energy depends upon the log of probability, the posterior energy is the sum of two terms corresponding to the factors in the posterior probability. X-PLOR, a widely used refinement package, minimizes the sum of a model molecular energy and a term penalizing disagreement between structure and data. The user chooses the form of the term along with weighting factors. In this new approach, a model for diffraction data is chosen, and the data-dependent term is derived from it.

A refinement algorithm, using jump-diffusion random sampling, has been implemented on a 16k-processor parallel machine. Preliminary results have been obtained for refining BPTI, using diffraction data from the Protein Data Bank and the published structure as the initial state.