Structures may come across by chance,
thanks to Monte Carlo ;-)
The RMC (Reverse Monte Carlo) code built in ESPOIR was strongly inspired from the RMCA program written by Malcolm Howe, using the 2.11 version for glass structure modelling (last altered as such by 26th May 1992). However, please note that a more recent version (RMCA 3.04) is available. See R.L. McGreevy, Nucl. Instr. and Meth. in Phys. Res. A354 (1995) 1-16, for one of the most recent reviews about RMC, and visit the RMC homepage at Studsvik. In this paper it is said that "RMC modelling will certainly not enable ab initio crystal structure models to be obtained starting from random initial structure "(p. 10, § 4.2). On the contrary, ESPOIR shows that this is possible by using the pertinent strategy, though the limit seems to be near of 30 independent atoms maximum. This limit can be overcome by using the Molecular Replacement (MR) method also inserted in ESPOIR (version 2).
A more thorough introduction will be published in the next CPD Newsletter (you may download a preprint as a MS Word97 document compressed by Winzip).
ESPOIR in french = HOPE in english (something not to lose when dealing with structure determination). However, do not load too much hope in ESPOIR, you could be deceived.
Running the program
This manual will explain how ESPOIR can be used, hopefully. Two files are necessary :
name.hkl containing the "|Fobs|", from
any origin (powder data or single crystal)
name.dat containing parameters for
defining the model (scratch or molecule fragment)
and piloting Monte carlo
(it is like trying to pilot a bottle in the ocean : you just can hope to
attain
your expected destination, but ESPOIR will guide you somewhere that
you may not have wished).
This file can be partly or fully prepared by answering to the questions
of PRESPOIR, in interactive mode.
The PC version will run by clicking on espoir.exe, opening a DOS box, in which the generic name of the two above files will be asked for (do not give any extension, only name), and then you will have to wait a lot (between a few seconds and several days, depending of your problem complexity).
Source code and latest modifications
You may consider building a version for your own computer if you possess a FORTRAN compiler. The source code is included in the package. Moreover, the GNU licence allows you to hack the code at your convenience.
ESPOIR 0.9
The main modifications in ESPOIR 0.9 from the original RMCA code consisted
in adding the |F| calculations for working on crystalline compounds instead
of glasses, and the possibility to permute atoms.
MODIFICATIONS IN ESPOIR 1.0 (still
available)
The main modification since the first ESPOIR
0.9 version, still available, is the possibility to work with any space
group, not only P1. Moreover, the contrainsts on distances and coordination
numbers were found useless and suppressed (with or without those constraints,
the retained atom moves are almost the same), leading to computer time
saving. Some simulated annealing was introduced, progressively reducing
the distances the atoms can move. The possibility to accept events that
do not improve the fit was introduced, annealed too. When facing obviously
false minima with the structure model frozen at high R factor, the calculation
automatically restarts from a different random configuration, according
to parameters selected by the user. Many simple and understandable parameters
were added that allow the user to control more closely the way ESPOIR is
working. Optimization of the |F| calculation was done by keeping the whole
stuff in memory and changing only the arrays parts concerned by the particular
moving atom or the pair of atoms permuting.
MODIFICATIONS IN ESPOIR 2.0
Two main modifications concern :
- Fit possible on a pseudo powder pattern regenerated from the
extracted "|Fobs|" - this allows keeping the whole set of extracted structure
factors, and speed is enhanced if compared to a fit on the true pattern.
Moreover, the step is variable : indexed on the FWHM. An output in .prf
readable by DMPLOT is built (however, it will be fully operational only
if the step is constant, of course : use U=V=0 in the Caglioti law).
- Molecular replacement method by rotation + translation of (only)
one fragment (molecule location). Test files for molecular replacement
are : pyrene (from X-ray data) ; 1-methylfluorene (without C14) ; cimetidine
(from synchrotron data) ; SDPDRR sample II (tetracycline hydrochloride,
test successful without considering the Cl atom) ; SDPDRR sample I (cobalt
amine, test successful by just searching for a CoN5O octahedra)
Minor modifications concern :
- Option of minimal interatomic distances reinserted.
- Annealing law revisited.
- Special positions considered (not all possibilities, but you
have the source code, don't you ?)..
Version 2.01 allows to treat some data affected by twinning by merohedry. See the parameter ns=2.
Introduction of the possibility to cope with several fragments simultaneously, and with torsion angles, will be for version 3, maybe.
This ESPOIR 2.0 package, espoir2.zip (600Ko) contains :
espoir.exe
: ESPOIR 2.0, executable for Win95/98/NT
espoir.f
: Fortran code
espoir.ico
: Icon for ESPOIR
prespoir.exe
: Executable for preparing the .dat file
prespoir.f
: Fortran code
prespoir.ico
: Icon for PRESPOIR
random.exe
: Executable for random positions generation
espoir.html
: This manual in HTML language
espoir.gif
: Logo
al2o3.gif
: Figure showing a pattern "regenerated" from the "|Fobs|"
license.html
: The GNU license applying to this software, in HTML language
together with the example files (in principle, for each example are given 3 files, the name.dat containing instructions, the name.hkl containing reflections, and the name.imp file containing the results) :
The codes used below are meaning :
(MR) = Molecule replacement ;
(S) = from scratch (i.e. random starting model) ;
(P) = from regenerated powder pattern ;
(Fo) = from raw extracted "|Fobs|" ;
(Fc) = from calculated exact |F| ;
(GP) = guessed special positions ;
(DC) = distance constraint ;
(GO) = guessed occupation number ;
(CC) = Cartesian coordinates ;
(FC) = fractional coordinates ;
(PR) = restarted from a previous result ;
(EX) = experimental data ;
(CA) = calculated data : from ICSD or CSD atomic coordinates, a theoretical
powder pattern is built with the U,V,W of PbSO4 (Rietveld Round Robin),
then the "|Fobs|" are extracted from this pattern by FULLPROF ;
(E1) = with ESPOIR version 1 (care to data incompatibilities with version
2).
Some .dat files contain sometimes unused data at the end : frequently the exact coordinates that you should find.
generic file name
An example delivered with this version (cuvo3c.dat) is detailed below
:
Test on CuVO3 : text for this run 4.9646 5.4023 4.9154 90.32 119.13 63.93 : cell parameters P -1 : SG : Space group 1.54056 4 5 3 1 : wa, kxr, na, nt, ns 0.09 -0.03 0.04 3 : U, V, W, Nstep (optional line if ns = 1) cu v o : atom type names (nt names, no capital letters) 1 1 3 : ni : numbers of atoms in each type (nt values) 1.1 0 0 0 1 : bov, nocc, ncon, nspe, ipri 1.0 0.008 : sigma, reject 3. 3. 3. : delta : max moves for each atom types 2. : nanneal : allowing to reduce delta according to a defined law (see below) 5000 : n1 : print after n1 moves 100000 20000 : n2 and n3 : end at n2 moves; save at n3 moves 20000 0.25 2 : nstart, rmax, ichi 10 : n4 : try one permutation after n4 moves or one translation after n4 rotations (MR) 10 : n5 : number of random different starting models -0.0760 0.3091 -0.5059 1. : na lines of x,y,z coordinates and occupation number -0.6436 -0.8249 -0.7986 1. to be given only if n5 = 0 0.3537 0.6792 0.6986 1. (given here just to show, since n5=10) 0.9300 -0.1930 0.9905 1. -0.1040 -0.0487 -0.3215 1. : care to make a return here
Parameters definitions
text (format 20A4) A title for the run.
cell The cell parameters a, b, c, alpha, beta, gamma (free format, 6 real).
SG
The space group to be interpreted by Prof. Burzlaff's subroutine.
In principle, any SG (with inversion center at the origin) should work.
Examples (use blanks appropriately) :
P 1
P-1
P 21/C
P 21/N
C C
C 2/M
P MMM
P N M A I M 2 M
F M M M
C M C M C 21 21 21
I 41/A
I 4/M M M ...
P 63/M M M R -3 C
P 3 2 1
...
I M 3
P -4 3 M I M 3 M
F D 3 M etc
Verify the printed symmetry operators (24 maximum, since adding atoms due
to F and I
Bravais lattices has no influence on intensity, and inversion center is
treated apart).
wa, kxr, na, nt, ns
One real and 3 integers (free format)
wa = Wavelength, only the following ones are recognized
for delta-f
and delta-f" anomalous dispersion terms for X-ray
2.28962 1.93597 1.54051 0.70926 0.556363
kxr = Allows to define neutron data (kxr=0) or X-ray data (kxr=4)
na = Total number of atoms (max = 200) in the asymetric
unit
nt = Number of different atom types (max = 8)
ns = Code defining the job type according to data
ns = 0 for working on "|Fobs|" (they must be quite good, without too much
overlapping)
or single crystal data (ns = 0 recommended ! why degrading data ?)
ns = 1 for working on the regenerated powder pattern (overlapping no matter),
note that the profile shapes are Gaussian but this is not so important,
the
trick is to treat overlapped data as overlapped data, no more ;-)
ns = 2 supposes that you have a twinning hypothesis on single crystal data.
Only merohedry is considered with 2 domains (ns=2) at 50% in volume.
When ns = 2, the next 3 lines should be the 3x3 matrix transforming the
hkl of domain 1 into the hkl of domain 2. For instance :
0 1 0
1 0 0 will transform hkl into
kh-l
0 0 -1
U, V, W, nstep
Optional line occuring only if ns = 1 (see above)
U, V, W = Caglioti law refined when the "|Fobs|" were extracted
nstep = number of points that you estimate useful above the FWHM
(try nstep = 3 to 5), if nstep is given smaller that 3, it will be reset
to 3.
name1, name2...
n atom names (characters in format nA4 where n = nt)
DO
NOT USE CAPITAL LETTERS
In principle, ionic definitions are recognized
like o-2, al+3, ca+2, f-1, ba+2, etc. (max =8)
ni1, ni2...
nt values giving the number of atoms in each type in the asymetric unit
in the same order as their names above (integers in free format).
(sum of ni values is = na, max 200)
bov, nocc, ncon, nspe, ipri
: one real and 5 integers (free format)
bov = Overall thermal B factor (real in free format)
Use a value near of 1.0 or 1.5 for inorganic materials
and 3.0 or more for organic compounds.
nocc = code for reading individual guessed occupation factors
if = 1 : a next optional line should give all the occup. factors
if = 0 : the program generates all occup = 1.
ncon = code for constraints on shortest interatomic distances
if = 1 : read rcut below
if = 0 : no constraints
nspe = code for general or special positions
if = 1 : read special positions codes nsp below
if = 0 : all atoms in general position, do not read anything
ipri = code for printing Iobs/Icalc and Fobs/Fcalc at the end of a run
if = 0 : no printout
if = 1 : printout
Of course, if nocc=1, and ncon=1, and nspe=1, then 3 optional lines should be given in that same order
And the special position codes (nsp) currently defined in the program are :
nsp code position 1 x,y,z (general position) 2 x,x,x 6 x,0,z 9 x,y,0 10 0,y,z 13 x,0,0 12 0,y,0 8 0,0,z 4 x,1/4,z 5 0,1/2,z 7 x,0,1/4 11 1/2,y,0 3 0,0,0If you want more : ask me for them or do it yourself (you have the source code, don't you ?). There are two places where you will have to act. In the main program :
9000 IF(MR.NE.0) GO TO 400 GO TO (1,2,3,4,5,6,7,8,9,10,11,12,13), nsp(i)and in the esp_genmove subroutine.
So : care that there could be up to 3 optional lines here :
occupation factors (real
in free format)
rcut values (real in free
format) (shortest interatomic distances
given in the order 11, 12, 13, 21, 23, 33 for 3 different atom types, for
instance)
nsp codes for special positions
(integers in free format)
If you have 100 atoms, your optional lines may extend on several lines.
The important point being that
the expected number of values
is found by the program.
sigma, reject
(reals)
sigma = standard deviation on |F| (an overall value).
The best is to explore different values. Data are arbitrarily normalized
for having a mean |F|=50. You may try sigma=1. at the beginning with
delta values (see below) of the order of the maximum cell parameter,
and then reduce to sigma = 0.1 or 0.01 in further tests with delta values
in the range 0.1-0.5A.
This parameter is of no use if you select ichi=2 below.
reject = test for accepting randomly 40% moves that do not improve the
fit
Anyway, all events that lead to delta(R) < -reject are really rejected,
where R (< 0) is the reliability on |F|
Try reject = 0.01 or 0.005, and observe the number of kept events.
Remember that a global decrease of R is searched, so that reduce
reject if R does not finally decreases.
This could help in not being trapped in a false minima.
The value of reject is dumped by the nanneal parameter (see below),
progressively reduced to zero up to n2 (see below), the total number
of events, is reached.
delta1, delta2..
(reals) The maximum move for each type
of atom.
Recommended values are in the range 0.1-0.5 in the final stages.
Use values of the order of 5 Angstroms at the beginning (or more, up to
the cell parameters). Otherwise, you may stick to a false minima.
A value of zero is possible and will allow only some types of atoms to
move.
delta values are progressively damped by the anneal parameter below.
(max = 8)
nanneal
(real) Move amplitudes will be progressively
reduced following the equation :
move=move*dump
dump=(1.-ngent/ngenmax)**nanneal
ngent = number of generated events during the program execution
ngenmax = maximum number of events allowed (see n2 below)
for nanneal=1, the reduction will be linear
it is suggested to use nanneal=2
note that dump will apply on atom moves but also on molecule translations
if
Rp(F) or RF < rmax (see rmax definition below)
This is a way for doing some simulated annealing, avoiding sometimes
the necessity to make two steps (one step with large move amplitudes, and
a subsequent step with smaller move amplitudes)
n1
(integer) Determines how often a summary will be written to
the standard output.
It will be every after n1 events generated (moves + permutations) except
that
it will only occur when an event is accepted.
n2, n3
(integers) n2 = The total number of events the program should run
for.
n3 = The number of events afterward the results will be saved
(possibly several times in a run).
nstart, rmax, ichi
If after nstart (integer) events (moves + permutations), the R factor
is
still higher than rmax (real), then restart from a new random configuration,
unless the total number of allowed starting models (n5 see below)
is
already attained.
ichi (integer) determines the test made for accepting or rejecting an event
:
ichi = 1 : the test is made on the decreasing of
Sum on (|Fobs|-|Fcalc|)**2/sigma**2
ichi = 2 : the test is made on the decreasing of R :
Sum on | |Fobs|-|Fcalc| | / Sum on |Fobs|
Try both, however, there seems to be no clear difference.
nstart = 40000 and rmax = 0.2 - 0.3 is fine for small structures
nstart = 120000 and rmax = 0.35 -0.40 could work for large structures
n4
(integer) This parameter may have 2 meanings according to the choice
of
a run from scratch (random atoms), or a run from a molecular model
If scratch : try permutations of atoms after n4 moves. Examples :
If n4 = 10, the ratio of atom moves and permutations will be 10 for 1.
If n4 = 1, only permutations will occur
If n4 = 0 only atom moves will occur
care that some combinations of constrained occupation numbers may not
allow any permutations. If permutations are not allowed, the program will
infinitely loop, but you will be given a message ;-)
try n4=10 to 100 like you wish (most test files use n4=10)
Or, in case or Molecular Replacement,
try translations of model after n4 rotations
If n4 = 10, the ratio of rotations and translations will be 10 for 1.
If n4 = 1, only translations will occur (not recommended in the general
case)
If n4 = 0 only model rotations will occur (not recommended as well)
Rotations are made around the molecule or fragment center of gravity
try n4 = 2 to 25 or more if you wish (many test files use n4=2 or 4)
n5
(integer) |n5| is the number of runs (try 5, 10 or 50 or 100...
but care to computer time)
if n5 > 0 the job concerns random starting models and data stop there
if n5 = 0 the job will reuse previous atomic coordinates and the x,y,z,occup
should be given just below
if n5 < 0 the job concerns Molecule Replacement and the next line
should be
either a,b,c,alpha,beta,gamma of the cell in which is described
the molecule
or 0. 0. 0. 90. 90. 90. if the model is described with Cartesian
coordinates
and the following lines will then be the x,y,z,occup values as described
below
x,y,z,occup
(reals) na lines of atomic coordinates and occupation
numbers (max 100 atoms)
To be given only if n5 = 0 and n5 < 0
occup=1 means a general position fully occupied
Note that if n5 is different from 0, then occup will always be = 1
for any atom (the program cannot decide in your place), or defined by nocc
parameters.
How many atoms in general and special positions ? You have to guess !
You may put there either :
- random coordinates obtained from RANDOM.EXE
- one result from a previous test, that you want to continue
with different Monte Carlo parameters.
- your fragment or molecular model in either cartesian or fractional
coordinates
The organization in the .hkl file is quite simple :
One line for the number N of hkl (N maximum is 1000), and then N lines
including h, k, l, |Fobs|. An example is below. Data are not formatted,
just list 3 integers and one real in free format.
A sufficient number of hkl could be 10 reflections for one atom in the asymetric unit.
120 0 1 0 21.580 you may find possibly next values in the test files 0 0 1 39.622 but they are ignored 1 1 0 9.749 1 0 -1 29.746 1 0 0 195.923 ... ... ... 4 1 -2 128.143 3 4 0 159.884 0 3 3 142.716 3 2 1 8.925 2 -2 0 349.860 Care to make a return here.
ESPOIR will create 5 output files :
name.imp
: will contain all intermediate results
name.res
: will contain the best result in SHELX format
name.spf
: will contain the best configuration almost ready for
searching
symmetry by PLATON
namenew.cfg : will contain
the best configuration ready for a copy-paste in name.dat
for further cycles with n5=0 for instance.
name.prf
: if the option ns = 1 is chosen. Will contain the "observed"
and
regenerated powder patterns in a format readable by DMPLOT.
Note however that when the U and V parameters are different from 0,
the step will be variable so that DMPLOT will not produce the correct
angles and reflection positions.
ESPOIR needs a cell, a space group and structure factors. That means that you should have indexed the cell, guessed a space group, and either extracted the "|Fobs|" from a powder pattern by ways at your convenience (see the SDPD tutorial), or recorded single crystal data. Care to the quality of your "|Fobs|"...
The n5 parameter is the key for applying the
choosen strategy :
n5 > 0 : scratch (i.e. random starting model)
n5 < 0 : molecule location.
The other key is the ns parameter,
depending on the data quality (overlap or not) :
ns = 0 : working on "|Fobs|" (they must be quite good, without too much
overlapping)
or single crystal data (ns = 0 recommended ! why degrading data by option
ns=1?)
ns = 1 : working on the powder pattern regenerated from the "|Fobs|" (overlapping
no matter),
note that the profile shapes are Gaussian but this is not so important,
the
trick is to treat overlapping data as overlapping data, and gain on speed,
no more ;-)
How many reflections ?
The first 100 "|Fobs|" were sufficient for all Molecule Replacement (MR)
examples.
Ten reflections by independent atom is the minimum for a scratch test.
How many Monte Carlo events ?
For scratch : 5000 to 8000000 were used, depending of the problem complexity.
For Molecule Replacement : 20000 to 1000000 should be sufficient.
The recommended strategy is to try with all the possible space groups (for instance, you may have a case where you should try Immm, I222, I212121, Imm2, Im2m, and I2mm), but maybe in P1 if your problem does not exceed 30 independent atoms in that space group. The problem of determining if atoms are on special or general positions is your problem : think. However, you will see below that Pb in PbSO4 in the Pnma space group is always found near of the special position x, 1/4, z, whatever you decide to fix the occupation number to 1 or to 0.5 (heavy atom problems are the simplest).
ESPOIR works generally better in P1 space group. If you decide to give a try in P1, you need to present the "|Fobs|" as if they were corresponding to the P1 space group, by separating those hkl reflections having a multiplicity greater than 2 (anyway, do not separate hkl and -h-k-l, because the program adds them as if the data were coming from a powder diffraction measurement). For instance, in the monoclinic P21/m space group, you may have extracted the following "|Fobs|":
1 0 1 35. 1 1 0 75. 0 1 1 27. 1 0 -1 38. 1 1 1 45. 1 0 2 34. 1 1 -1 89. 0 2 0 23.And you may decide to try in triclinic acentric. Then, these data must be presented differently to ESPOIR for matching to the P1 space group :
0 1 0 0. unique : add all forbidden reflections with "|Fobs|" = 0. 1 0 1 35. unique (the 1 0 -1 should appear later) 1 1 0 75. double (but keep the same "|Fobs|" value 1 -1 0 75. 0 1 1 27. double 0 -1 1 27. 1 0 -1 38. unique 1 1 1 45. double 1 -1 1 45. 1 0 2 34. unique 1 1 -1 89. double 1 -1 -1 89. 0 2 0 23. unique
Extinctions should obviously be considered as giving |Fhkl| = 0.
and therefore this is a quite useful information for ESPOIR that you should
absolutely include. Of course, the "|Fobs|" quality is essential. When
using a set of structure factors extracted from powder data by either the
Pawley of Le Bail methods, you should keep only those reflections that
are reasonably sure (use a dataset reduced with the OVERLAPsoftware,
for instance).
The starting configurations are built up by using a generator of random positions inserted in ESPOIR. But, if you want only one test with ESPOIR (n5=0), the program RANDOM provided with the package will help you to prepare data. Just give to RANDOM the number of random positions to generate (corresponding to na, the total number of atoms in the asymmetric unit). The result will be saved in a .cfg file that you may edit. Copy-paste your coordinates in the .dat file, and guess if your atoms occupy special or general positions. If n5 is equal to 1 or more, ESPOIR generates automatically the random starting configurations, with full occupation numbers for all atoms.
Note that you may not retrieve exactly the same results as for the examples below because the generator of random numbers is really efficient (hence the bottle in the ocean)... Those examples are corresponding to mederately complex crystal structures, giving you maybe the limits of feasibility with the current ESPOIR version 2.0. In fact, the two first cases belong really to the P-1 space group, the third case (PbSO4) is one of the Rietveld Round Robin sample (Pnma). The sample I of the SDPD Round Robin (cobaltamine) for which no participant proposed a model is also treated as well as the famous cimetidine. But the SDPDRR sample II structure could not be obtained from scratch, up to now (33 atoms in general position in P212121 space group, that would lead to 132 atoms in P1 : not tried, we have to wait for 10 or 100GHz computer speed).
Those examples are sorted from the simplest to the more complex.
Endeavour challenge examples (Al2O3,
aragonite,
CaF2,
calcite,
forsterite)
The following examples are taken from the Endeavour
software list of test files. This being done in the context of the ESPOIR-Endeavour
challenge announced on the SDPD Mailing List.
Al2O3
This example will illustrate how guessed constraints on occupation
numbers and special positions allow to obtain more surely the solution,
provided the constraints are true...
The first example (al2o3.dat) has ns=1 and so works on the pattern regenerated from the "|Fobs|", and n5>0, thus working in scratch mode, with n5=10 (10 independent runs) :
Al2O3 R-3c 4.764 4.764 13.009 90.0 90.0 120.0 R -3 C 1.54056 4 2 2 1 <-- last value ns = 1 0.02511 -0.04562 0.03019 3 if ns = 1 : this line occurs : U,V,W,nstep al+3o-2 use scattering factors of al+3 and o-2 1 1 one atom for each atom type 1.0 1 0 0 1 2. 3. occupation factors for Al and O 1.0 0.002 6. 6. maximum move for each atom type : 6 angstroems 2 annealing law : second order 5000 20000 20000 5000 0.2 2 10 10 <-- n5 = 10
The results lead to R values in the range 0.135-0.005, with a success
rate of 7/10 for R=0.005. The propositions at R=0.135 are false minima.
Below is the result :
66 moves acc. 18000 tested; Chi**2=0.491E-02, R=0.005 0 perm. acc. 1999 tested 1 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
al 1 0.99153 0.99167 0.14792 2.000 o- 1 0.64258 0.96913 0.91818 3.000The Al coordinates are not exactly 0,0,z
The true positions are :
Al 0.00000 0.00000 0.35210
O 0.30600 0.00000 0.25000
Now see the .prf file by the DMPLOT software :
You may note that the hkl are at the good positions but the peaks are not (because the calculation is made with a variable step : indexed on the FWHM). Accordance between hkl markers and peak positions will only occur if U=V=0, giving constant FWHM related to W. So, you have seen now what is the pattern "regenerated" from the "|Fobs|". Of course, there is no relation with the true powder pattern on the point of view of peak heights. But this artifact allows to cope with overlapping peaks.
Now the second example for Al2O3 will not work with ns=1 but with ns=0
(the fit is directly on the "|Fobs|") (al2o2F.dat) :
Al2O3 R-3c 4.764 4.764 13.009 90.0 90.0 120.0 R -3 C 1.54056 4 2 2 0 <-- ns=0 , no need for the U,V,W,nstep line al+3o-2 1 1 1.0 1 0 0 1 2. 3. 1.0 0.002 6. 6. 2 5000 20000 20000 5000 0.2 2 10 10This is at least 5 times faster than the previous run on the regenerated pattern. The results are similar because there is not much overlapping problems in such a very simple case.
The third test for Al2O3 makes use of constraints on special positions
: 0,0,z for Al and x,0,1/4 for O. Such positions could be guessed knowing
the chemical formula and being sure of the space group (anyway, this will
not be always so easy...) (al2o3GP.dat) :
Al2O3 R-3c 4.764 4.764 13.009 90.0 90.0 120.0 R -3 C 1.54056 4 2 2 1 0.02511 -0.04562 0.03019 3 al+3o-2 1 1 1.0 1 0 1 1 We have here the codes for reading occupations and special positions 2. 3. occupations for Al and O 8 7 special positions : 8 is code for 0,0,z and 7 is for x,0,1/4 1.0 0.002 6. 6. 2 5000 20000 20000 5000 0.2 2 0 Note that in such a case, permutations are impossible. If you 10 try, the program will loop infinitely, but will tell you...
In this way, a 100% success rate is ensured. This is because the
number of degrees of freedom (DoF) was considerably reduced : 2 unknown
parameters instead of 6. And the final result is :
54 moves acc. 19999 tested; Chi**2=0.602E-02, R=0.006 0 perm. acc. 0 tested 17 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
al 1 0.00000 0.00000 0.35210 2.000 o- 1 0.69395 0.00000 0.25000 3.000
This example is a bit more complex with 4 independent atoms (12 DoF) (aragonite.dat) :
Aragonite CaCO3 4.961 7.967 5.741 90.00 90.00 90.00 P M C N 1.54056 4 4 3 1 0.02511 -0.04562 0.03019 3. ca c o 1 1 2 1.0 1 0 0 1 0.5 0.5 0.5 1. 1.0 0.01 4. 4. 4. 2 5000 60000 60000 20000 0.3 2 10 10We should find :
Ca 0.25000 0.41508 0.24046 0.50000 C 0.25000 0.76211 0.08518 0.50000 O1 0.25000 0.92224 0.09557 0.50000 O2 0.47347 0.68065 0.08726 1.00000According to the formula and Z = 4, we know that Ca and C are necessarily on a special position with 4 equivalents. But the O atoms could either be distributed on one general plus one special or 3 specials. By chance ;-), the good choice was made here and the success rate is 8/10 :
412 moves acc. 59999 tested; Chi**2=0.436E-01, R=0.044 87 perm. acc. 5999 tested 211 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
ca 1 0.24469 0.08486 0.74000 0.500 c 1 0.74805 0.26311 0.41341 0.500 o 1 0.74901 0.42197 0.40669 0.500 o 2 0.02773 0.68130 0.08825 1.000
CaF2
This seems to be a rather simple example. However the success rate
(3/50) is far from the Al2O3 success rate (8/10). Why ? There seem to be
a lot of false minima with R~15% or higher (caf2.dat) :
CaF2 Fm3m 5.462 5.462 5.462 90.0 90.0 90.0 F M 3 M 1.54056 4 2 2 1 0.02511 -0.04562 0.03019 3. ca+2f-1 1 1 1.0 1 0 0 1 1. 2. 1.0 0.02 6. 6. 2 5000 60000 60000 10000 0.1 2 10 50The success rate would have been certainly enhanced if the special positions had been guessed.
787 moves acc. 59999 tested; Chi**2=0.951E-02, R=0.010 0 perm. acc. 5999 tested 401 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
ca 1 0.50174 0.00162 0.49287 1.000 f- 1 0.26616 0.74359 0.74955 2.000
Expected positions were :
Ca 0. 0.
0.
F 1/4 1/4 1/4
Calcite
This case is just a bit more complex than Al2O3, with the same space
group (calcite.dat) :
Calcite CaCO3 4.990 4.990 17.061 90.0 90.0 120.0 R -3 C 1.54056 4 3 3 1 0.02511 -0.04562 0.03019 3. ca c o 1 1 1 1.0 1 0 0 1 6. 6. 18. 1.0 0.005 9. 9. 9. 2 5000 60000 10000 20000 0.3 2 10 20The expected results are :
Ca 0.00000 0.00000 0.00000
C 0.00000 0.00000 0.25000
O 0.25682 0.00000 0.25000
Again, the good occupation numbers were guessed for the O atom, leading
to a success rate 8/20. Below is the result :
516 moves acc. 59999 tested; Chi**2=0.375E-02, R=0.004 3 perm. acc. 5999 tested 220 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
ca 1 0.00124 0.00162 0.49998 6.000 c 1 0.33332 0.66213 0.91922 6.000 o 1 0.66684 0.07675 0.08548 18.000Again, one has to retrieve special positions thinking to the R Bravais lattice.
Forsterite
Forsterite is the most complex example found in the Endeavour package,
with up to 6 independent atoms (forsterite.dat) :
Forsterite Mg2SiO4 4.755 10.198 5.979 90.0 90.0 90.0 P B N M 1.54056 4 6 3 1 0.02511 -0.04562 0.03019 3. mg+2si+4o-2 2 1 3 1.0 1 0 0 1 0.5 0.5 0.5 0.5 0.5 1.0 1.0 0.005 5. 5. 5. 2 10000 100000 50000 40000 0.3 2 10 40And the expected result is :
Miraculously, the true occupation number were guessed (but special positions
were not forced to occur). The DoF is now of 18. And the result is obtained
with a 3/40 success rate, rather low :
362 moves acc. 99999 tested; Chi**2=0.173E-01, R=0.017 66 perm. acc. 9999 tested 101 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
mg 1 0.00838 0.72263 0.75846 0.500 mg 2 0.49021 0.50067 0.00196 0.500 si 1 0.92644 0.40621 0.74431 0.500 o- 1 0.23412 0.90803 0.75301 0.500 o- 2 0.21983 0.44714 0.25644 0.500 o- 3 0.77804 0.33711 0.96673 1.000
That's all for the Endeavour challenge examples on the Endeavour
side. Now the test files strictly from ESPOIR. Some of the results below
were already obtained from ESPOIR 1.0 version :
CuVO3
In order to illustrate the difference of behaviour of ESPOIR on the
same problem if treated in P 1 and in P -1, observe the results below
(corresponding to the test files, with some annealing) :
in P 1, with 10 atoms in general position, for 10 tests starting from different random models (file cuvo3.dat).
Test N° Moves Perm. Events without Starting R Final R accepted accepted fit improvement (%) (%) 1 236 77 114 72.2 11.7 2 206 134 123 80.4 11.6 3 269 22 102 81.7 16.7 4 151 43 53 80.4 3.8 5 264 35 99 82.4 4.7 6 290 69 131 80.3 4.5 7 failed to attain R = 30% after 40000 events 8 239 69 93 80.0 4.5 9 281 29 101 84.8 5.5 10 237 16 79 85.3 11.8
in P -1, with 5 atoms guessed to be in general position (which is
true), for 10 tests (file cuvo3C.dat).
Test N° Moves Perm. Events without Starting R Final R accepted accepted fit improvement (%) (%) 1 87 3 13 82.6 5.8 2 to 5 failed to attain R = 30% after 40000 events 6 109 7 17 80.8 3.7 7 and 8 failed to attain R = 30% after 40000 events 9 145 3 43 82.9 16.5 10 119 20 32 84.2 5.0The difference is that when an atom is moving, two atoms really move in P-1 according to a completely arbitrary origin. Generally, you will observe much less moves accepted in any space group than for the same problem described in P1. The problem is due (I think) to the impossibility to build a truly random starting model, excepted in P1. Anyway, ESPOIR do the job in 4 tests for 10, to be compared to a succes rate of 9/10 in the P1 space group.
This example is quite more complex for two reasons : more atoms (16 in P 1), and almost same diffusion factors for both atom types. If ESPOIR succeeds here, this should mean that organic materials at least as complex as TeI should be solved from scratch by ESPOIR. Again, you can compare the performances in P 1 and in P -1.
in P 1, with 16 atoms in general position, for 20 tests starting from different random models (file tei.dat).
Test N° Moves Perm. Events without Starting R Final R accepted accepted fit improvement (%) (%) 1 126 411 0 92.2 7.3 2 to 11 failed 12 127 436 0 89.1 5.7 13-20 failed
in P -1, the above success rate (2/20) is reduced by a factor 2,
but ESPOIR still works (file teiC.dat).
Test N° Moves Perm. Events without Starting R Final R accepted accepted fit improvement (%) (%) 1 to 16 failed 17 100 189 0 97.0 5.4 18-20 failedClearly, one should not expect that Te and I are really well differentiated here (they are not, of course). So that looking at interatomic distances would allow to recognize Te and I atoms (no I-I direct contact, but Te-I and Te-Te are allowed).
In P1, there are 24 atoms, but the large success rate below (9/20) is
certainly due to easy location of the 4 heavy Pb scatterers. Anyway, in
the best solutions, also the S and many of the O atoms were located (file
pbso4P1.dat).
Test N° Moves Perm. Events without Starting R Final R accepted accepted fit improvement (%) (%) 1 51636 1267 24647 75.6 9.0 2 failed 3 55264 1335 26235 83.0 8.8 4 57811 1447 27484 83.5 10.4 5 to 10 failed 11 58931 1485 28067 84.6 10.5 12-13 failed 14 55237 1245 26162 82.8 8.8 15 failed 16 54465 1317 25839 80.5 10.1 17 56120 1304 26794 81.2 13.0 18 55048 1228 26217 83.5 10.6 19 51325 1479 24593 81.3 9.3 20 failedDetermining the space group, finding a new origin, if any.
Finding symmetry elements can then be attempted by using PLATON on the name.spf output :
TITL Test on PbSO4 CELL 8.4820 5.3980 6.9590 90.0000 90.0000 90.0000 SPGR P1 ATOM pb1 0.42876 0.65290 0.18283 ATOM pb2 0.81621 0.16184 0.49586 ATOM pb3 0.31170 0.14734 0.69931 ATOM pb4 0.93629 0.65219 0.02093Testing with PLATON the above .spf file containing only the Pb atoms as proposed by ESPOIR is sufficient to retrieve the true space group (Pnma) :
==================================================================================================================================== ADDSYM - CHECK (cf. MISSYM (C): Le Page, Y., J. Appl. Cryst. (1987), 20, 264-269; J. Appl. Cryst. (1988), 21, 983-984) ------------------------------------------------------------------------------------------------------------------------------------ - This ADDSYM Search is run on ALL NON-H Chemical Types - Number of Input Atoms Included in Search = 4 - Density based on Input Atom set = 4.319 g.cm-3 - Vol / Non-H atom = 79.7 Ang3 - The Structure implies the following Symmetry Elements subject to the Criteria: 1.00 Deg., (metric) 0.25 Ang. (distances) and 0.45 Ang. (inv. and transl.) Symm. Input Reduced (Ang) (Deg) (Ang) Input Cell Elem Cell Row Cell Row d Type Dot Angle Max. dev. x y z -------------------------------------------------------------------------------- a * [ 0 0 1] [ 0-1 0] 6.959 2 1 0.00 0.070 through 0 0 0.102 Pb2 -Pb3 Glide = 1/2 0 0 m * [ 0 1 0] [ 1 0 0] 5.398 2 1 0.00 0.048 through 0 0.653 0 Pb2 -Pb2 n * [ 1 0 0] [ 0 0 1] 8.482 2 1 0.00 0.159 through 0.370 0 0 Pb4 -Pb2 Glide = 0 1/2 1/2 -1 * ====================================== 0.151 at 0.122 0.407 0.339 Pb3 -Pb4 Reduced->Convent Input->Reduced T = Input->Convent: a' = T a -------------------------------------------------------------------------------- ( 0 0 1 ) ( 0 -1 0 ) ( -1 0 0 ) Det(T) ( 1 0 0 ) X ( 0 0 1 ) = ( 0 -1 0 ) = ( 0 1 0 ) ( -1 0 0 ) ( 0 0 1 ) 1.000 Cell Lattice a b c alpha beta gamma Volume CrystalSystem Laue -------------------------------------------------------------------------------- Input aP 8.482 5.398 6.959 90.00 90.00 90.00 319 Triclinic -1 Reduced P 5.398 6.959 8.482 90.00 90.00 90.00 319 Convent oP 8.482 5.398 6.959 90.00 90.00 90.00 319 Orthorhombic mmm Conventional, New or Pseudo Symmetry ================================================================================ Space Group Pnma No: 62, Laue: mmm [Hall: -P 2ac 2n ] Lattice Type oP, Centric, Orthorhombic, Order 8( 4) [Shoenflies: D2h^16 ] Nr ***** Symmetry Operation(s) ***** 1 X , Y , Z 2 1/2 - X , - Y , 1/2 + Z 3 1/2 + X , 1/2 - Y , 1/2 - Z 4 - X , 1/2 + Y , - Z 5 - X , - Y , - Z 6 1/2 + X , Y , 1/2 - Z 7 1/2 - X , 1/2 + Y , 1/2 + Z 8 X , 1/2 - Y , Z :: Origin shifted to:-0.122,-0.407, 0.339 after transformation :: * Symmetry Elements preceded by an Asterisk are New and indicate :: Missed/Pseudosymmetry Summary :: M/P Test on PbSO4 aP => oP 0.000 0.00 0.500 100% PnmaThe ESPOIR proposition of S and O atoms is only partially exact for PbSO4 in P 1 space group, although many atoms are at close-right position. This is understandable because of the quite heavy weight of the Pb atoms.
In the case of CuVO3 and TeI, the best ESPOIR propositions are almost fully correct in P 1 space group, so that PLATON has no difficulty to locate the inversion center (the true structures are both P-1) from the whole atomic positions, provided the I atoms are labelled also Te, in the hypothesis of some misplacement.
In Pnma, the problem is of course to decide which and how many atoms are on special positions. With Z = 4, there is little doubt that Pb and S are on special positions, but the question is for the O atoms. You can postulate that, owing to the Pb weight, this will not be important and try all atoms in general position (this will only play on the scale factor).
The two ways to run the PbSO4 test case, either in automatic mode (all atoms on general position), or by guessing if atoms could be on special positions, are shown below.
Best final result at R=5.1%, automatic run (all occupation factors = 1.) :
pb 1 0.68700 0.24790 0.16692 1.000 s 1 0.56372 0.26778 0.68455 1.000 o 1 0.41881 0.95491 0.19341 1.000 o 2 0.40791 0.23496 0.59896 1.000 o 3 0.31096 0.75743 0.45229 1.000and the corresponding starting pbso4.dat file :
Test on PbSO4 Pnma 8.482 5.398 6.959 90. 90. 90. P N M A 1.54056 4 5 3 pb s o 1 1 3 1.0 1.0 0.005 5. 5. 5. 5. 20000 300000 100000 40000 0.3 2 10 10
The 10 tests produce 100% success rate with 5.1 < R < 12.8
%. There is not a lot of difference with one test using a set of positions
produced by RANDOM and guessing the occupation factors. Of course this
is still due to the high Pb weight.
Best result at R=5.5%, one run with guessed occupation factors
pb 1 0.81187 0.75229 0.16561 0.500 s 1 0.06220 0.26771 0.31318 0.500 o 1 0.08402 0.47410 0.19429 1.000 o 2 0.30732 0.74682 0.95795 0.500 o 3 0.90995 0.24051 0.40013 0.500
and the corresponding starting pbso41.dat file :
Test on PbSO4 Pnma 8.482 5.398 6.959 90. 90. 90. P N M A 1.54056 4 5 3 pb s o 1 1 3 1.0 1.0 0.005 5. 5. 5. 5. 20000 300000 100000 40000 0.3 2 10 0 0.9951962 0.2052748 0.1227539 0.5 0.4912747 0.8526265 0.0630862 0.5 0.3038542 0.9097319 0.2772063 1.0 0.3659430 0.3213414 0.7311531 0.5 0.0127584 0.2117186 0.1964869 0.5
This compound structure is solved in Im2m, however all the other possible
groups had to be tried. This example illustrates a more complicated manual
choice of the occupation numbers (file im2m.dat).
Ba2CdP3O10(OH) Im2m 11.9031 7.3407 5.5533 90.0 90.0 90.0 I M 2 M 1.54056 4 9 4 ba cd p o 1 1 2 5 1.0 1.0 0.001 5. 5. 5. 5. 5. 20000 1000000 100000 40000 0.3 2 10 0 0.3545507 0.4620205 0.2947413 0.5 0.4923450 0.7782467 0.7220183 0.25 0.0822101 0.1718911 0.3469331 0.5 0.3218593 0.5036837 0.9342188 0.25 0.5619266 0.7106229 0.0156163 1.0 0.6033217 0.8264415 0.4910100 0.5 0.5698345 0.9051973 0.0746788 0.5 0.9920360 0.9374517 0.8545054 0.5 0.8335328 0.2782474 0.2700855 0.25and the result after 24minutes on a Pentium II 333MHz is :
201 moves acc. 980206 tested; Chi**2=0.879E-01,
R=0.088
993 perm. acc. 98020 tested
527 events did not
improved the fit
Final coordinates x,y,z and occupation numbers
ba 1 0.80476 0.53371 0.51686 0.500 cd 1 0.00181 0.80944 0.99895 0.250 p 1 0.87267 0.87782 0.71433 0.500 p 2 0.00169 0.39609 0.00658 0.250 o 1 0.33724 0.66500 0.95494 1.000 o 2 0.50340 0.00505 0.72417 0.500 o 3 0.09766 0.36232 0.03478 0.500 o 4 0.60878 0.52288 0.99350 0.500 o 5 0.38358 0.52165 0.99919 0.250
Many of the true positions are special positions with 0 or 1/2 coordinates.
You should not expect that ESPOIR will give you such exact values. You
will have to give a look to the International Table for Crystallography.
In such a case, the automatic mode will not work ! Can you easily guess
those occupation numbers ? Not for all the oxygen atoms but there is not
a lot of possibilities for the Ba, Cd and P atoms, so that a part of the
solution is attainable, at least. Fortunately, many space groups do not
present any special positions, or many organic compounds show all their
atoms in general position (like the cimetidine in P21/n - last example
below). Ahem, note that the final structure was found distorted in the
monoclinic system, with beta=90.09°...
In all the previous examples, the structure factors presented to ESPOIR were excellent ones (as provided by a single crystal study). The present case is the SDPD round robin sample I, for which no participant proposed a model (although it was solved by the organizers, see : Solid State Sciences 1, 1999, 55-62). Below is shown the ESPOIR performance on this compound with good data and with selected "|Fobs|" extracted from the Round Robin X-ray pattern.
- Good data in P 1 (file coamin.dat). ESPOIR works fine for this 30 atoms problem, but you will have to apply PLATON for searching the missing symmetry operators, and some atoms are certainly misplaced or not well identified. Have a look at the 2 Co aoms related by a y+1/2 translation, which is a good sign :
640 moves acc. 1981001 tested; Chi**2=0.783E-01,
R=0.078
22227 perm. acc. 198100 tested
10743 events did not improved
the fit
Final coordinates x,y,z and occupation numbers
co 1 0.08745 0.07298 0.55174 1.000 co 2 0.73690 0.57190 0.43416 1.000 n 1 0.11759 0.56399 0.53891 1.000 n 2 0.72046 0.12305 0.95530 1.000 n 3 0.65802 0.71939 0.60319 1.000 n 4 0.81151 0.43511 0.25895 1.000 n 5 0.82421 0.44626 0.66004 1.000 n 6 0.10960 0.62280 0.02686 1.000 n 7 0.66342 0.70737 0.19529 1.000 n 8 0.34270 0.98788 0.61994 1.000 n 9 0.16966 0.20956 0.79560 1.000 n 10 0.47434 0.48721 0.35990 1.000 n 11 0.17457 0.22209 0.37140 1.000 n 12 0.01310 0.94498 0.32230 1.000 c 1 0.01814 0.94085 0.73032 1.000 c 2 0.14872 0.06910 0.54082 1.000 o 1 0.85626 0.20192 0.98168 1.000 o 2 0.31789 0.98017 0.11658 1.000 o 3 0.24503 0.69159 0.54010 1.000 o 4 0.71507 0.00071 0.87474 1.000 o 5 0.56270 0.19712 0.95573 1.000 o 6 0.27031 0.69413 0.03566 1.000 o 7 0.57408 0.18812 0.46404 1.000 o 8 0.16760 0.47710 0.60812 1.000 o 9 0.96780 0.70361 0.02286 1.000 o 10 0.11143 0.49537 0.11406 1.000 o 11 0.66362 0.99035 0.37812 1.000 o 12 0.50667 0.47603 0.88218 1.000 o 13 0.83425 0.17602 0.44959 1.000 o 14 0.99477 0.67122 0.53212 1.000
Good data in P21, that is more direct (cop21.dat file) :
344 moves acc. 1981123 tested; Chi**2=0.365E-01,
R=0.037
15846 perm. acc. 198112 tested
7769 events did not improved
the fit
Final coordinates x,y,z and occupation numbers
co 1 0.81950 0.23785 0.93907 1.000 n 1 0.74516 0.10563 0.69804 1.000 n 2 0.89369 0.37553 0.76021 1.000 n 3 0.26168 0.59025 0.88503 1.000 n 4 0.09917 0.87128 0.82629 1.000 n 5 0.80493 0.69140 0.46587 1.000 n 6 0.56948 0.32317 0.87206 1.000 c 1 0.79570 0.70602 0.94995 1.000 o 1 0.34462 0.11217 0.53715 1.000 o 2 0.74427 0.83145 0.88826 1.000 o 3 0.07663 0.13665 0.04575 1.000 o 4 0.59402 0.33707 0.37846 1.000 o 5 0.05891 0.11206 0.51844 1.000 o 6 0.79752 0.81448 0.38107 1.000 o 7 0.34391 0.11612 0.04028 1.000The success rate is here of 4/40 and 17 hours of calculation on a Pentium II 266 MHz, and the structure is almost perfect, no error on C, N and O assignments...
Now, if we examine the result with real data, as extracted from the powder pattern distributed with the SDPD Round Robin, the result is certainly not as beautiful. Zhu5.hkl corresponds to a set of reflections extracted by the Le Bail method with Fullprof, excluding those reflections having a neighbouring one at less than 0.05 2-theta degrees. Near of 150 reflections (at the lowest angles) are used (for 15 atoms to be found in P21). Below are the best results :
179 moves acc. 1980792 tested; Chi**2=0.193
, R=0.193
5133 perm. acc. 198079 tested
2639 events did not improved
the fit
Final coordinates x,y,z and occupation numbers
co 1 0.82093 0.36020 0.94044 1.000 n 1 0.37594 0.07268 0.14666 1.000 n 2 0.62631 0.90305 0.47297 1.000 n 3 0.83421 0.84308 0.44808 1.000 n 4 0.37878 0.39791 0.31452 1.000 n 5 0.78679 0.38652 0.44956 1.000 n 6 0.69884 0.09903 0.45740 1.000 c 1 0.16988 0.23056 0.21554 1.000 o 1 0.79673 0.36041 0.97713 1.000 o 2 0.63487 0.91408 0.05895 1.000 o 3 0.43222 0.33256 0.09743 1.000 o 4 0.43146 0.66930 0.73072 1.000 o 5 0.40683 0.95036 0.59913 1.000 o 6 0.78341 0.77434 0.91579 1.000 o 7 0.74343 0.48008 0.05556 1.000This is not a complete solution but many atoms are already well placed, including the Co atom.
The success rate is of the order of 1/50. The best result is shown below with R = 3.7% after 8000000 moves and 7 hours of calculation (Pentium II 266Mhz). If you dispose of a big and powerful computer, try to do more tests on it, and let me know the result.
432 moves acc. 7990498 tested; Chi**2=0.372E-01,
R=0.037
15679 perm. acc. 799049 tested
7894 events did not improved
the fit
Final coordinates x,y,z and occupation numbers
s 1 0.98255 0.91396 0.69786 1.000 c 1 0.08635 0.83805 0.07537 1.000 c 2 0.64513 0.60618 0.14642 1.000 c 3 0.06324 0.65779 0.72502 1.000 c 4 0.72905 0.74969 0.67829 1.000 c 5 0.12604 0.54853 0.25767 1.000 c 6 0.90039 0.78266 0.20823 1.000 c 7 0.03221 0.78999 0.16923 1.000 c 8 0.46558 0.90546 0.90365 1.000 c 9 0.47914 0.41234 0.50753 1.000 c 10 0.73601 0.53650 0.28145 1.000 n 1 0.87490 0.26547 0.77556 1.000 n 2 0.40569 0.09148 0.11088 1.000 n 3 0.29048 0.31663 0.43408 1.000 n 4 0.33178 0.01164 0.35182 1.000 n 5 0.95375 0.62665 0.56134 1.000 n 6 0.63445 0.05413 0.25410 1.000
Finding an octahedron
A powder pattern was calculated by using the SDPDRR sample I characteristics,
but keeping only 2 CoN6 octahedra related by the 21 axis. Then ESPOIR 2
was run with (zhutest.dat) :
Finding an octahedron CoN6 7.662 9.626 7.072 90. 106.20 90. P 21 1.54056 4 7 2 1 0.25000 -0.26925 0.12779 0.04 co n 1 6 1.5 0 1 0 1 5. 2.0 2.5 1. 0.01 7. 7. 2. 500 5000 5000 5000 0.25 2 2 -10 10. 10. 10. 90. 90. 90. 0.00 0.00 0.001 1.00 0.207 0.00 0.001 1.00 -0.207 0.00 0.001 1.00 0.00 0.207 0.001 1.00 0.00 -0.207 0.001 1.00 0.00 0.00 0.208 1.00 0.00 0.00 -0.206 1.00Note that the chances to find an octahedron are enhanced by a factor 6 because of the equivalent positions which superpose the octahedron corners by 90° rotations. Thus the total number of events (rotations + translations) may be reduced to 5000 here for a full success ratio (10/10), using the 100 first reflections of the powder pattern. The final R are in the range 0.150-0.052. Below is the best result. The original position was 0,0,0 for Co. The proposition 0.49,0.15,0.51 is correct, owing to the P21 space group (freedom along y).
16 rot+tr acc. 2500 gen. and 2499 tested; Chi**2=0.519E-01, R=0.052 6 trans. acc. 2498 tested 4 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
co 1 0.49117 0.15466 0.50805 1.000 n 1 0.73961 0.25510 0.56978 1.000 n 2 0.24274 0.05423 0.44631 1.000 n 3 0.56692 0.02909 0.30678 1.000 n 4 0.41542 0.28024 0.70931 1.000 n 5 0.38305 0.29745 0.28762 1.000 n 6 0.59930 0.01188 0.72847 1.000
[Co(NH3)5CO3]NO3.H2O
Now, the question is : does ESPOIR will locate the CoN5O octahedra
from the SDPDRR sample I data ? In fact the C atom was added without completing
the CO3 group, in order to build a CoN5OC unit. The test was from the true
SDPDRR data, using the first 100 extracted "|Fobs|" (zhu.dat) :
Test on Zhu3f 7.662 9.626 7.072 90. 106.20 90. P 21 1.54056 4 8 4 1 0.25000 -0.26925 0.12779 3. co n c o 1 5 1 1 1.5 0 1 0 1 5. 2.0 3.0 2.0 2.5 2.5 2.5 5. 1.25 2.5 1. 0.005 7. 7. 7. 7. 2. 5000 60000 30000 60000 0.30 2 4 -10 10. 10. 10. 90. 90. 90. 0.00 0.00 0.001 1.00 0.207 0.00 0.001 1.00 -0.207 0.00 0.001 1.00 0.00 0.207 0.001 1.00 0.00 -0.207 0.001 1.00 0.00 0.00 0.208 1.00 0.00 0.00 -0.336 1.00 0.00 0.00 -0.206 1.00The final R values for 10 tests are in the range 0.296-0.327, with the best below :
41 rot. acc. 45000 gen. and 44965 tested; Chi**2=0.296 , R=0.296 10 trans. acc. 14999 tested 13 events did not improved the fit, dump = 0.000000
Final coordinates x,y,z and occupation numbers
co 1 0.79439 0.21918 0.44450 1.000 n 1 0.85490 0.10401 0.70181 1.000 n 2 0.73387 0.33436 0.18718 1.000 n 3 0.70579 0.38121 0.58665 1.000 n 4 0.88299 0.05716 0.30234 1.000 n 5 0.53432 0.13718 0.36395 1.000 c 1 0.21779 0.35268 0.57563 1.000 o 1 0.05446 0.30119 0.52504 1.000This is to be compared to the positions in the publication (Solid State Sciences 1, 1999, 55-62) :
Co 0.8192 0.25* 0.4408 N 5 0.899 0.119 0.668 N 1 0.742 0.383 0.198 N 2 0.746 0.395 0.626 N 3 0.893 0.113 0.271 N 4 0.567 0.162 0.364 C 1 0.210 0.280 0.557 O 1 0.081 0.351 0.537
*fixed
So, it is a success, isn't it ? The remaining atoms should be found
by Fourier difference synthesis.
Cimetidine (pharmaceutical)
If you knew the full molecule of cimetidine, could you locate it with
ESPOIR version 2 ? Undoubtedly yes (cim0.dat) :
Test on cimetidine C10H16N6S 10.7001 18.8206 6.8255 90.0 111.28 90.0 P 21/N 1.52904 4 17 3 1 0.01176 -0.00481 0.00223 3. s c n 1 10 6 3.0 0 0 0 1 1.0 0.005 7. 7. 7. 2. 5000 400000 100000 400000 0.35 2 4 -10 10.7001 18.8206 6.8255 90.0 111.28 90.0 -0.01477 0.51390 0.29953 1.0000 0.23174 0.35195 0.77730 1.0000 0.08631 0.43676 0.67645 1.0000 0.02869 0.38828 0.76868 1.0000 -0.10273 0.38175 0.80964 1.0000 0.02448 0.51120 0.59391 1.0000 0.14323 0.49332 0.24463 1.0000 0.23598 0.56322 0.37858 1.0000 0.46070 0.50489 0.50155 1.0000 0.56444 0.44226 0.82692 1.0000 0.62392 0.54993 0.35794 1.0000 0.45268 0.47273 0.66131 1.0000 0.21030 0.41723 0.66634 1.0000 0.36874 0.54625 0.34632 1.0000 0.12808 0.33514 0.82688 1.0000 0.59352 0.50863 0.49126 1.0000 0.66651 0.58746 0.25019 1.0000After 6 tests (300000 rotations + 100000 translations for each run, using the first 100 extracted "|Fobs|" for regenerating the powder pattern), the R values are in the range 0.514-0.121, with several proposal ar R~0.25. The solution for R=0.121 is absolutely correct for trying a subsequent Rietveld Refinement.
pyrene
This compound was used as an example for the GAP program (Genetic Algorithm
Program, still unavailable) (Zeit. Kristallogr. 212, 1997, 550-552). By
GAP, a solution for the whole C16D10 molecule (ToF neutron data) could
be obtained in 33 seconds.
ESPOIR is not able to do as well, however, the C16 group is here located from X-ray data. The R values are in the range 0.140-0.317 for 10 tests, (pyrene.dat) :
Test on Pyrene 13.649 9.256 8.470 90.0 100.28 90.0 P 21/A 1.52904 4 16 1 1 0.02511 -0.04562 0.03019 3 c 16 3.0 0 1 0 1 1.3 1.0 0.01 7. 2. 2000 400000 100000 400000 0.25 2 4 -10 0.00 0.00 0.00 90.00 90.00 90.00 4.2223 -0.3721 3.4328 1.00 4.6117 0.2277 2.2644 1.00 3.9412 -0.0713 1.0618 1.00 4.2967 0.5248 -0.1984 1.00 3.6721 0.2194 -1.3151 1.00 2.5940 -0.6831 -1.3384 1.00 1.8878 -1.0209 -2.5169 1.00 0.8355 -1.9345 -2.4719 1.00 0.4684 -2.5417 -1.3284 1.00 1.1167 -2.2177 -0.1092 1.00 0.7541 -2.8416 1.1301 1.00 1.3758 -2.5658 2.2552 1.00 2.4843 -1.6059 2.2694 1.00 3.1909 -1.3069 3.4678 1.00 2.8695 -1.0098 1.0859 1.00 2.1862 -1.3042 -0.1133 1.00The Cartesian coordinates were taken from the Cambridge Structural Databank (CSD), here recognized by putting a=b=c=0.
1-methylfluorene
The structure of this compound was determined by the OCTOPUS96 program
(still unavailable, Monte Carlo approach) (. Mater. Chem. 6, 1996, 1601-1604).
The solution was from a C13 fragment at move 18251 in a run of 20000 events,
with Rwp = 33.7%.
The same conditions were used here : data range 5-55° (2-theta), with the C13 fragment, however, only the 100 first extracted "|Fobs|" were retained for building the regenerated powder pattern. The R values obtained for 10 runs are in the range 0.403-0.250 (success ratio : 2/10) for 400000 events (300000 rotations + 100000 translations, needing ~20mn for one run on a Pentium II 266MHz). At R=0.25, the structure is determined, when one C atom is lacking. It should be located by Fourier difference (methyl.dat).
Test on 1-methylfluorene 14.2973 5.7011 12.3733 90.00 95.1060 90. P 21/N 1.52904 4 13 1 1 0.02511 -0.04562 0.03019 3 c 13 3.0 0 0 0 1 1.0 0.01 7. 2. 2000 400000 100000 400000 0.25 2 4 -10 14.297300 5.701100 12.3733 90.00 95.1060 90. 0.29800 1.24300 -0.03500 1.00 0.37100 1.08300 -0.04400 1.00 0.39900 0.91000 0.03700 1.00 0.33900 0.86300 0.11900 1.00 0.26500 1.01800 0.12500 1.00 0.19700 1.02800 0.20900 1.00 0.18600 0.88300 0.29700 1.00 0.11300 0.93100 0.36300 1.00 0.05300 1.12000 0.33600 1.00 0.06100 1.25900 0.24400 1.00 0.13500 1.21300 0.18300 1.00 0.16300 1.52200 0.08400 1.00 0.24300 1.20000 0.05000 1.00
tetracycline hydrochloride
This is the SDPDRR sample II. It was solved in due time by two participants
of which one used a GOM (Global Optimization Method, program DRUID, still
unavailable) and obtained the whole structure from the first 100 reflections
extracted by the Pawley method (the other successfull participant used
the Patterson method). It was also solved after the deadline by the commercial
PowderSolve software (MSI).
ESPOIR version II cannot cope actually with a fragment and an isolated atom (Cl here) together. But, can the tetracycline fragment, witout the Cl atom, be located from the SDPDRR data ? Absolutely yes, by using the same model (tetracycline hexahydrate from CSD) that gave the success to the GOM procedure (testtetra.dat):
Test on tetracycline hydrochloride C22H25ClN2O8 10.980181 12.852233 15.733344 90.0 90.0 90.0 P 21 21 21 0.692 4 32 3 1 0.00600 0.00446 0.00257 3. c n o 22 2 8 3.5 0 0 0 1 1.0 0.005 8. 8. 8. 2 5000 400000 100000 400000 0.35 2 4 -10 10. 10. 10. 90. 90. 90. -0.22081 -0.53245 -0.35051 1.0 -0.27849 -0.66194 -0.37186 1.0 -0.41052 -0.69479 -0.31734 1.0 -0.20611 -0.75874 -0.44949 1.0 -0.05317 -0.75703 -0.43833 1.0 -0.00174 -0.76502 -0.68536 1.0 0.14454 -0.87923 -0.52443 1.0 0.01312 -0.62129 -0.40945 1.0 0.07153 -0.62111 -0.26783 1.0 0.15275 -0.49490 -0.23964 1.0 0.19505 -0.48280 -0.09118 1.0 0.27360 -0.60591 -0.04888 1.0 0.27359 -0.35248 -0.07258 1.0 0.37728 -0.34364 0.01915 1.0 0.44414 -0.22175 0.03971 1.0 0.40770 -0.10912 -0.03139 1.0 0.30356 -0.11522 -0.12295 1.0 0.23606 -0.23711 -0.14507 1.0 0.12660 -0.24238 -0.24374 1.0 0.07787 -0.37109 -0.28573 1.0 -0.02680 -0.37653 -0.37392 1.0 -0.08748 -0.50566 -0.42679 1.0 -0.44952 -0.82301 -0.32213 1.0 0.00533 -0.83355 -0.55292 1.0 -0.26825 -0.44164 -0.28126 1.0 -0.48680 -0.60976 -0.26366 1.0 -0.25540 -0.85358 -0.51172 1.0 0.07413 -0.47422 -0.01185 1.0 0.27169 -0.00178 -0.19072 1.0 0.07709 -0.13458 -0.28894 1.0 -0.08530 -0.26761 -0.42377 1.0 -0.12576 -0.48989 -0.56368 1.0After 10 tests (300000 rotations + 100000 translations for each run, using the first 100 extracted "|Fobs|" for regenerating the powder pattern), the R values are in the range 0.368-0.264. The solution for R=0.264 is absolutely correct for trying a subsequent Rietveld Refinement and finding the remaining Cl atom by Fourier difference synthesis. You could also try to find it by ESPOIR, just by fixing the tetracycline model (with amplitudes of moves = 0) and using the "scratch" option for trying to locate the Cl atom...
You may try to modify/improve ESPOIR in order to include the simultaneous search of several molecular fragments. If you do so, please contact me, and remember, you should absolutely let the source code open (GNU license).
In the "scratch" option, ESPOIR uses the "brute force", thanks to fast and cheap computers. Is it really more capable to solve your structure than by classical Patterson and Direct methods ? This is not so sure. Have also a look at the more conventional methods as described in the SDPD tutorial. More generally, visit the SDPD Database, and subscribe to the SDPD Mailing List.
Indroduce the possibility to include several fragments to be simultaneously translated and rotated, or fragments + isolated atoms. Add the possibility to treat torsion angles.
If you want to help, or do the whole job, you are welcome !