Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Draft CIF twin dictionary for approval

Colleagues,

In principle I do not see a problem with combining the two loops, and there is an advantage in using a familiar structure.  There may be problems in the details.  For example giving an id a minus sign to indicate that the next line belongs to the same observation is a CIF no-no.  There is no significance in the order in which the lines in a list appear, but this should not be a serious problem.  The _datum_id will do the job and allow one to extract the list of just the observed structure factors.  Using the datum_id does not restrict one to grouping all the contributions to the same observation together. 

One reason for using two lists is to reduce the number of times Fobs has to be given.  Is there a problem, for example, if the values of Fobs for a given observed reflection are not the same?  With the two lists it is easier to provide Fobs and the corresponding Fcalc, which is a sum of the different twin contributions weighted according to the size of their contribution.  It is thus directly related to Fobs in a way that the individual twin contributions are not.  This Fcalc(sum) could, of course, also be included in a single list, but it would have to be repeated as often as the Fobs.

On a techinical point, the names would have to be changed as all the items in a list must belong to the same category and so start with the same category name, but this should not be a problem.

Clearly we need to have a look at this and get back to the core dictionary management group with a solution, but before we do this, it would be helpful to know if there are other suggestions we should consider.

David
Chair: Core dictionary management group

On 10/22/2013 3:56 AM, Tony Linden wrote:
Dear James,

Thanks for your thoughts.  I do appreciate all that you, Vic and others have put into this over a very long period and sorry to quibble at this late stage.  As far as I can recall, this is the first time I have seen at least the material relating to reflection handling.

Having thought a little more over this overnight and in light of George's remarks yesterday, my strong suggestion would be to avoid having two lists.  Given that the HKLF5 format has been around for over 20 years (the similar input for CRYSTALS even longer), people are already very used to working with that and mostly understand it and I think all software around can work with it.  So why re-invent a wheel that has been working well for 20+ years?  I think it would be simpler for USERS if the CIF definitions emulated the HKLF5 style, than have something new and (at first glance) not intuitive and potentially open to misunderstanding.  You can probably relatively easily morph what you have done already into that style.  Yes, one could run over your two lists (easy, yes, but the programmer must be very careful, I suspect) and generate an HKLF5 type file, but that requires either helper programs to be written (which would frustrate users if they have to have a separate step) or the existing software to be adapted (which you will note George is not that keen on these days).

Would the following work?  Needs only a single list.
  loop_
     _twin_refln_datum_id  (instead of "_twin_contribution_datum_id")
     _twin_contribution_index_h  (maybe "twin_individual_index_*" would be better??)
     _twin_contribution_index_k
     _twin_contribution_index_l
     _twin_refln_F_squared_calc
     _twin_refln_F_squared_meas
     _twin_refln_F_squared_sigma
     _twin_refln_include_status
     _twin_contribution_individual_id (or just "_twin_individual_id"??)

Note that datum_id and individual_id are essentially combined in HKLF5 (the former as a running counter is not really needed), with the negative sign for individual_id cleverly indicating there are more entries for other individuals to follow (immediately) for that reflection.

The above could be used for structure factor output as well as reflection input (by excluding the _calc and _status values from the loop).

The names  _twin_refln_F_squared_* could be modified if you like to make it clear that these values are the sum over all contributors to that reflection and not the individual values.  At least for _meas. In principle, _calc could be the actual calculated value for the individual.

By the way, what would be the weights in the description "The calculated F^2 for an observed peak is a weighted sum of the contribution of all the twin components"?  Aren't they unity?

Best wishes,
Tony




At 16:58 +1100 22.10.13, James Hester wrote:
Dear Core CMG members,

As one of those involved in 'gestating' the twinning dictionary let me discuss a few of Tony's concerns:

Tony writes:

"A fully overlapping reflection would have n individual contributors, each with its own values of h,k,l, but one cannot know the contribution of each to the measured F**2 and sigma (in the input data).  So how is that handled in the current definitions; i.e. I cannot see an example where there are two entries with different individual_id and h,k,l, but the same F_squared_meas and F_squared_sigma.  I am guessing that the _twin_contribution_ list in combination with the _twin_reflection_ list achieves this.  OK, but then you have two lists which must be given together, so things are getting complicated and long."

The logic of the two lists as surmised by Tony is correct.  The F_squared_meas and F_squared_sigma values for each observed spot are listed once, in the twin_refln loop, and the individuals contributing to each observed spot are listed once, in the twin_contributions loop.  Presumably the documentation on the dictionary needs to be improved to make this clear.

Tony continues:

"Furthermore, the appearance of all h,k,l assignments in the _twin_contribution_ list is then to some extent duplicated (and complicated) by the use of h,k,l again, but only from one component in the _twin_reflection_ list.  Understanding how these two lists are constructed, indexed and combined seems rather complicated and takes quite a bit of digesting of the dictionary to come to grips with.  Could it be simpler???"

The h,k,l and individual_id columns in the twin_refln list are optional and can be removed altogether from the specification with no loss of information.  We provided these in case they were seen as useful, but so far it looks like they are simply confusing.  Tony - how does your assessment change if h,k,l and individual_id are removed from twin_reflns?

A single list as similar as is possible to the HKL5 output of Shelx can be obtained by running over the twin_contribution loop and inserting the F_squared_calc, F_squared_meas, F_squared_sigma values from the twin_refln loop for the matching datum_id, which does not strike me as a complex operation.

all the best,
James.

On Mon, Oct 21, 2013 at 8:22 PM, Tony Linden <<mailto:alinden@oci.uzh.ch>alinden@oci.uzh.ch> wrote:

Dear David,

I looked over the twin CIF definition document.  To be honest, at first reading the definitions needed and used to handle reflection data seem very complicated for general use.  Below are some thoughts. I don't know if these are addressed in the current effort or things not yet quite resolved.  Maybe they are, but I did not get my head around all this sufficiently well to know.

First, there are some typos I detected.
"the this" appears in several places and probably "this" has to be deleted.
In a description, the range 620 - 624 is given followed by 624 - 629. Probably the latter starts at 625.

Would the definitions for reflection data allow the full input reflection file to be reconstructed, e.g. for use in CRYSTALS, SHELXL, etc.?  I think the answer is yes, but I want to be sure people agree.

A fully overlapping reflection would have n individual contributors, each with its own values of h,k,l, but one cannot know the contribution of each to the measured F**2 and sigma (in the input data).  So how is that handled in the current definitions; i.e. I cannot see an example where there are two entries with different individual_id and h,k,l, but the same F_squared_meas and F_squared_sigma.  I am guessing that the _twin_contribution_ list in combination with the _twin_reflection_ list achieves this.  OK, but then you have two lists which must be given together, so things are getting complicated and long.  Furthermore, the appearance of all h,k,l assignments in the _twin_contribution_ list is then to some extent duplicated (and complicated) by the use of h,k,l again, but only from one component in the _twin_reflection_ list. Understanding how these two lists are constructed, indexed and combined seems rather complicated and takes quite a bit of digesting of the dictionary to come to grips with.  Could it be simpler???

Compare this with lines in the HKLF5 style input for SHELXL (CRYSTALS is similar)...
   h   k   l  Fo**2     sigma  component (=individual_id)
  -4   2  -5  440232    6723  -2
  -4  -1   6  440232    6723   1
  -3  -1   6  336093    5357   1
  -2  -1   6 2138204   47562   1
  -1  -1   6   71870    1617   1
   0   1  -6 2044486   44513  -2
   0  -1   6 2044486   44513   1
The first and last two lines are overlapping refls from both components (common F**2 and sigmas), lines 3-5 are non-overlaps from component 1 only.

I am not trying to suggest we need to follow the SHELXL path always, but I am interested in keeping things relatively simple and easy to understand for users.  To me, the SHELXL idea seems less complicated than two separate _twin_contribution_ and _twin_reflection_ lists.

Just my thoughts...

P.S. Following our discussions in Warwick, I was working on a list of CIF items that we might think about updating or other new things needed.  I will get that to you eventually, but things have been too hectic lately for me to devote much time to it (which is why I could not accept James Hester's kind invitation to step into your shoes).

Best wishes,
Tony




Dear Colleagues,

After many years of gestation, a draft CIF dictionary of items for describing twinning in crystals is now available.  It is attached to this email which is being circulated to the core CIF Dictionary Management Group for your approval, this being the final step in the formal COMCIFS approval process.  If you are receiving this message you are invited to review the attached draft and either indicate your approval, or draw attention to potential problems, by replying to the core DMG list at <mailto:<mailto:coredmg@iucr.org>coredmg@iucr.org><mailto:coredmg@iucr.org>coredmg@iucr.org. When approved the twinning dictionary will become an addendum to the coreCIF dictionary.

The draft is open for review for six weeks ending  on 11 November 2013.  If you have not replied by then, it will be assumed that you approve of the attached document as circulated.  If any questions are raised we will try if possible to resolve them within the review period.

The dictionary is also available at the URL <<https://github.com/jamesrhester/twinning-dic>https://github.com/jamesrhester/twinning-dic><https://github.com/jamesrhester/twinning-dic>https://github.com/jamesrhester/twinning-dic


I look forward to receiving your response.

David Brown
Chair of the core CIF Dictionary Management Group

Attachment converted: Macintosh HD:cif_twinning_ver0.6.dic (TEXT/R*ch) (0072E233)
Attachment converted: Macintosh HD:idbrown.vcf (TEXT/R*ch) (0072E234)
_______________________________________________
coreDMG mailing list
<mailto:coreDMG@iucr.org>coreDMG@iucr.org
<http://mailman.iucr.org/mailman/listinfo/coredmg>http://mailman.iucr.org/mailman/listinfo/coredmg



--
-----------------------------------------------------------------------
 Prof. Dr. Anthony Linden
 Editor, Acta Crystallographica Section C
 University of Zurich
 Institute of Organic Chemistry
 Winterthurerstrasse 190
 CH-8057 Zurich, Switzerland

 Phone:  <tel:%2B41%2044%20635%204228>+41 44 635 4228
 Fax:    <tel:%2B41%2044%20635%206812>+41 44 635 6812

 <http://www.chem.uzh.ch/linden>http://www.chem.uzh.ch/linden
 <mailto:alinden@oci.uzh.ch>alinden@oci.uzh.ch
-----------------------------------------------------------------------
 2014: The International Year of Crystallography
 <http://www.iycr2014.org>http://www.iycr2014.org
-----------------------------------------------------------------------
 The Zurich School of Crystallography,
 University of Zurich, June 7-20, 2015
 <http://www.chem.uzh.ch/linden/zsc>http://www.chem.uzh.ch/linden/zsc
=======================================================================
_______________________________________________
coreDMG mailing list
<mailto:coreDMG@iucr.org>coreDMG@iucr.org
<http://mailman.iucr.org/mailman/listinfo/coredmg>http://mailman.iucr.org/mailman/listinfo/coredmg




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
coreDMG mailing list
coreDMG@iucr.org
http://mailman.iucr.org/mailman/listinfo/coredmg



begin:vcard
fn:I.David Brown
n:Brown;I.David
org:McMaster University;Brockhouse Institute for Materials Research
adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada
email;internet:idbrown@mcmaster.ca
title:Professor Emeritus
tel;work:+905 525 9140 x 24710
tel;fax:+905 521 2773
version:2.1
end:vcard

_______________________________________________
coreDMG mailing list
coreDMG@iucr.org
http://mailman.iucr.org/mailman/listinfo/coredmg

[Send comment to list secretary]
[Reply to list (subscribers only)]