[Date Prev][Date Next][Date Index]
(82) BNF; magic string; extension to mmCIF
- To: COMCIFS@iucr
- Subject: (82) BNF; magic string; extension to mmCIF
Dear Colleagues Thanks to everyone who responded directly to David Brown regarding the vote on the revised terms of reference for COMCIFS. I'm sorry that it is taking me so long to pick up all the threads and unfinished business from my unexpected trip away during late February. This time I have a couple of queries from Peter Murray-Rust (PMR>) and a request for technical approval of a set of data items to extend the mmCIF dictionary. D82.1 BNF description of STAR/CIF/dictionary CIF format ------------------------------------------------------- PMR> Is there a definitive statement in precise terms (e.g. in BNF) of (a) PMR> CIF (b) 'dictionary-enhanced CIF' (i.e. CIF-with-save-frames)? Although PMR> there is prose describing CIF there are places where it is not clear. For PMR> people coming from outside the Crystallographic community there has to be PMR> a precise definition of the language. PMR> PMR> If there *is* a BNF or equivalent, could it please be mounted on iucr.org? The "Backus-Naur Form" is a formal description of the syntax of a computer language. Nick Spadaccini and Peter Keller worked some time ago on a STAR BNF, and I append the last version of their work below. Members are welcome to review and criticise the proposal if they so desire. If there are no objections, I would like to post this version on the CIF web page. I trust that it (or a revised version) will also find its way into International tables Vol. G. I invite a proposal for a CIF form, and also for the data dictionary format (question from a computer science viewpoint - are one or two different BNF specifications required for DDL1 and DDL2 dictionaries? The DDL1 dictionaries conform fully to the CIF syntax; DDL2 dictionaries also permit save frames). ============================================================================== (Extracts from correspondence Date: Tue, 21 Nov 1995 12:10:17 From: Peter Keller <bsspak@bath.ac.uk> To: Brian McMahon <bm@iucr.ac.uk>, Nick Spadaccini <nick@sdsc.edu>, Sydney Hall <syd@crystal.uwa.edu.au>) A third attempt at a specification for STAR Why do CIF programmers need to know about STAR? I'm not a great expert on database theory, but I don't think that anyone would contradict me in saying that there are three broad aspects to using any datafile format. They are: 1) Lexical considerations: what arrangements of bytes (characters) are allowed, how they are grouped into tokens (words), and how those tokens are classified into various types. 2) Syntax or grammar: the function which each type of token has, the order in which tokens can follow each other, and how they are grouped into larger structures. 3) Semantics: the interpretation of the data, what it means, manipulation of the information which it expresses. For example, a sequence of non-space characters which begins with an underscore is a token of type 'data name' [lexical analysis]. It identifies a data value (which is another type of token), and must either be followed by a single data value, or appear in the header of a loop structure [syntactical analysis]. The semantics involve recognising the data name itself, in order to attach a meaning to the data value which it identifies. Now, for CIF's, STAR specifies all of the lexical and syntactical aspects, as well as a very few of the semantics. (The bulk of the semantical aspects of CIF are dealt with by DDL's and dictionaries). This means that any CIF file must be a valid STAR file, which means, in turn, that CIF programmers must be aware of the whole of the STAR syntax, not just the parts which are used by CIF. For example, in STAR, a framecode is a sequence of non-blank characters, the first of which is a '$'. Even though framecodes are not used in CIF's, it is still necessary to be aware of them, to avoid trying to start a non quoted text string with a '$'. The most recent published specification of the STAR syntax appears in "The STAR File: Detailed Specifications" (S.R.Hall and N.Spadaccini, J. Chem. Inf. Comput. Sci, 34, p505-508 (1994) ). This paper was itself intended to improve the precision of an earlier specification; however, code based on the newer specification still cannot be relied on to parse STAR files accurately without making certain assumptions during lexical analysis. Also, there were a number of typographical errors and omissions. Hopefully, this specification will be useful to programmers writing their own lexers and parsers directly. It is also intended to be easily convertible to a working parser using code generators such as yacc/lex, bison/flex, PCCTS, PRECCX, etc., with a minimum of interpretation required from the programmer. It seeks to consolidate existing practice, and to prevent ambiguities arising in the future. It is not intended to extend the syntax. N.B. This syntax should be read in conjunction with the JCICS paper. A few notes on the table: 1). A standard set of symbols is used in this table. They are: * means "zero or more" + means "at least one" {....} groups a number of terms together. [....] means "optional" (in a syntactic sense only - it does not mean that inserting or removing the item has no effect on the data representation being expressed.) | means "or" ::= means "is defined to be" <...> is a syntactic or lexical construction. If it appears on the right hand side of a definition, the definition of that term is subtituted in. 2). Items whose names are in UPPER CASE on the left-hand side of the definitions below, are those which would conventionally be considered as tokens during lexical analysis. This is only intended as a guide - in particular, tokens which can be of arbitrary length, may need to be processed in smaller portions in a particular implementation. For CIF's, this is only a consideration for the <semi_colon_bounded_text_string> type - all others are limited in size by the 80 characters to a line rule. 3). Comments. There are two possible ways of treating comments. The characters from the initial '#' to the last character on the line (inclusive) can simply be thrown away during lexical analysis. In this case, all the <comment>* constructions can be dropped from the definition below. (This is the treatment implied by the 1994 JCICS paper.) However, an application may carry out an operation on the information contained within a CIF (as opposed to the data structures themselves), generating a new CIF in the process. In such a case, it will normally be useful to retain comments in their original places in the output file, as far as possible. For this purpose, the comment specifications below can be used. The approach taken is that a comment is defined to be a token type. The file can start with an arbitrary number of comments, and each non-comment token can be followed by an arbitrary number of comments. 4). Whitespace. This is a potential minefield for precise definition, since which whitespace characters are allowed at particular places in a STAR file (or indeed any text file or stream) is system and protocol dependent. Getting to grips with the implications of the BNF table is hard work, so the rules are summarised here. Basically, six whitespace characters are permitted, all of which may separate tokens (in the sense of note 2 above, and the upper case definitions in the table below): ht (ASCII 9) lf (ASCII 10) vt (ASCII 11) ff (ASCII 12) cr (ASCII 13) ' ' (ASCII 32) Of these, vt and ff may only appear between tokens, not within them. ht and ' ' (i.e. <blank>) are allowed within quoted strings, comments and semi-colon bounded strings. The latter will contain <newline>'s as well. ff (but not vt) is taken to imply a new line, and so can introduce a semi-colon bounded text string and terminate a comment (and also delimit lines for the purposes of lines in a CIF being no more than 80 characters). <newline> subsumes cr and lf, but this does not imply that either or both are appropriate in any particular context. Some specific possibilities are: Unix <newline> :== <lf> MacOS <newline> :== <cr> MS-DOS <newline> :== <cr><lf> (raw file/fopen etc. in binary mode) MS-DOS <newline> :== '\n' (fopen etc. in text mode) * ASCII FTP <newline> :== <cr><lf> (RFC959) * Gopher <newline> :== <cr><lf> (RFC1436) * HTTP <newline> :== [<cr>]<lf> (strictly <cr><lf>, but the recommendation for tolerant http clients and servers states that: "Lines should be regarded as terminated by the Line Feed, and the preceeding Carriage Return character ignored." (http://www.w3.org/pub/WWW/Protocols/HTTP1.0/draft-ietf-http-v10-spec-04.txt section 2.2 and Appendix B). * For ftp, gopher and http, think of a network client parsing a STAR information stream directly from a server's port. Other environments may require extension of this picture - as a programmer, you have to be familiar with your own development enviromnent. Note that tokens must be separated by at least one whitespace character. In particular, it means that constructions such as the following are not allowed, even though they are, in principle, interpretable: _data.item_1 ; a bit of multi-line text ;_data.item_2 ..... There must be at least one whitespace character after the second semi-colon here. 5). Quoted text strings. These are defined so that a quote of the same type as that which encloses a string, can also appear in the middle of that string, as long as it is followed by a non-blank character. For example, in: _data.item 'Peter's example' the middle quote does not close the text string, because it is not followed by the whitespace character which is necessary to separate tokens. It is the third quote which fulfills that function. Note, though, that the DDL version 2.x, and mmCIF, restrict the characters which may appear in data items (check the contents of the ITEM_TYPE_LIST category in the dictionary). The above example only became legal in a macromolecular CIF at version 0.7.28 of mmCIF (1995-10-6), and then only for certain types of data. 6). Non-quoted text string. This complicated-looking construction is intended to distinguish between a semi-colon appearing in column 1, or elsewhere on the line: ;A_few_characters would be the first line of a <semi_colon_bounded_text_string>, and ;A_few_characters is a <non_quoted_text_string> . Also, a <non_quoted_text_string> is now explicitly prevented from starting with a quote, so it is illegal at the lexical level for an opening quote to appear without a corresponding closing quote on the same line. 7). During lexical analysis, tokens which begin with a fixed sequence of more than one character must be checked for first. (These are <global_heading>, <data_heading>, <save_heading>, and the fixed tokens "loop_", "save_" and "stop_" ). Other token types can now be determined just from the whitespace character preceding the token (the 'leading context') and the first character of the token itself. Note that of these tokens, only <data_heading> and "loop_" can appear in a CIF. (CIF dictionaries are not themselves CIF data files, so they may contain any of these tokens.) However, this does not mean that it is unnecessary for a CIF application to check for the other tokens, since they still have meaning at the STAR level. If you want to assign the sequence of characters (s, t, o, p, _) to a data name in a CIF, you have to quote it: _data.item 'stop_' otherwise the CIF will not be a legal STAR file (or, at the very least, will not mean what you want it to mean). 8). Loops. It is a requirement of STAR that a loop structure must have complete loop packets, i.e. the number of data values in a nested loop which are assigned to a particular <data_loop_definition> at its particular level of nesting, must be an integral multiple of the number of <data_name>'s in that <data_loop_definition>. CIF's don't allow nested loops, so this condition can be expressed in a simpler way: the number of data values in a loop structure must be an integral multiple of the number of data names in the loop header. This sounds obvious, but it is very difficult to express in a table such as the one below, so it is re-stated here. <star_file> ::= <comment>* {<data_block> | <global_block>}* <data_block> ::= <data_heading> <comment>* <data_block_body>* <global_block> ::= <global_heading> <comment>* <data_block_body>+ <GLOBAL_HEADING> ::= global_ <data_block_body> ::= {<data> | <save_frame>}+ <data> ::= <data_name> <comment>* <data_value> <comment>* | <data_loop> <save_frame> ::= <save_heading> <comment>* <data>* save_ <comment>* <data_loop> ::= loop_ <comment>* <data_loop_definition> <data_loop_values> <data_loop_defintion> ::= <data_loop_field>+ <data_loop_field> ::= { <data_name> | <nested_loop> } <comment>* <nested_loop> ::= loop_ <comment>* <data_loop_definition> [stop_] <data_loop_values> ::= <data_loop_item>+ <data_loop_item> ::= { <data_value> | stop_ } <comment>* <data_value> ::= <non_quoted_text_string> | <double_quoted_text_string> | <single_quoted_text_string> | <semi_colon_bounded_text_string> | <frame_code> <DATA_HEADING> ::= data_ <non_blank_char>+ <DATA_NAME> ::= _ <non_blank_char>+ <SAVE_HEADING> ::= save_ <non_blank_char>+ <NON_QUOTED_TEXT_STRING> ::= { { <line_begin><ordinary_char> } | {<line_space> {<ordinary_char>|<semi_colon>} } } <non_blank_char>* <DOUBLE_QUOTED_TEXT_STRING> ::= <D_quote> <D_quote_string> <D_quote> <D_quote_string> ::= {<D_quote> <non_blank_char> | <not_a_D_quote>}* <SINGLE_QUOTED_TEXT_STRING> ::= <S_quote> <S_quote_string> <S_quote> <S_quote_string> ::= {<S_quote> <non_blank_char> | <not_a_S_quote>}* <SEMI_COLON_BOUNDED_TEXT_STRING> ::= <line_begin><semi_colon> <char>* <newline> <line_of_text>* <semi_colon> <FRAME_CODE> ::= $<non_blank_char>+ <COMMENT> ::= { <line_space> | <line_begin> } #<char>* <line_begin> <line_begin> ::= beginning of a new line of input (i.e. leading { <newline> | <form_feed> } ) <blank> ::= space (ASCII 32) | horizontal tab (ASCII 9) <line_space> ::= end of leading, or start of trailing {<blank> | <vert_tab>} <non_blank_char> ::= ! shriek -> ~ tilde (ASCII 33-126) <char> ::= <blank> | <non_blank_char> <line_of_text> ::= [ <not_a_semi_colon><char>* ] <newline> <newline> ::= End of line character(s) as defined by operating system, file structure and system libraries. <form_feed> ::= ff (ASCII 12) <vert_tab> ::= vt (ASCII 11) <D_quote> ::= " (ASCII 34) <S_quote> ::= ' (ASCII 39) <not_a_D_quote> ::= ! shriek (ASCII 33) | # sharp -> ~ tilde (ASCII 35-126) | <blank> <not_a_S_quote> ::= ! shriek -> & ampersand (ASCII 33-38) | ( left bracket -> ~ tilde (ASCII 40-126) | <blank> <ordinary_char> ::= ! shriek (ASCII 33) | % percent (ASCII 37) | & ampersand (ASCII 38) | ( left bracket -> : colon (ASCII 40-58) | < less than -> ^ caret (ASCII 60-94) | ` backquote -> ~ tilde (ASCII 96-126) <not_a_semi_colon> ::= ! shriek -> : colon (ASCII 33-58) | < less than -> ~ tilde (ASCII 60-126) | <blank> <semi_colon> ::= ; semicolon (ASCII 59) ============================================================================== D89.2 "Magic number" to identify CIF format internally ------------------------------------------------------ PMR> Since dictionaries and data instances have different syntax it is PMR> important that the files have some internal means of identifying their PMR> type. [More generally perhaps CIF files should have a magic string at the PMR> top to indicate what their type is? e.g. PMR> #CIF DDL1.0 Crystallographic Information File (iucr.org)] PMR> PMR> Data will be increasingly be delivered by servers as streams (i.e. without PMR> suffixes, although hopefully with chemical/MIME stamps :-). For example, PMR> what does: PMR> http://www.pdb.bnl.gov/pdb-cgi/send-cif?foo PMR> send - a dictionary or an instance? The author of client-side software PMR> will need to know what is apparently sent so they can check on integrity. Note the decision by the imgCIF/CBF community to use the magic string ###_CRYSTALLOGRAPHIC_BINARY_FILE: VERSION 1.0 at the start of a crystallographic binary file. D89.3 Request for technical approval of extension to mmCIF dictionary --------------------------------------------------------------------- The following request has been submitted by the mmCIF dictionary management group. Note their proposal to release these data items to the community for content review on April 15. Absence of comment by a COMCIFS member by April 15 will be taken as approval of the technical compliance of the submitted items with the CIF standard. > To COMCIFS - > > In accordance with the time-table established by the executive of the mmCIF > working group, we are now ready to submit to COMCIFS the data items that > will constitute the extension of the mmCIF dictionary from Version 1.0 to > Version 2.0. > > These proposed new data itmes were submitted from the community (in this > case, from Kim Henrick of EBI), reviewed by the appropriate member(s) of > the panel of editors of the mmCIF dictionary for content, and then given > further editorial review by the executive of the mmCIF working group. It > is our understanding that the next step in the process is to submit the > data items to COMCIFS for a quick review of technical compliance with the > CIF standard, before submitting them for review by the crystallographic > community. Following community review and revision, we will submit these > data items to COMCIFS for formal approval. > > We would like to release them for community review by April 15, and hope > than the members of COMCIFS will find them technically sound in short > order so that the process of community review may begin. > > Paula Fitzgerald > Helen Berman > John Westbrook The data items are appended to this circular. I confirm that the file fragment exhibits no syntax errors under 'vcif'. Regards Brian _______________________________________________________________________________ Brian McMahon tel: +44 1244 342878 Research and Development Officer fax: +44 1244 314888 International Union of Crystallography e-mail: bm@iucr.ac.uk 5 Abbey Square, Chester CH1 2HU, England bm@iucr.org _______________________________________________________________________________ - - - - - ################# ## PHASING_MIR ## ################# # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ######################################### ## Submitted by Kim Henrick ## ## Content review by Paula Fitzgerald ## ## Editorial review by HB, JW and PMDF ## ######################################### save__phasing_MIR.d_res_high _item_description.description ; The highest resolution for the interplanar spacing in the reflection data used for the native data set. This is the smallest d value. ; _item.name '_phasing_MIR.d_res_high' _item.category_id phasing_MIR _item.mandatory_code yes _item_aliases.alias_name '_phasing_MIR.ebi_d_res_high' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__phasing_MIR.d_res_low _item_description.description ; The lowest resolution for the interplanar spacing in the reflection data used for the native data set. This is the largest d value. ; _item.name '_phasing_MIR.d_res_low' _item.category_id phasing_MIR _item.mandatory_code yes _item_aliases.alias_name '_phasing_MIR.ebi_d_res_low' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__phasing_MIR.fom _item_description.description ; The mean value of the figure of merit m for all reflections phased in the native data set. int P~alpha~ exp(i*alpha) dalpha m = -------------------------------- int P~alpha~ dalpha P~a~ = the probability that phase angle a is correct int is taken over the range alpha = 0 to 2 pi. ; _item.name '_phasing_MIR.fom' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_fom' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR.fom_acentric _item_description.description ; The mean value of the figure of merit m for the acentric reflections phased in the native data set. int P~alpha~ exp(i*alpha) dalpha m = -------------------------------- int P~alpha~ dalpha P~a~ = the probability that phase angle a is correct int is taken over the range alpha = 0 to 2 pi. ; _item.name '_phasing_MIR.fom_acentric' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_fom_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR.fom_centric _item_description.description ; The mean value of the figure of merit m for the centric reflections phased in the native data set. int P~alpha~ exp(i*alpha) dalpha m = -------------------------------- int P~alpha~ dalpha P~a~ = the probability that phase angle a is correct int is taken over the range alpha = 0 to 2 pi. ; _item.name '_phasing_MIR.fom_centric' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_fom_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR.reflns _item_description.description ; The total number of reflections phased in the native data set. ; _item.name '_phasing_MIR.reflns' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_reflns' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR.reflns_acentric _item_description.description ; The number of acentric reflections phased in the native data set. ; _item.name '_phasing_MIR.reflns_acentric' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_reflns_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR.reflns_centric _item_description.description ; The number of centric reflections phased in the native data set. ; _item.name '_phasing_MIR.reflns_centric' _item.category_id phasing_MIR _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR.ebi_reflns_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR.reflns_criterion _item_description.description ; Criterion used to limit the reflections used in the phasing calculations. ; _item.name '_phasing_MIR.reflns_criterion' _item.category_id phasing_MIR _item_aliases.alias_name '_phasing_MIR.ebi_reflns_criteria' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item.mandatory_code no _item_type.code text _item_examples.case '> 4 \s(I)' save_ ##################### ## PHASING_MIR_DER ## ##################### # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ######################################### ## Submitted by Kim Henrick ## ## Content review by Paula Fitzgerald ## ## Editorial review by HB, JW and PMDF ## ######################################### save__phasing_MIR_der.power_acentric _item_description.description ; The mean phasing power P for acentric reflections in this derivative. Phasing power is <FH / Lack_of_closure>. sum|Fh~calc~^2^| P = (----------------------------)^1/2^ sum|Fph~obs~ - Fph~calc~|^2^ Fph~obs~ = the observed structure factor amplitude of this derivative Fph~calc~ = the calculated structure factor amplitude of this derivative Fh~calc~ = the calculated structure factor amplitude from the heavy atom model sum is taken over the specified reflections ; _item.name '_phasing_MIR_der.power_acentric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_power_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_der.power_centric _item_description.description ; The mean phasing power P for centric reflections in this derivative. Phasing power is <FH / Lack_of_closure>. sum|Fh~calc~^2^| P = (----------------------------)^1/2^ sum|Fph~obs~ - Fph~calc~|^2^ Fph~obs~ = the observed structure factor amplitude of the derivative Fph~calc~ = the calculated structure factor amplitude of the derivative Fh~calc~ = the calculated structure factor amplitude from the heavy atom model sum is taken over the specified reflections ; _item.name '_phasing_MIR_der.power_centric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_power_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_der.R_cullis_acentric _item_description.description ; Residual factor R~cullis~ for acentric reflections in this derivative. Cullis R factor is <Lack_of_closure>/<Isomorphous difference>. NB: This is tabulated for acentric and anomalous terms, extending the former definition. ; _item.name '_phasing_MIR_der.R_cullis_acentric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_Rcullis_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_der.R_cullis_anomalous _item_description.description ; Residual factor R~cullis~ for anomalous reflections in this derivative. Cullis R factor is <Lack_of_closure>/<Isomorphous difference>. NB: This is tabulated for acentric and anomalous terms, extending the former definition. Anomalous difference is |FPHi(+) - FPHi(-)|. Calculated anomalous difference is 2 * FHi" * sin(PHIx) where PHIx is the protein phase. Lack of closure is | Anom.Diff - Calc.Anom.Diff| Cullis Rfactor is <Lack_of_closure>/ <Anomalous difference> This is tabulated for acentric terms. Any value <1.0 means there is some contribution to the phasing from the anomalous data. Sum(h) Sum(phi) prob(phi) |Dano_obs(h) - Dano_calc(h,phi)| RC(ano) = ---------------------------------------------------------- Sum(h) |Dano_obs(h)| ; _item.name '_phasing_MIR_der.R_cullis_anomalous' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_Rcullis_anomalous' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_der.R_cullis_centric _item_description.description ; Residual factor R~cullis~ for centric reflections in this derivative. sum| |Fph~obs~ +/- Fp~obs~| - Fh~calc~ | R~cullis~ = ---------------------------------------- sum|Fph~obs~ - Fp~obs~| Fp~obs~ = the observed structure factor amplitude of the native Fph~obs~ = the observed structure factor amplitude of the derivative Fh~calc~ = the calculated structure factor amplitude from the heavy atom model sum is taken over the specified reflections Ref: Cullis, A. F., Muirhead, H., Perutz, M. F., Rossmann, M. G. & North, A. C. T. (1961). Proc. Roy. Soc. A265, 15-38. ; _item.name '_phasing_MIR_der.R_cullis_centric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_Rcullis_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_der.reflns_acentric _item_description.description ; The number of acentric reflections used in phasing for this derivative. ; _item.name '_phasing_MIR_der.reflns_acentric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_reflns_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR_der.reflns_anomalous _item_description.description ; The number of anomalous reflections used in phasing for this derivative. ; _item.name '_phasing_MIR_der.reflns_anomalous' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_reflns_anomalous' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR_der.reflns_centric _item_description.description ; The number of centric reflections used in phasing for this derivative. ; _item.name '_phasing_MIR_der.reflns_centric' _item.category_id phasing_MIR_der _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der.ebi_reflns_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ ########################## ## PHASING_MIR_DER_SITE ## ########################## # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ######################################### ## Submitted by Kim Henrick ## ## Content review by Paula Fitzgerald ## ## Editorial review by HB, JW and PMDF ## ######################################### save__phasing_MIR_der_site.occupancy_anom _item_description.description ; The relative anomalous occupancy of the atom type present at this heavy-atom site in a given derivative. This atom occupancy will probably be on an arbitrary scale. ; _item.name '_phasing_MIR_der_site.occupancy_anom' _item.category_id phasing_MIR_der_site _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der_site.ebi_occupancy_anom' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_related.related_name '_phasing_MIR_der_site.occupancy_anom_esd' _item_related.function_code associated_esd _item_type.code float _item_type_conditions.code esd save_ save__phasing_MIR_der_site.occupancy_anom_esd _item_description.description ; The standard uncertainty (e.s.d) of _phasing_MIR_der_site.occupancy_anom. ; _item.name '_phasing_MIR_der_site.occupancy_anom_esd' _item.category_id phasing_MIR_der_site _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der_site.ebi_occupancy_anom_esd' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_default.value 0.0 _item_related.related_name '_phasing_MIR_der_site.occupancy_anom' _item_related.function_code associated_value _item_type.code float save_ save__phasing_MIR_der_site.occupancy_iso _item_description.description ; The relative real isotropic occupancy of the atom type present at this heavy-atom site in a given derivative. This atom occupancy will probably be on an arbitrary scale. ; _item.name '_phasing_MIR_der_site.occupancy_iso' _item.category_id phasing_MIR_der_site _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der_site.ebi_occupancy_iso' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_related.related_name '_phasing_MIR_der_site.occupancy_iso_esd' _item_related.function_code associated_esd _item_type.code float _item_type_conditions.code esd save_ save__phasing_MIR_der_site.occupancy_iso_esd _item_description.description ; The standard uncertainty (e.s.d.) of _phasing_MIR_der_site.occupancy_iso. ; _item.name '_phasing_MIR_der_site.occupancy_iso_esd' _item.category_id phasing_MIR_der_site _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_der_site.ebi_occupancy_iso_esd' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_default.value 0.0 _item_related.related_name '_phasing_MIR_der_site.occupancy_iso' _item_related.function_code associated_value _item_type.code float save_ ####################### ## PHASING_MIR_SHELL ## ####################### # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ######################################### ## Submitted by Kim Henrick ## ## Content review by Paula Fitzgerald ## ## Editorial review by HB, JW and PMDF ## ######################################### save__phasing_MIR_shell.fom_acentric _item_description.description ; The mean value of the figure of merit m for acentric reflections in this shell. int P~alpha~ exp(i*alpha) dalpha m = -------------------------------- int P~alpha~ dalpha P~a~ = the probability that phase angle a is correct int is taken over the range alpha = 0 to 2 pi. ; _item.name '_phasing_MIR_shell.fom_acentric' _item.category_id phasing_MIR_shell _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_shell.ebi_fom_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_shell.fom_centric _item_description.description ; The mean value of the figure of merit m for centric reflections in this shell. int P~alpha~ exp(i*alpha) dalpha m = -------------------------------- int P~alpha~ dalpha P~a~ = the probability that phase angle a is correct int is taken over the range alpha = 0 to 2 pi. ; _item.name '_phasing_MIR_shell.fom_centric' _item.category_id phasing_MIR_shell _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_shell.ebi_fom_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__phasing_MIR_shell.reflns_acentric _item_description.description ; The number of acentric reflections in this shell. ; _item.name '_phasing_MIR_shell.reflns_acentric' _item.category_id phasing_mir_shell _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_shell.ebi_reflns_acentric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__phasing_MIR_shell.reflns_centric _item_description.description ; The number of centric reflections in this shell. ; _item.name '_phasing_MIR_shell.reflns_centric' _item.category_id phasing_mir_shell _item.mandatory_code no _item_aliases.alias_name '_phasing_MIR_shell.ebi_reflns_centric' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ ################### ## REFLN_SYS_ABS ## ################### # ########################### ## proposed new category ## ########################### # ######################################### ## Submitted by Kim Henrick ## ## Content review by Paula Fitzgerald ## ## Editorial review by HB, JW and PMDF ## ######################################### save_REFLN_SYS_ABS _category.description ; Data items in the REFLN_SYS_ABS category record details about the reflection data that should be systematically absent, given the designated space group. ; _category.id refln_sys_abs _category.mandatory_code no loop_ _category_key.name '_refln_sys_abs.index_h' '_refln_sys_abs.index_k' '_refln_sys_abs.index_l' loop_ _category_group.id 'inclusive_group' 'refln_group' loop_ _category_examples.detail _category_examples.case # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; Example 1 - completely arbitrary ; ; loop_ _refln_sys_abs.index_h _refln_sys_abs.index_k _refln_sys_abs.index_l _refln_sys_abs.I _refln_sys_abs.sigmaI _refln_sys_abs.I_over_sigmaI 0 3 0 28.32 22.95 1.23 0 5 0 14.11 16.38 0.86 0 7 0 114.81 20.22 5.67 0 9 0 32.99 24.51 1.35 ; # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - save_ save__refln_sys_abs.I _item_description.description ; The measured value of the intensity in arbitrary units. ; _item.name '_refln_sys_abs.I' _item.category_id refln_sys_abs _item.mandatory_code no _item_aliases.alias_name '_ebi_refln_sys_abs.I' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_related.related_name _item_related.function_code '_refln_sys_abs.sigmaI' associated_esd _item_type.code float _item_type_conditions.code esd _item_units.code arbitrary save_ save__refln_sys_abs.I_over_sigmaI _item_description.description ; The ratio of _refln_sys_abs.I to _refln_sys_abs.sigmaI. Used to evaluate whether a reflection that should be systematically absent according to the designated space group is in fact absent. ; _item.name '_refln_sys_abs.I_over_sigmaI' _item.category_id refln_sys_abs _item_aliases.alias_name '_ebi_refln_sys_abs.I_over_sigma' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item.mandatory_code no _item_type.code float save_ save__refln_sys_abs.index_h _item_description.description ; Miller index h of the reflection. The values of the Miller indices in the REFLN_SYS_ABS category must correspond to the cell defined by cell lengths and cell angles in the CELL category. ; _item.name '_refln_sys_abs.index_h' _item.category_id refln_sys_abs _item.mandatory_code yes _item_aliases.alias_name '_ebi_refln_sys_abs.h' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_dependent.dependent_name '_refln_sys_abs.index_k' '_refln_sys_abs.index_l' _item_sub_category.id miller_index _item_type.code int save_ save__refln_sys_abs.index_k _item_description.description ; Miller index k of the reflection. The values of the Miller indices in the REFLN_SYS_ABS category must correspond to the cell defined by cell lengths and cell angles in the CELL category. ; _item.name '_refln_sys_abs.index_k' _item.category_id refln_sys_abs _item.mandatory_code yes _item_aliases.alias_name '_ebi_refln_sys_abs.k' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_dependent.dependent_name '_refln_sys_abs.index_h' '_refln_sys_abs.index_l' _item_sub_category.id miller_index _item_type.code int save_ save__refln_sys_abs.index_l _item_description.description ; Miller index h of the reflection. The values of the Miller indices in the REFLN_SYS_ABS category must correspond to the cell defined by cell lengths and cell angles in the CELL category. ; _item.name '_refln_sys_abs.index_l' _item.category_id refln_sys_abs _item.mandatory_code yes _item_aliases.alias_name '_ebi_refln_sys_abs.l' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_dependent.dependent_name '_refln_sys_abs.index_h' '_refln_sys_abs.index_k' _item_sub_category.id miller_index _item_type.code int save_ save__refln_sys_abs.sigmaI _item_description.description ; The standard uncertainty (e.s.d.) of _refln_sys_abs.I, in arbitrary units. ; _item.name '_refln_sys_abs.sigmaI' _item.category_id refln_sys_abs _item.mandatory_code no _item_aliases.alias_name '_ebi_refln_sys_abs.sigmaI' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_related.related_name _item_related.function_code '_ebi_refln_sys_abs.I' associated_value _item_type.code float _item_units.code arbitrary save_ ############ ## REFINE ## ############ # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ################################################### ## Submitted by Kim Henrick ## ## Content review by Dale Tronrud - Mar 19, 1988 ## ## Editorial review by HB, JW and PMDF ## ################################################### save__refine.correlation_coeff_Fo_to_Fc _item_description.description ; The correlation coefficient between Fobs and Fcalcs for reflection included in refinement Correlation coefficients is scale independent and gives an idea of quality of refined model sum~i~(Fo~i~ Fc~i~ -<Fo> <Fc>) R_corr = ------------------------------------------------ SQRT{sum~i~(Fo~i~)^2-<Fo>^2} SQRT{sum~i~(Fc~i~)^2-<Fc>^2} Fo = observed structure factors Fc = calculated structure factors <> = denotes average value of data Summation is over reflections included in refinement ; _item.name '_refine.correlation_coeff_Fo_to_Fc' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Correlation_coeff_Fo_to_Fc' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.correlation_coeff_Fo_to_Fc_free _item_description.description ; The correlation coefficient between Fobs and Fcalcs not included in refinement (free reflections) Correlation coefficients is scale independent and gives an idea of quality of refined model sum~i~(Fo~i~ Fc~i~ -<Fo> <Fc>) R_corr = ------------------------------------------------ SQRT{sum~i~(Fo~i~)^2-<Fo>^2} SQRT{sum~i~(Fc~i~)^2-<Fc>^2} Fo = observed structure factors Fc = calculated structure factors <> = denotes average value of data Summation is over reflections not included (free reflections) in refinement ; _item.name '_refine.correlation_coeff_Fo_to_Fc_free' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Correlation_coeff_Fo_to_Fc_free' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_ESU_B _item_description.description ; Overall estimated standard uncertainties of thermal parameters based on Maximum likelihood residual. Overall ESU gives an idea about uncertainties of B-values of averagely defined atoms (atoms with B-values equal to average B-value) N_a (sigma_B)^2 = 8 ---------------------------------------------- sum~i~ {(1/Sigma - (E_o)^2 (1-m^2)(SUM_AS)s^4} SUM_AS = (sigma_A)^2/Sigma^2) N_a = number of atoms Sigma = (sigma_{E;exp})^2 + epsilon (1-{sigma_A)^2) E_o = normalized structure factors sigma_{E;exp} = experimental uncertainties of normalized structure factors sigma_A = <cos 2 pi s delta_x> SQRT(Sigma_P/Sigma_N) estimated using maximum likelihood Sigma_P = sum_{atoms in model} f^2 Sigma_N = sum_{atoms in crystal} f^2 f = is form factor of atoms delta_x = expected error m = is figure of merit of phases of reflection included in summation delta_x expected error s = reciprocal space vector epsilon = multiplicity of diffracting plane summation is over all reflections included in refinement Reference for sigma_A estimation: "Refinement of Macromolecular Structures by the Maximum-Likelihood Method:" G.N. Murshudov, A.A.Vagin and E.J.Dodson,(1997) Acta Crystallogr. D53, 240-255 Reference for ESU_ML estimation: "Simplified error estimation a la Cruickshank in macromolecular crystallography", Murshudov G.N. & Dodson E.J. in the "CCP4 Newsletter on protein crystallography" Number 33 ed. M.Winn ; _item.name '_refine.overall_ESU_B' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Overall_ESU_B' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_ESU_ML _item_description.description ; Overall estimated standard uncertainties of positional parameters based on Maximum likelihood residual. Overall ESU gives an idea about uncertainties in the position of averagely defined atoms (atoms with B-values equal to average B-value) 3 N_a (sigma_X)^2 = ----------------------------------------------------- 8 pi^2 sum~i~ {(1/Sigma - (E_o)^2 (1-m^2)(SUM_AS)s^2} SUM_AS = (sigma_A)^2/Sigma^2) N_a = number of atoms Sigma = (sigma_{E;exp})^2 + epsilon (1-{sigma_A)^2) E_o = normalized structure factors sigma_{E;exp} = experimental uncertainties of normalized structure factors sigma_A = <cos 2 pi s delta_x> SQRT(Sigma_P/Sigma_N) estimated using maximum likelihood Sigma_P = sum_{atoms in model} f^2 Sigma_N = sum_{atoms in crystal} f^2 f = is formfactor of atoms delta_x = expected error m = is figure of merit of phases of reflection included in summation delta_x expected error s = reciprocal space vector epsilon = multiplicity of diffracting plane summation is over all reflections included in refinement Reference for sigma_A estimation: "Refinement of Macromolecular Structures by the Maximum-Likelihood Method:" G.N. Murshudov, A.A.Vagin and E.J.Dodson,(1997) Acta Crystallogr. D53, 240-255 Reference for ESU_ML estimation: Simplified error estimation a la Cruickshank in macromolecular crystallograpy Murshudov G.N. & Dodson E.J. in the "CCP4 Newsletter on protein crystallography" Number 33 ed. M.Winn ; _item.name '_refine.overall_ESU_ML' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Overall_ESU_ML' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_ESU_R_Cruickshanks_DPI _item_description.description ; Overall estimated standard uncertainties of thermal parameters based on crystallographic R-value Overall ESU gives an idea about uncertainties of B-values of averagely defined atoms (atoms with B-values equal to average B-value) N_a (sigma_B)^2 = 0.65 --------- (R_value)^2 (D_min)^2 C^(-2/3) (N_o-N_p) N_a = number of atoms N_o = number of reflections included in refinement N_p = number of refined parameters R_value = conventional crystallographic R-value D_min = maximum resolution C = completeness of data Reference for ESU_ML estimation: Cruickshank, in the "Refinement of macromolecular structures" CCP4 study weekend Simplified error estimation a la Cruickshank in macromolecular crystallography, Murshudov G.N. & Dodson E.J. in the "CCP4 Newsletter on protein crystallography" Number 33 ed. M.Winn ; _item.name '_refine.overall_ESU_R_Cruickshanks_DPI' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Overall_ESU_R_Cruickshanks_DPI' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_ESU_Rfree _item_description.description ; Overall estimated standard uncertainties of thermal parameters based on free R-value Overall ESU gives an idea about uncertainties of B-values of averagely defined atoms (atoms with B-values equal to average B-value) N_a (sigma_B)^2 = 0.65 ----- (R_free)^2 (D_min)^2 C^(-2/3) N_o N_a = number of atoms N_o = number of reflections included in refinement N_p = number of refined parameters R_free = conventional free crystallographic R-value calculated using reflections not included in refinement D_min = maximum resolution C = completeness of data Reference for ESU_ML estimation: Cruickshank, in the "Refinement of macromolecular structures" CCP4 study weekend Simplified error estimation a la Cruickshank in macromolecular crystallography, Murshudov G.N. & Dodson E.J. in the "CCP4 Newsletter on protein crystallography" Number 33 ed. M.Winn ; _item.name '_refine.overall_ESU_Rfree' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_Overall_ESU_Rfree' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_FOM_free_Rset _item_description.description ; Average figure of merit of phases of reflections not included in refinement This value is derived from likelihood function. fom = I_1(X)/I_0(X) I_0, I_1 = zero and first ordered modified Bessel function first kind X = sigma_A |E_o| |E_c|/SIGMA E_o, E_c = normalized observed and calculated structure factors sigma_A = <cos 2 pi s delta_x> SQRT(Sigma_P/Sigma_N) estimated using maximum likelihood Sigma_P = sum_{atoms in model} f^2 Sigma_N = sum_{atoms in crystal} f^2 f = formfactor of atoms delta_x = expected error SIGMA = (sigma_{E;exp})^2 + epsilon (1-{sigma_A)^2) sigma_{E;exp} = uncertainties of normalized observed structure factors epsilon = multiplicity of diffracting plane Reference for sigma_A estimation: "Refinement of Macromolecular Structures by the Maximum-Likelihood Method:" G.N. Murshudov, A.A.Vagin and E.J.Dodson,(1997) Acta Crystallogr. D53, 240-255 ; _item.name '_refine.overall_FOM_work_Rset' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_overall_FOM_work_Rset' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ save__refine.overall_FOM_work_Rset _item_description.description ; Average figure of merit of phases of reflections included in refinement. This value is derived from likelihood function. fom = I_1(X)/I_0(X) I_0, I_1 = zero and first ordered modified Bessel function first kind X = sigma_A |E_o| |E_c|/SIGMA E_o, E_c = normalized observed and calculated structure factors sigma_A = <cos 2 pi s delta_x> SQRT(Sigma_P/Sigma_N) estimated using maximum likelihood Sigma_P = sum_{atoms in model} f^2 Sigma_N = sum_{atoms in crystal} f^2 f = is formfactor of atoms delta_x = expected error SIGMA = (sigma_{E;exp})^2 + epsilon (1-{sigma_A)^2) sigma_{E;exp} = uncertainties of normalized observed structure factors epsilon = multiplicity of diffracting plane Reference for sigma_A estimation: "Refinement of Macromolecular Structures by the Maximum-Likelihood Method:" G.N. Murshudov, A.A.Vagin and E.J.Dodson,(1997) Acta Crystallogr. D53, 240-255 ; _item.name '_refine.overall_FOM_work_Rset' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine.ebi_overall_FOM_work_Rset' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ #################### ## REFINE_ANALYZE ## #################### # ############################################################ ## proposed additional data items in an existing category ## ############################################################ # ################################################### ## Submitted by Kim Henrick ## ## Content review by Dale Tronrud - Jan 13, 1998 ## ## Editorial review by HB, JW and PMDF ## ################################################### save__refine_analyze.RG_d_res_high _item_description.description ; The value of the high resolution cutoff, in Angstrom, used in calculation of the Hamilton generalized R factor (RG) stored in refine_analyze.RG_work and _refine_analyze.ls_RG_free. ; _item.name '_refine_analyze.RG_d_res_high' _item.category_id refine_analyze _item.mandatory_code no _item_aliases.alias_name '_refine_analyze.ebi_RG_d_res_high' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__refine_analyze.RG_d_res_low _item_description.description ; The value of the low resolution cutoff, in Angstrom, used in calculation of the Hamilton generalized R factor (RG) stored in refine_analyze.ls_RG_work and _refine_analyze.ls_RG_free. ; _item.name '_refine_analyze.RG_d_res_low' _item.category_id refine_analyze _item.mandatory_code no _item_aliases.alias_name '_refine_analyze.ebi_RG_d_res_low' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__refine_analyze.RG_free _item_description.description ; The Hamilton generalized R-factor (see W.C. Hamilton (1965) Acta Cryst., 18, 502-510. ) for all reflections that satisfy the resolution limits established by _refine_analyze.RG_d_res_high and _refine_analyze.RG_d_res_low for the Free R set of reflections that were excluded from the refinement. / sum_i sum_j w_{i,j}(|Fobs|_i - G|Fcalc|_i)(|Fobs|_j - G|Fcalc|_j)\ Rg = Sqrt| ----------------------------------------------------------------- | \ sum_i sum_j w_{i,j} |Fobs|_i |Fobs|_j / where |Fobs| = the observed structure factor amplitudes |Fcalc| = the calculated structure factor amplitudes G = the scale factor which puts |Fcalc| on the same scale as |Fobs| w_{i,j} = the weight for the combination of the reflections i and j. sum_i and sum_j is taken over the specified reflections When the covariance of the amplitude of reflection i and reflection j is zero (i.e. the reflections are independent) w{i,i} can be redefined as w_i and the nested sums collapsed into one. / sum_i w_i(|Fobs|_i - G|Fcalc|_i)^2 \ Rg = Sqrt| ----------------------------------- | \ sum_i w_i |Fobs|_i^2 / ; _item.name '_refine_analyze.RG_free' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine_analyze.ebi_RG_free' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__refine_analyze.RG_work _item_description.description ; The Hamilton generalized R factor (see W.C. Hamilton (1965) Acta Cryst., 18, 502-510. ) for all reflections that satisfy the resolution limits established by _refine_analyze.RG_d_res_high and _refine_analyze.RG_d_res_low and for those reflections included in the working set when a Free R set of reflections are omitted from the refinement. / sum_i sum_j w_{i,j}(|Fobs|_i - G|Fcalc|_i)(|Fobs|_j - G|Fcalc|_j)\ Rg = Sqrt| ----------------------------------------------------------------- | \ sum_i sum_j w_{i,j} |Fobs|_i |Fobs|_j / where |Fobs| = the observed structure factor amplitudes |Fcalc| = the calculated structure factor amplitudes G = the scale factor which puts |Fcalc| on the same scale as |Fobs| w_{i,j} = the weight for the combination of the reflections i and j. sum_i and sum_j is taken over the specified reflections When the covariance of the amplitude of reflection i and reflection j is zero (i.e. the reflections are independent) w{i,i} can be redefined as w_i and the nested sums collapsed into one. / sum_i w_i(|Fobs|_i - G|Fcalc|_i)^2 \ Rg = Sqrt| ----------------------------------- | \ sum_i w_i |Fobs|_i^2 / ; _item.name '_refine_analyze.RG_work' _item.category_id refine _item.mandatory_code no _item_aliases.alias_name '_refine_analyze.ebi_RG_work' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__refine_analyze.RG_work_free_ratio _item_description.description ; The observed RGfree/RGwork ratio. The expected RG ratio is the value that should be achievable at the end of a structure refinement when only random uncorrelated errors exist in data and model provided that the observations are properly weighted. When compared to the observed RG ratio it may indicate that a structure has not reached convergence or a model has been over refined with no corresponding improvement in the model. See: I.J. Tickle, R.A. Laskowski and D.S. Moss, (1998) "Rfree and the Rfree ratio: deviation of expected values of cross-validation residuals used in macromolecular least-squares refinement" Acta Cryst. D, in press In an unrestrained refinement the ratio of RGfree/RGwork with only random uncorrelated errors at convergence would only depend on the number of reflections and the number of parameters as: sqrt[(f + m) / (f - m) ] where f = number of included structure amplitudes and target distances, and m = number of parameters being refined. In the restrained case, RGfree is calculated from a random selection of residuals including both structure amplitudes and restraints. When restraints are included in refinement the RG ratio requires a term for the contribution to the minimized residual at convergence, Drest, due to those restraints: Drest = r - sum (w_i . (a_i)^t . (H)^-1 a_i Where r is the number of geometrical, temperature factor and other restraints H is the (m,m) normal matrix given by A^t.W.A W is the (n,n) symmetric weight matrix of the included observations A is the least-squares design matrix of derivatives of \ order (n,m) a_i is the ith row of A Then the expected RGratio becomes sqrt [ (f + (m - r + Drest))/ (f - (m - r + Drest)) ] The expected RGfree/RGwork is not yet included in the mmCIF dictionary. ; _item.name '_refine_analyze.RG_work_free_ratio' _item.category_id refine_analyze _item.mandatory_code no _item_aliases.alias_name '_refine_analyze.ebi_RG_work_free_ratio' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ ############################### ## REFINE_FUNCT_MINIMIZED ## ############################### # ########################### ## proposed new category ## ########################### # ################################################### ## Submitted by Kim Henrick ## ## Content review by Dale Tronrud - Jan 13, 1998 ## ## Editorial review by HB, JW and PMDF ## ################################################### save_REFINE_FUNCT_MINIMIZED _category.description ; Data items in the REFINE_FUNCT_MINIMIZED category record details about the individual terms of the function minimized during refinement. ; _category.id refine_funct_minimized _category.mandatory_code no _category_key.name '_refine_funct_minimized.type' loop_ _category_group.id 'inclusive_group' 'refine_group' loop_ _category_examples.detail _category_examples.case # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; Example 1 - based on RESTRAIN refinement for the CCP4 text data set toxd. ; ; loop_ _refine_funct_minimized.type _refine_funct_minimized.numTerms _refine_funct_minimized.residual 'sum(W*Delta(Amplitude)^2' 3009 1621.3 'sum(W*Delta(Plane+Rigid)^2' 85 56.68 'sum(W*Delta(Distance)^2' 1219 163.59 'sum(W*Delta(U-tempfactors)^2' 1192 69.338 ; # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - save_ save__refine_funct_minimized.numterms _item_description.description ; The number observations in this term. For example, if the term is residual of the X-ray data this item would contain the number of reflections used in refinement. ; _item.name '_refine_funct_minimized.numterms' _item.category_id refine_funct_minimized _item.mandatory_code no _item_aliases.alias_name '_ebi_refine_funct_minimized.NumTerms' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0 0 0 _item_type.code int save_ save__refine_funct_minimized.residual _item_description.description ; The residual for this term of function which was minimized in refinement. ; _item.name '_refine_funct_minimized.residual' _item.category_id refine_funct_minimized _item.mandatory_code no _item_aliases.alias_name '_ebi_refine_funct_minimized.Residual' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_ save__refine_funct_minimized.type _item_description.description ; The type of the function being minimized. ; _item.name '_refine_funct_minimized.type' _item.category_id refine_funct_minimized _item.mandatory_code yes _item_aliases.alias_name '_ebi_refine_funct_minimized.type' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code line save_ save__refine_funct_minimized.weight _item_description.description ; The weight applied to this term of the function which was minimized in refinement. ; _item.name '_refine_funct_minimized.weight' _item.category_id refine_funct_minimized _item.mandatory_code no _item_aliases.alias_name '_ebi_refine_funct_minimized.weight' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 _item_type.code float save_ ##################### ## REFINE_LS_RESTR ## ##################### # ################################################### ## proposed elaboration of an existing data item ## ################################################### # #################################################### ## Submitted by Kim Henrick ## ## Content review by John Westbrook ## ## Kim Henrick approval of changes - Jan 22, 1998 ## ## Editorial review by HB, JW and PMDF ## #################################################### save__refine_ls_restr.type _item_description.description ; The type of the parameter being restrained. An explicit set of data values are provided for programs Protin/ Prolsq (beginning with p_) and X-plor (beginning with x_). As computer programs will evolve, these data values are given as examples, and not as an enumeration list. Computer programs converting a data block to a refinement table will expect the exact form of the data values given here to be used. ; loop_ _item.name _item.category_id _item.mandatory_code '_refine_ls_restr.type' refine_ls_restr yes '_refine_ls_restr_type.type' refine_ls_restr_type yes loop_ _item_linked.child_name _item_linked.parent_name '_refine_ls_restr_type.type' '_refine_ls_restr.type' _item_type.code line loop_ _item_examples.case _item_examples.detail 'p_bond_d' 'bond distance' 'p_angle_d' 'bond angle expressed as a distance' 'p_planar_d' 'planar 1,4 distance' 'p_xhbond_d' 'x-h bond distance' 'p_xhangle_d' 'x-h bond angle expressed as a distance' 'p_hydrog_d' 'hydrogen distance' 'p_special_d' 'special distance' 'p_planar' 'planes' 'p_chiral' 'chiral centers' 'p_singtor_nbd' 'single-torsion non-bonded contact' 'p_multtor_nbd' 'multiple-torsion non-bonded contact' 'p_xyhbond_nbd' 'possible (x...y) hydrogen-bond' 'p_xhyhbond_nbd' 'possible (x-h...y) hydrogen-bond' 'p_special_tor' 'special torsion angle' 'p_planar_tor' 'planar torsion angle' 'p_staggered_tor' 'staggered torsion angle' 'p_orthonormal_tor' 'orthonormal torsion angle' 'p_mcbond_it' 'main-chain bond isotropic thermal factor' 'p_mcangle_it' 'main-chain angle isotropic thermal factor' 'p_scbond_it' 'side-chain bond isotropic thermal factor' 'p_scangle_it' 'side-chain angle isotropic thermal factor' 'p_xhbond_it' 'x-h bond isotropic thermal factor' 'p_xhangle_it' 'x-h angle isotropic thermal factor' 'p_special_it' 'special isotropic thermal factor' 'RESTRAIN_Distances < 2.12' ; For the program RESTRAIN, the root-mean-square deviation of the difference between the values calculated from the structures used to compile the restraints dictionary parameters and the dictionary values themselves in the distance range less than 2.12 Angtroms. ; 'RESTRAIN_Distances 2.12 < D < 2.625' ; For the program RESTRAIN, the root-mean-square deviation of the difference between the values calculated from the structures used to compile the restraints dictionary parameters and the dictionary values themselves in the distance range 2.12 - 2.625 Angtroms. ; 'RESTRAIN_Distances > 2.625' ; For the program RESTRAIN, the root-mean-square deviation of the difference between the values calculated from the structures used to compile the restraints dictionary parameters and the dictionary values themselves in the distance range greater than 2.625 Angtroms. ; 'RESTRAIN_Peptide Planes' ; For the program RESTRAIN, the root-mean-square deviation of the difference between the values calculated from the structures used to compile the restraints dictionary parameters and the dictionary values themselves for peptide planes. ; 'RESTRAIN_Ring and other planes' ; For the program RESTRAIN, the root-mean-square deviation of the difference between the values calculated from the structures used to compile the restraints dictionary parameters and the dictionary values themselves for rings and planes other than peptide planes. ; 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.2-1.4' . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.4-1.6' . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.8-2.0' . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.0-2.2' . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.2-2.4' . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist >2.4' . save_ ########################### ## REFINE_LS_RESTR_TYPE ## ########################### # ########################### ## proposed new category ## ########################### # #################################################### ## Submitted by Kim Henrick ## ## Content review by John Westbrook ## ## Kim Henrick approval of changes - Jan 22, 1998 ## ## Editorial review by HB, JW and PMDF ## #################################################### save_REFINE_LS_RESTR_TYPE _category.description ; Data items in the REFINE_LS_RESTR_TYPE category record details about the restraints types used in least-squares refinement. ; _category.id refine_ls_restr_type _category.mandatory_code no _category_key.name '_refine_ls_restr_type.type' loop_ _category_group.id 'inclusive_group' 'refine_group' loop_ _category_examples.detail _category_examples.case # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; Example 1 - based on RESTRAIN refinement for the CCP4 text data set toxd. ; ; loop_ _refine_ls_restr.type _refine_ls_restr.number _refine_ls_restr.dev_ideal _refine_ls_restr.dev_ideal_target 'RESTRAIN_Distances < 2.12' 509 0.005 0.022 'RESTRAIN_Distances 2.12 < D < 2.625' 671 0.016 0.037 'RESTRAIN_Distances > 2.625' 39 0.034 0.043 'RESTRAIN_Peptide Planes' 59 0.002 0.010 'RESTRAIN_Ring and other planes' 26 0.014 0.010 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.2-1.4' 212 0.106 . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.4-1.6' 288 0.101 . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.8-2.0' 6 0.077 . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.0-2.2' 10 0.114 . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.2-2.4' 215 0.119 . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist >2.4' 461 0.106 . loop_ _refine_ls_restr_type.type _refine_ls_restr_type.distance_cutoff_low _refine_ls_restr_type.distance_cutoff_high _refine_ls_restr_type.ebi_U_sigma_wghts 'RESTRAIN_Distances < 2.12' . 2.12 . 'RESTRAIN_Distances 2.12 < D < 2.625' 2.12 2.625 . 'RESTRAIN_Distances > 2.625' 2.625 . . 'RESTRAIN_Peptide Planes' . . . 'RESTRAIN_Ring and other planes' . . . 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.2-1.4' 1.2 1.4 1.800 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.4-1.6' 1.4 1.6 1.800 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 1.8-2.0' 1.8 2.0 1.800 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.0-2.2' 2.0 2.2 1.800 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist 2.2-2.4' 2.2 2.4 1.800 'RESTRAIN_r.m.s. diffs for Uiso atoms at dist >2.4' 2.4 . 1.800 ; # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - save_ save__refine_ls_restr_type.distance_cutoff_high _item_description.description ; Upper limit of the distance range applied to the current restraint type. ; _item.name '_refine_ls_restr_type.distance_cutoff_high' _item.category_id refine_ls_restr_type _item.mandatory_code no loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__refine_ls_restr_type.distance_cutoff_low _item_description.description ; Lower limit of the distance range applied to the current restraint type. ; _item.name '_refine_ls_restr_type.distance_cutoff_low' _item.category_id refine_ls_restr_type _item.mandatory_code no loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float _item_units.code angstroms save_ save__refine_ls_restr_type.type _item_description.description ; This data item is a pointer to _refine_ls_restr.type in the REFINE_LS_RESTR category. ; _item.name '_refine_ls_restr_type.type' _item.category_id refine_ls_restr_type _item.mandatory_code yes _item_type.code line save_ save__refine_ls_restr_type.U_sigma_wghts _item_description.description ; refine_ls_restr.ebi_U_sigma_wghts is a coefficient used in the calculation of the weight for thermal parameter restraints in the program RESTRAIN (see Driessen H, Haneef M I J, Harris G W, Howlin B, Khan G and Moss D S (1989) J Appl Cryst, 22, 510-516; and Howlin B, Butler S A, Moss D S, Harris G W and Driessen H P C (1993) J. Appl. Crystallogr. 26, 622-624.) The equation used to calculate the actual weight from this coefficient depends upon the value of _refine_ls_restr.type -- either "r.m.s. diffs for Uiso atoms at distance *" or "r.m.s. diffs for Uaniso atoms at distance *". A similarity restraint is applied to the thermal parameters of any pair of atoms which are subject to an interatomic distance restraint (i.e. 1-2 and 1-3 bonded atoms). The thermal parameter restraints are categorized by the distance between the two affected atoms. The expected r.m.s. differences in thermal parameter, either Uiso or Uaniso, are listed for each shell in _refine_ls_restr.ebi_rmsdev_dictionary. The weight for the restraint on the thermal parameter difference between atoms i and j is w_ij = 1/(s(i)^2 + s(j)^2), where the calculation of s(i) differs depending upon the value of _refine_ls_restr.type. If it is equal to "r.m.s. diffs for Uiso atoms at distance *" its calculation is s_iso(i) = WU.U_iso(i)^2. When the type is "r.m.s. diffs for Uaniso atoms at distance *" its calculation is s_aniso(i) = WU.d^2, where d is the interatomic distance and, in both cases, WU is the value stored in _refine_ls_restr.U_sigma_wghts. The residual of the restraint is also different in the two cases; in the isotropic case it is simply the difference between the U_iso's; in the anisotropic case it is the difference between the components of the anisotropic tensors along the line joining the atoms. See "r.m.s. diffs for Uiso atoms at distance bins" and "r.m.s. diffs for Uaniso atoms at distance bins" definitions within _refine_ls_restr.type. ; _item.name '_refine_ls_restr_type.U_sigma_wghts' _item.category_id refine_ls_restr_type _item.mandatory_code no _item_aliases.alias_name '_refine_ls_restr_type.ebi_U_sigma_wghts' _item_aliases.dictionary ebi_extensions _item_aliases.version 1.0 loop_ _item_range.maximum _item_range.minimum . 0.0 0.0 0.0 _item_type.code float save_
- Prev by Date: (81) Call for votes on revised terms of reference; coreDMG list
- Next by Date: (83) Discussion mode; BNF; mmCIF extensions from EBI
- Index(es):