[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Advice on COMCIFS policy regarding compatibility of CIFsyntax with other domains
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Advice on COMCIFS policy regarding compatibility of CIFsyntax with other domains
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Fri, 4 Mar 2011 09:20:43 -0500
- In-Reply-To: <AANLkTi=pQoaya+9eyChCzn5HnkGkcOcbZxL=rQEN=jDL@mail.gmail.com>
- References: <AANLkTikfLNd6mQB9hB9haGek_52ceO3GjXrtAR5tbsnj@mail.gmail.com><AANLkTin+DsXM58+gQ=H4vXGyuRS7xcDHcmAKKYMztvDL@mail.gmail.com><AANLkTimzgzLHrAg_pKHv82Qjzsz6ME1NPFsfZ87P2tQ8@mail.gmail.com><AANLkTi=pQoaya+9eyChCzn5HnkGkcOcbZxL=rQEN=jDL@mail.gmail.com>
Dear Colleagues, I very much agree with the concept of clearly established principles to help guide our discussions. To summarize James's list as amended by Peter, what he has proposed as guiding principles are: ==================================================================== Principles guiding development of CIF syntax ----------------------------------------------------------------- Preamble: The CIF syntax describes a human-readable, syntactic container for scientific data. CIF syntax aims to be as simple as possible. The domain dictionaries are the primary location of semantic information in the Crystallographic Information Framework. 1. A feature should only be added to CIF syntax if all of the following are satisfied: (i) implementation or use of equivalent behaviour at dictionary level is either significantly more cumbersome or not possible; (ii) the feature provides significant new functionality that is widely applicable to most scientific domains (iii) reliable transfer and archiving of data is not compromised (iv) there is no simpler way of achieving the desired behaviour (v) a feature should only be added if it has been shown possible to implement it with "reasonable ease," i.e. provides a benefit worth the cost, and there is a "rough consensus and running code" 2. As long as the requirements in (1) are satisfied, the CIF framework should: (i) behave in a way that is consistent with common usage (ii) align with pre-existing standards where those standards provide the required behaviour. CIF1 can be considered a pre-existing standard for CIF2 in this context. 3. Non-technical issues should be dealt with in non-technical arenas. (End) ==================================================================== Before turning to the lower numbered principles, I would suggest we discuss item 3, because I believe it conflicts with item 1, which discusses features in terms of being "cumbersome," "significant," being "simpler," "desired behavior," "reasonable ease," and "rough consensus," all of which have strong psychological and other human factors components that may be difficult to quantify and make into "technical issues." In software engineering, my own field, there is a long history of failure in trying to make the systems design process into something purely technical. On purely technical grounds, we would never have accepted C or Python as a suitable programming languages for serious work. Everything important from operating systems to dREL would be based on Pascal. That was the technical consensus of the Computer Science world of a few decades ago. Technically, Pascal is a much better language than C or Python. It just happens that real people produce better, more reliable code with C and with Python than with Pascal, because, for reasons that are still not clearly understood, they get less confused and make fewer mistakes when working with those languages. Precisely because we don't fully understand the non-technical issues in the design of information systems (and, I suspect, in almost all systems), one of the accepted principles of software engineering is to clearly identify all stakeholders, bring them into the discussion and work to achieve their "buy-in". Therefore I propose that we replace principle 3 with ====================================================================== 3. The stakeholders impacted by any change should be clearly identified and the proposed changes should be fully and openly discussed them in an effort to achieve their buy-in to the change, and the change should not go forward in absence of such buy-in absent pressing technical reasons for making the change over such objections. ====================================================================== In principle 2, we reference CIF1. I believe that should be CIF1.1. Now I would like to turn to principle 1.(ii): the feature provides significant new functionality that is widely applicable to most scientific domains This principle would prevent CIF from having features which support any one scientific domain. Under this principle, we never would have had DDL2 and mmCIF, nor imgCIF. I would suggest changing this principle to" ====================================================================== 1.(ii). the feature provides significant new functionality for some scientific application domain, and does not interfere with the use of CIF in other scientific application domains. ====================================================================== Finally, let us consider principle 1.(i): implementation or use of equivalent behaviour at dictionary level is either significantly more cumbersome or not possible; Depending on how we interpret the non-technical word "cumbersome", this may create the impression that we will require all uses of CIF to require use of dictionaries. I would suggest instead: ====================================================================== 1.(i): If it is feasible to implement the desired behavior by specification of changes to dictionaries rather then to CIF syntax, that alternative should be seriously considered and balanced against the human-readability of the resulting CIFs without reference to dictionaries; ====================================================================== Thus the revised principles I would suggest would be: ==================================================================== Principles guiding development of CIF syntax ----------------------------------------------------------------- Preamble: The CIF syntax describes a human-readable, syntactic container for scientific data. CIF syntax aims to be as simple as possible. The domain dictionaries are the primary location of semantic information in the Crystallographic Information Framework. 1. A feature should only be added to CIF syntax if all of the following are satisfied: (i): If it is feasible to implement the desired behavior by specification of changes to dictionaries rather then to CIF syntax, that alternative should be seriously considered and balanced against the human-readability of the resulting CIFs without reference to dictionaries; (ii). the feature provides significant new functionality for some scientific application domain, and does not interfere with the use of CIF in other scientific application domains. (iii) reliable transfer and archiving of data is not compromised (iv) there is no simpler way of achieving the desired behaviour (v) a feature should only be added if it has been shown possible to implement it with "reasonable ease," i.e. provides a benefit worth the cost, and there is a "rough consensus and running code" 2. As long as the requirements in (1) are satisfied, the CIF framework should: (i) behave in a way that is consistent with common usage (ii) align with pre-existing standards where those standards provide the required behaviour. CIF1.1 can be considered a pre-existing standard for CIF2 in this context. 3. The stakeholders impacted by any change should be clearly identified and the proposed changes should be fully and openly discussed them in an effort to achieve their buy-in to the change, and the change should not go forward in absence of such buy-in absent pressing technical reasons for making the change over such objections. (End) ==================================================================== Regards, Herbert At 10:47 PM +1100 3/4/11, James Hester wrote: >Thanks Peter for your comments. While you may not be a voting member >of COMCIFS, you and other COMCIFS members fulfill an important >advisory role and I would encourage everybody to take the opportunity >to provide their perspectives. > >I assume you have no particular disagreement with the principles that >you haven't commented on explicitly? > >I've added some comments in response to your comments, inserted below: > >On Fri, Mar 4, 2011 at 7:25 PM, Peter Murray-Rust <pm286@cam.ac.uk> wrote: >> I add some comments arising out of my own experience with XML/CML which may >> be useful. I don't think I am a full member of COMCIFs so feel free to >> ignore all or any. I comment after significant paragraphs. >> >> On Fri, Mar 4, 2011 at 6:03 AM, James Hester <jamesrhester@gmail.com> wrote: >>> >>> 1. A feature should only be added to CIF syntax if all of the >>> following are satisfied: >>> >>> (i) implementation or use of equivalent behaviour at dictionary level >>> is either significantly more cumbersome or not possible; >>> (ii) the feature provides significant new functionality that is widely >>> applicable to most scientific domains >>> (iii) reliable transfer and archiving of data is not compromised >>> (iv) there is no simpler way of achieving the desired behaviour >>> >> I would add: > > * a feature should only be added if it has been shown possible to implement >> it with "reasonable ease". "Rough consensus and running code" > >I agree that this is a reasonable requirement. I would express it in >terms of cost/benefit, so something with a significant benefit would >justify extra effort. > >>> >>> Example 2: Unicode support in CIF2. This is broadly useful, given the >>> international nature of science and range of symbols used in >>> scientific papers. It could have been implemented in dictionaries >>> using ASCII escapes, but this would have been cumbersome to use, so it >>> satisfies Principle 1. We have adopted Unicode (rather than created >>> our own international character set) and copied the XML character >>> ranges (Principle 2) >> >> I found the original ASCII escapes difficult/tedious for some code points >> and woudl urge full unicode support (with numeric values). > >I perhaps wasn't clear that we have already taken this step. The >current CIF2 draft envisions full Unicode support using UTF-8 >encoding. Some provision has been made for allowing other encodings >in the future. The point of the example was to show how this decision >to adopt Unicode was justifiable in terms of these principles. > >[rest edited out] >-- >T +61 (02) 9717 9907 >F +61 (02) 9717 3145 >M +61 (04) 0249 4148 >_______________________________________________ >comcifs mailing list >comcifs@iucr.org >http://scripts.iucr.org/mailman/listinfo/comcifs -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu =====================================================
Reply to: [list | sender only]
- Follow-Ups:
- References:
- Advice on COMCIFS policy regarding compatibility of CIF syntax withother domains (James Hester)
- Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains (James Hester)
- Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains (Peter Murray-Rust)
- Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains (James Hester)
- Prev by Date: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
- Next by Date: Re: Advice on COMCIFS policy regarding compatibility of CIFsyntax with other domains
- Prev by thread: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
- Next by thread: RE: Advice on COMCIFS policy regarding compatibility of CIFsyntaxwith other domains. .
- Index(es):