IUCr activities

Final report of the Inter-Union Bioinformatics Group

[Berendsen] H. J. C. Berendsen

The Inter-Union Bioinformatics Group (IUBG) is a Joint Initiative of the Int’l Union for Pure and Applied Biophysics (IUPAB), the Int’l Union of Biochemistry and Molecular Biology (IUBMB), the Int’l Union of Crystallography (IUCr), the Int’l Union of Pure and Applied Chemistry (IUPAC) and the Committee on Data for Science and Technology (CODATA). The IUBG seeks overlap with the nomenclature committees of IUCr, IUBMB, IUPAC, and CODATA. The IUBG has received support from the ICSU (Int’l Council for Science) Grants Programme and UNESCO (United Nations Educational, Scientific and Cultural Organization).

Summary statements

The safeguarding of biological data: It is the obligation of the scientists and legislators of all nations to archive and support primary (i.e., fundamental experimental) scientific data, including, but not exclusive to, nucleotide sequences of biological organisms, aminoacid sequences of proteins, three-dimensional structures of biological molecules, as well as other primary data produced by genomics and proteomics studies. These data must be validated, stored, made publicly accessible, and safeguarded for future availability and access. Access must be public and unrestricted and no organization should have a monopoly on these data. These primary scientific data are crucial for the development of science and its applications.

The obligations of data generators: It has always been the practice that those who claim scientific advances by publication of their work should support their claim by making openly available the objective data on which their claim is based. Thus it is the obligation of scientists who generate primary biological data in the course of publicly funded research to preserve these data for present and future reference and unrestricted access. Regardless of whether publication in journals is appropriate, such data must be deposited into the archival databases to guarantee their availability. Primary data producers in the private sector are also urged to conserve and eventually deposit their primary data.

Right to fair use of data: Scientific advances rely on full and open access to data. Primary data that are accessible through the archival databases should not be subjected to any restrictions that would limit fair use of those data. Fair use includes the use for teaching and research purposes.

Standardization issues: here are four different aspects associated with primary data for which standardization should be considered: content, nomenclature, data format, and data exchange protocol. Standardization is an ongoing activity requiring high-level agreement among scientists of various fields in order to ensure understanding and knowledge exchange across borders of scientific disciplines.

Education: Considering the skills required for archiving, validation and dissemination of data, educational institutions should recognize the need for specific education in (bio)molecular informatics.

Recommendations

To international unions and scientific societies:

(a) It is recommended that each Union, on a regular and ongoing basis, identifies and publicizes a list of key archival data bases.

(b) It is recommended that Int’l Unions and other scientific societies actively encourage their membership to deposit primary data in recognized data repositories which provide unrestricted access to these data.

(c) It is recommended that journals of these Unions and societies ensure that these requirements are met before accepting publications in their journals.

(d) It is recommended that Int’l Unions, specifically IUPAB, address the general issue of education in the field of (bio)molecular informatics.

To funding agencies:

(a) It is recommended that funding agencies insist that all primary data produced by grants that they fund be deposited in recognized data repositories which provide unrestricted access to these data. Int’l Unions and scientific societies should work with funding agencies to create guidelines for this purpose.

(b) It is recommended that funding agencies actively encourage and adequately support existing and newly funded primary data repositories, including their updating and annotation, to provide the mechanism to preserve in perpetuity the data deposited therein in a form which is fully recoverable by future generations of researchers.

To for-profit organizations: It is recommended that for-profit organizations deposit their data as early as possible in public archival databases.

To publishers and authors: It is recommended that journal publishers make the primary data on which a publication is based available under the same conditions as they make the printed article available, and - if applicable - require that such data are deposited in a recognized key archival database. Authors are encouraged not to publish in journals that do not conform to these rules.

To legislators: Following the recommendations of the ICSU/CODATA Ad Hoc Group on Data and Information, it is recommended that legislators take into account the impact of intellectual property laws on research and education, in order to allow fair use for scientific and educational purposes.

To scientific committees for nomenclature and standardization: It is recommended that Int’l Unions and scientific societies play an active role in the definition of standards in the fields they represent. This should be done through nomenclature and data standardization committees. They should be conversant with both the content and the technologies needed for a full definition of the field, in order to ensure the exchange of data without loss of information.

To educational institutions: It is recommended that bioinformatics curricula should include specific education in the creation and curation of databases, as well as in their use. Life sciences curricula should include courses and training in bioinformatics.

Steering Committee: H.J.C. Berendsen (IUPAB), H.M. Berman (IUCr), R. Cammack (IUBMB), C. Cantor (IUPAB), J. Garnier, (IUPAB), chairman, A. Lesk, (CODATA), A. McNaught, (IUPAC), R. Roberts, (ICSU), M. Vijayan (IUPAB, IUCr).

The full report can be found at http://md.chem.rug.nl/~berends/IUBG-FinalReport.html.

H.J.C. Berendsen, Secretary