Multipurpose crystallochemical analysis with the program package TOPOS

Vladislav A. Blatov

Samara State University, Ac.Pavlov Street. 1, Samara 443011, Russia. E-mail: blatov@ssu.samara.ru ; WWW: http://www.topos.ssu.samara.ru/blatov.html ; TOPOS website: http://www.topos.ssu.samara.ru/

Abbreviation list

CIF	Crystallographic Information File
CN	Coordination number
CS	Coordination sequence
CSD	Cambridge Structural Database
DBMS	Database Management System
EPINET	Euclidean Patterns in Non-Euclidean Tilings
ES	Extended Schläfli symbol for circuits
ICSD	Inorganic Crystal Structure Database
MCN	Molecular coordination number
MOF	Metal-organic framework
MPT	Maximal proper tile
PDB	Protein Databank
RCSR	Reticular Chemistry Structure Resource
SBU	Secondary building unit
TTD	TOPOS Topological Database
VDP	Voronoi-Dirichlet polyhedron
VS	Vertex symbol

1. Introduction

At present, the data for more than 400,000 chemical compounds are collected in world-wide crystallographic databases CSD, ICSD, PDB and CrystMet. Processing of such a large amount of information is a great challenge for modern crystal chemistry. Traditional visual analysis of crystal structures becomes insufficient to reveal the common principles of the spatial organization of three-dimensional nets and packings in long series of chemical compounds of various composition and stoichiometry. Rapidly developing interdisciplinary branches of science, such as crystal engineering and supramolecular chemistry, require the development of new computer methods to process and classify the crystallographic information, and to search for general crystallochemical regularities.

When developing the program package TOPOS we pursued two main goals:

· to create a computer system that would enable one to perform comprehensive crystallochemical analysis of any crystal structure irrespective of its chemical nature and complexity;

· to implement new methods for crystallochemical analysis of large amounts of chemical compounds to find the regularities in their structure organization in an automated mode.

TOPOS has been developed since 1989 and has several versions being exploited so far. The MS DOS versions 3.0, 3.1 and 3.2 were developed until 2003 and now are not supported. The current Windows-based TOPOS 4.0 Professional started in 2001 and now is the main program product in the TOPOS family. Its periodically updated beta-version is available for free at the TOPOS website: http://www.topos.ssu.samara.ru/. It is the version that will be considered in detail and below it will be called TOPOS for short.

TOPOS is created using Borland Delphi 7.0 environment and works under Windows 95/98/Me/NT/2000/XP operating systems. Its current size is less than 3 M (without topological databases) so it is easily distributed as a self-extracted zipped file. The system requirements are minimal; really TOPOS can work on any IBM PC computer under Windows. The main file topos40.exe is an integrated interactive multiwindow system (Fig. 1) that is based on DBMS intended to input, edit, search and retrieve the crystal structure information stored in TOPOS external databases. TOPOS includes a number of applied programs (Table 1), all of which (except StatPack) are integrated into TOPOS system.

Table 1: A brief description of TOPOS applied programs

Program name	Destination
ADS (Automatic Description of Structure)	Revealing structural groups, determining their composition, orientation, dimensionality and binding in various structure representations Calculating topological invariants (coordination sequences, Schläfli and vertex symbols) and performing topological classification Constructing molecular VDPs and calculating their geometric characteristics Constructing tilings for 3D nets Searching for and classifying entanglements of 1D, 2D or 3D extended structures
AutoCN	Identifying and classifying interatomic contacts Determining coordination numbers of atoms Calculating and storing the adjacency matrix
DiAn	Calculating interatomic distances and bond angles
Dirichlet	Constructing VDPs for atoms and voids Calculating geometric characteristics of atom and void domains Searching for void positions and channel systems
HSite	Generating positions of hydrogens
IsoCryst	Visualizing crystal structure Calculating geometric parameters of crystal structure
IsoTest	Arranging crystal structures into topological and structure types Comparative analysis of atomic nets and packings
StatPack	Statistical processing of the data files generated by the programs Dirichlet and ADS

All TOPOS constituents can exchange the data and should usually be applied in a certain sequence when performing a complicated crystallochemical analysis. Scheme 1 shows the logical interconnections within the TOPOS system when exchanging the data streams. The main data stream is directed from the top to the bottom of Scheme 1, since all TOPOS applied programs use the crystallographic information from DBMS. However, ADS, AutoCN, Dirichlet, HSite and IsoTest programs can produce new data that can be stored in TOPOS databases, so there is an inverse data stream.

Crystal structure database in the TOPOS VER 2.02 format includes five files:

File type	File destination
*.adm	contains adjacent matrices of crystal structures (optional file)
*.cmp	contains chemical formulae of compounds
*.cd	contains other data on crystal structures
*.its	contains the information on the topology of the graphs of crystal structures (optional file)
*.itl	contains the information on the topology of atomic sublattices (optional file)

DBMS identifies the database using the *.cmp file; it is the file that is loaded in the DBMS window. Any number of databases can be loaded at once. In addition a number of index files *.idx ('x' is a letter characterizing the content of the index file) can be created using the DBMS Distribution utility.

[Geneal view of TOPOS]

Figure 1: General view of the program package TOPOS.

Scheme 1: Interaction of constituents and main data stream paths within the TOPOS system.

In addition TOPOS forms and supports the following auxiliary databases:

· TTD collection that is a set of *.ttd files in a special binary format containing the information on topological types of simple 2D or 3D nets. The TTD collection is used for automatic determining crystal structure topology with the ADS program. At present the TTD collection includes four databases:

Database	Number of records	Description
TOPOS&RCSR.ttd	1606	Data on idealized nets from RCSR,[1] framework zeolites,[2] sphere packings (see, e.g. Sowa & Koch, 2005), and two-dimensional nets
binary.ttd	1597	Data on binary framework compounds
polytypes.ttd	694	Data on topologies of polytypic close packings, SiC, NiAs, and other layered polymorphs up to 12-layered
epinet.ttd	14380	Data on all new topological types of the periodic nets generated within the EPINET project[3]

· library of combinatorial-topological types of finite polyhedra containing the information on the edge nets of polyhedra in the files *.edg, *.pdt, *.vec. This library is used by the Dirichlet and ADS programs to identify the combinatorial topology of VDPs and tiles.

The methods of inputting the information into the TOPOS databases are shown in Scheme 2. The main distinction of the content of the TOPOS databases in comparison with other crystallographic databanks is that the 3D graph of interatomic bonds is completely stored in the *.adm file. Using this information TOPOS can produce other important data on the crystal structure topology. Thus, the main TOPOS peculiarity is its orientation to topological characteristics that clarifies its name.

Scheme 2: Methods to produce data in TOPOS databases.

2. Topological information in TOPOS

2.1. Adjacency matrix

TOPOS uses the concept of labelled quotient graph (Chung et al., 1984) to make the infinite 3D periodic graph of crystal structure suitable for computer storage. The adjacency matrix of the labelled quotient graph contained in the *.adm file carries all necessary information about the system of interatomic contacts. The format of data for each contact of the basic 'central' atom with a surrounding atom is given below.[4] The CSym and translation fields contain encoded symmetry operation and translation vector, which transforms the jth basic atom into surrounding atom connected with the ith central one. This information is sufficient to describe the labelled quotient graph and the topology of the whole net. Other parameters characterize the kind and strength of the contact.

record

i,j:integer numbers of central and surrounding atom

CSym:integer symmetry code

translation:array[1..3]of integer translation vector

m:integer type of the contact

{m=0 - not a contact

m=1 - valence bond

m=2 - specific (secondary) interaction

m=3 - van der Waals bonding

m=4 - hydrogen bonding

m=5 - agostic bonding}

R,SA:float contact parameters (interatomic distance, VDP solid angle, etc.)

end;

The program AutoCN is intended for automated computing and storing adjacency matrix. Since TOPOS can work with periodic nets of various nature, including idealized or artificial nets, AutoCN uses several algorithms to determine contacts between nodes of the net.

Three main AutoCN algorithms, called Using Rsds, Sectors and Distances, are designed for crystal structures of real chemical compounds and based on constructing Voronoi-Dirichlet polyhedra,[5] VDPs, for all atoms. For applications of VDPs in crystal chemistry see Blatov (2004). The VDP construction uses very effective 'gift wrapping' algorithm (Preparata & Shamos, 1985) of computing a convex hull for a set of image points with coordinates (2x_i/R_i², 2y_i/R_i², 2z_i/R_i²), where (x_i, y_i, z_i) are the Cartesian coordinates of the surrounding atoms, and R_i is the distance from the VDP atom to the ith neighbouring atom. In this algorithm for each edge E of face F belonging to the convex hull, the point (P_k) corresponding to the third vertex of a face adjacent to F and joined to it at the same edge is determined from the maximum dihedral angle j (Fig. 2a). Cotangents of j angles are calculated with the formula

, (1)

where n is the unit normal vector to the face F in a half-space containing the VDP atom, and a is the unit vector normal to both E and n (Fig. 2b).

As a result, VDP of an atom in the crystal space is a convex polyhedron whose faces are perpendicular to segments connecting the central atom of VDP and the other surrounding atoms (Fig. 3a). VDPs of all atoms form Voronoi-Dirichlet partition of crystal space (Fig. 3b). Each face divides the corresponding segment by half and ordinarily the face and segment intersect each other. Otherwise (Fig. 3c) the surrounding atom is called 'indirect neighbour' according to O’Keeffe (1979). All the three AutoCN algorithms consider only the contacts with direct VDP neighbours as potential bonds. The differences are in consequent arranging of the contacts.

Figure 2: (a) Determination of a point forming the VDP face (P₆) in the 'gift wrapping' algorithm. The P₁P₂P₆ half-plane forms the maximum angle with the P₁P₂P₃ (F) half-plane containing previously found points. (b) Calculation of cotj according to formula (1). P₁P₂ (E) is the VDP edge.

a b c

Figure 3: (a) Voronoi–Dirichlet polyhedron (VDP) and surrounding atoms, (b) Voronoi-Dirichlet partition for a body-centred cubic lattice; (c) VDP and surrounding atoms of an oxygen atom in the crystal structure of ice VIII. Valence, H bond, and non-valence interatomic contacts are coloured red, green, and black, respectively. Indirect contacts are dotted.

The Using Rsds algorithm is rested upon the so-called method of intersecting spheres (Serezhkin et al., 1997). In this method the interatomic contacts are determined as a result of calculating the number of overlapping pairs of internal and external spheres circumscribed around the centre of either atom of the pair (Fig. 4). Normally, the internal and external spheres have atomic Slater's radius, r_s, and radius of spherical domain, R_sd, respectively. If more than one pair of such spheres intersect each other (overlaps P₂, P₃ or P₄) then the contact is assumed to be a chemical bond and is added to atomic CN. If only external spheres overlap, the contact is assumed to be specific, otherwise van der Waals. With additional geometrical criteria the algorithm can separate hydrogen or agostic bonding from specific contacts. In fact, the method of intersecting spheres assumes the shape of the atomic domain to be practically spherical in the crystal structure. This assumption works well for many inorganic compounds, but in the case of organic or coordination compounds it requires considering anisotropy of atomic domains.

P₀	P₁	P₂	P₃	P₄

Figure 4: Schematic representation of basic types of overlaps (P_n) for atoms within the method of intersecting spheres. The radii of solid and dotted spheres are equal r_s and R_sd, respectively. The intersections are shaded of the spheres that causes a given type of overlap. The n value is equal to the number of pair overlaps (Serezhkin et al., 1997).

The Sectors algorithm uses an improved method of intersecting spheres designed by Peresypkina and Blatov (2000) for organic and metal-organic compounds and called method of spherical sectors. In this method sphere of R_sd radius is replaced with a set of spherical sectors corresponding to interatomic contacts (Fig. 5a). The radius (r_sec) of the ith sector is determined by the formula

, (2)

where V_i and W_i are volume and solid angle of a pyramid with basal VDP face corresponding to interatomic contacts and with the VDP atom in the vertex (Fig. 5b). The Sectors algorithm also allows user to reveal non-valence bonding.

Figure 5: (a) An example of identification of interatomic contacts with the Sectors algorithm in a two-dimensional lattice. Bold lines confine VDPs; dashed lines show boundaries of pyramids (triangles in 2D case) based on the VDP faces corresponding to the direct interatomic contacts. Dashed circles have r_s radius; solid arcs of r_sec radius confine spherical sectors and show atomic boundaries in a crystal field. The A and B atoms form a valence contact, to which the triple overlap r_sec(A)–r_s(B), r_s(A)–r_sec(B) and r_sec(A)–r_sec(B) corresponds; the contact between A and C atoms is non-valence because the only overlap r_sec(A)–r_s(B) corresponds to it. (b) VDP of an atom in a body-centred cubic lattice. The solid angle (W) of the VDP pyramid based on the shaded face is equal to the shaded segment of the unit sphere cut off by the pyramid with the VDP atom at the vertex and the face in the base.

The Distance algorithm is an attempt to combine the Voronoi-Dirichlet approach and traditional methods that use atomic radii and interatomic distances. The contact between the VDP atom and surrounding atom is considered valence bonding if the distance between them is shorter that the sum of their Slater's radii increased by a shift to be specified by user (0.3 Å by default).

With these algorithms (Sectors by default) user can compute adjacency matrices in an automated mode that is very important for the analysis of large numbers of crystal structures. Their main advantage is independence of the nature of bonding and of kind of interacting atoms; Slater's system of radii is used in all cases. They were tested for all compounds from CSD and ICSD, and showed a good agreement with chemical models.

To work with artificial nets TOPOS has two additional algorithms, where no atomic radii and the concept of direct neighbour are used:

Solid Angles, where W_i value is the only criterion to select connected net nodes from surrounding ones;

Ranges, where the nodes are considered connected if the distance between them falls into specified range(s); no VDPs are constructed in this case.

The general AutoCN procedure with use of one of the VDP algorithms for a crystal structure with NAtoms atoms in asymmetric unit is given below. The procedure results in saving AdjMatr array containing adjacency matrix.

procedure AutoCN(output AdjMatr)

for i:=1 to NAtoms do

call VDPConstruction(i, output NVDPFaces)

k:=0

for i:=1 to NAtoms do for j:=1 to NVDPFaces[i] do

begin

k:=k+1

call CalcContactParam(i, j, output Dist, Omega, Overlap, Direct, HBond, Agostic)

AdjMatr[k].i:=i

AdjMatr[k].j:=j

AdjMatr[k].R:=Dist

AdjMatr[k].SA:=Omega

if Omega>OmegaMin then

begin

if Method=Solid_Angles then AdjMatr[k].m:=1 else

if Direct then

begin

if (Method=Using_Rsds)or(Method=Sectors) then

if Overlap=0 then AdjMatr[k].m:=3

if Overlap=1 then

if HBond then AdjMatr[k].m:=4 else

if Agostic then AdjMatr[k].m:=5 else AdjMatr[k].m:=2

if Overlap>1 then AdjMatr[k].m:=1

if Method=Distances then

if Dist<r[i]+r[j]+Shift then AdjMatr[k].m:=1 else AdjMatr[k].m:=0

end else AdjMatr[k].m:=0

end

call StoreInDatabase(AdjMatr)

Adjacency matrix is used by all TOPOS applied programs; ADS and IsoTest produce other data for the database derived from the adjacency matrix.

2.2. Reference databases of topological types

The ADS program produces textual *.nnt (New Net Topology) files that contain important topological invariants of nets and can be converted to binary TTD databases. The format of an *.nnt file entry is given below. For detailed information on coordination sequences, total and extended Schläfli symbols (ES) and vertex symbols (VS) see Delgado-Friedrichs & O’Keeffe (2005). The CS+ES+VS combination of topological invariants unambiguously determines the topology of any net found in real crystal structures; about additional invariants see part 3.2.1. The binary *.ttd equivalents of *.nnt files are used as libraries of standard reference nets (topological types) to be compared with the nets in real crystal structures.

An *.nnt entry example

'$sqc691',

'{6^2;8}{6^4;8^2}{6^5;10}',

'3 8 18 40 65 100 140 184 234 294',

'[6(2).6(2).8(2)]',

'4 10 24 44 74 104 144 190 240 296',

'[6.6.6.6.6(2).10(8)]',

'[6.6.6.6.6(2).10(6)]',

'4 12 24 46 72 106 144 190 240 298',

'[6(2).6(2).6(2).6(2).8(2).8(2)]',

'[6(2).6(2).6(2).6(2).8(2).*]',

Detailed description:

'$sqc691',

Name of the record with ‘$’ prefix

'{6^2;8}{6^4;8^2}{6^5;10}',

Total Schläfli symbol for the whole net: {6²8}{6⁴8²}{6⁵10}. In this case the numbers of the three non-equivalent nodes are the same: 1:1:1. Otherwise, indices will be given after each ‘}’ bracket.

'3 8 18 40 65 100 140 184 234 294',

Coordination sequence (CS)

'[6(2).6(2).8(2)]',

Extended Schläfli symbol for circuits (ES): [6₂.6₂.8₂]

'[6(2).6(2).8(2)]',

The same for rings (VS)

Similar triples for other non-equivalent nodes

'4 10 24 44 74 104 144 190 240 296',

'[6.6.6.6.6(2).10(8)]',

'[6.6.6.6.6(2).10(6)]',

'4 12 24 46 72 106 144 190 240 298',

'[6(2).6(2).6(2).6(2).8(2).8(2)]',

'[6(2).6(2).6(2).6(2).8(2).*]',

‘*’ means that there are no rings in this angle, it is equivalent to the ‘¥’ symbol: [6₂.6₂.6₂.6₂.8₂.¥]

2.3. Topological information on crystal structure representations

The IsoTest program forms two kinds of database files. The file *.its contains topological invariants (CS+ES+VS) for all possible net representations of a given crystal structure. A hierarchical sequence of the crystal structure representations is based on the complete representation, where all the contacts stored in the adjacency matrix are taken into account. Each contact (graph edge) has a colour corresponding to its type (the m field of adjacency matrix), and weight determining by interatomic distance (Dist field) or solid angle (SA field). All other representations may be deduced as the subsets of the complete representation by the following three-step algorithm.

(i) Graph edges of the same colour are taken into account, other edges are either ignored or considered irrespective of their weights. In most cases, the chemical interactions of only one type are of interest; as a rule, those are strong bonds. If two or more types of bonds are to be analyzed, the bonds of only one type are to be considered at a given pass of the procedure. Then an array of the weights is formed for all the one-coloured edges.

(ii) The entire array of weights is divided into several groups by a clustering algorithm. TOPOS have used a simple approach when two weights belong to the same group if their difference is smaller than a given value. Thus, n distinct coordination spheres are separated in the atomic environment. Then different topologies are generated by successive rejecting the farthest coordination sphere. As a result, n–1 additional representations of the crystal structure are produced from the complete one. It is important that no 'best' representations are chosen at this step, but all levels of interatomic interaction are clearly distinguished for further analysis, depending on the matter in hand.

(iii) Each of the n representations is used to generate a set of subrepresentations according to the scheme proposed by Blatov (2006). Every subrepresentation is unambiguously determined by an arrangement of the set {NAtoms} of all atoms from asymmetric unit into four subsets: origin {OA}, removed {RA}, contracted {CA}, and target {TA} atoms. The two operations are defined on the subsets to derive a graph of the subrepresentation from the graph of an initial ith representation: contracting an atom to other atoms keeping the local connectivity, when the atom is suppressed, but all graph paths passing through it are retained (Figs. 6a,b), and removing an atom together with all its bonds (Figs. 6c,d). The four-subset arrangement is determined by the role of atoms in those operations. Namely, origin atoms form a new net that characterizes the subrepresentation topology; removed atoms are eliminated from the initial net by the removing operation; contracted atoms merge with target atoms, passing the bonds to them.

All the sets {OA}, {RA}, {CA}, and {TA} form a collection ({OA}, {RA}, {CA}, {TA}) that, together with the initial representation, unambiguously determines the subrepresentation topology (Figs. 6a-d). With the concept of collection, the successful enumeration of the significant subrepresentations becomes easily formalizable as a computer algorithm implemented into IsoTest. Firstly, any collection has a number of properties reflecting the crystal structure relations that can be formulated in terms of set theory.

(i) {OA}Ç{RA}=Æ; {OA}Ç{CA}=Æ, {RA}Ç{CA}=Æ, because an atom cannot play more than one role in the crystal structure.

(ii) {OA}È{RA}È{CA}={NAtoms}, i.e. every atom must have a crystallochemical role.

(iii) {OA}¹Æ, other sets may be empty. This property arises because only the origin atoms are nodes in the graph of the crystal structure subrepresentation; other atoms determine the graph topology. Obviously, the collection ({OA}, Æ, Æ, Æ) means that {OA}={NAtoms}; it describes the initial representation.

(iv) {TA}Í{OA}, because the target atoms are always selected from the origin atoms; unlike other origin atoms they are the centres of complex structural groups.

(v) {TA}¹Æ Û {CA}¹Æ, because the target and contracted atoms together form the structural groups.

Secondly, the collections, together with the topological operations, map onto all the crystal structure transformations applied in crystallochemical analysis. Namely, origin atoms correspond to the centres of structural groups in a given structure consideration. If a structural group has no distinct central atom, a pseudo-atom (PA) coinciding with group's centroid should be added to the {NAtoms} set; this case is typical to the analysis of molecular packings. Removed atoms are atoms to be ignored in the current crystal structure representation, as atoms of interstitial ions and molecules in porous substances or, say, alkali metals in framework coordination compounds. Contracted atoms, together with target atoms, form complex structural groups, but the contracted atoms are not directly considered; they merely provide the structure connectivity whereas the target atoms coincide with the groups' centroids. The difference between origin and target atoms is that the target atoms always correspond to polyatomic structural groups whereas the origin atoms symbolize all structural units, both mono- and polyatomic.

a b

Figure 6: g-CaSO₄ crystal structure: (a) complete representation ({Ca, S, O}, Æ, Æ, Æ), and its subrepresentations (b) ({Ca, S}, Æ, {O}, {S}) with origin Ca and S atoms, contracted oxygen atoms, and target sulfur atoms (the sma[6] topology); (c) ({Ca, O}, {S}, Æ, Æ) with origin Ca and O atoms, and removed sulfur atoms; (d) ({Ca}, {S}, {O}, {Ca}) with origin and target Ca atoms, removed sulfur atoms, and contracted oxygen atoms (the qtz topology).

If, say, there are two atoms of different colours, A and B, {A, B}={NAtoms}, the following four subrepresentations are possible for the initial representation ({A, B}, Æ, Æ, Æ):

(i) ({A}, {B}, Æ, Æ), i.e. the subnet of A atoms;

(ii) ({A}, Æ, {B}, {A}), i.e. the net of A atoms with the A–B–A bridges (B atoms are spacers);

(iii) ({B}, {A}, Æ, Æ); (iv) ({B}, Æ, {A}, {B}) are the same nets of B atoms.

IsoTest enumerates all possible collections and successively writes down them into *.its file in the following format:

OA, RA, CA, TA: array of integer

atomic numbers for atoms in the {OA}, {RA}, {CA}, {TA} sets

CS, ES, VS: array of integer topological invariants for all OA atoms

...

Another IsoTest algorithm enables user to compute topological invariants for sublattices of, generally speaking, non-bonded atoms, and to save them in the *.itl file. Actually, the *.itl file contains the topological information on all possible packings of atoms. There are two principal distinctions in this algorithm in comparison with the analysis of nets:

(i) adjacency matrix is calculated using the Solid Angles algorithm because no real chemical bonds, but packing contacts, are analyzed;

(ii) all atoms in the collection are considered origin or removed, no contraction is used because of the same reason.

Thus, in the case of an AB compound three packing representations ({OA}, {RA}) will be considered: ({A}, {B}), ({B}, {A}) and ({A, B}, Æ). The formats of *.its and *.itl files are similar, but there are no CA and TA arrays in the *.itl file.

2.4. Library of combinatorial types of polyhedra

Two TOPOS programs, Dirichlet and ADS, can store the data on polyhedral units in a library consisting of three files: *.pdt (polyhedron name and geometrical parameters); *.edg (data on polyhedron edges in the format: V1, V2: integer, where V1 and V2 are the numbers of polyhedron vertices); *.vec (Cartesian coordinates of vertices and face centroids). Using the polyhedron adjacency matrix from the *.edg file Dirichlet and ADS can unambiguously identify combinatorial topology of VDPs and tiles. A standard algorithm of searching for isomorphism of two finite ordinary graphs is used for this purpose.

3. Basic algorithms of crystallochemical analysis in TOPOS

In accordance to the content of databases there are two principal ways of crystallochemical analysis in TOPOS. They can be conditionally called geometrical and topological, because the former one rests upon the ordinal crystallographic data from *.cd file (cell dimensions, space group, atomic coordinates), whereas the latter one uses the topological information from *.adm, *.its, *.itl *.ttd, *.edg files. As is seen from the previous part these two ways are not completely independent, because all the topological data are initially produced from crystallographic information. However, these two methods depend on different algorithms, and we need to describe them separately.

3.1. Geometrical analysis: general scheme

Here we consider in detail only original TOPOS features that distinguish it from well known crystallochemical software such as Diamond, Platon, ICSD or CSD tools. In addition, the IsoCryst and DiAn programs let user compute all standard geometrical parameters (interatomic distances, bond and torsion angles, RMS lines and planes, etc.) with ordinal algorithms. The general scheme of geometrical analysis of a crystal structure is shown in Scheme 3.

3.1.1. Computing atomic and molecular Voronoi-Dirichlet polyhedra

Geometrical analysis in TOPOS is based on VDP as an image of an atomic domain in the crystal field and on Voronoi-Dirichlet partition as an image of crystal space that is a good approach even in the case of complex compounds (Blatov, 2004). The main advantage of this approach over the traditional model of a spherical atom is its independence of any system of atomic radii and validity for describing chemical compounds of different nature, from elementary substances to proteins. The programs Dirichlet and IsoCryst compute the following geometrical and topological VDP parameters, each of which has a clear physical meaning (Blatov, 2004; Table 2):

· VDP volume (V_VDP) and R_sd.

· VDP dimensionless normalized second moment of inertia (G₃), generally defined as:

, (3)

however, Dirichlet uses a simpler formula for an arbitrary (not necessarily convex) solid that can be subjected to simplicial subdivision:

, (4)

where summation is performed over all simplexes, V_j is the volume of the jth simplex, and I_j is the normalized second moment of inertia of a simplex with respect to the centre of the VDP:

. (5)

Scheme 3: Geometrical analysis of a crystal structure in TOPOS.

In (5), summation is performed over all simplex vertices, ║v_k║ is the norm of the radius vector of the kth vertex of the simplex, and is the norm of the radius vector of the simplex centroid in the coordinate system with the origin in the centre of the VDP.

· Solid angles of VDP faces (W_i) to be computed according to Fig. 5.

· Displacement of an atom from the centroid of its VDP (D_A).

· Number of VDP faces (N_f).

A number of parameters of Voronoi-Dirichlet partition to be computed with Dirichlet are crucial at crystallochemical analysis (Table 2):

· Standard deviation for 3D lattice quantizer (Convay and Sloane, 1988):

, (6)

or, with (4)

, (7)

i.e. <G₃> is averaged over G₃ values of all inequivalent atomic VDPs.

· Coordinates of all VDP vertices and lengths of VDP edges.

· Other geometrical parameters of VDP vertices and edges important at the analysis of voids and channels (see part 3.2.2).

Table 2: Physical meaning of atomic VDP, molecular VDP and Voronoi-Dirichlet partition parameters

Parameter	Dimensionality	Meaning
Atomic VDP parameters
V_VDP	Å³	Relative size of atom in the crystal field
R_sd	Å	Generalized crystallochemical atomic radius
G₃	Dimensionless	Sphericity degree for nearest environment of the atom; the less G₃, the closer the shape of coordination polyhedron to sphere
W_i	Percentage of 4p steradian	Strength of atomic interaction
D_A	Å	Distance between the centres of positive and negative charges in the atomic domain
N_f	Dimensionless	Number of atoms in the nearest environment of the VDP atom
Molecular VDP parameters
V_VDP(mol)	Å³	Relative size of secondary building unit in the crystal field
R_sd(mol)	Å	Effective radius of secondary building unit
G₃(mol)	Dimensionless	Sphericity degree of secondary building unit
MCN, Number of faces of smoothed molecular VDP	Dimensionless	Number of SBUs contacting with a given one
	Percentage of sum of	Strength of intermolecular interaction
Number of faces of lattice molecular VDP	Dimensionless	Number of SBUs surrounding a given one in idealized packing of spherical molecules
Voronoi-Dirichlet partition parameters
<G₃>	Dimensionless	Uniformity of crystal structure
Coordinates of VDP vertices	Fractions of unit cell dimensions	Coordinates of void centres
Lengths of VDP edges	Å	Lengths of channels between the voids

Uniting atomic VDPs TOPOS constructs secondary building units in the form of molecular VDP (Figs. 7a-c). Molecular VDP is always non-convex, however, VDPs of all secondary building units (SBU) in the crystal structure still form the Voronoi-Dirichlet partition of the crystal space. The program IsoCryst visualizes molecular VDPs, and the program ADS determines the following parameters (Table 2):

· Molecular VDP volume (as a sum of volumes of atomic VDPs), V_VDP(mol) and R_sd(mol).

· Normalized second moment of inertia of molecular VDP, G₃(mol), to be computed according to (4), but the summation is provided over simplexes of all atomic VDPs composing the molecular VDP, and the centroid of the molecule is taken as origin.

· Molecular coordination number (MCN) as a number of molecular VDP faces.

· Solid angles of molecular VDP faces () to be computed by the formula (8)

, (8)

where W_ij are solid angles of the molecular VDP facets composing the ith molecular VDP face; is the sum of solid angles of all nonbonded contacts formed by atoms of the molecule.

· Cumulative solid angles corresponding to different kinds of intermolecular contacts in MOFs:

Valence solid angles of a ligand () and a complex () to be calculated as

, (9)

where valence contacts between the complexing M atom and donor X_i atoms of a ligand L were only taken into consideration, and

, (10)

where all ligands connected with the M atom are included in the sum.

Total solid angles of a ligand () and a complex ():

, (11)

, (12)

where, unlike (9), the index i enumerates all (including non-valence) contacts VDP atom–ligand, even if the ligand is non-valence bonded with the complexing atom and only shields it, while the index I, as in (10), enumerates all the ligands in complex, which are valence bonded with the complexing atom.

Agostic solid angles of a ligand () and a complex (). These values are to be calculated by the formulae analogous to (11) and (12), but with merely the solid angles of atomic VDPs corresponding to agostic contacts M…H–X.

Residual solid angles of a ligand (d=) and a complex (Δ=–).

In addition to molecular VDPs the ADS program constructs two types of VDPs for SBU centroids:

(i) The Smoothed molecular VDP is constructed by flattening the boundary surfaces of a molecular VDP (Fig. 7d). Smoothed molecular VDPs characterize the local topology of molecular packing and occasionally do not form a partition of space.

(ii) The Lattice molecular VDP is constructed by using molecular centroids only (Fig. 7e). Lattice molecular VDPs characterize the global topology of a packing as a whole and form a partition of space, but the number of faces of such a VDP is not always equal to MCN.

In both cases the only VDP parameter, number of faces, has clear crystallochemical meaning (Table 2).

Figure 7: (a) A molecule N₄S₄F₄; (b) VDP of a nitrogen atom; (c) molecular VDP (dotted lines confine boundary surfaces); (d) smoothed and (e) lattice molecular VDPs.

3.1.2. Generating hydrogen positions

Parameters of atomic VDPs are used in the program HSite intended for the calculation of the coordinates of H atoms connected to X atoms (X = B, C, N, O, Si, P, S, Ge, As, Se) depending on their nature, hybridization type and arrangement of other atoms directly non-bonded with the X atoms. In comparison with known similar programs HSite has some additional features:

(i) At the determination of the hybridization type of an atom X the Me¼X contacts of different type (s or p) between metal (Me) and X atoms are taken into account.

(ii) During the generation of H atoms in groups with rotational degrees of freedom, the search for an optimal orientation of the group is fulfilled depending on the arrangement and size of the surrounding atoms. In turn, the sizes of these atoms are approximated by their R_sd values. In the determination of the optimal orientation the effects of repulsion in H¼H contacts are considered and the possibility of the appearance of hydrogen bonds O(N)–H ¼O(N) is taken into account.

The HSite algorithm includes the following steps:

(i) Searching for X atoms, which can be potentially linked with hydrogen atoms.

(ii) Determination of the hybridization (sp, sp² or sp³) of these atoms in accordance with the following criteria:

· B, Si and Ge atoms may have the sp³ hybridization only.

· O, P, S, As and Se atoms may have the sp² or sp³ hybridization only.

· C and N atoms may have any type of hybridization.

· Bonds with metal atoms are taken into account at the determination of hybridization only if they form s-bonds with X atoms. HSite automatically determines the type of Me–X bonds (s or p) using the following criterion: a pair of X atoms is involved into a p bonding with a Me atom if they are also linked together, i.e. there is a triple .

· The types of hybridization are distinguished depending on the parameters of valence bonds formed by X atoms with other L atoms:

Total number of X–L bonds	Number of bonds with L=C, N, O, S, Se	Numerical criterion	Hybridization
any	0	none	sp³
1	1	R(X-L)£R_max(sp)	sp
1	1	R(X-L)£R_max(sp²)	sp²
1	1	R(X-L >R_max(sp²)	sp³
2	1, 2	Ð L–X–L ³ Ð_min(sp)	sp
2	1, 2	R(X-L1)+R(X-L2)<R_S(sp³)	sp²
2	1, 2	R(X-L1)+R(X-L2)³R_S(sp³)	sp³
3	1, 2, 3	Ð L1–X–L2 + Ð L1–X–L3 + Ð L2–X–L3 > Ð_S(sp²)	sp²
3	1, 2, 3	Ð L1–X–L2 + Ð L1–X–L3 + Ð L2–X–L3 £ Ð_S(sp²)	sp³

The criteria R_max(sp), R_max(sp²), R_S(sp³) have the default values 1.30, 1.40 and 2.90 Å for X, L= C, N and O, respectively. If X or L atom is of the 3rd or the 4th period, then the R_max(sp²) criterion is increased by 0.4 or 0.5 Å, respectively, and R_S(sp³) is increased by 0.8 or 1.0 Å. If a boron atom participates in the bond, all values increase by 0.11 Å.

(iii) The site symmetry of X and L positions is taken into account. If necessary, the type of hybridization of X atom and the number of hydrogen atoms to be added are corrected. For example, if the above mentioned criteria show that a carbon atom is in sp³ hybridization and should form a methyl group, but its site symmetry is C₂, then its true hybridization is assumed to be sp² and it really forms a planar CH₂ group.

(iv) Positions of hydrogen atoms are determined with the following geometric criteria:

· Bond angles ÐH-X-H depend only on the type of hybridization of the X atom and are equal to 180, 120 and 109.47° for sp-, sp² and sp³ hybridization, respectively.

· For the sp² hybridization in the group L₂XH the condition ÐL1–X–H = ÐL2–X–H must hold.

· For the sp³ hybridization in the group L₂XH₂ two additional hydrogen atoms must lie in the plane perpendicular to the plane passing through the L1, L2 and X atoms. In the case of the group L₃XH the condition ÐL1-X-H = ÐL2-X-H = ÐL3-X-H must hold.

· Lengths of the bonds O-H, N-H and C-H are equal 0.96, 1.01 and 1.09 Å by default and may be changed. If the X atom is of the 3rd or the 4th period the bond length will be additionally increased by 0.4 or 0.5 Å, respectively. For example, the length of Se–H bond will be 1.46 Å.

· If the atomic group has rotational degrees of freedom, its optimal orientation is searched in the following way: the group rotates with a small step (5° by default), for each orientation the minimum distance (R_min) is found from hydrogen atoms of the group to other atoms except of the atom X itself, normalized by the R_sd values for these atoms. The orientation with maximum R_min assumes to be optimal. For isolated groups (H₂O, NH₄⁺, CH₃^–, etc.) all possible orientations of the primary axis of inertia are additionally verified by scanning an independent region of the spherical coordinate system; the spherical coordinates j and q vary also with the 5° step. If the H bonds are considered, they take priority at the determination of the orientation. The conditions R(H¼X)£R_max(HBond) and ÐX–H¼X > Ð_min(HBond) are used for distinguishing H bonds. A mandatory condition during searching for the orientation is that the distances between hydrogen atoms and other atoms, except the atoms participating in H bonds, must be more than 2 Å (by default). If this condition cannot be obeyed, the program error 'Atom X is invalid' is generated. The orientation of bridge groups XH_n binding several metal atoms is a special case. At that the orientation of the primary axis of inertia of the group is considered passing through the centroid of the set of metal atoms and through the X atom itself. The exception is the planar CH₃⁺ cation whose orientation may be different taking into account the aforesaid criteria.

· Boron atoms are assumed to be in the composition of carboran or borohydride ions. The generation of hydrogens is not provided for boranes.

(v) If there are 'pseudo-bonds' Me–X the parameter R_max(Me) (5 Å by default) may be useful which corresponds to maximum allowable length of the Me–X bonds to be considered at the determination of the geometry and orientation of the XH_n group. To avoid the 'pseudo-bonds' the R_max(Me) may be decreased.

(vi) By default all groups assume to be electroneutral; the valence of the X atoms supposes to be standard and equal to 8 minus number of corresponding group of Periodic Table. If a group is an ion (for example, X-NH₃⁺ or OH^–), it may be taken into account by setting corresponding HSite options ('Hydroxide/amide-anions' or 'Hydroxonium/ammonium-cations').

3.2. Topological analysis: general scheme

Topological analysis is the main TOPOS destination; many modern methods have recently been implemented, and new features appear every year. Below the general scheme of the analysis (Scheme 4) and basic algorithms are considered.

Scheme 4: Topological analysis of a crystal structure in TOPOS.

As follows from Scheme 4 there are three representations of crystal structure in TOPOS: as an atomic net, as a net of voids and channels, and as an atomic packing. The main branch of the scheme begins with generating atomic net as a labelled quotient graph (part 2.1). The subsequent analysis should be performed with program ADS.

3.2.1. Analysis of atomic and molecular nets

To analyze the adjacency matrix of the labelled quotient graph ADS uses the sets of origin {OA}, removed {RA}, contracted {CA}, and target {TA} atoms (part 2.3) to be specified by user. There are two modes of the analysis: Atomic net ({OA}¹Æ) and Molecular net ({OA}=Æ). The algorithm of the first mode consists of the following steps:

(i) All {RA} are removed from the adjacency matrix.

procedure Remove_RA(output AdjMatr)

for i:=1 to NAtoms do

begin

if Atoms[i] Î {RA} then atom must be removed

repeat

looking for AdjMatr[k1].i=i or AdjMatr[k1].j=i

AdjMatr[k1].m:=0 'not a contact' flag

until no AdjMatr[k1].i=AdjMatr[k].j or AdjMatr[k1].j=i

end

(ii) All {CA} form ligands.

procedure Form_Ligands(output Ligands)

for i:=1 to NAtoms do

begin

if Atoms[i]Î{CA} and Atoms[i]Ï{Ligands} then atom forms new ligand

begin

new Ligands[j]

add Atoms[i] to Ligands[j]

for Atoms[k] Î Ligands[j] do

repeat

looking for AdjMatr[k1].i=k

if Atoms[AdjMatr[k1].j]Î{CA} then add Atoms[AdjMatr[k1].j] to Ligands[j]

until no AdjMatr[k1].i=k

end

(iii) All {CA} are contracted to {TA}. A simplified net is obtained as a result.

procedure Contract_CA_to_TA(output AdjMatr)

for i:=1 to NAtoms do

begin

if Atoms[i] Î {TA} then target atom is found

repeat

looking for AdjMatr[k].i=i, i.e. the record corresponding to Atoms[i]

if Atoms[AdjMatr[k].j] Î {CA} then surrounding atom must be contracted

repeat

looking for AdjMatr[k1].i=AdjMatr[k].j

AdjMatr[k1].i:=AdjMatr[k].i

looking for AdjMatr[k2].j=AdjMatr[k].j

AdjMatr[k2].j:=AdjMatr[k].i

until no AdjMatr[k1].i=AdjMatr[k].j

delete AdjMatr[k]

until no AdjMatr[k].i=i

end

The second mode differs from the first one by additional procedure of determining molecular units to be fulfilled after the first step. In this case initially {OA}={CA}={TA}=Æ, but there should be at least two different kinds of bond in adjacency matrix: intramolecular and intermolecular. A typical situation is when the intramolecular bonds are valence (AdjMatr[k].i=1) and intermolecular bonds are hydrogen, specific or/and van der Waals (AdjMatr[k].i=2,3,4). As a result of the additional (ia) step, all atoms fall into {CA} set, and molecular centroids ('pseudo-atoms', PA) are input into {OA} and {TA} sets. Subsequent passing the steps (ii) and (iii) results in the connected net of molecular centroids.

(ia) Searching for molecular units (Molecular net mode).

procedure Form_Molecules(output Molecules, AdjMatr)

for i:=1 to NAtoms do

begin

if Atoms[i]Ï{CA} and Atoms[i]Ï{Molecules} then atom forms new molecule

begin

new Molecules[j]

add Atoms[i] to Molecules[j]

add Atoms[i] to {CA}

for Atoms[k] Î Molecules[j] do

repeat

looking for AdjMatr[k1].i=k

if AdjMatr[k1].m=1 then

begin

add Atoms[AdjMatr[k1].j] to Molecules[j]

add Atoms[AdjMatr[k1].j] to {CA}

end

until no AdjMatr[k1].m=1

call Calc_Centroid(Molecules[j], output PA[j])

add PA[j] to {OA}

add PA[j] to {TA}

NAtoms:=NAtoms + 1

Atoms[NAtoms]:=PA[j]

for Atoms[k] Î Molecules[j] do

begin

new AdjMatr[k1]

AdjMatr[k1].i:=NAtoms

AdjMatr[k1].j:=k

AdjMatr[k1].m:=1

looking for AdjMatr[k2].i=k

AdjMatr[k2].j:=NAtoms

end

Both modes result in a simplified network array corresponding to the structural motif at a given crystal structure representation encoded by the collection ({OA}, {RA}, {CA}, {TA}) (cf. part 2.3). The net nodes are formed by the {OA} set; the resulted and initial nets are the same if {RA}={CA}={TA}=Æ in the Atomic net mode.

The network array may consist of several nets of the same or different dimensionality (0D–3D). Before topological classification ADS distinguishes all molecular (0D) groups, chain (1D), layer (2D) and framework (3D) nets in the array as is shown in the output for ODAHEG[7].

For each net ADS computes basic topological indices CS+ES+VS and several additional ones using original algorithms based on successful analysis of coordination shells. The algorithms have a number of advantages over described in the literature (Goetzke & Klein, 1991; O’Keeffe & Brese, 1992; Yuan & Cormack, 2002; Treacy et al., 2006):

· No distance matrix D´D is used, so the calculation is not memory-limited.

· There are no limits to the node degree (CN).

· Smallest circuits are computed along with smallest rings.

· All rings, not only smallest, can be found within a specified ring size.

· Strong rings can be computed.

An example of TOPOS output with dimensionalities of structural groups in ODAHEG

########################################################################################

5;RefCode:ODAHEG:(C48 H60 CU1 N8 O6)N, 2N(C32 H42 CU1 N5 O4 +), 2N(N1 O3 -), N(C2 H6 O1)

Author(s): PLATER M.J.,FOREMAN M.R.ST.J.,GELBRICH T.,HURSTHOUSE M.B.

Journal: CRYSTAL ENGINEERING Year: 2001 Volume: 4 Number: Pages: 319

#########################################################################################

-------------------------

Structural group analysis

-------------------------

Structural group No 1

-------------------------

Structure consists of chains [ 0 1 0] with CuO6N8C48H56

2-c net

-------------------------

Structural group No 2

-------------------------

Structure consists of chains [ 0 0 1] with CuO4N5C32H42

2-c net

Elapsed time: 6.36 sec.

By computing all rings user can distinguish topologically different nets with the same CS+ES+VS combination. At present such examples are revealed only among artificial nets. The output has the following format:

An example of TOPOS output with all-ring Vertex symbols for rutile

Vertex symbols for selected sublattice

--------------------------------------

O1 Schlafli symbol:{4;6^2}

With circuits:[4.6(2).6(2)]

Rings coincide with circuits

All rings (up to 10): [(4,6(2)).(6(2),8(6)).(6(2),8(6))]

--------------------------------------

Ti1 Schlafli symbol:{4^2;6^10;8^3}

With circuits:[4.4.6.6.6.6.6.6.6.6.6(2).6(2).8(2).8(4).8(4)]

With rings: [4.4.6.6.6.6.6.6.6.6.6(2).6(2).*.*.*]

All rings (up to 10): [4.4.(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).6(2).6(2).*.*.*]

ATTENTION! Some rings * are bigger than 10, so likely no rings are contained in that angle

--------------------------------------

Total Schlafli symbol: {4;6^2}2{4^2;6^10;8^3}

In this case all rings were constructed up to 10-ring. So possibly larger rings exist - TOPOS does not know this!

The notation

All rings (up to 10): [(4,6(2)).(6(2),8(6)).(6(2),8(6))]

means that not only 4- (or 6-) rings, but also longer 8-rings meet at the same angle of the first non-equivalent node (oxygen atom, cf. ES or VS). There is still no conventional notation; it might look as: [(4,6₂).(6₂,8₆).(6₂,8₆)].

Resting upon the CS+ES+VS combination ADS searches for the net topological type in the TTD collection (part 2.2). Besides these basic indices, all rings and strong rings can be used for more detailed description of the net topology. A fragment of ADS output with the computed indices and the conclusion about the net topology is given below.

##################

63;RefCode:nbo:nbo

Author(s): Bowman A L,Wallace T C,Yarnell J L,Wenzel R G

Journal: Acta Crystallographica (1,1948-23,1967) Year: 1966 Volume: 21 Number: Pages: 843

##################

Topology for C1

--------------------

Atom C1 links by bridge ligands and has

Common vertex with R(A-A)

C 1 0.0000 0.5000 0.0000 (-1 0 0) 1.000A 1

C 1 0.0000 0.5000 1.0000 (-1 0 1) 1.000A 1

C 1 0.0000 0.0000 0.5000 ( 0-1 0) 1.000A 1

C 1 0.0000 1.0000 0.5000 ( 0 0 0) 1.000A 1

Coordination sequences

----------------------

C1: 1 2 3 4 5 6 7 8 9 10

Num 4 12 28 50 76 110 148 194 244 302

Cum 5 17 45 95 171 281 429 623 867 1169

----------------------

TD10=1169

Vertex symbols for selected sublattice

--------------------------------------

C1 Schlafli symbol:{6^4;8^2}

With circuits:[6(2).6(2).6(2).6(2).8(6).8(6)]

With rings: [6(2).6(2).6(2).6(2).8(2).8(2)]

All rings (up to 10): [(6(2),8).(6(2),8).(6(2),8).(6(2),8).8(2).8(2)]

All rings with types: [(6(2),8).(6(2),8).(6(2),8).(6(2),8).8(2).8(2)]

--------------------------------------

Total Schlafli symbol: {6^4;8^2}

4-c net; uninodal net

Topological type: nbo NbO; 4/6/c2 {6^4;8^2} - VS [6(2).6(2).6(2).6(2).8(2).8(2)] (18802 types in 6 databases)

Strong rings (MaxSum=6): 6

Non-strong ring: 8=6+6+6+6

Elapsed time: 1.00 sec.

a b c d

Figure 8: (a) Intersecting 8-rings (Hopf link) in self-catenating coesite; one of the rings is triangulated. (b) Two orientations (positive and negative) of the same 4-ring in body-centred cubic lattice determined as cross-products A´B and B´A. The black ball is the ring centroid. The direction of the ring tracing (1234) coincides with the A direction. (c) Non-Hopf link between 6- and 10-ring in self-catenating ice II. (d) Double link between 8-rings in interpenetrating array of two quartz-like nets.

If there are more than one nets in the array ADS determines the type of their mutual entanglement (polythreading, polycatenation, interpenetration and self-catenation) according to principles described by Carlucci et al. (2003), Blatov et al. (2004). Analysis of 0D–2D (low-dimensional) entanglements is based on searching for the intersections of rings by bonds not belonging to these rings. Since, generally speaking, the rings are not flat, they are represented as a facet surface by a barycentric subdivision (triangulation, Fig. 8a). The ring surface has two opposite orientations (positive and negative), and the ring boundary has a distinct direction of tracing (Fig. 8b). Let us call the ring intersection positive if the bond making an intersection within the boundary of the ring is directed to the same half-space as the vector of positive ring orientation, and negative otherwise. If there is the single ring intersection (positive or negative) the link between rings is always true (Hopf, Fig. 8a). If the numbers of positive and negative intersections are the same, the link can be unweaved (it is false, non-Hopf link, Fig. 8c), if the difference between the numbers is more than a unity, the link is multiple (Fig. 8d). ADS determines the link types; the real entanglement exists if there is at least one true (Hopf or multiple) link. Then ADS outputs the type of the entanglement (see Example 1). A special case is the entanglement of several 3D nets (3D interpenetration), when the information is output about Class of interpenetration (Blatov et al., 2004) and symmetry operations relating different 3D nets (Example 2).

Example 1. 2D+2D, inclined polycatenation (Fig. 9a)

########################################

6;RefCode:LETWAI:C24 H24 Cu4 F12 N12 Si2

Author(s): Macgillivray L.R.,Subramanian S.,Zaworotko M.J.

Journal: CHEM.COMMUN. Year: 1994 Volume: Number: Pages: 1325

########################################

Topology for Cu1

--------------------

Atom Cu1 links by bridge ligands and has

Common vertex with R(A-A) f

Cu 1 0.7063 -0.2063 0.0000 ( 1 0 0) 6.937A 1

Cu 1 0.2063 0.2937 -0.5000 ( 0 0-1) 6.685A 1

Cu 1 0.2063 0.2937 0.5000 ( 0 0 0) 6.685A 1

-------------------------

Structural group analysis

-------------------------

Structural group No 1

-------------------------

Structure consists of layers ( 1 1 0); ( 1-1 0) with CuN3C6H6

Vertex symbols for selected sublattice

--------------------------------------

Cu1 Schlafli symbol:{6^3}

With circuits:[6.6.6]

--------------------------------------

Total Schlafli symbol: {6^3}

3-c net

-----------------------

Non-equivalent circuits

-----------------------

Circuit No 1; Type=6; Centroid: (0.500,0.000,0.500)

------------------------------

Atom x y z

------------------------------

Cu1 0.2937 0.2063 1.0000

Cu1 0.7063 -0.2063 1.0000

Cu1 0.7937 -0.2937 0.5000

Cu1 0.7063 -0.2063 0.0000

Cu1 0.2937 0.2063 0.0000

Cu1 0.2063 0.2937 0.5000

Crossed with bonds

------------------------------------------------------------------------------------------------

No | Atom x y z | Atom x y z | Dist. | N Cycles

------------------------------------------------------------------------------------------------

1 | Cu1 0.2937 -0.2063 0.5000 | Cu1 0.7063 0.2063 0.5000 | 6.937 | 6/inf 6/inf

------------------------------------------------------------------------------------------------

Ring links

------------------------------------------------------

------------------------------------------------------

6 | 6 | inf. | 1 | 1 | * | 2

------------------------------------------------------

Polycatenation

--------------

Groups

1: 2D, CuN3C6H6 (Zt=1); (1,1,0); (1,-1,0)

Types

----------------------------------------------------

Group 1 | Orient. | Group 2 | Orient. | Type

----------------------------------------------------

1 | 1,1,0 | 1 | 1,-1,0 | 2D+2D, inclined

----------------------------------------------------

Elapsed time: 2.14 sec.

Figure 9: (a) Entangled 2D layers in the crystal structure of LETWAI. The nets are simplified at {OA}={TA}={Cu}. (b) Interpenetrating 3D nets in the cuprite (Cu₂O) crystal structure.

Example 2. Interpenetration of two 3D nets in cuprite, Cu₂O (Fig. 9b)

####################

7;RefCode:63281:Cu2O

Author(s): Restori R,Schwarzenbach D

Journal: Acta Crystallographica B (39,1983-) Year: 1986 Volume: 42 Number: Pages: 201-208

####################

-------------------------

Structural group analysis

-------------------------

Structural group No 1

-------------------------

Structure consists of 3D framework with Cu2O

There are 2 interpenetrated nets

FIV: Full interpenetration vectors

----------------------------------

[0,1,0] (4.27A)

[0,0,1] (4.27A)

[1,0,0] (4.27A)

----------------------------------

PIC: [0,2,0][0,1,1][1,1,0] (PICVR=2)

Zt=2; Zn=1

Class Ia Z=2

Vertex symbols for selected sublattice

--------------------------------------

O1 Schlafli symbol:{12^6}

With circuits:[12(2).12(2).12(2).12(2).12(2).12(2)]

--------------------------------------

Cu1 Schlafli symbol:{12}

With circuits:[12(6)]

--------------------------------------

Total Schlafli symbol: {12^6}{12}2

2,4-c net with stoichiometry (2-c)2(4-c)

-----------------------

Non-equivalent circuits

-----------------------

Circuit No 1; Type=12; Centroid: (0.000,0.500,0.500)

------------------------------

Atom x y z

------------------------------

O1 0.2500 0.2500 1.2500

Cu1 0.5000 0.5000 1.0000

O1 0.7500 0.7500 0.7500

Cu1 0.5000 1.0000 0.5000

O1 0.2500 1.2500 0.2500

Cu1 0.0000 1.0000 0.0000

O1 -0.2500 0.7500 -0.2500

Cu1 -0.5000 0.5000 0.0000

O1 -0.7500 0.2500 0.2500

Cu1 -0.5000 0.0000 0.5000

O1 -0.2500 -0.2500 0.7500

Cu1 0.0000 0.0000 1.0000

Crossed with bonds

------------------------------------------------------------------------------------------------

No | Atom x y z | Atom x y z | Dist. | N Cycles

------------------------------------------------------------------------------------------------

1 | O1 -0.2500 0.7500 0.7500 | Cu1 0.0000 0.5000 0.5000 | 1.848 | 12/inf 12/inf 12/inf 12/inf 12/inf 12/inf

1 | O1 0.2500 0.2500 0.2500 | Cu1 0.0000 0.5000 0.5000 | 1.848 | 12/inf 12/inf 12/inf 12/inf 12/inf 12/inf

------------------------------------------------------------------------------------------------

Ring links

------------------------------------------------------

------------------------------------------------------

12 | 12 | inf. | 1 | 1 | * | 6

------------------------------------------------------

Elapsed time: 5.75 sec.

ADS uses the information about ring intersections to construct natural tiling (Delgado-Friedrichs & O’Keeffe, 2005) that carries the net. Although the definition for natural tiling has been well known, there was no strict algorithm of its construction. The main problem is that not all strong rings (Fig. 10a) are necessary the faces of the tiles, but only essential ones (Delgado-Friedrichs & O’Keeffe, 2005; Fig. 10b). At the same time no criteria were reported to distinguish essential strong rings, so they can be determined only after constructing the natural tiling.

a b c

Figure 10: (a) Closed sum of strong 5,6-rings (magenta) and non-strong 18-ring (yellow) in fullerene. (b) Two tiles, essential (green) and inessential (red) strong rings in the natural tiling of body-centred cubic net. (c) Two intersecting equivalent inessential rings (red and yellow) in the tile.

ADS uses the following definition of essential strong ring: this is strong ring that intersects no other essential strong rings. There are two types of such intersections: homocrossing and heterocrossing, when the intersecting rings are equivalent (Fig. 10c) or inequivalent. The rings participating in a homocrossing are always inessential, the rings participating in only heterocrossings can be essential in an appropriate ring set, otherwise the ring is always essential. Thus, the algorithm of searching for essential rings consists of the following steps:

(i) Compute all rings within a given range. Because even the rings of the same size are not always symmetrically equivalent, TOPOS can distinguish them by assigning types. The types are designated by one or more letters: a-z, aa-az, ba-bz, etc., for example, 4a, 12ab, 20xaz. As a result a typed all-ring Vertex symbol is calculated:

An example of TOPOS output with typed all-ring Vertex symbols for rutile

Vertex symbols for selected sublattice

--------------------------------------

O1 Schlafli symbol:{4;6^2}

With circuits:[4.6(2).6(2)]

Rings coincide with circuits

All rings (up to 10): [(4,6(2)).(6(2),8(6)).(6(2),8(6))]

All rings with types: [(4,6(2)).(6(2),8a(4),8b(2)).(6(2),8a(4),8b(2))]

--------------------------------------

Ti1 Schlafli symbol:{4^2;6^10;8^3}

With circuits:[4.4.6.6.6.6.6.6.6.6.6(2).6(2).8(2).8(4).8(4)]

With rings: [4.4.6.6.6.6.6.6.6.6.6(2).6(2).*.*.*]

All rings (up to 10): [4.4.(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).(6,8(3)).6(2).6(2).*.*.*]

All rings with types: [4.4.(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),8b).(6,8a(2),

8b).6(2).6(2).*.*.*]

ATTENTION! Some rings * are bigger than 10, so likely no rings are contained in that angle

--------------------------------------

Total Schlafli symbol: {4;6^2}2{4^2;6^10;8^3}

For example, the first angle for the first node (oxygen atom) contains two non-equivalent 8-rings. There is no conventional notation for typed all-ring Vertex symbol. We propose the following one: [(4,6₂).(6₂,8a₄,8b₂).(6₂,8a₄,8b₂)].

(ii) Select strong rings. All non-strong rings are output as sums of smaller rings:

An example of TOPOS output with strong and non-strong rings for zeolite MTF

Strong rings (MaxSum=8): 4,5a,5b,5c,5d,6a,6b,6c,6d,8a,8b

Non-strong ring: 7=5d+6c+6d

Non-strong ring: 12=4+5a+5a+5b+5b+6a+8a+8b

(iii) Find all rings intersected by bonds (in entangled structures) and reject them. This condition is required because tile interior must be empty.

(iv) For all remaining rings find their intersections.

(v) Reject all rings participating in homocrossings.

(vi) Arrange all remaining rings into the sets, where no intersecting rings exist. The sets are maximal, i.e. no other ring can be added to the set to avoid heterocrossings.

Each of the sets obtained is then checked to produce a natural tiling. Starting from the first ring of the set and taking one of two possible ring orientations (Fig. 8b) ADS adds another ring to an edge of the initial ring to get a ring sum. For instance, three pentagonal and three hexagonal rings can be added to the central hexagonal ring in Fig. 10a. In 3D nets several (at least three) rings are adjacent to any edge, so there is an ambiguity at this step. To get over this problem and to speed up the calculation the dihedral angles are computed between each of the trial rings and the initial ring. Really these are the angles between normals to ring facets (triangles) based on the edge of the initial ring (Fig. 11a). Since the facets are oriented, the angles vary in the range 0-360°. Let us consider two facets of two trial rings 1 and 2 (candidates to be the tile face) with different angles j₁ and j₂; j₂>j₁. Obviously, if we choose the ring 2, this means that the tile intersects another tile to which the ring 1 belongs. So, the target ring for natural tile can be unambiguously chosen at each step as the ring with minimal dihedral angle j_min. Then the next ring is added to any of free, i.e. belonging to only one ring, edge of the sum. The procedure repeats until no free edges remain, i.e. sum becomes closed (Fig. 10a). The closed ring sum is one of the natural tiles. Then the procedure starts again for the opposite orientation of the initial ring. As a result the initial ring becomes shared between two natural tiles (Fig. 11b). Then ADS considers all other inequivalent rings in the same way. Thus, all tiles forming the natural tiling are obtained with the following algorithm.

procedure Natural_Tiling(output Tiles)

NumTiles:=0

for i:=1 to NStrongRings do

begin

NumTiles:=NumTiles + 1

add StrongRings[i] to Tiles[NumTiles] initialize new tile

for j:=1 to 2 do j is an orientation number for the first ring of the tile

begin

repeat

call AddRing(j, output Tiles[NumTiles]) add new ring to the tile

until no new ring is added to Tiles[NumTiles]

end

Figure 11: (a) Some 4-rings sharing the same (red) edge in body-centred cubic net. The grey facet is the facet of the initial ring; the yellow one has smaller j than the green one. The black balls are the rings centroids. (b) 4-ring (red) shared between two natural tiles. (c) Two natural tiles shared by red face in an MPT of the idealized net bcw.

Then ADS determines a number of geometrical and topological characteristics of tiles and tiling (Delgado-Friedrichs & O’Keeffe, 2005). The resulted output looks as shown below (the sodalite net example).

The physical meaning of the tiles is that they correspond to minimal cages in the net. Using these ‘bricks’ ADS can construct larger tiles by summarizing natural tiles (merging them by faces). In this way, maximal proper tiles (MPT) and tiling can be obtained representing maximal cages allowed by a given net symmetry (Fig. 11c).

Resting upon the tiling ADS can construct dual net, whose nodes, edges, rings and tiles map onto tiles, rings, edges and nodes of the initial net (Delgado-Friedrichs & O’Keeffe, 2005). In particular, nodes and edges of the dual net describe the topology of the system of cages and channels in the initial net (Fig. 12). The data on the dual net are stored in a TOPOS database, so the dual net can be studied as an ordinal net including generation of dual net (‘dual dual net’).

Figure 12: Initial net (cyan balls) and dual net (yellow sticks) in sodalite.

An example of TOPOS output for natural tiling in sodalite net

#################

3;RefCode:sod:sod

#################

Topology for C1

--------------------

Atom C1 links by bridge ligands and has

Common vertex with R(A-A)

C 1 0.5000 0.0000 0.2500 ( 1 0 0) 0.707A 1

C 1 0.5000 0.0000 0.7500 ( 1 0 1) 0.707A 1

C 1 0.0000 0.2500 0.5000 ( 0 0 1) 0.707A 1

C 1 0.0000 -0.2500 0.5000 ( 0 0 1) 0.707A 1

Vertex symbols for selected sublattice

--------------------------------------

C1 Schlafli symbol:{4^2;6^4}

With circuits:[4.4.6.6.6.6]

Rings coincide with circuits

Rings with types: [4.4.6.6.6.6]

--------------------------------------

Total Schlafli symbol: {4^2;6^4}

4-c net

Essential rings by homocrossing: 4,6

Inessential rings by homocrossing: none

-----------------------------

Primitive proper tiling No 1

-----------------------------

Essential rings by heterocrossing: 4,6

Inessential rings by heterocrossing: none

Natural tiling

24/14:[4^6.6^8]; Centroid:(0.500,0.500,0.500); Volume=4.000; G3=0.078543

-------------------------------

Atom x y z

-------------------------------

C1 0.5000 0.2500 0.0000

C1 0.5000 0.0000 0.2500

C1 0.7500 0.0000 0.5000

C1 0.2500 0.0000 0.5000

C1 0.5000 0.0000 0.7500

C1 0.5000 0.2500 1.0000

C1 0.0000 0.2500 0.5000

C1 0.2500 0.5000 0.0000

C1 0.0000 0.5000 0.2500

C1 0.7500 0.5000 0.0000

C1 0.5000 0.7500 0.0000

C1 1.0000 0.2500 0.5000

C1 1.0000 0.5000 0.2500

C1 0.7500 0.5000 1.0000

C1 1.0000 0.5000 0.7500

C1 1.0000 0.7500 0.5000

C1 0.0000 0.7500 0.5000

C1 0.0000 0.5000 0.7500

C1 0.2500 0.5000 1.0000

C1 0.5000 0.7500 1.0000

C1 0.5000 1.0000 0.2500

C1 0.7500 1.0000 0.5000

C1 0.5000 1.0000 0.7500

C1 0.2500 1.0000 0.5000

Tiling: [4^6.6^8]

Transitivity: [1121]

Simple tiling

All proper tilings (S=simple; I=isohedral)

------------------------------------------------------------

Tiling | Essential rings | Transitivity | Comments | Tiles

------------------------------------------------------------

PPT 1/NT | 4,6 | [1121] | MPT SI | [4^6.6^8]

------------------------------------------------------------

Elapsed time: 2.55 sec.

3.2.2. Analysis of systems of cavities and channels

Quite another way to get the system of cages and channels is to consider Voronoi-Dirichlet partition (part 2.1) and to analyze the net of VDP vertices and edges, Voronoi-Dirichlet graph (Fischer, 1986). The principal difference between tiling and Voronoi-Dirichlet approaches is that the former approach is purely topological and derives the cages and channels from the topological properties of the initial net, whereas the latter one treats the geometrical properties of crystal space for the same purpose. Here the geometrical and topological parts of TOPOS are combined with each other.

The main notions of the Voronoi-Dirichlet approach are elementary void and elementary channel. Some their important properties to be used in TOPOS algorithms follow from the properties of Voronoi-Dirichlet partition.

Elementary void properties

(i) The elementary void is equidistant to at least four noncoplanar atoms (tetrahedral void) since no less than four VDPs meet in the same vertex (Fig.13a). There are two types of elementary voids: major, if its centre is allocated inside the polyhedron, whose vertices coincide with the atoms forming the elementary void (for instance, inside the tetrahedron for a tetrahedral void, Fig.13a); and minor, if its centre lies outside or on the boundary of the polyhedron (Fig.13b).

(ii) There are additional atoms at longer distances than the atoms of elementary void that can strongly influence the geometrical parameters of the elementary void. To find these parameters, one should construct the void VDP taking into account all atoms and other equivalent elementary voids (Fig.13c). Let us call the atoms and voids participating in the VDP formation environmental. Obviously, the atoms forming the elementary void are always environmental.

(iii) Radius of elementary void (R_sd) is the radius of a sphere, whose volume is equal to the volume of the void VDP constructed with consideration of all environmental atoms and voids.

(iv) Shape of elementary void is estimated by G₃ value for the void VDP constructed with all environmental atoms and voids (Fig.13c).

a b c

Figure 13: (a) Four VDPs meeting in the same vertex (red ball) in the body-centred cubic lattice. (b) A minor void (ZC2) allocated outside the tetrahedron of the three yellow oxygen atoms and zirconium atom forming this void in the crystal structure of NASICON, Na₄Zr₂(SiO₄)₃. All distances from ZC2 to the oxygen and zirconium atoms are equal 1.722 Å. (c) The form of an elementary void in the NaCl crystal structure. All environmental atoms and one void are yellow; R_sd =1.38 Å, G₃=0.07854.

Elementary channel properties

(i) The elementary channel is formed by at least three noncollinear atoms since in the Voronoi-Dirichlet partition each VDP edge is shared by no less than three VDPs. The plane passing through these atoms is perpendicular to the line of the elementary channel (Fig. 14a).

(ii) Section of the elementary channel is a polygon whose vertices are the atoms forming the channel; the section always corresponds to the narrowest part of the channel. The line of the elementary channel is always perpendicular to its section; ordinarily, the channel section and channel itself are triangular (Figs. 14a, b). The elementary channel can be of two types: major, if its line intersects its section (Fig. 14a), and minor, if the line and section have no common points, or one of the line ends lies on the section (Fig. 14b).

(iii) Radius of the elementary channel section is estimated as a geometric mean for the distances from the inertia centre of the elementary channel section to the atoms forming the channel. The atom can freely pass through the channel if the sum of its radius and an averaged radius of the atoms forming the channel does not exceed the channel radius.

(iv) Length of elementary channel is a distance between the elementary voids connected by the channel, i.e. is the length of corresponding VDP edge.

a b

Figure 14: (a) Section of a triangular major elementary channel in the crystal structure of a-AgI. The channel line is red. The atoms forming the channel are in the vertices of the triangular section intersecting the channel line in the black ball. (b) A fragment of the channel system in the crystal structure of NASICON. The line of a minor elementary channel is red, the oxygen atoms forming this channel are yellow. Other minor elementary channels are shown by dotted lines.

The Voronoi-Dirichlet approach is implemented into the program Dirichlet as the following general algorithm:

(i) constructing VDPs for all independent framework atoms, i.e. a Voronoi-Dirichlet partition of the crystal space (interstitial particles including mobile ions or solvate molecules are ignored);

(ii) determining the coordinates for all independent vertices of atomic VDPs, and, as a result, the coordinates of all elementary voids;

(iii) determining all independent VDP edges, and, hence, all elementary channels;

(iv) calculating the numerical parameters of elementary voids and channels.

The information on the resulted conduction pattern is stored as a three-level adjacency matrix of the Voronoi-Dirichlet graph (Fig. 15).

The first level contains the information on a (central) elementary void. A major elementary void is designated as ZA, a minor one is marked as ZB or ZC if it lies respectively on the boundary or outside the polyhedron of the atoms forming the void. The void radius (R_sd, Å) and second moment of inertia of its VDP (G₃) are also shown.

The second level includes the information on other elementary voids connected with a given (central) one by channels, and on the atoms of its near environment. Every elementary void is characterized by the length (R, Å) of the elementary channel connecting it with the central one, by the number of channel atoms (Chan), and by the channel radius (Rad, Å). If a channel is major, the text is marked bold, otherwise a normal font is applied. Every environmental atom of the central void is characterized by the distance to the void centre (R, Å), and by the solid angle of corresponding VDP face (SA, in percentage of 4p steradian); the greater SA, the more significant the contact atom–void.

The third level contains the information on the atoms forming the channel; the distances between the atoms and the centre of the channel section are also given.

Figure 15: A TOPOS window containing the information on the adjacency matrix of the Voronoi‑Dirichlet graph for a-AgI.

In contrast to the tiling approach, resting upon the adjacency matrix of Voronoi-Dirichlet graph TOPOS can compute a number of geometrical parameters of cages and channels to be important to predict some physical properties of the substance, in particular, ionic conductivity and ion-exchange capacity. Note that, in general, the Voronoi-Dirichlet graph does not coincide with the dual net. The dual net has always the same topology for any spatial embedding of the initial net, whereas the topology of the Voronoi-Dirichlet graph depends on the geometrical properties of the crystal space. The physical meanings of the Voronoi-Dirichlet graph parameters are summarized in Table 3; recall that the graph nodes and edges correspond to elementary voids and channels.

Table 3: Physical meaning of Voronoi-Dirichlet graph parameters

Parameter	Meaning
R_sd of node	The radius of an atom that can be allocated in the void under the influence of the crystal field distorting the spherical shape of the atom
G₃ of node	Sphericity degree for the nearest environment of the void; the less G₃, the closer the void shape to a sphere
Radius of edge	Effective radius of the channel between two voids
Length of edge	Length of the channel between two voids
Connected subgraph	Migration path, a set of the elementary voids and channels available for mobile particles
A set of all connected subgraphs	Conduction pattern of the substance. The dimensionality of the conduction pattern determines the dimensionality of conductivity (1D, 2D, or 3D)

3.2.3. Analysis of packings

Besides the model of atomic net, TOPOS can use the structure representation as a packing of atoms or atomic groups. In this case the adjacency matrix of the crystal structure contains interatomic contacts, not bonds. This representation can be obtained running AutoCN in the Solid Angles mode (part 2.1). The program IsoTest internally uses this mode to enumerate all possible atom packings to be selected in the crystal structure. IsoTest forms all subsets of the {NAtoms} set of all kinds of atom and generates the packing net (cf. part 2.3) by considering all faces of atomic VDPs constructed within a given subset. For instance, for NaCl, there will be considered three ion packings corresponding to the subsets {Na}, {Cl} and {Na, Cl}. For each of them IsoTest computes topological indices and performs topological analysis according to Scheme 4.

3.2.4. Hierarchical topological analysis

At the highest level of the topological analysis (Scheme 4) IsoTest analyzes all graph representations with the algorithm described in part 2.3. Simultaneously, IsoTest arranges the compounds by structure types using the definition of Lima-de-Faria et al. (1990). The results are output into a textual *.it2 file as shown below for comparing simple sulfates with binary compounds.

------------------

Isotypic compounds

-----------------------------

Topological type of 1:Li2(SO4)

-----------------------------

(SO4)+Li

3:CaF2: F+Ca

-----------------------------

Topological type of 6:CaSO4

-----------------------------

(SO4)+Ca

1:NaCl: Na+Cl

-----------------------------

Topological type of 7:ZnSO4

-----------------------------

Structure Type of 34:SiO2 S+Zn<->Si O<->O

Zn+(SO4)

5:ZnS: Zn+S

-----------------------------

Topological type of 17:MgSO4

-----------------------------

(SO4)+Mg

2:NiAs: Ni+As

In particular, the results show that Li₂SO₄ relates to the fluorite, CaF₂, if sulfate ion is considered as a whole, with the oxygen atoms contracted to the sulfur atom. The similar relations are observed for the pairs CaSO₄«NaCl; ZnSO₄«ZnS; MgSO₄«NiAs. However, ZnSO₄ has one more relation to cristobalite, SiO₂, if Zn and S atoms correspond to Si atoms.

Thus, the general scheme starts with the analysis of a single net or packing consisting of atoms, ions, molecules, voids, and finishes by the consideration of all possible topological motifs.

4. Processing large amounts of crystal structure data

Most of TOPOS procedures and applied programs can work in two modes, Manual or Continuous, corresponding to handling the single compound or large groups of crystal structures, respectively. The only exception is IsoCryst, where the Continuous mode is not available. The Continuous mode is not restricted to the number of entries; the largest world-wide databases, CSD, ICSD, CrystMet, may be processed at one computational cycle using the CIF interface. The data obtained in the Continuous mode are output to external files to be handled with TOPOS or other programs. The main Continuous operations available in TOPOS are listed below.

Operation	Output file format
*DBMS*
Copying, moving, deleting, undeleting, searching, retrieving, exporting, importing database entries	TOPOS database, textual files
Determining chemical composition, searching for errors in data and disordering, transforming adjacency matrix, generating crystal structure representations	TOPOS database
*ADS*
Simplifying atomic net	TOPOS database
Computing CS, ES, VS, determining net topology	textual .nnt, Microsoft Excel-oriented .txt
Determining net entanglements and structure group dimensionality	Microsoft Excel-oriented *.txt
Selecting molecular crystal structure groups, constructing molecular VDPs, determining methods of ligand coordination	binary StatPack *.bin
Constructing tiles, computing parameters of natural tiling	textual *.cgd
Determining combinatorial types of tile	binary .edg, .pdt, *.vec
Constructing dual net	TOPOS database
*AutoCN*
Computing adjacency matrix	TOPOS database
*DiAn*
Computing interatomic distances and bond angles	textual .dia, .ang
*Dirichlet*
Computing atomic VDP parameters	binary StatPack *.bin
Determining combinatorial types of VDP	binary .edg, .pdt, *.vec
Constructing Voronoi-Dirichlet graph	TOPOS database
*HSite*
Determining positions of hydrogen atoms	TOPOS database
*IsoTest*
Generating crystal structure representations, computing CS, ES, VS	TOPOS database
Comparing atomic and packing net topologies	textual *.it2
Determining structure types	textual *.ist

5. Outlook

Crystallochemical analysis is still mainly based on the analysis of local geometrical properties of crystal structures, such as interatomic distances and bond angles. In this manner, crystal chemistry remains to be stereochemistry to a great extent. Discovering genuine crystallochemical regularities that manage global properties of crystal structures, such as atomic and molecular net topologies, types of atomic and molecular packings, sizes and architecture of cages and channels, require new theoretical approaches, computer algorithms and programs. To find these regularities crystal chemists need first to systematize huge amounts of crystal data collected in the world-wide electronic databases. This job can be done only by using automated computer methods, and TOPOS is intended to actualize them. The transformation of crystal chemistry into crystal chemistry notably began 15-20 years ago, but has already given rise to novel scientific branches, such as supramolecular chemistry, reticular chemistry, crystal engineering and crystal design. TOPOS program package evolves together with crystal chemistry; new algorithms and procedures are implemented every year. In this way, the current TOPOS state described above should not be considered as something completed, but as a foundation for further development.

Acknowledgements

The TOPOS applied programs Dirichlet, IsoCryst and StatPack were mainly written by Dr. A.P. Shevchenko. I am indebted to my former scientific advisor Prof. V.N. Serezhkin, who initialized the TOPOS project at the end of 1980s and stimulated its development for a long time. I am grateful to Prof. D.M. Proserpio, who opened my eyes to a lot of crystallochemical problems to apply TOPOS. Discussions with Prof. M. O'Keeffe, Dr. S.T. Hyde, Dr. O. Delgado-Friedrichs have highly promoted the development of TOPOS topological algorithms. My PhD students I.A. Baburin, E.V. Peresypkina, M.V. Peskov spent a lot of time testing novel TOPOS features; their painstaking work enabled me to fix many bugs and provided high TOPOS stability.

References

Blatov, V. A. (2004). Cryst. Rev. 10, 249-318.

Blatov, V. A. (2006). Acta Cryst. A62, 356-364.

Blatov, V. A., Carlucci, L., Ciani, G. & Proserpio, D. M. (2004). CrystEngComm, 6, 377–395.

Carlucci, L., Ciani, G. & Proserpio, D. M. (2003). Coord. Chem. Rev. 246, 247–289.

Chung, S. J., Hahn, Th. & Klee, W. E. (1984). Acta Cryst. A40, 42-50.

Conway, J.H. & Sloane, N.J.A. (1988). Sphere Packings, Lattices and Groups, New York: Springer Verlag.

Delgado-Friedrichs, O. & O’Keeffe, M. (2005). J. Solid State Chem. 178, 2480-2485.

Fischer, W. (1986). Cryst. Res. Technol. 21, 499-503.

Goetzke, K. & Klein, H.-J. (1991). J. Non-Cryst. Solids. 127, 215-220.

Lima-de-Faria, J., Hellner, E., Liebau, F., Makovicky, E. & Parthé, E. (1990). Acta Cryst. A46, 1-11.

O’Keeffe, M. (1979) Acta Cryst. A35, 772-775.

O’Keeffe, M. & Brese, N.E. (1992). Acta Cryst. A48, 663-669.

Peresypkina, E.V. & Blatov, V.A. (2000). Acta Cryst. B56, 1035-1045.

Preparata, F. P. & Shamos, M. I. (1985). Computational Geometry. New York: Springer-Verlag.

Serezhkin, V. N., Mikhailov, Yu. N. & Buslaev, Yu. A. (1997). Russ. J. Inorg. Chem. 42, 1871-1910.

Sowa, H. & Koch, E. (2005). Acta Cryst. A61, 331-342.

Treacy, M.M.J., Foster, M.D. & Randall, K.H. (2006). Microp, Mes. Mater. 87, 255–260.

Yuan, X. & Cormack, A.N. (2002). Comput. Mater. Sci. 24, 343-360.

**Appendix - TOPOS Glossary**

Atomic domain is a region, which 'belongs' to an atom in crystal space. The notion 'belongs' may be understood differently, depending on the task to be solved.

Atomic Voronoi-Dirichlet polyhedron (VDP, Voronoi polyhedron, Dirichlet domain) is a convex polyhedron whose faces are perpendicular to segments connecting the central atom of VDP (VDP atom) and other (surrounding) atoms; each face divides corresponding segment by half. VDPs of all atoms form normal (face-to-face) Voronoi-Dirichlet partition of crystal space.

Circuit (cycle) is a closed chain of connected atoms.

Coordination sequence (CS) {N_k} is a set of sequential numbers N₁, N₂, ¼ of atoms in 1st, 2nd, etc. coordination spheres of an atom in the net. The first ten coordination spheres are usually considered at the topological classification. The coordination number is equal to N₁, and the graph node is called N₁-connected or N₁-coordinated.

Elementary channel is a free space connecting a couple of elementary voids; the channel corresponds to a VDP edge for the atom forming either of the voids. Such an edge is called line of the elementary channel. Accordingly, the atoms forming the elementary channel are the atoms whose VDPs have common edge coinciding with the channel line.

Elementary void is a region of crystal space with the centre in a vertex of an atomic VDP. The atoms, whose VDPs meet in the centre of a given elementary void, are referred to as atoms forming the elementary void.

Extended Schläfli symbol (ES) contains a detailed description of all shortest circuits for each angle at each non-equivalent atom. Total Schläfli symbol summarizes all the Schläfli symbols for the non-equivalent atoms with stoichiometric coefficients.

Indirect neighbours are contacting atoms, the segment between which does not intersect the VDP face separating these atoms (minor VDP face). Otherwise atoms are called direct neighbours and the corresponding VDP face is called major.

Lattice quantizer is a multilattice embedded into space in such a way that any point of the space is rounded to the nearest node of the quantizer.

Molecular Voronoi-Dirichlet polyhedron is a union of VDPs of atoms composing a molecular (0D) structural group. Facet is a face of the VDP of an atom belonging to a molecular structure group. All facets corresponding to contacts between two molecules form faces (boundary surfaces) of adjacent molecular VDPs. Smoothed molecular VDP is a convex polyhedron derived from the molecular VDP by flattening their faces.

Natural tiling is such subdivision of space by tiles that (i) preserves the symmetry of the net, (ii) has strong rings as tile faces, (iii) contains the tiles as small as possible.

Radius of spherical domain (R_sd) is the radius of a sphere of VDP volume.

Ring is the circuit without shortcuts, i.e. chains between two ring nodes that are shorter than any chain between the nodes that belongs to the circuit.

Strong ring is the ring that is not a sum of smaller rings.

Tile is a 3D solid (generalized polyhedron) bounded by rings (faces) in such a way that any ring edge is shared between two rings.

Vertex symbol (VS) gives similar information as extended Schläfli symbol, but for rings.

Voronoi-Dirichlet graph is the graph consisting of all vertices and edges of all VDPs in the Voronoi-Dirichlet partition.

[1] http://okeeffe-ws1.la.asu.edu/RCSR/home.htm

[2] http://www.iza-structure.org/databases/

[3] http://epinet.anu.edu.au/

[4] Hereafter a Pascal-like pseudocode is used to describe TOPOS algorithms and data structure.

[5] Hereafter all bold italic terms are explained in TOPOS Glossary (Appendix).

[6] The bold three-letter codes indicate the net topology according to the RCSR nomenclature (http://okeeffe-ws1.la.asu.edu/RCSR/home.htm ).

[7] The CSD Reference Code.

These pages are maintained by the Commission Last updated: 15 Oct 2021

Commission on Crystallographic Computing

Multipurpose crystallochemical analysis with the program package TOPOS

Vladislav A. Blatov

Abbreviation list

1. Introduction

Appendix - TOPOS Glossary

**Appendix - TOPOS Glossary**