Abstract [eng] |
This dissertation describes fully automated means to extract geometric information – interatomic bond lengths, bond and dihedral angles – from small-molecule crystal structures, and to use this information for the validation of novel crystal structures. Crystallography Open Database (COD), regularly updated open-access resource of small-molecule crystal structures, has been chosen as the source of input data. Software has been developed to prefilter the records from the COD, transform them to a form appropriate for geometric analysis, extract and organise the geometric parameters. Statistical models chosen to describe the groups of chemically similar observations can be used for Bayesian method-based outlier detection: previously unseen, or seen relatively rarely, geometric observations in molecules in consideration are spotted and marked for further analysis. Software implementing this principle has been developed and a Web based user interface has been presented. The method for structure validation has been tested with novel, retracted and deliberately deformed small-molecule crystal structures. The main conclusions of this dissertation are that the COD is a proper resource for small-molecule geometric information, developed methods and software tools are sufficient to organise the data from the source database into a library of molecular geometry, which is in turn capable to spot unusual geometric features in small-molecule crystal structures. |