K.S. Egorova, Ph.V. Toukach

N. D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow, Russia

KEYWORDS: data quality, error, glycomics, carbohydrates, CCSD, Carbbank, database, BCSDB

Journal of chemical information and modeling, 2012, v.52(11), p.2812-2814

DOI: 10.1021/ci3002815

Systematization and classification of carbohydrates contribute greatly to development of modern biomedical sciences. CCSD (CarbBank) data constitute the significant part of nearly all existing carbohydrate databases. However, these data have not been verified from their original deposit. During the expansion of Bacterial Carbohydrate Structure Database (BCSDB) project, we checked CCSD data quality and found that about 35% of records contained errors. The CCSD data cannot be used without manual verification, while CCSD errors migrate from database to database.

