NEW FEATURES OF CSDB LINEAR, AS COMPARED TO OTHER CARBOHYDRATE NOTATIONS

Ph.V. Toukach1,2, K.S Egorova1

1 N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow, Russia
2 National Research University Higher School of Economics, Moscow, Russia

KEYWORDS: CSDB, Carbohydrate Structure Database, CSDB Linear, carbohydrate notation, glycoinformatics

Journal of Chemical Information and Modeling, 2020, v.60(3), pp. 1276-1289

DOI: 10.1021/acs.jcim.9b00744, PMID: 31790229


The CSDB Linear notation for carbohydrate sequences utilized in the Carbohydrate Structure Database (CSDB) has been improved to meet modern requirements in glycoinformatics. The new features include the possibility to combine repeating and non-repeating moieties in one structure; support of carbon-carbon bonds; and usage of SMILES encodings for unambiguous chemical description of glycan structures, including aglycons and atypical components. The new capabilities of CSDB Linear, together with the older ones, allow efficient detection of errors in CSDB and, at the same time, ensure the absence of informatic problems common for human-readable notations. The CSDB Linear implementation provides translation to other carbohydrate notations and multiple procedures for content error checking.

TOC graphic

ScienceScience: CSDB ScienceHome : Science