Merged CSDB Bacterial CSDB Plant and Fungal CSDB

Carbohydrate Structure Database

Carbohydrates are one of the major constituents of living cells. They provide mechanical stability of the cell wall and play important role in signal transduction, cell-cell recognition and immunological properties of microorganisms. The role of the provision of data on carbohydrates to the scientific community in biomedical and immunological research can hardly be overestimated. However, in contrast to other disciplines studying molecular basis of life, glycomics is lacking information-technology-based advantages. Universal integration standards and computer-assisted tools in glycomics are still in the making. Many existing carbohydrate databases are focused on particular properties, utilize incompatible formats, do not provide complete coverage, and most of them lack data quality.

Carbohydrate Structure Database (CSDB) aims at closing this gap by its curated content and cross-database integration, thus bringing glycomics to the same level of integrity as exists in genomics and proteomics. CSDB has been continuously developed and updated since 2004. Nowadays it provides data on bacterial, archaeal, plant, fungal, and protistal carbohydrates and glycoconjugates with published chemical sequence. Currently, it is the only free database with primary data on carbohydrate structures from these taxonomical domains published up to 2023.

Two key features of this project are coverage and data consistency. The database contains structures of ~33K carbohydrates and glycoconjugates (including glycoproteins and glycolipids) associated with ~17K microorganisms in ~15K publications. The coverage approaches nearly all glycans of microorganisms and fungi reported up to 2023, and of plants up to 2000. The average growth is ~1000 structures annually.

CSDB stores structural, taxonomical, bibliographical, assigned NMR-spectroscopic and other data (elucidation methods, publication abstracts, conformational, biochemical, and genetic data etc.) on carbohydrates with a known sequence. The source of data were import and manual re-annotation of other databases (incl. CarbBank), manual and semi-automated retrospective processing of publications, and user data submissions. All data have been checked for consistency by experts in carbohydrate biochemistry prior to the upload, and corrected when necessary. This makes CSDB one of a few primary glycoinformatic databases with fully curated content. Comparison of consistency of freely available carbohydrate databases showed the high data quality in CSDB.

The CSDB interface includes the web user part, administrator part and gateways for automated data interchange with other databases. Currently it is cross-linked with NCBI PubMed, NCBI Taxonomy, GlyTouCan, ICD-11, MonosaccharideDB, ImmuneEpitopeDB, and other resources. Users can search the database by fragments of structure, bibliography, taxonomical annotations, fragments of NMR spectra, composition data, trivial names etc. The integration with certain projects in glycomics has been achieved at the level of programming interface and by bulk data export as a Resource Description Framework feed. The unambiguous but nevertheless human-readable carbohydrate notation has been developed for this project, and translation tools to and from other known glycan representations are provided.

CSDB serves as a glycoinformatic platform and, except the database itself, hosts a number of services, such as:

The CSDB is freely available at

This project started as "Bacterial CSDB" in the beginning of XXI century in the framework of the International Science and Technology Center Partner Project. The further funding originated from Russian Foundation of Basic Research, Russian Federation President grant committee, Deutches Krebbsforschungszentrum, and Russian Science Foundation. My personal role in this project was general research and development, database ideology and architecture, data formats, carbohydrate encoding and notation, programming of engine and services, web-design, cross-database interfaces, coordination of literature annotation and database filling processes, general management, and funding acquisition.

For scholars and students: Invitation to collaboration (in Russian) PDFtext, PDFpresentation.

Supplementary materials:

Poster 2015 Poster 2014 Poster 2009

  CSDB project website

  Merged CSDB poster, 2015 (18th European Carbohydrate Symposium) (JPG, 566Kb)

  Bacterial, plant and fungal CSDB poster, 2014 (6th Baltic Meeting on Bacterial Carbohydrates) (JPG, 637Kb)

  Bacterial CSDB poster, 2009 (4th Baltic Meeting on Bacterial Carbohydrates) (JPG, 876Kb)

  Carbohydrate databases: problems and solutions (lection)

Selected publications:

Other papersPublications : glycoinformatics ScienceHome : Science
Last update: 2023 Dec 22      Home