Big data leads the way for structural chemistry
A huge milestone for structural chemistry has been achieved with the addition of the millionth structure into the Cambridge Structural Database (CSD). It was determined by a team at Shandong University in China.
The Cambridge Crystallographic Data Centre masterminded the landmark: CCDC is a world-leading specialist in structural chemistry data, software and knowledge for materials and life science research and application.
The Cambridge Structural Database is a global repository of highly curated experimentally determined organic and metal-organic crystal structures. It is used by scientists in more than 70 countries to understand how molecules behave and interact in three dimensions in the solid form and ultimately how this affects physical properties.
As the interest in Big Data continues to grow in a time where machine learning and automation are changing the way pharmaceutical, agrochemical and many other industries work, reaching this milestone could not be better timed.
Large volumes of data such as this enable scientists to generate more replete answers from a more complete and diverse volume of information, ensuring confidence in the insights being drawn from the data.
CCDC’s focus on ensuring the integrity of the data within the CSD through stringent quality assurance and control steps adds even more value and confidence that scientists are obtaining the highest quality information to inform their research.
This rich data resource, alongside advanced search, 3-D data mining, analysis and visualisation software from CCDC enables scientists from both industry and academia to further their research and predict new outcomes.
Knowledge derived from the CSD also underpins computational chemistry and molecular modelling and is relied on by industry for the development and manufacturing of new drugs and within academia to teach chemistry.
Dr Jürgen Harter, CEO of CCDC said: “This is truly an important milestone not only for CCDC but also for the wider scientific community.
“In addition to the value that lies in large sets of data like this to help scientists inform their research and decision making, we also pride ourselves on the high quality of the data, a result of the hard work of our expert in-house database team.
“Maintaining a policy of strict data interrogation ensures the value of the plentiful insights that can be drawn from the CSD, avoiding misinformation that can lead to wasted time, resources and ultimately cost.”
CCDC has announced the 1,000,000th structure to be a N-heterocycle produced by a chalcogen bonding catalyst activating multiple reactions steps sequentially. The structure was determined by Yao Wang and co-authors from Shandong University in China and published in the Journal of the American Chemical Society (JACS).
Dr Wang said: “We have used the CSD for over 10 years because it is an excellent platform to report new crystal structures and an outstanding database to find inspirable chemical structures.
“It is a valuable resource to us and to many other scientists around the world so we are very proud to be associated with this milestone for the community.”
This is an exciting time for life science and materials development research with markets such as China leading the way in scientific discovery and functional materials design.