Abstracts of the Third School (25.10 2022)

The intersection of Optical Chemical Structure Recognition (OCSR) and object detection

by Martijn Oldenhof  (9:30)

Chemical compounds are usually available as 2d images of the molecular structures in literature. The translation of these 2d images to machine readable representations is usually referred as Optical Chemical Structure Recognition (OCSR). The recognition of atomic level entities on images of molecules is a challenge in the field of OCSR. For the recognition of these atomic level entities object detection models could be used. In this context, ProbKT is presented, which is a framework based on probabilistic logical reasoning that allows to train object detection models with arbitrary types of weak supervision. Additionally also some practical guidelines will be presented on running experiments using the tool 'Weights & Biases’. 

Inferring missing data with auto-associators

by Mark Embrechts (10:45)

Missing and faulty data are common issues in large data sets. This talk introduces the use of deep auto-associators for inferring missing data. Typically the deep auto-associators have five hidden layers of neurons on both sides of the bottleneck layer (i.e., 12 layers of neurons in total). The auto-associator networks are trained by policy (30 passes through the data with mini-batches of 30 samples and using Adam). The only modification from standard auto-associators is that the outputs related to missing data are not backpropagated forward. This procedure is illustrated with two sample cases: the 2008 ICANN toxicity challenge and the olive oil data.