Fig. 1

Workflow for bioactivity data selection and curation. Raw data extracted from ChEMBL were filtered for a specific ‘standard type’. Only data with no potential duplicates, flagged with no ‘data validity comments’, and without concerning ‘Activity comments’ were considered. Based on the availability of information in the ‘Standard Value’ (SV), ‘Standard Relation’ (SR) and ‘Comment’ fields, different criteria were used to classify records into active or not active