Fig. 3

Graphical representation of the modeling workflow. 1 Targets for modeling were identified from five AOPs of the liver, kidney, and developing brain adversities. 2 Bioactivity data for each target were extracted from ChEMBL and curated. 3 Three types of chemical features were evaluated as independent variables. Feature selection was performed using the VSURF method. 4 The datasets were partitioned into training and test set. Synthetic minority oversampling (SMOTE) was applied to the training data. 5 Six machine learning methods were evaluated for modeling (MLP multilayer perceptron, SVM support vector machine, KNN k-nearest neighbors, GB gradient boosting, XGB extreme gradient boosting, BRF balanced random forest), and hyper-parameterized in cross-validation. The five best models were selected based on their balanced accuracy in external validation. 6 The stability of the selected models was checked by iteratively replicating the training-test split, and the final model was selected based on the average balanced accuracy. Confidence estimation and novelty detection methods were applied to identify reliable predictions