Fig. 2

HOLIgraph outperforms ligand-based and interaction-based methods using XGBoost when trained on docked poses of outward-facing OATP1B1. (A) Feature encodings of ECFP and RDKit physicochemical descriptors generated numerical vectors and tensors for use in classification (e.g., XGBoost) and Ligand-GNN models, respectively. (B) Protein-ligand interaction fingerprints (PLIFs) obtained from docked poses were encoded into numerical vectors and tensors for use in classifier models and HOLIgraph, respectively. Feature engineering and model optimization are further detailed in the Supplemental Information. (C) Box plots displaying the distribution of scores for ligand-based (ECFP, RDKit) and interaction-based XGBoost, Ligand-GNN, and HOLIgraph (left to right). Interaction-based models for the inward- and outward-facing OATP1B1 conformers were evaluated independently (blue and pink, respectively). Mann–Whitney U p-values with Bonferroni correction indicated by asterisks (*: p< 0.05, **: p< 5e-3) show that HOLIgraph (applied to the docked poses of the outward-facing OATP1B1 conformer) improves balanced accuracy scores compared to all XGBoost models. Comprehensive performance results are reported in Tables S11-S15.