You are viewing the site in preview mode

Skip to main content
Algorithm 3 | Journal of Cheminformatics

Algorithm 3

From: Publishing neural networks in drug discovery might compromise training data privacy

Algorithm 3

Robust Membership Inference Attack (RMIA) tests whether a specific target data point m-in our case, a molecular structure x with the corresponding label y-was part of the training data for a target neural network model \(f_{\theta }\). In this attack, shadow models \(s_i, \; i = 1, \dots , N\) are trained on data drawn from a distribution similar to that of \(f_{\theta }\)’s training data (in our case, a similar chemical space). Some shadow models include m in their training data, while others do not. The probability of m is approximated by averaging the correct class probability over all shadow models. Similarly, the probability of m given \(f_{\theta }\) is approximated as the probability of the correct class assignment by model \(f_{\theta }\). The ratio between these probabilities is then calculated and compared to the ratios obtained for other points z. The final score is the proportion of points z for which the ratio is at least \(\gamma\) times higher for data point m. This score, combined with a decision threshold t, determines whether m is predicted to have been part of \(f_{\theta }\)’s training data.

Back to article page