Background: The LBC-Net (Local Balance with Calibration by Neural Networks) - a novel deep learning method for nonparametric propensity score (PS) estimation - ensures optimal covariate balance (Peng M., 2024). By leveraging local balance and calibration, LBC-Net minimizes model misspecification, and demonstrates as one of best PS methods in large samples. Evaluating the performance of LBC-Net on small samples is particularly important because many real-world datasets to support clinical oncology development are limited in size. Understanding how well the method performs under these conditions can guide its application in practice.
Objectives: To evaluate the performance of the LBC-Net in small samples.
Methods: Case studies were conducted using simulations and various real-world data. For simulations, cohorts from 100 to 5000 subjects were created and analyzed using various PS methods: LBC-Net, logistic regression, covariate balancing PS (CBPS), binary cross-entropy (BCE), and Twang (GBM, XGBoost, and Legacy versions). Performance of different PS models was evaluated using Global and Local Standard Mean Differences (GSD and LSD), bias, Root Mean Square Error (RMSE), and empirical variance of the estimated Average Treatment Effect (ATE). Models were implemented using PyTorch and R with 3-layer feedforward neural network, pretrained with a variational autoencoder with Monte Carlo iterations and 20,000 training epochs.
Results: LBC-Net performed best in large cohorts (5000 and 1000 subjects), achieving superior covariate balance. In small samples of 250 and 200 subjects, it still performed quite well. In 250 samples for correctly specified model, LBC-Net performed best with lowest bias (0.0503%), RMSE (2.1483) and variance (4.6513); however, for mis-specified model, LBC-Net performed close to CBPS and BCE. LBC-Net struggled with sample sizes < 200 patients. Two real-world case studies with sample sizes of 996 and 209 patients showed efficiency of LBC-Net method in ATE estimation with smallest standard errors of 0.022 and 0.063, respectively, among all study methods from nonparametric bootstrap sampling.
Conclusions: LBC-Net consistently outperformed traditional methods in terms of covariate balance and bias reduction in large samples. Its performance, however, diminished in sample samples, suggesting the need for at least 200 subjects for optimal results. LBC-Net proved effective, providing robust treatment effect estimates.