Molecular Medicine 1.
Németh, Flóra
Center for Molecular Fingerprinting
Németh Flóra1, Dr. Kosmas Kepesidis1
1: Center for Molecular Fingerprinting
Introduction
Fourier-transform infrared (FTIR) spectroscopy enables rapid, label-free biochemical profiling of human blood and has strong potential for clinical applications. However, predictive models trained on data from one measurement device often fail to generalize to others due to domain shifts caused by instrument-specific variability.
Aims
We aimed to develop a machine learning approach that learns device-invariant representations from FTIR spectra, enabling robust cross-instrument generalization without requiring paired calibration data.
Methods
We used FTIR spectra of human blood plasma acquired from two instruments. A domain-adversarial neural network (DANN) was trained to learn latent representations that preserve biologically relevant information while suppressing device-specific variation. The model was supervised using auxiliary regression tasks based on selected routine blood parameters. Performance was evaluated in both within-device and cross-device prediction settings, as well as on an independent downstream classification task.
Results
Models trained on raw spectra showed substantial degradation in cross-device performance. In contrast, representations learned by the DANN significantly reduced cross-device prediction error, achieving performance comparable to within-device models while maintaining predictive accuracy. Furthermore, the learned features improved generalization in an independent classification task not used during training, demonstrating that the model captures domain-invariant biochemical structure.
Conclusion
Domain-adversarial learning enables effective harmonization of FTIR spectral data across measurement devices, improving robustness and transferability of predictive models. More broadly, this approach provides a general framework for mitigating domain shifts and supports the use of multi-instrument datasets within a unified modeling framework, facilitating reliable model deployment across diverse measurement settings in biomedical applications.
Funding
This work was supported by the National Research, Development and Innovation Fund of Hungary (project no. 2020-2.1.1-ED-2022-00213) and the EKÖP-KDP Scholarship Programme.