Dou Luyao, Wang Zehao, Qu Jingchen, Zhou Zhigang, Dai Longzheng
Online available: 2025-09-23
[Objective] This study aims to address the challenges of data utilization efficiency and privacy protection in multi-institutional data fusion, thereby enhancing both the practicality and security of data sharing among multiple institutions.[Method] A novel multi-institutional data fusion model, termed CEFDFM-MI (Cloud-Edge Federated Data Fusion Model for Multi-Institution), is proposed, which integrates cloud-edge collaboration with federated learning. Within the cloud-edge collaborative framework, a budget allocation mechanism and an information gain evaluation mechanism are devised to ensure the efficiency of data fusion processes. Furthermore, by leveraging the distributed characteristics of federated learning, the proposed model ensures secure and privacy-preserving data integration across institutions. The model is empirically evaluated on MNIST, CIFAR-10, and CIFAR-100 datasets under independent and identically distributed (IID) scenarios, as well as non-IID scenarios exhibiting low, medium, and high degrees of heterogeneity, in order to assess its performance across diverse and complex environments.[Results] Under IID conditions, the CEFDFM-MI model achieves a maximum accuracy of 94.52%. In non-IID scenarios with low, medium, and high heterogeneity, the model attains peak F1 scores of 73.71%, 74.51%, and 73.45%, respectively. Moreover, in the presence of model heterogeneity at the edge level, the proposed model demonstrates an accuracy improvement of approximately 6%–8% compared to independent training on individual edge nodes.[Limitations] The current study does not address scenarios where the objectives of cloud and edge models are misaligned, and the model's applicability in more complex environments remains to be further explored.[Conclusions] The proposed CEFDFM-MI model exhibits superior global performance relative to FedAvg and FedProx, and possesses robust capabilities in handling model heterogeneity across multiple institutions.