ABSTRACT

The collection of medical data and personal health data is taking place in diverse locations due to the development of IoT technology. This has created possibilities for using medical data for novel services, but maintaining data privacy is a key consideration. Privacy-preserving medical data collection is achievable through the application of local differential privacy (LDP), with the addition of noise. However, machine learning (ML) significantly degrades performance when trained on noisy data. The purpose of this chapter is to develop a model of ML such as decision trees using LDP data. Instead of applying the machine learning model as is, we use copulas to generate synthetic data that can remove the effects of noise of LDP. The results of simulations using real and synthetic data confirm that the proposed algorithm was effective for decision trees, deep neural networks, and k-nearest neighbors.