Article Code : 139507081123143466(DOI : 10.7508/jist.2016.03.004)

Article Title : Preserving Data Clustering with Expectation Maximization Algorithm

Journal Number : 15 Summer 2016

Visited : 1495

Files : 541 KB

List of Authors

  Full Name Email Grade Degree Corresponding Author
1 Leila Jafar Tafreshi Post Graduate Student M.A
2 Farzin Yaghmaee Assistant Professor PhD


Data mining and knowledge discovery are important technologies for business and research. Despite their benefits in various areas such as marketing, business and medical analysis, the use of data mining techniques can also result in new threats to privacy and information security. Therefore, a new class of data mining methods called privacy preserving data mining (PPDM) has been developed. The aim of researches in this field is to develop techniques those could be applied to databases without violating the privacy of individuals. In this work we introduce a new approach to preserve sensitive information in databases with both numerical and categorical attributes using fuzzy logic. We map a database into a new one that conceals private information while preserving mining benefits. In our proposed method, we use fuzzy membership functions (MFs) such as Gaussian, P-shaped, Sigmoid, S-shaped and Z-shaped for private data. Then we cluster modified datasets by Expectation Maximization (EM) algorithm. Our experimental results show that using fuzzy logic for preserving data privacy guarantees valid data clustering results while protecting sensitive information. The accuracy of the clustering algorithm using fuzzy data is approximately equivalent to original data and is better than the state of the art methods in this field.