Representing a Content-based link Prediction Algorithm in Scientific Social Networks
Subject Areas : Data MiningHosna Solaimannezhad 1 , omid fatemi 2
1 - Tehran University
2 - Tehran University
Keywords: Link prediction , Social networks , Content-based , Interest ,
Abstract :
Predicting collaboration between two authors, using their research interests, is one of the important issues that could improve the group researches. One type of social networks is the co-authorship network that is one of the most widely used data sets for studying. As a part of recent improvements of research, far much attention is devoted to the computational analysis of these social networks. The dynamics of these networks makes them challenging to study. Link prediction is one of the main problems in social networks analysis. If we represent a social network with a graph, link prediction means predicting edges that will be created between nodes in the future. The output of link prediction algorithms is using in the various areas such as recommender systems. Also, collaboration prediction between two authors using their research interests is one of the issues that improve group researches. There are few studies on link prediction that use content published by nodes for predicting collaboration between them. In this study, a new link prediction algorithm is developed based on the people interests. By extracting fields that authors have worked on them via analyzing papers published by them, this algorithm predicts their communication in future. The results of tests on SID dataset as coauthor dataset show that developed algorithm outperforms all the structure-based link prediction algorithms. Finally, the reasons of algorithm’s efficiency are analyzed and presented
[1]B. Furth, Handbook of Social Network Technologies and Applications, 2010. #[2]LiseGetoor, Christopher P. Diehl, “Link Mining: A Survey,” SIGKDD Explorations, Vol.7, No. 2, pp. 3-12, 2005. #[3]Pieter B. T. M.-F. W.,Koller A. D, “Link prediction in relational data,” Learning Statistical Patterns in Relational Data Using Probabilistic Relational Models, Vol.7, 2005. #[4]David Liben-Nowell, Jon Kleinberg, “The Link Prediction Problem for Social Networks,” Journal of the American Society for Information Science and Technology, Vo.No.58,7, pp. 1019-1031, 2007. #[5]FengX.,Zhao J.,Xu K, “Link prediction in complex networks: a clustering perspective,” European Physical Journal, vol.85, pp. 1-9, 2012. #[6]P Wang, BW Xu, YR Wu, X Zhou, “Link Prediction in Social Networks: the State-of-the-Art,” Science China Information Sciences, Vol.57, pp. 1-38, 2014. #[7]Stroele V, Zimbr~ao G, Souza J M, “Group and link analysis of multi-relational scientific Social Networks,” Journal of Systems and Software, Vol.86, pp. 1819-1830, 2013. [8]Rossetti G, Berlingerio M, Giannotti F, “Scalable link prediction on multidimensional networks,” in11th IEEE International Conference on Data Mining Workshops, Vancouver, Canada, 2011. #[9]Mori J, Kajikawa Y, Kashima H, et al, “Machine learning approach for finding business partners and buildings reciprocal relationships,” Expert Systems with Applications, Vol.39, pp. 10402-10407, 2012. #[10]Wu S, Sun J, Tang J, “Patent partner recommendation in enterprise social networks,” inthe 6th ACM International Conference on Web Search and Data Mining (WSDM'13), Rome, Italy, 2013. #[11]Aiello L M, Barrat A, Schifanella R, et al, “Friendship prediction and homophily in social media,” inACM Transactions on the Web, 2012. #[12]Chen H H, Miller D J, Giles C L, “The predictive value of young and old links in a social network,” inthe ACM SIGMOD Workshop on Databases and Social Networks, New York, USA, 2013. #[13]Davis D, Lichtenwalter R, Chawla N V, “Supervised methods for multi-relational link prediction,” Social Networks Analysis and Mining, Vol.3, pp. 127-141, 2013. #[14] Adamic, L.A., and E.Adar, “ Friend and Neighbors on the Web,” Social Networks, Vol.25, pp. 211-230, 2003. #[15] Soares P R S, Prud^encio R B C, “Proximity measures for link prediction based on temporal events,” Expert Systems with Applications, Vol.40, pp. 6652-6660, 2013. #[16] Richard E, Baskiotis N, Evgeniou T, et al, “Link discovery using graph feature tracking,” inthe 24th Annual Conference on Neural Information Processing Systems 2010, Vancouver, Canada, 2010. #[17] Oyama S, Hayashi K, Kashima H, “Cross-temporal link prediction,” inthe 11th IEEE International Conference on Data Mining (ICDM'11), Vancouver, Canada, 2011. #[18] da Silva Soares P R, BastosCavalcantePrud^encio R, “Time series based link prediction,” in International Joint Conference on Neural Networks (IJCNN'12), Brisbane, Australia, 2012. #[19] Gilbert E, Karahalios K, “Predicting tie strength with social media,” inthe SIGCHI Conference on Human Factors in Computing Systems, Boston, USA, 2009. #[20] O'Madadhain J, Hutchins J, Smyth P, “Prediction and ranking algorithms for event-based network data,” ACM SIGKDD Explorations, Vol.7, pp. 23-30, 2005. #[21] Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems, Vol.30, pp. 107-117, 1998. #[22] RN Lichtenwalter, JT Lussier , NV Chawla, “New perspectives and methods in link prediction,” in16th ACM SIGKDD International Conference of Knowledge Discovery and Data Mining, Washington, DC, USA, 2010. #[23] Dunlavy D M, Kolda T G, Acar E, “Temporal link prediction using matrix and tensor factorizations,” ACM Transactions on Knowledge Discovery from Data, Vol.5, pp. 1-27, 2011. #[24] Kuo T, Yan R, Huang Y, et al, “Unsupervised link prediction using aggregative statistics on heterogeneous social networks,” inthe 19th ACM SIGKDD international conference on Knowledge discovery and data mining, Chicago, USA, 2013. #[25] Yin D, Hong L, Davison B D, “Structural link analysis and prediction in microblogs,” inthe 20th ACM International Conference on Information and Knowledge Management (CIKM'11), Glasgow, UK, 2011. #[26] Xiang R, Neville J, Rogati M, “Modeling relationship strength in online social networks,” inthe 19th International Conference on World Wide Web (WWW'10), Raleigh, USA, 2010. #[27] “LINKREC: a unified framework For link Recommendation with user attributes and Graph structure,” inthe 19th International Conference on World Wide Web (WWW'10), Raleigh,USA, 2010. #[28] Sachan M, Ichise R, “Using semantic information to improve link prediction results in network datasets,” International Journal of Computer Theory and Engineering, Vol.3, pp. 71-76, 2011. #[29] C Cortes, V Vapnik, “Support-vector networks,” Machine learning, Vol.3, pp. 273-297, 1995. #[30] Zahra Sarabi, HoomanMahyar, MojganFarhoodi, “ParsiPardaz: Persian Language Processing Toolkit,” 2013. #[31] “http://www.ranks.nl/stopwords/persian,” [intra-linear]. #[32] Xuan-HieuPhan, Le-Minh Nguyen, and Susumu Horiguchi, “. Learning to Classify Short and Sparse Text&Web with Hidden Topics from Large-scale Data Collections,” inThe 17th International World Wide Web Conference, Beijing, China, 2008.# [33] Blei, D., Boyd-Graber, J., Zhu, X., “A Topic Model for Word Sense Disambiguation,” inthe 2007Joint Conf. on Empirical Methods in Natural Language Processing and Comp. Natural Language Learning, 2007. #[34] Chen, W., Chu, J., Luan, J., Bai, H., Wang, Y., Chang, Y.E., “Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior,” inInternational World Wide Web Conference, 2009. #[35] Singhal, Amit, “Modern Information Retrieval: A Brief Overview,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering,, NO. 244, pp. 35-43, 2003. #[36] P.-N. Tan, M. Steinbach&V. Kumar, Introduction to Data Mining, 2005. #[37] D. Hand, “Measuring classifier performance: a coherent alternative to the area under the ROC curve,” Machine learning, pp. 103-123, 2009. #[38] J Davis, M Goadrich, “The relationship between Precision-Recall and ROC curves,” inthe 23rd international conference on Machine Learning, 2006.