List of subject articles Semantic Web


    • Open Access Article

      1 - Referral Traffic Analysis: A Case Study of the Iranian Students' News Agency (ISNA)
      Roya Hassanian Esfahani Mohammad Javad Kargar
      Web traffic analysis is a well-known e-marketing activity. Today most of the news agencies have entered the web providing a variety of online services to their customers. The number of online news consumers is also increasing dramatically all over the world. A news webs Full Text
      Web traffic analysis is a well-known e-marketing activity. Today most of the news agencies have entered the web providing a variety of online services to their customers. The number of online news consumers is also increasing dramatically all over the world. A news website usually benefits from different acquisition channels including organic search services, paid search services, referral links, direct hits, links from online social media, and e-mails. This article presents the results of an empirical study of analyzing referral traffic of a news website through data mining techniques. Main methods include correlation analysis, outlier detection, clustering, and model performance evaluation. The results decline any significant relationship between the amount of referral traffic coming from a referrer website and the website's popularity state. Furthermore, the referrer websites of the study fit into three clusters applying K-means Squared Euclidean Distance clustering algorithm. Performance evaluations assure the significance of the model. Also, among detected clusters, the most populated one has labeled as "Automatic News Aggregator Websites" by the experts. The findings of the study help to have a better understanding of the different referring behaviors, which form around 15% of the overall traffic of Iranian Students' News Agency (ISNA) website. They are also helpful to develop more efficient online marketing plans, business alliances, and corporate strategies. Manuscript Document
    • Open Access Article

      2 - Computing Semantic Similarity of Documents Based on Semantic Tensors
      Navid Bahrami Amir H.  Jadidinejad Mojdeh Nazari
      Exploiting semantic content of texts due to its wide range of applications such as finding related documents to a query, document classification and computing semantic similarity of documents has always been an important and challenging issue in Natural Language Process Full Text
      Exploiting semantic content of texts due to its wide range of applications such as finding related documents to a query, document classification and computing semantic similarity of documents has always been an important and challenging issue in Natural Language Processing. In this paper, using Wikipedia corpus and organizing it by three-dimensional tensor structure, a novel corpus-based approach for computing semantic similarity of texts is proposed. For this purpose, first the semantic vector of available words in documents are obtained from the vector space derived from available words in Wikipedia articles, then the semantic vector of documents is formed according to their words vector. Consequently, measuring the semantic similarity of documents can be done by comparing their semantic vectors. The vector space of the corpus of Wikipedia will cause the curse of dimensionality challenge because of the existence of the high-dimension vectors. Usually vectors in high-dimension space are very similar to each other; in this way, it would be meaningless and vain to identify the most appropriate semantic vector for the words. Therefore, the proposed approach tries to improve the effect of the curse of dimensionality by reducing the vector space dimensions through random indexing. Moreover, the random indexing makes significant improvement in memory consumption of the proposed approach by reducing the vector space dimensions. The addressing capability of synonymous and polysemous words in the proposed approach will be feasible by means of the structured co-occurrence through random indexing. Manuscript Document
    • Open Access Article

      3 - Opinion Mining in Persian Language Using Supervised Algorithms
      Saeedeh Alimardani abdollah aghaei
      Rapid growth of Internet results in large amount of user-generated contents in social media, forums, blogs, and etc. Automatic analysis of this content is needed to extract valuable information from these contents. Opinion mining is a process of analyzing opinions, sent Full Text
      Rapid growth of Internet results in large amount of user-generated contents in social media, forums, blogs, and etc. Automatic analysis of this content is needed to extract valuable information from these contents. Opinion mining is a process of analyzing opinions, sentiments and emotions to recognize people’s preferences about different subjects. One of the main tasks of opinion mining is classifying a text document into positive or negative classes. Most of the researches in this field applied opinion mining for English language. Although Persian language is spoken in different countries, but there are few studies for opinion mining in Persian language. In this article, a comprehensive study of opinion mining for Persian language is conducted to examine performance of opinion mining in different conditions. First we create a Persian SentiWordNet using Persian WordNet. Then this lexicon is used to weight features. Results of applying three machine learning algorithms Support vector machine (SVM), naive Bayes (NB) and logistic regression are compared before and after weighting by lexicon. Experiments show support vector machine and logistic regression achieve better results in most cases and applying SO (semantic orientation) improves the accuracy of logistic regression. Increasing number of instances and using unbalanced dataset has a positive effect on the performance of opinion mining. Generally this research provides better results comparing to other researches in opinion mining of Persian language. Manuscript Document
    • Open Access Article

      4 - Scalable Community Detection through Content and Link Analysis in Social Networks
      Zahra  Arefian Mohammad Reza  Khayyam Bashi
      Social network analysis is an important problem that has been attracting a great deal of attention in recent years. Such networks provide users many different applications and features; as a result, they have been mentioned as the most important event of recent decades. Full Text
      Social network analysis is an important problem that has been attracting a great deal of attention in recent years. Such networks provide users many different applications and features; as a result, they have been mentioned as the most important event of recent decades. Using features that are available in the social networks, first discovering a complete and comprehensive communication should be done. Many methods have been proposed to explore the community, which are community detections through link analysis and nodes content. Most of the research exploring the social communication network only focuses on the one method, while attention to only one of the methods would be a confusion and incomplete exploration. Community detections is generally associated with graph clustering, most clustering methods rely on analyzing links, and no attention to regarding the content that improves the clustering quality. In this paper, to scalable community detections, an integral algorithm is proposed to cluster graphs according to link structure and nodes content, and it aims finding clusters in the groups with similar features. To implement the Integral Algorithm, first a graph is weighted by the algorithm according to the node content, and then network graph is analyzed using Markov Clustering Algorithm, in other word, strong relationships are distinguished from weak ones. Markov Clustering Algorithm is proposed as a Multi-Level one to be scalable. The proposed Integral Algorithm was tested on real datasets, and the effectiveness of the proposed method is evaluated. Manuscript Document
    • Open Access Article

      5 - Analysis of expert finding algorithms in social network in order to rank the top algorithms
      AhmadAgha kardan بهنام بزرگی
      The ubiquity of Internet and social networks have turned question and answer communities into an environment suitable for users to ask their questions about anything or to share their knowledge by providing answers to other users’ questions. These communities designed f Full Text
      The ubiquity of Internet and social networks have turned question and answer communities into an environment suitable for users to ask their questions about anything or to share their knowledge by providing answers to other users’ questions. These communities designed for knowledge-sharing aim to improve user knowledge, making it imperative to have a mechanism that can evaluate users’ knowledge level or in other words “to find experts”. There is a need for expert-finding algorithms in social networks or any other knowledge sharing environment like question and answer communities. There are various content analysis and link analysis methods for expert-finding in social networks. This paper aims to challenge four algorithms by applying them to our dataset and analyze the results in order to compare the algorithms. The algorithms suitable for expert finding has been found and ranked. Based on the results and tests it is concluded that the Z-score algorithm has a better performance than others. Manuscript Document
    • Open Access Article

      6 - De-lurking in Online Communities Using Repost Behavior Prediction Method
      Omid Reza Bolouki Speily
      Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are am Full Text
      Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are among the strategies used for production and repost of information by users interested in a specific area. In this respect, users become members in a particular domain at will and begin posting. Considering the networking structure, one of the major challenges these groups face is the lack of reposting behavior. Most users of these systems take up a lurking position toward the posts in the forum. De-lurking is a type of social media behavior where a user breaks an "online silence" or habit of passive thread viewing to engage in a virtual conversation. One of the proposed ways to improve De-Lurking is the selection and display of influential posts for each individual. Influential posts are so selected as to be more likely reposted by users based on each user's interests, knowledge and characteristics. The present article intends to introduce a new method for selecting k influential posts to ensure increased repost of information. In terms of participation in OCs, users are divided into two groups of posters and lurkers. Some solutions are proposed to encourage lurking users to participate in reposting the contents. Based on actual data from Twitter and actual blogs with respect to reposts, the assessments indicate the effectiveness of the proposed method. Manuscript Document
    • Open Access Article

      7 - A Semantic Approach to Person Profile Extraction from Farsi Web Documents
      Hojjat Emami Hossein Shirazi ahmad abdolahzade
      Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studie Full Text
      Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of EP research, we present a semantic approach to extract profile of person entities from Farsi Web documents. Our approach includes three major components: (i) pre-processing, (ii) semantic analysis and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods. Manuscript Document
    • Open Access Article

      8 - Coreference Resolution Using Verbs Knowledge
      hasan zafari maryam hourali Heshaam Faili
      Coreference resolution is the problem of determining which mention in a text refer to the same entities, and is a crucial and difficult step in every natural language processing task. Despite the efforts that have been made in the past to solve this problem, its perform Full Text
      Coreference resolution is the problem of determining which mention in a text refer to the same entities, and is a crucial and difficult step in every natural language processing task. Despite the efforts that have been made in the past to solve this problem, its performance still does not meet today’s applications requirements. Given the importance of the verbs in sentences, in this work we tried to incorporate three types of their information on coreference resolution problem, namely, selectional restriction of verbs on their arguments, semantic relation between verb pairs, and the truth that arguments of a verb cannot be coreferent of each other. As a needed resource for supporting our model, we generate a repository of semantic relations between verb pairs automatically using Distributional Memory (DM), a state-of-the-art framework for distributional semantics. This resource consists of pairs of verbs associated with their probable arguments, their role mapping, and significance scores based on our measures. Our proposed model for coreference resolution encodes verbs’ knowledge with Markov logic network rules on top of deterministic Stanford coreference resolution system. Experiment results show that this semantic layer can improve the recall of the Stanford system while preserves its precision and improves it slightly. Manuscript Document
    • Open Access Article

      9 - Effective Query Recommendation with Medoid-based Clustering using a Combination of Query, Click and Result Features
      Elham Esmaeeli-Gohari Sajjad Zarifzadeh
      Query recommendation is now an inseparable part of web search engines. The goal of query recommendation is to help users find their intended information by suggesting similar queries that better reflect their information needs. The existing approaches often consider the Full Text
      Query recommendation is now an inseparable part of web search engines. The goal of query recommendation is to help users find their intended information by suggesting similar queries that better reflect their information needs. The existing approaches often consider the similarity between queries from one aspect (e.g., similarity with respect to query text or search result) and do not take into account different lexical, syntactic and semantic templates exist in relevant queries. In this paper, we propose a novel query recommendation method that uses a comprehensive set of features to find similar queries. We combine query text and search result features with bipartite graph modeling of user clicks to measure the similarity between queries. Our method is composed of two separate offline (training) and online (test) phases. In the offline phase, it employs an efficient k-medoids algorithm to cluster queries with a tolerable processing and memory overhead. In the online phase, we devise a randomized nearest neighbor algorithm for identifying most similar queries with a low response-time. Our evaluation results on two separate datasets from AOL and Parsijoo search engines show the superiority of the proposed method in improving the precision of query recommendation, e.g., by more than 20% in terms of p@10, compared with some well-known algorithms. Manuscript Document