List of subject articles Machine learning


    • Open Access Article

      1 - Confidence measure estimation for Open Information Extraction
      Vahideh Reshadat maryam hourali Heshaam Faili
      The prior relation extraction approaches were relation-specific and supervised, yielding new instances of relations known a priori. While effective, this model is not applicable in case when the number of relations is high or where the relations are not known a priori. Full Text
      The prior relation extraction approaches were relation-specific and supervised, yielding new instances of relations known a priori. While effective, this model is not applicable in case when the number of relations is high or where the relations are not known a priori. Open Information Extraction (OIE) is a relation-independent extraction paradigm designed to extract relations directly from massive and heterogeneous corpora such as Web. One of the main challenges for an Open IE system is estimating the probability that its extracted relation is correct. A confidence measure shows that how an extracted relation is a correct instance of a relation among entities. This paper proposes a new method of confidence estimation for OIE called Relation Confidence Estimator for Open Information Extraction (RCE-OIE). It investigates the incorporation of some proposed features in assigning confidence metric using logistic regression. These features consider diverse lexical, syntactic and semantic knowledge and also some extraction properties such as number of distinct documents from which extractions are drawn, number of relation arguments and their types. We implemented proposed confidence measure on the Open IE systems’ extractions and examined how it affects the performance of results. Evaluations show that incorporation of designed features is promising and the accuracy of our method is higher than the base methods while keeping almost the same performance as them. We also demonstrate how semantic information such as coherence measures can be used in feature-based confidence estimation of Open Relation Extraction (ORE) to further improve the performance. Manuscript Document
    • Open Access Article

      2 - Information Bottleneck and its Applications in Deep Learning
      حسن حافظ کلاهی Shohreh Kasaei
      Information Theory (IT) has been used in Machine Learning (ML) from early days of this field. In the last decade, advances in Deep Neural Networks (DNNs) have led to surprising improvements in many applications of ML. The result has been a paradigm shift in the communit Full Text
      Information Theory (IT) has been used in Machine Learning (ML) from early days of this field. In the last decade, advances in Deep Neural Networks (DNNs) have led to surprising improvements in many applications of ML. The result has been a paradigm shift in the community toward revisiting previous ideas and applications in this new framework. Ideas from IT are no exception. One of the ideas which is being revisited by many researchers in this new era, is Information Bottleneck (IB); a formulation of information extraction based on IT. The IB is promising in both analyzing and improving DNNs. The goal of this survey is to review the IB concept and demonstrate its applications in deep learning. The information theoretic nature of IB, makes it also a good candidate in showing the more general concept of how IT can be used in ML. Two important concepts are highlighted in this narrative on the subject, i) the concise and universal view that IT provides on seemingly unrelated methods of ML, demonstrated by explaining how IB relates to minimal sufficient statistics, stochastic gradient descent, and variational auto-encoders, and ii) the common technical mistakes and problems caused by applying ideas from IT, which is discussed by a careful study of some recent methods suffering from them. Manuscript Document
    • Open Access Article

      3 - Social Groups Detection in Crowd by Using Automatic Fuzzy Clustering with PSO
      Ali Akbari Hassan Farsi Sajad Mohammadzadeh
      Detecting social groups is one of the most important and complex problems which has been concerned recently. This process and relation between members in the groups are necessary for human-like robots shortly. Moving in a group means to be a subsystem in the group. In o Full Text
      Detecting social groups is one of the most important and complex problems which has been concerned recently. This process and relation between members in the groups are necessary for human-like robots shortly. Moving in a group means to be a subsystem in the group. In other words, a group containing two or more persons can be considered to be in the same direction of movement with the same speed of movement. All datasets contain some information about trajectories and labels of the members. The aim is to detect social groups containing two or more persons or detecting the individual motion of a person. For detecting social groups in the proposed method, automatic fuzzy clustering with Particle Swarm Optimization (PSO) is used. The automatic fuzzy clustering with the PSO introduced in the proposed method does not need to know the number of groups. At first, the locations of all people in frequent frames are detected and the average of locations is given to automatic fuzzy clustering with the PSO. The proposed method provides reliable results in valid datasets. The proposed method is compared with a method that provides better results while needs training data for the training step, but the proposed method does not require training at all. This characteristic of the proposed method increases the ability of its implementation for robots. The indexing results show that the proposed method can automatically find social groups without accessing the number of groups and requiring training data at all. Manuscript Document
    • Open Access Article

      4 - A Study of Fraud Types, Challenges and Detection Approaches in Telecommunication
      Kasra Babaei ZhiYuan Chen Tomas Maul
      Fraudulent activities have been rising globally resulting companies losing billions of dollars that can cause severe financial damages. Various approaches have been proposed by researchers in different applications. Studying these approaches can help us obtain a better Full Text
      Fraudulent activities have been rising globally resulting companies losing billions of dollars that can cause severe financial damages. Various approaches have been proposed by researchers in different applications. Studying these approaches can help us obtain a better understanding of the problem. The aim of this paper is to investigate different aspects of fraud prevention and detection in telecommunication. This study presents a review of different fraud categories in telecommunication, the challenges that hinder the detection process, and some proposed solutions to overcome them. Also, the performance of some of the state-of-the-art approaches is reported followed by our guideline and recommendation in choosing the best metrics. Manuscript Document
    • Open Access Article

      5 - AI based Computational Trust Model for Intelligent Virtual Assistant
      Babu Kumar Ajay Vikram Singh Parul  Agarwal
      The Intelligent virtual assistant (IVA) also called AI assistant or digital assistant is software developed as a product by organizations like Google, Apple, Microsoft and Amazon. Virtual assistant based on Artificial Intelligence which works and processes on natural la Full Text
      The Intelligent virtual assistant (IVA) also called AI assistant or digital assistant is software developed as a product by organizations like Google, Apple, Microsoft and Amazon. Virtual assistant based on Artificial Intelligence which works and processes on natural language commands given by humans. It helps the user to work more efficiently and also saves time. It is human friendly as it works on natural language commands given by humans. Voice-controlled Intelligent Virtual Assistants (IVAs) have seen gigantic development as of late on cell phones and as independent gadgets in individuals’ homes. The intelligent virtual assistant is very useful for illiterate and visually impaired people around the world. While research has analyzed the expected advantages and downsides of these gadgets for IVA clients, barely any investigations have exactly assessed the need of security and trust as a singular choice to use IVAs. In this proposed work, different IPA users and non-users (N=1000) are surveyed to understand and analyze the barriers and motivations to adopting IPAs and how users are concerned about data privacy and trust with respect to organizational compliances and social contract related to IPA data and how these concerns have affected the acceptance and use of IPAs. We have used Naïve Byes Classifier to compute trust in IVA devices and further evaluate probability of using different trusted IVA devices. Manuscript Document
    • Open Access Article

      6 - An Effective Method of Feature Selection in Persian Text for Improving the Accuracy of Detecting Request in Persian Messages on Telegram
      zahra khalifeh zadeh Mohammad Ali Zare Chahooki
      In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from Full Text
      In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from these sources of information. In automatic text classification, words are assessed as a set of features. Selecting useful features from each text reduces the size of the feature vector and improves classification performance. Many algorithms have been applied for the automatic classification of text. Although all the methods proposed for other languages are applicable and comparable, studies on classification and feature selection in the Persian text have not been sufficiently carried out. The present research is conducted in Persian, and the introduction of a Persian dataset is a part of its innovation. In the present article, an innovative approach is presented to improve the performance of Persian text classification. The authors extracted 85,000 Persian messages from the Idekav-system, which is a Telegram search engine. The new idea presented in this paper to process and classify this textual data is on the basis of the feature vector expansion by adding some selective features using the most extensively used feature selection methods based on Local and Global filters. The new feature vector is then filtered by applying the secondary feature selection. The secondary feature selection phase selects more appropriate features among those added from the first step to enhance the effect of applying wrapper methods on classification performance. In the third step, the combined filter-based methods and the combination of the results of different learning algorithms have been used to achieve higher accuracy. At the end of the three selection stages, a method was proposed that increased accuracy up to 0.945 and reduced training time and calculations in the Persian dataset. Manuscript Document