﻿<?xml version="1.0" encoding="utf-8"?>
<ArticleSet>
  <ARTICLE>
    <Journal>
      <PublisherName>مرکز منطقه ای اطلاع رسانی علوم و فناوری</PublisherName>
      <JournalTitle>Journal of Information Systems and Telecommunication (JIST) </JournalTitle>
      <ISSN>2322-1437</ISSN>
      <Volume>4</Volume>
      <Issue>16</Issue>
      <PubDate PubStatus="epublish">
        <Year>2016</Year>
        <Month>12</Month>
        <Day>24</Day>
      </PubDate>
    </Journal>
    <ArticleTitle>A Semantic Approach to Person Profile Extraction from Farsi Web Documents</ArticleTitle>
    <VernacularTitle>A Semantic Approach to Person Profile Extraction from Farsi Web Documents</VernacularTitle>
    <FirstPage>1</FirstPage>
    <LastPage>10</LastPage>
    <ELocationID EIdType="doi">10.7508/jist.2016.04.004</ELocationID>
    <Language>en</Language>
    <AuthorList>
      <Author>
        <FirstName>Hojjat</FirstName>
        <LastName>Emami</LastName>
        <Affiliation>Malek - Ashtar University of Technology</Affiliation>
      </Author>
      <Author>
        <FirstName>Hossein</FirstName>
        <LastName>Shirazi</LastName>
        <Affiliation>Malek-e ashtar university of technology</Affiliation>
      </Author>
      <Author>
        <FirstName>ahmad</FirstName>
        <LastName>abdolahzade</LastName>
        <Affiliation>دانشگاه صنعتی امیرکبیر</Affiliation>
      </Author>
    </AuthorList>
    <History PubStatus="received">
      <Year>2017</Year>
      <Month>1</Month>
      <Day>14</Day>
    </History>
    <Abstract>Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of EP research, we present a semantic approach to extract profile of person entities from Farsi Web documents. Our approach includes three major components: (i) pre-processing, (ii) semantic analysis and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods.</Abstract>
    <ObjectList>
      <Object Type="Keyword">
        <Param Name="Value">Web mining</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">information extraction</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">entity profiling</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">Farsi language</Param>
      </Object>
    </ObjectList>
    <ArchiveCopySource DocType="Pdf">http://jist.ir/ar/Article/Download/14997</ArchiveCopySource>
  </ARTICLE>
</ArticleSet>