﻿<?xml version="1.0" encoding="utf-8"?>
<ArticleSet>
  <ARTICLE>
    <Journal>
      <PublisherName>مرکز منطقه ای اطلاع رسانی علوم و فناوری</PublisherName>
      <JournalTitle>Journal of Information Systems and Telecommunication (JIST) </JournalTitle>
      <ISSN>2322-1437</ISSN>
      <Volume>13</Volume>
      <Issue>52</Issue>
      <PubDate PubStatus="epublish">
        <Year>2026</Year>
        <Month>2</Month>
        <Day>3</Day>
      </PubDate>
    </Journal>
    <ArticleTitle>Automatic Concept Extraction from Persian News Text Based On Deep Learning</ArticleTitle>
    <VernacularTitle>Automatic Concept Extraction from Persian News Text Based On Deep Learning</VernacularTitle>
    <FirstPage>278</FirstPage>
    <LastPage>288</LastPage>
    <ELocationID EIdType="doi">10.66224/jist.48902.13.52.278</ELocationID>
    <Language>en</Language>
    <AuthorList>
      <Author>
        <FirstName>ZahraSadat</FirstName>
        <LastName>Hosseini</LastName>
        <Affiliation>Department of Electrical and Computer, Engineering Malek-Ashtar University of Technology, Tehran, Iran,</Affiliation>
      </Author>
      <Author>
        <FirstName>SayedGholamHassan </FirstName>
        <LastName>Tabatabaei</LastName>
        <Affiliation>Department of Electrical and Computer, Engineering Malek-Ashtar University of Technology, Tehran, Iran,</Affiliation>
      </Author>
    </AuthorList>
    <History PubStatus="received">
      <Year>2024</Year>
      <Month>12</Month>
      <Day>18</Day>
    </History>
    <Abstract>&lt;p&gt;One of the most critical issues in natural-language understanding is extracting concepts from the text. The concept expresses essential information from the text. Concept Extraction to the process of extracting and generating keyphrases that may exist or not in the text. Automatic concept extraction from the Persian news text is a challenging problem due to the complexity of the Persian language. In this paper, we first review traditional and deep learning-based models in keyphrase extraction and generation. Then, an automated Persian news concept extraction algorithm is presented, which exploits encoder-decoder models. Specifically, our proposed models use the output vector of BERT-Base and ParsBERT language models as a word embedding. The evaluation results have shown that changing the word embedding layer has improved recall, precision, and F1 measures about 3.15%. Since encoder-decoder models get inputs consecutively, the training time increases. Also, if the sentence is long, they cannot store much information from the sentences. Therefore, for the first time, we have used mT5-Base with Transformer architecture, which receives and processes data parallelly. Recall, precision, and F1 measures used for the concept extraction results of the mT5-Base model are 55.66%, 55.47%, and 55.48%, respectively. The F1 score has increased by 19.8% compared to the previous models. Therefore, this model is effective for extracting the concept of Persian news texts.&lt;/p&gt;</Abstract>
    <ObjectList>
      <Object Type="Keyword">
        <Param Name="Value">Concept Extraction</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">Deep Learning</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">Keyphrase</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">BERT-BASE</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">ParsBERT</Param>
      </Object>
      <Object Type="Keyword">
        <Param Name="Value">mT5</Param>
      </Object>
    </ObjectList>
    <ArchiveCopySource DocType="Pdf">http://jist.ir/en/Article/Download/48902</ArchiveCopySource>
  </ARTICLE>
</ArticleSet>