New approaches for extracting Arabic keyphrases

Mahmoud Nabil, Amir F. Atiya, Mohamed Aly

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Keyphrases extraction has a considerable importance in many applications such as search engine optimization, clustering, summarization, and sentiment analysis. The importance of keyphrases comes from the semantic meaning they provide as they can be used as descriptors for the documents. In this paper we compare four approaches for extracting keyphrases from Arabic documents. The first method uses the KP-Miner keyphrase extraction system. The second method uses Arabic natural language processing tools (stemmer and part of speech tagger) in order to filter some patterns that can be weighted by token frequency inverse document frequency (TF-IDF) algorithm. The third method uses Google'sWord2Vec library to calculate the weighting of the resulting patterns by measuring the similarity of the candidate pattern and the document title. The fourth method combines the weightings result from the second and the third method.

Original languageEnglish (US)
Title of host publicationProceedings - 1st International Conference on Arabic Computational Linguistics
Subtitle of host publicationAdvances in Arabic Computational Linguistics, ACLing 2015
EditorsAlexander Gelbukh, Khaled Shaalan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages133-137
Number of pages5
ISBN (Electronic)9781467391559
DOIs
StatePublished - Feb 29 2016
Event1st International Conference on Arabic Computational Linguistics, ACLing 2015 - Cairo, Egypt
Duration: Apr 17 2015Apr 20 2015

Publication series

NameProceedings - 1st International Conference on Arabic Computational Linguistics: Advances in Arabic Computational Linguistics, ACLing 2015

Other

Other1st International Conference on Arabic Computational Linguistics, ACLing 2015
Country/TerritoryEgypt
CityCairo
Period04/17/1504/20/15

Keywords

  • Keyphrases extraction
  • POS tagging
  • Stemming

ASJC Scopus subject areas

  • Computer Science Applications
  • Signal Processing
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'New approaches for extracting Arabic keyphrases'. Together they form a unique fingerprint.

Cite this