Arabic Extractive Summarization Using Pre-Trained Models

Main Article Content

Yasmin Einieh
Amal AlMansour
Amani Jamal

Abstract

Automatic Text Summarization (ATS) is a crucial area of study in Natural Language Processing (NLP) due to the vast amount of online information available. Extractive summarization, which involves selecting important sentences from the original document without altering their wording, is one approach to generating summaries. While many methods for Arabic text summarization exist, deep learning applications are still in their early stages, and there is a shortage of available datasets. Unlike English, there have been fewer experiments conducted on Arabic language summarization due to its unique characteristics. This study aims to fill this gap by experimenting with several models for summarizing Arabic text, including QARiB, AraELECTRA, and AraBERT-base models, all trained using the KALIMA dataset. The AraBERT model performed exceptionally well, achieving high scores of 0.44, 0.26, and 0.44 on the ROUGE-1, ROUGE-2, and ROUGE-L measures, respectively.

Article Details

How to Cite
Einieh, Y., AlMansour, A., & Jamal, A. (2023). Arabic Extractive Summarization Using Pre-Trained Models. Journal of King Abdulaziz University: Computing and Information Technology Sciences, 12(1), 63 –. https://doi.org/10.4197/Comp.12-1.6
Section
Articles