Skip to main content
Пријава

Collected Item: “Sentiment Analysis of Serbian Old Novels”

Врста публикације

Рад у зборнику

Верзија документа

објављена

Језик

енглески

Аутор/и (Милан Марковић, Никола Николић)

Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović

Наслов рада (Наслов - поднаслов)

Sentiment Analysis of Serbian Old Novels

Назив конференције (зборника), место и датум одржавања

Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data, June 2022, Marseille, France

Издавач (Београд : Просвета)

European Language Resources Association

Година издавања

2022

Сажетак рада на енглеском језику

In this paper we present first study of Sentiment Analysis (SA) of Serbian novels from the 1840-1920 period. The preparation of sentiment lexicon was based on three existing lexicons: NRC, AFFIN and Bing with additional extensive corrections. The first phase of dataset refinement included filtering the word that are not found in Serbian morphological dictionary and in second automatic POS tagging and lemma were manually corrected. The polarity lexicon was extracted and transformed into ontolex-lemon and published as initial version. The complex inflection system of Serbian language required expansion of sentiment lexicon with inflected forms from Serbian morphological dictionaries. Set of sentences for SA was extracted from 120 novels of Serbian part of ELTeC collection, labelled for polarity and used for several model training. Several approaches for SA are compared, starting with for variation of lexicon based and followed by Logistic Regression, Naive Bayes, Decision Tree, Random Forest, SVN and k-NN. The comparison with models trained on labelled movie reviews dataset indicates that it can not successfully be used for sentiment analysis of sentences in old novels.

Почетна страна рада

31

Завршна страна рада

38

Кључне речи на енглеском (одвојене знаком ", ")

sentiment lexicon, sentiment analysis, distant-reading, machine learning, old novels

Линк

http://www.lrec-conf.org/proceedings/lrec2022/workshops/SALLD-2/pdf/2022.salld2-1.6.pdf

Шира категорија рада према правилнику МПНТ

М30

Ужа категорија рада према правилнику МПНТ

М33

Ниво приступа

Отворени приступ

Лиценца

Creative Commons – Attribution-NonComercial-Share Alike 4.0 International

Формат датотеке

.pdf
Click here to view the corresponding item.