Претрага
339 items
-
It-Sr-NER: Web Services for Recognizing and Linking Named Entities in Text and Displaying Them on a Web Map
The paper will present the results of the project `“It-Sr-NER: Web services for named entities recognition, linking and mapping,” in which teams from the University of Turin and the Society for Language Resources and Technologies JeRTeh participated, and whose goal was the development of the It-Sr-NER web service for named entity annotations in the text and displaying them on the map. Named entities in these services are names of persons, places, organizations, demonyms (ethnicities), events and works of art.Olja Perišić, Ranka Stanković, Milica Ikonić Nešić, Mihailo Škorić. "It-Sr-NER: Web Services for Recognizing and Linking Named Entities in Text and Displaying Them on a Web Map" in Infotheca, Belgrade : Faculty of Philology, University of Belgrade (2023). https://doi.org/10.18485/infotheca.2023.23.1.3
-
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
-
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
-
Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model
Ova studija predstavlja analizu sentimenta srpskih starih romana iz perioda 1840-1920, koristeći veliki jezički model (LLM) Mistral za tehniku učenja sa zasnovani na takozvanim "zero" i "few-shot" pokušajima. Glavni pristup uvodi inovacije osmišljavanjem istraživačkih upita (promptova) uključuju tekst sa uputstvom za klasifikaciju bez primera i na osnovu nekoliko primera, omogućavajući jezičkom modelu da klasifikuje osećanja u pozitivne, negativne ili objektivne kategorije. Ova metodologija ima za cilj da pojednostavi analizu osećanja ograničavanjem odgovora, čime se povećava preciznost ...Milica Ikonić Nešić, Saša Petalinkar, Mihailo Škorić, Ranka Stanković, Biljana Rujević. "Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, Sofia, Bulgaria, 9-10 September 2024, LREC | COLING (2024)
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... from Serbian texts we have chosen a rule-based approach, which relies on a system of language resources such as morphological e-dictionaries and grammars developed within the University of Belgrade Human Language Technology Group (Vitas et al., 2012). For our approach, production of lemmas for ...
... Preece, A., Li, H. (Eds.), Natural Language Processing and Information Systems. Berlin: Springer, pp. 248--255. Koeva, S. (2007). Multi-word term extraction for Bulgarian. In Proc. of the Workshop on BSNLP: Information Extraction and Enabling Technologies, pp. 59--66. Krstev, C., Obradović ...
... and Kupść, A. (2007). Lemmatization of Polish person names. In Proc. of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies, Stroudsburg: Association for Computational Linguistics, pp. 27--34. Savary, A., Zaborowski, B., Krawczyk-Wieczorek ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Open Educational Resources in Serbia
... learning platform. She published more than 100 papers in journals and proceedings of scientific conferences, most of them in the area of human language technologies and more than 15 related to TEL. Marija Blagojević University of Kragujevac, Faculty of Technical Sciences Čačak Svetog Save 65 ...
... incorporate knowledge from various language and lexical resources. She is head of Computer Centre for the Mining department, Chairman of Technical comity A037 Terminology in Institute for Standardisation of Serbia and vice president of Language Resources and Technologies Society (JERTEH). She actively ...
... Topics of content (titles and keywords) are visualised by a word cloud in Figure 3. It can be seen that computer science, modeling and language technologies are dominant. 12 Chapter # - will be assigend by editors Figure 3. Word cloud of ...Ivan Obradović, Ranka Stanković, Marija Blagojević, Danijela Milošević. "Open Educational Resources in Serbia" in Current State of Open Educational Resources in the “Belt and Road” Countries, Springer Singapore (2020). https://doi.org/10.1007/978-981-15-3040-1_10
-
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... of this approach to wordnet enrichment. 1. Introduction Semantic networks, such as wordnets, are among the most important resources in Human Language Technologies. Thus, for example, the Princeton WordNet - PWN (Fellbaum, 1998), has been in use for more than two decades as the standard lexical database ...
... management and semantic web technologies compliant to W3C recommendations, as well as latest trends in thesaurus standards. For this research we used the bilingual en-sr version 4.7 in xls format, with 6,939 term entries, and 6,971 aligned pairs of terms. Microsoft language portal10 has published Microsoft ...
... form of a .tbx (ISO 30042:2008) file containing: Concept ID, Definition, Source term, Source language identifier, Target term, Target language identifier. The number of terms differ from language to language, due to varying levels of localization. The Microsoft Terminology Collection is a set of standard ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... Computational Linguistics: Human Language Technologies, pages 271–281. Constant, M., Krstev, C., and Vitas, D. (2018). Lexical analysis of serbian with conditional random fields and large-coverage finite-state resources. In Zygmunt Vetu- lani, et al., editors, Human Language Technology. Chal- lenges ...
... taggers as well as new tagging technologies will be taken into consideration and tested in order to find the best solu- tion for Serbian, a highly-inflected language without fixed word order, for instance RNNTagger.9 Since CRF tagger for Serbian and Croatian language obtained the accuracy over 98% ...
... (2009). Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort. In Proceedings of the 23rd Pacific Asia Conference on Language, Informa- tion and Computation, PACLIC 23, Hong Kong, China, December 3-5, 2009, pages 110–119. Erjavec, T. (2012) ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
A Data Driven Approach for Raw Material Terminology
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaci... documentation using human language technology. Electron. Libr. 2018, 36, 993–1009. [CrossRef] 32. Stanković, R.; Krstev, C.; Lazić, B.; Škorić, M. Electronic Dictionaries—from File System to lemon Based Lexical Database. In Proceedings of the Eleventh International Conference on Language Resources and ...
... and L1 in the CLIL Classroom. In Proceedings of the Second International Conference on Teaching English for Specific Purposes and New Language Learning Technologies, Niš, Serbia, 22–24 May 2015; Faculty of Electronic Engineering, University of Niš: Niš, Serbia, 2015. 24. Termi—Terminological Web A ...
... approach. A monolingual corpus from the mining domain was developed as part of a project related to managing mining project documentation using human language technology [31] and used within this research in the web and mobile applications. 2.3. General Purpose Morphological Dictionaries Serbian has ...Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
-
An aproach to Implementation of blended learning in a university setting
... systems, Information system design, and GIS (Geographic Information System) technologies. InfoTech course material is hence not organized on a weekly basis but rather by different topics. Namely, GIS technologies are, for example, one of the topics of the Information system design course ...
... staff of 143 professors and teaching assistants. The development environment for the production of the portal was based on the PHP scripting language, and the portal database was implemented on MS SQL Server 2008. For each of the several hundred courses available at FMG CMS two types of ...
... platform. From a technical point of view, it is a web application for creating Internet-based courses and web sites developed using the PHP scripting language (Hypertext Preprocessor), with a SQL type data base (for example MySQL, PostgreSQL, Microsoft SQL Server or Oracle). It can be run on Windows ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Jelena Prodanović . "An aproach to Implementation of blended learning in a university setting" in Proceedings of the Second International Conference on e-Learning, eLearning 2011, September 2011, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2011)
-
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain
U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...... different language resources, as well as the Sketch Engine corpus analysis tool. We have shown that FrameNet offers a detailed and structured mapping, which can then be used in different ways for language processing, especially in text extraction and organizing, as well as in an effort to make human- computer ...
... Serbian.” In Proceedings of The 12th LREC – Language Resources and Evaluation Conference, 3954–3962. Tomašević, Aleksandra, Ranka Stanković, Miloš Utvić, Ivan Obradović, and Božo Kolonja. 2018. “Managing mining project documentation using human language technology.” The Electronic Library, https://doi ...
... domain of mining started as part of a mining project documentation management project using language 18. Data for the frame Risky_situation 22 Infotheca Vol. 21, No. 1, September 2021 Scientific paper technologies (Tomašević et al. 2018, 996). Back then, the corpus contained texts from the domain of ...Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
-
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... 33-40. Savary, A. (2008). Computational Inflection of Multi-Word Units – A Contrastive Study of Lexical Approach, In: Linguistic Issues in Language Technologies, Vol. 1, No. 2, CSLI Publications. 819 ...
... 07 Language resource management - Feature Structures – Part 2: Feature System Declaration, ISO/TC 37/SC 4. ISO. (2009) ISO 12620 Terminology and other language and content resources – Data Categories – Specification of data categories and management of a data category registry for language resources ...
... the satisfactory solution. 1. Motivation Description of morphological features of a language is a prerequisite for many NLP applications. This description can be simple or complex depending both on a language and application in question. Considerable efforts in standardizing such a description ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
-
An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... A.: Slavonic information extraction and partial parsing. In: Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and En- abling Technologies, ACL ’07, pp. 1–10. Association for Computational Linguistics, Strouds- burg, PA, USA (2007). URL http://dl.acm.or ...
... ée (2000) 16. Savary, A.: Computational Inflection of Multi-Word Units — A Contrastive Study of Lexical Approaches. Linguistic Issues in Language Technologies 1(2) (2008) 17. Savary, A.: Multiflex: A Multilingual Finite-state Tool for Multi-Word Units. In: CIAA, pp. 237–240 (2009) 18. Savary, A. ...
... before, most of these conditions are satisfied for many languages. However, in order to apply this functionality to a new language it would be necessary to develop a new language- dependent strategy, that is, a new XML document. It is also worth mentioning that the system can be easily modified to ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
-
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... Belgrade) IMPROVEMENT OF GEODATABASE QUERIES WITHIN GEOLISS Abstract: We present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for improvement of queries for the geodatabase within the Geological information ...
... languages, such as Serbian. The geological dictionary, developed within GeolISS, supports semantic and multilingual expansions of the query. The Human Language Technology group at the University of Belgrade (HLT) has been developing various lexical resources over a long period, the resources reaching ...
... adjustable tool, a workstation for language resources, labeled WS4LR, which greatly enhances the potential of manipulating each particular resource as well as several resources simultaneously [9]. This tool has already been successfully used for various language processing related tasks including ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
-
Combining Heterogeneous Lexical Resources
... (IDE), which allows them to share tools and facilitates the creation of mixed-language solutions. In addition, these languages leverage the functionality of the .NET Framework, which provides access to key technologies that simplify the development of ASP Web applications and XML Web services ...
... the other hand, the XML Schema definition language (XSD) enables the definition of the structure and data types of XML documents. Figure 1 shows the graphical representation of XSD schema of Serbian WN. The XML Path Language (XPath) provides a language for addressing parts of an XML document ...
... Resources | Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić | Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4 | 2004 | | http://dr.rgf.bg.ac.rs/s/repo/item/0004863 Дигитални репозиторијум Ру ...Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić. "Combining Heterogeneous Lexical Resources" in Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4, ELRA - European Language Resources Association (2004)
-
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++
Branislava Šandrih, Ranka Stanković (2020)U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...... Texts”. Natural Language Engi- neering Vol. 22, no. 4 (2016): 517–548 Koehn, Philipp, Franz Josef Och and Daniel Marcu. “Statistical Phrase- based Translation”. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology ...
... “debugger”, the transcribed version is adopted for everyday use in Information Technologies domain. It is a challenge to produce and maintain up-to-date terminology re- sources, especially for an under-resourced language, such as Serbian. Today, Serbian terminology is transferred mainly from English ...
... domain terms for the source language (Input ii) is (a) the source language part of LIS-dict including SWTs; (b) the output of the extractor Eng-TE applied to the source language part of the aligned input corpus; 3. The extraction of the set of MWTs in the target language by Serb-TE (Input iii) was ...Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
-
An Italian-Serbian Sentence Aligned Parallel Literary Corpus
This article presents the construction and relevance of an Italian-Serbian sentence-aligned parallel corpus, delving into the aligned sentences in order to facilitate effective translation between the two languages. The parallel corpus serves as a valuable resource for language experts, researchers, and language enthusiasts, fostering a deeper understanding of linguistic nuances and cultural expressions. By bridging the gap between Serbian and Italian, this corpus opens new avenues for cross-cultural communication and collaboration, and ultimately contributes to the improvement of language-related ...Saša Moderc, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić. "An Italian-Serbian Sentence Aligned Parallel Literary Corpus" in Review of the National Center for Digitization, Belgrade : Faculty of Mathematics, University of Belgrade (2023). https://doi.org/10.5281/zenodo.11203388
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... n for Computational Linguistics: Human Language Technologies, June 1ŰJune 6, 2018, New Orleans, Louisiana, Vol. 1, 2018. 47 Michael Wiegand, Melanie Siegel, and Josef Ruppenhofer. Overview of the germeval 2018 shared task on the identification of offensive language. In Proceedings of GermEval 2018, ...
... information is a crucial component in human language technology, the FrAC module facilitates sharing and utilising this valued information [9], as presented in Listing 3. 4 Discussion and conclusion In this paper, we presented AbCoSER 1.0, the first corpus of abusive language in Serbian which consists of tweets ...
... as a language successfully, and thus the language column of a tweet could not be relied upon, the annotators were given one more task – to check the language of a tweet and whether it could be interpreted. They needed to mark tweets with meaningless content, tweets written in a foreign language or m ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... Bulgarian were par- tially funded by the Bulgarian National In- terdisciplinary Research e-Infrastructure for Resources and Technologies in favor of the Bulgarian Language and Cultural Heritage, part of the EU infrastructures CLARIN and DARIAH – CLaDA-BG, Grant number DO1- 272/16.12.2019. This work ...
... - The Repository is available at: www.dr.rgf.bg.ac.rs Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 3232–3242 Marseille, 11–16 May 2020 c© European Language Resources Association (ELRA), licensed under CC-BY-NC 3232 A Multilingual Evaluation Dataset ...
... eu/MWSA. Keywords: lexical semantic resources, sense alignment, lexicography, language resource 1. Introduction Lexical semantic resources (LSRs) are knowledge reposi- tories that provide the vocabulary of a language in a de- scriptive and structured way. One of the famous examples of LSRs are ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... the 49th Annual Meeting of the ACL: Human Language Technologies: short papers – Volume 2. Association for Computational Linguistics, 564–568. [10] Matthieu Constant, Cvetana Krstev, and Duško Vitas. 2015. Hybrid Lexical Tagging in Serbian. In Proc. of 7th Language & Technology Conference. Fundacja U ...
... 1145/3136273.3136298 1 INTRODUCTION There are many different theories on what irony is and what role it plays in language understanding. According to [33] “Irony is . . . a uniquely human mode of communication, curious in that the speaker says something other than what he or she intends”. Like- wise ...
... annotators were asked to decide whether the language of the tweet was recognized and whether the tweet represents an ironic statement.13 The results of the language tagging were used to estimate a binary language classifier (BCMS or not_BCMS). After the language classification we obtained a subset of 1 ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/