Претрага
101 items
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... following way (step 1 in Fig 1). First we manually marked each tweet with a (BCMS) or (not_BCMS) mark. After that we used Serbian Morphological Electronic Dictionaries [22] to automatically tag each word with a mark of belonging to a language _word or not belonging _not (resource A in Fig 1). We introduced ...
... (FigLanguages ’07). Association for Computational Linguistics, 1–4. [22] Cvetana Krstev. 2008. Processing of Serbian — Automata, Texts and Electronic Dictionaries. Faculty of Philology, University of Belgrade, Belgrade. [23] Mirjana Mišković. 2001. The particle baš in contemporary Serbian. Pragmatics ...
... a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the rea- ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/
-
Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC
OntoLex, dominantni standard zajednice za mašinski čitljive leksičke resurse u kontekstu RDF-a, Linked Data i tehnologija Semantičkog veba, trenutno se proširuje sa posebnim modulom za Frekvencije, Primere i Informacije zasnovane na Korpusu (OntoLex-FrAC). Predlažemo novi komponent za OntoLex-FrAC, koji se bavi inkorporacijom korpusnih upita za (a) povezivanje rečnika sa korpusnim mašinama, (b) omogućavanje RDF baziranih web servisa da dinamički razmenjuju korpusne upite i podatke odgovora, i (c) korišćenje konvencionalnih upitačkih jezika za formalizaciju unutrašnje strukture kolokacija, skica reči i ...standardizacija, digitalna leksikografija, OntoLex, upiti korpusa, povezani podaci, Lingvistički povezani otvoreni podaciChristian Chiarcos, Ranka Stanković, Maxim Ionov, Gilles Sérasset. "Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, 20-25 May 2024, LREC (2024)
-
SrpELTeC: A Serbian Literary Corpus for Distant Reading
U članku je predstavljen SrpELTeC, korpus razvijen u okviru akcije COST Distant Reading for European Literary History (CA16204). Svi romani u SrpELTeC-u su odabrani, pripremljeni i obeleženi korišćenjem zajedničkih principa uspostavljenih za sve jezičke zbirke u Evropskoj zbirci književnog teksta (ELTeC). Navedeni su izazovi i rešenja u pripremi SrpELTeC od nule. Svi romani su ručno kodirani u TEI sa bogatim metapodacima i strukturnim napomenama. Automatska anotacija je uključivala POS-označavanje, lematizaciju i imenovane entitete, oslanjajući se na resurse za obradu ...digital humanities, Serbian literature, text corpora, distant reading , linked data, named entity recognition, text analyticsRanka Stanković, Cvetana Krstev, Duško Vitas. "SrpELTeC: A Serbian Literary Corpus for Distant Reading" in Primerjalna književnost, Research Centre of the Slovenian Academy of Sciences and Arts (2024). https://doi.org/10.3986/pkn.v47.i2.03
-
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names
In this paper we present a rule- and lexicon-based system for the recognition of Named Entities (NE) in Serbian news paper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annota tion, which were further used to train two Named Entity Recognition (NER) sys tems: Stanford and spaCy. All obtained models, together with a rule- and lexicon based system were evaluated on ...... In Proceedings of the Demonstrations Session at EACL 2012. Duško Vitas and Cvetana Krstev. 2012. Processing of Corpora of Serbian using Electronic Dictionaries. Prace Filologiczne LXIII:279–292. ...
... Maurel, 2004; Mau- rel et al., 2011). Each transducer rely in its work on the results of previous transducers and on e- dictionaries of Serbian (Vitas and Krstev, 2012). E-dictionaries play an important role specifically in the recognition of name expressions, since, beside general lexica, they contain ...
... NE classes (or- ganization names) and new sub-classes (e.g. for geopolitical names: regions, super-regions and city counties). In addition, the e-dictionaries of Serbian were also continually improved and en- hanced, and that by itself contributes to better per- formance of SRPNER. The new version of ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names" in Proceedings - Natural Language Processing in a Deep Learning World, Incoma Ltd., Shoumen, Bulgaria (2019). https://doi.org/10.26615/978-954-452-056-4_122
-
Microstructural and magnetic properties of electrospun hematite/cuprospinel composites
Phase composition, microstructural and magnetic properties of electrospun hematite/cuprospinel composites were investigated. Samples were synthesized starting with 0 to 10 mol% of copper relative to iron. The round shape of reference electrospun fbres was preserved upon their heating up to 600 °C in air, whereas at 700 °C hollow substructure was additionally formed. In these reference samples the presence of hematite phase was detected by XRPD. A small amount (traces) of Fe3O4 /γ-Fe2O3 was also found, due to the ...Electrical and Electronic Engineering, Condensed Matter Physics, Atomic and Molecular Physics and Optics, Electronic, Optical and Magnetic MaterialsMira Ristić, Aleksandar Kremenović, Michael Reissner, Željka Petrović, Svetozar Musić. "Microstructural and magnetic properties of electrospun hematite/cuprospinel composites" in Journal of Materials Science: Materials in Electronics, Springer Science and Business Media LLC (2020). https://doi.org/10.1007/s10854-020-03526-0
-
Увођење доменских и семантичких маркера за област рударства у српске електронске речнике
... Tomašević, Ranka Stanković, Bilјana Lazić INTRODUCING DOMAIN AND SEMANTIC MARKERS FOR THE FIELD OF MINING IN SERBIAN ELECTRONIC DICTIONARIES Summary Semantic markers in electronic dictionaries allow for complex queries for information extrac- tion. When it comes to domain-specific queries, the availablese ...
... Speech and Lan- guage Processing, Draft of November 7, 2016. Крстев 2008: Cvetana Krstev, Processing of Serbian – Automata, Texts and Elec- tronic dictionaries Faculty of Philology, University of Belgrade, Belgrade. Крстев и др., 2008: Cvetana Krstev, DuškoVitas, Gordana Pavlović-Lažetić, “Re- sources ...Иван Обрадовић, Александра Томашевић, Ранка Станковић, Биљана Лазић. "Увођење доменских и семантичких маркера за област рударства у српске електронске речнике" in Научни састанак слависта у Вукове дане - Српски језик и његови ресурси: теорија, опис и примене, Београд : Међународни славистички центар на Филолошком факултету, Филолошки факултет (2017). https://doi.org/10.18485/msc.2017.46.3.ch10
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... report on attacks and improper behaviour that are the result of national, racial, or religious hatred and intolerance. The system relied on electronic dictionaries of Serbian and local grammars that covered various patterns of hate speech and ways they were covered in newspaper articles. It should be noted ...
... different word suffixes to express different grammatical, syntactic, or semantic features, we also established the relation with the Serbian electronic dictionaries and the management platform Leximirka (Figure 6) [22], which enables the recognition of all inflected forms of trigger words. For the ranking ...
... Serbian. In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 74–84, 2020. 43 Julien Tissier, Christophe Gravier, and Amaury Habrard. Dict2vec: Learning word embeddings using lexical dictionaries. In Proceedings of the 2017 Conference on Empirical Methods in Natural ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
A Tel Platform Blending Academic And Entrepreneurial Knowledge
... textual resources (Fig 2). One of the basic lexical resources is the system of morphological dictionaries of Serbian simple words and compounds in the so-called LADL format [8]. Morphological dictionaries in the same format exist for many other languages, including French, English, Greek, Portuguese ...
... services, and knowledge management. Wiley. com. [10] Stanković, R., Obradović, I., Krstev, C., & Vitas, D. (2011). Production of morphological dictionaries of multi-word units using a multipurpose tool. In Proceedings of the Computational Linguistics- Applications Conference, CLA '11 (pp. 77-84) ...
... called "big" languages. In order to offer support in expert terminology within the multilingual approach, the BAEKTEL platform provides electronic terminological resources, parallel (multilingual) corpora of lessons and texts in written form, and functionalities for searching and browsing ...Ivan Obradović, Ranka Stanković, Jelena Prodanović, Olivera Kitanović. "A Tel Platform Blending Academic And Entrepreneurial Knowledge" in Proceedings of the The Fourth International Conference on e-Learning (eLearning-2013), September 2013, Belgrade, Serbia, Belgrade, Serbia : Belgrade Metropolitan University (2013)
-
Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface
Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos (2024)Predstavljamo trenutne aktivnosti na definisanju interfejsa leksikona i korpusa koji će služiti kao referenca u prikazu polileksemskih jedinica - višečlanih izraza - (različitih tipova - imenskih, glagolskih, itd.) u specijalizovanim leksikonima i povezivanju ovih unosa sa njihovim pojavljivanjima u korpusima. Konačni cilj je korišćenje ovakvih resursa za automatsko identifikovanje višečlanih izraza u tekstu. Uključivanje nekoliko prirodnih jezika ima za cilj univerzalnost rešenja koje nije usredsređeno na određeni jezik, kao i prilagođavanje idiosinkrazijama. Raspravljaju se izazovi u leksikografskom opisu višerečnih ...Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos. "Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... alignment with EWN. Bentivogli and Pianta (2003) proposed a method for extending MultiWordNet1 with phrases, by extracting them from bilingual dictionaries and corpora with techniques similar to those used for collocation extraction. In (Bhingardive et al., 2014) a method for Sanskrit WordNet extension ...
... application7 was also developed at the FMG, with the support of Tempus BAEKTEL project8. It has been aimed to support the development of terminological dictionaries in various domains within BAEKTEL. It contains 870 concepts in Serbian and English, which were used to produce 1,290 aligned term pairs. The ...
... Column POS shows the PWN synset POS, while the last column SrpPOS shows the POS for the Serbian equivalent, obtained from Serbian morphological dictionaries (Krstev et al., 2010), if this term was found in the dictionary. Definition in English DefEn is intended to help the expert in correcting the ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
-
Serbian NER&Beyond: The Archaic and the Modern Intertwinned
U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...... Systems. In Proceedings of the 6th Named Entity Workshop, pages 21–27. Cvetana Krstev. 2008. Processing of Serbian. Au- tomata, Texts and Electronic Dictionaries. Fa- culty of Philology of the University of Belgrade. Cvetana Krstev, Jelena Jaćimović, Branislava Šandrih, and Ranka Stanković. 2019. Ana- ...
... conclusions and plans for the future work were stated in Section 6. 2 Related Work The existence of large-scale lexical resources for Serbian, e-dictionaries in particular (Kr- stev, 2008), coupled with local grammars in the form of finite-state transducers (Vitas and Krstev, 2012), enabled the development ...
... XML-TEI tags used to preserve the format of original editions. Authors’ solution was based on the cascades of finite-state automata and both general dictionaries and those built speci- fically for the project. The evaluation showed that the slot error rate of name tagging was 6.1%. A dataset of literary entities ...Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
-
Дигиталне библиотеке у рударству и геологији са посебним освртом на представљање сиве литературе
Имајући у виду потребу за проналажењем информација похрањених у различитим облицима документације која се генерише у областима рударства и геологије на Рударско-геолошком факултету Универзитета у Београду, отпочет је процес развоја дигиталне библиотеке ROmeka@RGF, на платформи за приказивање дигиталних колекција - Омека. Значајан део документације представља такозвана сива литература која је претежно заступљена у виду вишетомне документацијe. Први савладани изазов представљало је повезивање различитих вишетомних делова пројектних извештаја у једну целину која би била лако доступна и претражива.... given to relational dictionaries which are designed to define document relations. We will also present some language resources for Serbian language which are used to improve information retrieval. Keywords: digital libraries, grey literature, Omeka, language resources, dictionaries. ...
... Aleksandra, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja. „Managing mining project documentation using human language technology“. The Electronic Library Vol. 36 Issue: 6 (2018): 993-1009. Ћирковић, Сњежана. „Сива литература – камелеон информационих ресурса“. Инфо- тека Год. 18, бр. 1 (2018): ...Биљана Лазић, Александра Томашевић, Михаило Шкорић. "Дигиталне библиотеке у рударству и геологији са посебним освртом на представљање сиве литературе" in Научна конференција Библиоинфо — 55 година од покретања наставе библиотекарства на високошколском нивоу, Београд 18. мај 2017., Филолошки факултет Универзитета у Београду (2019). https://doi.org/10.18485/biblioinfo.2017.ch13
-
Developing Termbases for Expert Terminology under the TBX Standard
... the like. This is especially important in the case of domain specific texts as in the fields of geology or mining. Thus, appropriate electronic morphological dictionaries are ❉❡✈❡❧♦♣✐♥❣ ❚❡r♠❜❛s❡s ✉♥❞❡r t❤❡ ❚❇❳ ❙t❛♥❞❛r❞ ✶✼ ❋✐❣✳ ✹✿ ❲❡❜ ❛♣♣❧✐❝❛t✐♦♥ ❢♦r ❘✉❞❖♥t♦ ❜r♦✇s❡ ❛♥❞ s❡❛r❝❤ ♥❡❡❞❡❞ ❬✶✽❪✳ ❚❤❡ s②st❡♠ ...
... about the lemma and the inflection class is supplied by a web service developed by University of Belgrade HLT Group based on Serbian electronic morphological dictionaries. The export of information on all inflective forms is currently under development. Finally, hyphenation of each word is specified as ...
... al information using Serbian electronic morphological dictio- naries and a web service developed by HLT Group from University of Belgrade. There is still much work to be done in this area, in the first place an en- hancement of domain specific morphological dictionaries of terminology related ✷✹ ...Ranka Stanković, Ivan Obradović, and Miloš Utvić. "Developing Termbases for Expert Terminology under the TBX Standard" in Natural Language Processing for Serbian - Resources and Applications, Belgrade : University of Belgrade, Faculty of Mathematics (2014)
-
Novel cerium and praseodymium doped phosphate tungsten bronzes: Synthesis, characterization, the behavior in the Briggs-Rauscher reaction and photoluminescence properties
Tijana Maksimović, Pavle Tančić, Jelena Maksimović, Dimitrije Mara, Marija Ilić, Rik Van Deun, Ljubinka Joksović, Maja Pagnacco (2023)Due to the interesting and potentially useful properties, phosphate tungsten bronzes are constantly being studied and attract a lot of attention. In the present work, two different metallic elements, belonging to the group of rare-earth metals, cerium and praseodymium, were used as a dopants for phosphate tungsten bronzes. Novel cerium and praseodymium doped phosphate tungsten bronzes were successfully synthesized and further characterized by thermal analyses, Fourier-transform infrared spectroscopy, X-ray powder diffraction, scanning electron microscopy with energy-dispersive X-ray spectrometer and ...Electrical and Electronic Engineering, Atomic and Molecular Physics and Optics, Electronic, Optical and Magnetic Materials, Inorganic Chemistry, Organic Chemistry, Physical and Theoretical Chemistry, SpectroscopyTijana Maksimović, Pavle Tančić, Jelena Maksimović, Dimitrije Mara, Marija Ilić, Rik Van Deun, Ljubinka Joksović, Maja Pagnacco. "Novel cerium and praseodymium doped phosphate tungsten bronzes: Synthesis, characterization, the behavior in the Briggs-Rauscher reaction and photoluminescence properties" in Optical Materials, Elsevier BV (2023). https://doi.org/10.1016/j.optmat.2023.114125
-
Proširivanje upita zasnovano na leksičkim resursima
U radu je opisano kako se leksički resursi za srpski jezik i softverski alati, razvijeni u okviru Grupe za jezičke tehnologije Univerziteta u Beogradu, mogu koristiti za unapređenje postavljanja upita. Rezultati pretrage mogu biti značajno unapređeni korišćenjem različitih leksičkih resursa, kakvi su morfološki rečnici i semantičke mreže. Izloženi pristup može se iskoristiti i u Sistemu naučnih, tehnoloških i poslovnih informacija, jer je efikasno pretraživanje ovog dragocenog resursa, imajući u vidu njegovu heterogenost i obim, kao i preovladavajući tekstualni sadržaj, ...... can be used for improvement of queries. Search results can be substantially improved by using various lexical resources, such as morphological dictionaries and semantic networks. The outlined approach may be used within the System of scientific, technical and business information. Efficient exploration ...
... langues avec l'ordinateur: De INTEX à NooJ, Presses Universitaires de Franche Compte, Paris, 2007. [4] Fellbaum C. (ed.) (1998) WordNet: An Electronic Lexical Database, The MIT Press. [5] Maurel D., Vitas D., Krstev S., Koeva S., (2007) „Prolex: a lexical model for translation of proper names ...Ranka Stanković, Ivan Obradović, Cvetana Krstev. "Proširivanje upita zasnovano na leksičkim resursima" in SNTPI 09 - Naučno-stručni skup Sistem naučnih, tehnoloških i poslovnih informacija, Beograd 19. i 20. jun 2009, Beograd : Fakultet informacionih tehnologija (2009)
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... Serbian lemmatizer. For lemmatization, we use Serbian morphological elec- tronic dictionaries and grammars developed within the University of Bel- grade Human Language Technology Group [17]. Morphological electronic dictionaries of Serbian for NLP have been developing for many years now. In the dictionary ...
... forms containing all the necessary grammatical information (DELAF) can be generated from it, and subsequently used for various NLP tasks. Serbian e-dictionaries of simple forms have reached a con- siderable size: they have more than 140,000 lemmas generating more than 5 million forms and 18,000 multi-word ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
Глаголи у кухињи и за столом
Цветана Крстев, Биљана Лазић (2015)У раду је приказано истраживање лексике на српском језику кулинарског домена које се заснива на коришћењу доменског корпуса, електронских лексичких ресурса, пре свега WordNet-а и морфолошких речника, и локалних граматика. Приказане су доменске специфичности ових ресурса, како се користе, и међусобно употпуњују. Посебно је приказано како се коришћењем доменског корпуса могу екстраховати глаголи специфични за кулинарски домен и описати начини њиховог коришћења. Дат је попис глагола са основним подацима који је добијен применом представљених метода.аутоматска обрада, коначни трансдуктори, електронски речници, семантичке мреже, локалне граматике, кулинарство... центар, Београд. 3. ВУЈИЧИЋ СТАНКОВИЋ И ДР. 2014: Staša Vujičić Stanković, Cvetana Krstev, Duško Vitas, “Enriching Serbian WordNet and Electronic Dictionaries with Terms from the Culinary Domain”, In The Proceedings of Seventh Global WordNet Conference 2014, eds. Heili Orav, Christiane Fellbaume ...
... Reasoning Applications-2. Springer Berlin Heidelberg, 121-162. 8. КРСТЕВ 2008: Cvetana Krstev, Processing of Serbian – Automata, Texts and Electronic dictionaries. Belgrade: Faculty of Philology, University of Belgrade. 9. КРСТЕВ И ДР. 2014: Cvetana Krstev, Staša Vujičić Stanković, Duško Vitas, “A ...
... paper we present a research of the lexica of the culinary domain in Serbian based on the use of the domain corpus, electronic lexical resources – WordNet and morphologcila dictionaries – and local grammars. We presented the domain characteristics of these resources, how they can be used for research ...Цветана Крстев, Биљана Лазић. "Глаголи у кухињи и за столом" in Научни састанак слависта у Вукове дане - Српски језик и његови ресурси: теорија, опис и преимене, Вол. 44/3, Београд : Међународни славистички центар (2015)
-
Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking
U radu se prikazuju rezultati istraživanja vezanih za pripremu paralelnih korpusa, fokusirajući se na transformaciju u RDF grafove koristeći NLP Interchange Format (NIF) za lingvističku anotaciju. Pružamo pregled paralelnog korpusa koji je korišćen u ovom studijskom slučaju, kao i proces označavanja delova govora, lematizacije i prepoznavanja imenovanih entiteta (NER). Zatim opisujemo povezivanje imenovanih entiteta (NEL), konverziju podataka u RDF, i uključivanje NIF anotacija. Proizvedene NIF datoteke su evaluirane kroz istraživanje triplestore-a korišćenjem SPARQL upita. Na kraju, razmatra se povezivanje Linked ...paralelni korpusi, povezivanje imenovanih entiteta, prepoznavanje imenovanih entiteta, NER, NEL, povezani podaci, NIF, VikipodaciRanka Stanković, Milica Ikonić Nešić, Olja Perisic, Mihailo Škorić, Olivera Kitanović. "Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
Towards a Mining Equipment Ontology
... of terminological resource in corresponding sub-fields, often in the form of controlled dictionaries, which are consistent collections of terms selected for a specific purpose. For example, controlled dictionaries can be derived from RudOnto for the area of Geostatistics, Mine safety, Mineral resource ...
... standardized definition of terms used in a specific area is needed. This goal can be achieved by developing relevant terminological resources in electronic format, preferably including relations of a semantic nature between terms. The simplest semantic relations are those between general and specific ...Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Towards a Mining Equipment Ontology" in Proceedings of the 12th International Conference Research and Development in Mechanical Industry, RaDMI 2012, September 2012, Vrnjačka Banja, Serbia no. 1, Vrnjačka Banja, Serbia : SaTCIP (Scientific and Technical Center for Intellectual Property) Ltd. (2012)
-
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... sciences and technology. In Proceedings of Linguistic Resources and Evaluation Conference, pages 1077–1082. Fellbaum, C., Ed. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press. Gelfenbeyn, I., Goncharuk, A., Lehelt, V., Lipatov, A., and Shilo, V. (2003). Automatic translation ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51