Претрага
106 items
-
An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... Efficient Processing of Multi-Word Units Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] An Approach to Efficient Processing of Multi-Word Units | Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško ...
... possible applications of both the procedure and LeXimir in various language processing tasks. Cvetana Krstev University of Belgrade — Faculty of Philology, Studentski trg 3, 11000 Belgrade, Serbia e-mail: cvetana@matf.bg.ac.rs Ivan Obradović University of Belgrade — Faculty of Mining and Geology, Djušina ...
... the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs An Approach to Efficient Processing of Multi-Word Units Cvetana Krstev, Ivan Obradović, Ranka Stanković, and Duško Vitas Abstract Efficient processing of MWUs in the course of development of morpho- logical MWU ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
-
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... 2023-10-14 03:28:46 E-Connecting Balkan Languages Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] E-Connecting Balkan Languages | Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva | Proceedings ...
... the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs E-Connecting Balkan Languages Cvetana Krstev Faculty of Philology University of Belgrade cvetana@matf.bg.ac.rs Ranka Stanković Faculty of Mining and Geology University of Belgrade ranka@rgf.bg.ac.rs ...
... language, Hejzal, Sofia, 2004, 111- 157, 2004. [8] C. Krstev, et al. Combining Heterogeneous Lexical Resources, in Proc. of the Fourth International Conference LREC, Lisbon, Portugal, May 2004, vol. 4, pp. 1103-1106, 2004. [9] C. Krstev, R. Stanković, D. Vitas, I. Obradović. WS4LR: A Workstation ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... Search Engines Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines | Krstev Cvetana, Stanković Ranka ...
... Engines Cvetana Krstev 1 , Ranka Stanković2 , Duško Vitas 3 , Ivan Obradović4 1 professor, Faculty of Philology, Belgrade, 2 assistant, Faculty of Mining and Geology, Belgrade 3 professor, Faculty of Mathematics, Belgrade, 4 professor, Faculty of Mining and Geology, Belgrade E-mail: cvetana@matf ...
... fr/~unitex/ Krstev, C., et al., (2008). Resources and Methods in the Morphosyntactic Processing of Serbo-Croatian, In Formal Description of Slavic Languages: The Fifth Conference, Leipzig 2003, Zybatow, Gerhild et al. (eds.), Peter Lang: Frankfurt am Main, pp. 3-17... Krstev, C., Stanković ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
-
Annotation of the Serbian ELTeC Collection
Ovaj rad predstavlja takozvano izdanje nivoa 2 kolekcije tekstova SrpELTeC razvijene u okviru aktivnosti Radne grupe 2 – Metode i alati COST akcije CA 16204 (Distant Reading for European Literary History) i njene specifikacije šeme. Izdanje nivoa 2 je nastavak izdanja nivoa 1, koje se koristi kao ulaz za morfosintaksičke i NER anotacije romana. Srpska obrada nivoa-2 je navedena kroz potrebne korake, uključujući metode i alate koji se koriste u tom procesu. Neki statistički podaci iz srpske kolekcije nivoa ...udaljeno čitanje, literarni korpus, tagiranje, prepoznavanje imenovanih entiteta, lematizacija, ELTeCRanka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Mihailo Škorić. "Annotation of the Serbian ELTeC Collection" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.2.3
-
Distribution of canonical syllable types in Serbian
Obradović Ivan, Obuljen Aljoša, Vitas Duško, Krstev Cvetana, Radulović Vanja. "Distribution of canonical syllable types in Serbian" in Text and Language, Structures · Functions · Interrelations. Quantitative Perspectives, P. Grzybek, E. Kelih, J. Mačutek (eds.), Wien:Praesens Verlag (2010): 145-157
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... management Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A bilingual digital library for academic and entrepreneurial knowledge management | Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor ...
... of Belgrade Đušina 7, 11000 Belgrade, Serbia E-mail: ranka.stankovic@rgf.bg.ac.rs Cvetana Krstev Faculty of Philology University of Belgrade Studentski trg 3, 11000 Belgrade, Serbia E-mail: cvetana@matf.bg.ac.rs Biljana Lazić Faculty of Mining and Geology University of Belgrade ...
... international and national journals and proceedings from scientific conferences and. She has developed several tools for various HLT tasks. Cvetana Krstev is a full-time professor of Librarianship and Informatics, University of Belgrade, Faculty of Philology, Her scientific field is Human Language ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
Corpus-based bilingual terminology extraction in the power engineering domain
Ovaj rad predstavlja resurse i alate koji se koriste za ekstrkciju i evaluaciju dvojezične, englesko-srpske terminologije u domenu energetike. Resursi se sastoje od postojeće opšte i domenske leksike i domenskog paralelnog korpusa; alati uključuju ekstraktore termina za oba jezika i alat za poravnavanje segmenata koji pripadaju korpusnim rečenicama. Sistem je testiran variranjem funkcije podudaranja koja utvrđuje prisustvo ekstrahovanog termina u poravnatom segmentu (odsečak), u rasponu od veoma labavog do strogog. Procena rezultata je pokazala da je preciznost izdvajanja termina ...Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev. "Corpus-based bilingual terminology extraction in the power engineering domain" in Terminology, John Benjamins Publishing Company (2022). https://doi.org/10.1075/term.20038.iva
-
Нове технологије за оживљавање старих текстова
удаљено читање, књижевни корпус, обрада српског језика, анотација врстом речи, лематизација, именовани ентитетиЦветана Крстев, Ранка Станковић, Бранислава Шандрих Тодоровић, Милица Иконић Нешић. "Нове технологије за оживљавање старих текстова" in Зборник радова Међународне научне конференције Дигитална хуманистика и словенско културно наслеђе II, Београд, 28-29 јуни 2021., Београд : Савез славистичких друштава Србије (2023)
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian | Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš ...
... the production of the new tag- ger model for Serbian are: (a) Serbian morphological dic- tionaries (Cvetana Krstev, Duško Vitas, 2015) (SMD); (b) pre-annotated texts (Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić, 2019). 2.1. Serbian morphological dictionaries Serbian morphological ...
... 12(2):36a–47a, December. 8. Language Resource References Cvetana Krstev, Duško Vitas. (2015). Serbian Morpho- logical Dictionary - SMD. University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0. Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić. (2019). Sr-Basic: Annotated ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
A WordNet Ontology in Improving Searches of Digital Dialect Dictionary
In this paper, we present a method for automatic generation of a digital resource, which connects all indirect synonyms of a dialect term to all indirect synonyms of a corresponding term in the standard language, aiming to improve the search of a digital dialect dictionary. The method uses SWRL rules defined in the Serbian WordNet ontology to identify sets of synonymous words. It also uses e-dictionaries to produce correct lemmas in standard language that users usually employ in searches. ...... Miljana Mladenović, Ranka Stanković, Cvetana Krstev Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A WordNet Ontology in Improving Searches of Digital Dialect Dictionary | Miljana Mladenović, Ranka Stanković, Cvetana Krstev | New Trends in Databases and Information ...
... Stankovié?, and Cvetana Krstev? eVox Solutions, Belgrade, Serbia, ml.miljana@gmail.com, ? University of Belgrade, Faculty of Mining and Geology, Djusina 7, Belgrade, Serbia ranka@rgf.bg.ac.rs 3 University of Belgrade, Faculty of Philology, Studentski trg 3, Belgrade, Serbia cvetana@matf.bg.ac.rs Abstract ...Miljana Mladenović, Ranka Stanković, Cvetana Krstev. "A WordNet Ontology in Improving Searches of Digital Dialect Dictionary" in New Trends in Databases and Information Systems: ADBIS 2017 Short Papers and Workshops - SW4CH (Semantic Web for Cultural Heritage) 767, Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-67162-8_37
-
Serbian NER&Beyond: The Archaic and the Modern Intertwinned
U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...... Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Serbian NER&Beyond: The Archaic and the Modern Intertwinned | Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica ...
... Systems. In Proceedings of the 6th Named Entity Workshop, pages 21–27. Cvetana Krstev. 2008. Processing of Serbian. Au- tomata, Texts and Electronic Dictionaries. Fa- culty of Philology of the University of Belgrade. Cvetana Krstev, Jelena Jaćimović, Branislava Šandrih, and Ranka Stanković. 2019. Ana- ...
... uploads/ 2019/09/DH_BP_2019-Abstract-Booklet.pdf. Cvetana Krstev, Ivan Obradović, Miloš Utvić, and Duško Vitas. 2014. A System for Named Entity Recognition Based on Local Grammars. Journal of Logic and Computation, 24(2):473–489. Cvetana Krstev and Ranka Stanković. 2020. Old or New, we Repair, Adjust ...Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian | Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih | 3rd Conference ...
... Detection in Serbian Danka Jokić # University of Belgrade, Serbia Ranka Stanković # Faculty of Mining and Geology, University of Belgrade, Serbia Cvetana Krstev # ↸ Faculty of Philology, University of Belgrade, Serbia Branislava Šandrih # ↸ Faculty of Philology, University of Belgrade, Serbia Abstract ...
... such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic”. © Danka Jokić, Ranka Stanković, Cvetana Krstev, and Branislava Šandrih; licensed under Creative Commons License CC-BY 4.0 3rd Conference on Language, Data and Knowledge (LDK 2021). Editors: ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
WS4LR - a Worksation for Lexical Resources
... WS4LR - a Worksation for Lexical Resources Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] WS4LR - a Worksation for Lexical Resources | Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović ...
... Resources Cvetana Krstev 1 , Ranka Stanković2 , Duško Vitas 3 and Ivan Obradović2 1Faculty of Philology, Studentski trg 3, CS-11000 Belgrade, 2Faculty of Mining and Geology, Đušina 7, CS-11000 Belgrade, 3Faculty of Mathematics, Studentski trg 16, CS-11000 Belgrade E-mail: cvetana@matf.bg.ac ...
... are all offered to the user to choose the appropriate one. Conversely, semantic marks of synset literals can be assigned to dictionary entries (Krstev & al., 2004). For instance, the mark +Comm can be added to all communicative verbs, that is, all literals belonging to the synsets that are hyponyms ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
-
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... for Serbian Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Towards Automatic Definition Extraction for Serbian | Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin ...
... employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs Towards Automatic Definition Extraction for Serbian Stanković Ranka1, Krstev Cvetana1, Stijović Rada2, Gočanin Mirjana2, Škorić Mihailo1 1 University of Belgrade, Serbia 2 Institute for the Serbian Language of SASA, Serbia ...
... dictionary production by automatically associating some grammatical information to lemmas, namely, word class, word forms and different types of markers (Krstev 2008). Since the research related to extraction of dictionary examples has shown that information extraction from a corpus can be used to speed up ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
-
Production of morphological dictionaries of multi-word units using a multipurpose tool
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multi-word units, noun phrases, query expansion... Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Production of morphological dictionaries of multi-word units using a multipurpose tool | Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas | Proceedings ...
... Mining and Geology, Djušina 7, 11000 Belgrade, Serbia Email: {ranka,ivano}@rgf.bg.ac.rs Cvetana Krstev University of Belgrade — Faculty of Philology, Studentski trg 3, 11000 Belgrade, Serbia Email: cvetana@matf.bg.ac.rs Duško Vitas University of Belgrade — Faculty of Mathematics, Studentski trg ...
... vol. 5070. Springer, 2009, pp. 111–141. [10] C. Krstev, R. Stanković, D. Vitas, and I. Obradović, “The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines,” in 6th LREC, Marrakech, Marocco, 2008. [11] C. Krstev, R. Stanković, D. Vitas, and S. Koeva, “E-Connecting ...Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas. "Production of morphological dictionaries of multi-word units using a multipurpose tool" in Proceedings of the Computational Linguistics-Applications Conference, October 2011, Jachranka, Poland, Jachranka, Poland : PTI - Polish Information Processing Society (2011)
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... ion Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Rule-based Automatic Multi-word Term Extraction and Lemmatization | Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana ...
... Stanković1, Cvetana Krstev2, Ivan Obradović1, Biljana Lazić1, Aleksandra Trtovac3 1University of Belgrade, Faculty of Mining and Geology 2 University of Belgrade, Faculty of Philology 3 University Library “Svetozar Marković”, Belgrade E-mail: ranka.stankovic@rgf.bg.ac.rs, cvetana@poincare.matf ...
... , pp. 59--66. Krstev, C., Obradović, I., Stanković, R., and Vitas, D. (2013). An Approach to Efficient Processing of Multi-Word Units. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (Eds.) Computational Linguistics. Berlin: Springer, pp. 109--129. Krstev, C., Stanković R. ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Indexing of textual databases based on lexical resources: A case study for Serbian
In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...... Serbian Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Indexing of textual databases based on lexical resources: A case study for Serbian | Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera ...
... Serbian Ranka Stanković1, Cvetana Krstev2, Ivan Obradović1, and Olivera Kitanović1 1 University of Belgrade, Faculty of Mining and Geology, ranka@rgf.bg.ac.rs, ivan.obradovic@rgf.bg.ac.rs, olivera.kitanovic@rgf.bg.ac.rs 2 University of Belgrade, Faculty of Philology, cvetana@matf.bg.ac.rs Abstract ...
... Languages with Sparse Resources. INFOtheca 9(1–2), 23a–33a (May 2008) 6. Krstev, C.: Processing of Serbian - Automata, Texts and Electronic Dictionaries. Faculty of Philology, University of Belgrade, Belgrade (2008) 7. Krstev, C., Obradović, I., Utvić, M., Vitas, D.: A System for Named Entity Recog- ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
-
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
-
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... Language Processing Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Development of Open Educational Resources (OER) for Natural Language Processing | Cvetana Krstev, Biljana Lazić ...
... September 2015, Belgrade, Serbia DEVELOPMENT OF OPEN EDUCATIONAL RESOURCE (OER) FOR NATURAL LANGUAGE PROCESSING CVETANA KRSTEV University of Belgrade, Faculty of Philology, cvetana@poincare.matf.bg.ac.rs BILJANA LAZIĆ University of Belgrade, Faculty of Mining and Geology, biljana.lazic@rgf.bg ...
... Lažetić, et al. 2014, Belgrade: Faculty of Mathematics. [6], 135. [8] Krstev, C. and A. Trtovac, Teaching Multimedia Documents to LIS Students. The Journal of Academic Librarianship, 2014. 40(2): p. 152-162. [9] Krstev, C., Information Science Curriculum at the Undergraduate Studies of Library ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
-
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... Rule-Based Diacritic Restoration in Serbian Cvetana Krstev, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Knowledge and Rule-Based Diacritic Restoration in Serbian | Cvetana Krstev, Ranka Stanković, Duško Vitas | Proceedings of ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51