Претрага
81 items
-
Нове технологије за оживљавање старих текстова
удаљено читање, књижевни корпус, обрада српског језика, анотација врстом речи, лематизација, именовани ентитетиЦветана Крстев, Ранка Станковић, Бранислава Шандрих Тодоровић, Милица Иконић Нешић. "Нове технологије за оживљавање старих текстова" in Зборник радова Међународне научне конференције Дигитална хуманистика и словенско културно наслеђе II, Београд, 28-29 јуни 2021., Београд : Савез славистичких друштава Србије (2023)
-
A Lexical Approach to Acronyms and their Definitions
In this paper we present a comprehensive approach to acronyms for Natural-Language Processing (NLP) of Serbian texts. The proposed procedure includes extraction of acronyms and their definitions that are usual Multi-Word Units (MWUs), shallow parsing of MWUs that enables MWU lemmatization and production of entries in morphological electronic dictionaries, both for MWU and acronyms, that are provided with grammatical, syntactic, semantic and domain information. This approach enables representation that reflects complex relations between acronyms and their definitions.... Lexical Approach to Acronyms and their Definitions Cvetana Krstev, Duško Vitas, Ranka Stanković Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A Lexical Approach to Acronyms and their Definitions | Cvetana Krstev, Duško Vitas, Ranka Stanković | Proceedings of ...
... Repository is available at: www.dr.rgf.bg.ac.rs A Lexical Approach to Acronyms and their Definitions Cvetana Krstev∗, Duško Vitas∗, Ranka Stanković† University of Belgrade, Belgrade, Serbia, ∗(cvetana|vitas)@matf.bg.ac.rs, †ranka@rgf.bg.ac.rs Abstract In this paper we present a comprehensive approach ...
... 3rd Parseme General Meeting – Poster Session. Krstev, C., R. Stanković, I. Obradović, D. Vitas, and M. Utvić, 2010. Automatic construction of a morpho- logical dictionary of multi-word units. In IceTAL, vol- ume 6233 of LNCS. Springer. Krstev, C. and D. Vitas, 2005. Corpus and Lexicon – Mu- ...Cvetana Krstev, Duško Vitas, Ranka Stanković. "A Lexical Approach to Acronyms and their Definitions" in Proceedings of the 7th Language & Technology Conference, November 27-29, 2015, Poznań, Poland, Springer (2015)
-
An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... Efficient Processing of Multi-Word Units Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] An Approach to Efficient Processing of Multi-Word Units | Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško ...
... possible applications of both the procedure and LeXimir in various language processing tasks. Cvetana Krstev University of Belgrade — Faculty of Philology, Studentski trg 3, 11000 Belgrade, Serbia e-mail: cvetana@matf.bg.ac.rs Ivan Obradović University of Belgrade — Faculty of Mining and Geology, Djušina ...
... the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs An Approach to Efficient Processing of Multi-Word Units Cvetana Krstev, Ivan Obradović, Ranka Stanković, and Duško Vitas Abstract Efficient processing of MWUs in the course of development of morpho- logical MWU ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
-
Serbian NER&Beyond: The Archaic and the Modern Intertwinned
U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...... Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Serbian NER&Beyond: The Archaic and the Modern Intertwinned | Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica ...
... Systems. In Proceedings of the 6th Named Entity Workshop, pages 21–27. Cvetana Krstev. 2008. Processing of Serbian. Au- tomata, Texts and Electronic Dictionaries. Fa- culty of Philology of the University of Belgrade. Cvetana Krstev, Jelena Jaćimović, Branislava Šandrih, and Ranka Stanković. 2019. Ana- ...
... uploads/ 2019/09/DH_BP_2019-Abstract-Booklet.pdf. Cvetana Krstev, Ivan Obradović, Miloš Utvić, and Duško Vitas. 2014. A System for Named Entity Recognition Based on Local Grammars. Journal of Logic and Computation, 24(2):473–489. Cvetana Krstev and Ranka Stanković. 2020. Old or New, we Repair, Adjust ...Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
-
WS4LR - a Worksation for Lexical Resources
... WS4LR - a Worksation for Lexical Resources Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] WS4LR - a Worksation for Lexical Resources | Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović ...
... Resources Cvetana Krstev 1 , Ranka Stanković2 , Duško Vitas 3 and Ivan Obradović2 1Faculty of Philology, Studentski trg 3, CS-11000 Belgrade, 2Faculty of Mining and Geology, Đušina 7, CS-11000 Belgrade, 3Faculty of Mathematics, Studentski trg 16, CS-11000 Belgrade E-mail: cvetana@matf.bg.ac ...
... are all offered to the user to choose the appropriate one. Conversely, semantic marks of synset literals can be assigned to dictionary entries (Krstev & al., 2004). For instance, the mark +Comm can be added to all communicative verbs, that is, all literals belonging to the synsets that are hyponyms ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... management Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A bilingual digital library for academic and entrepreneurial knowledge management | Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor ...
... of Belgrade Đušina 7, 11000 Belgrade, Serbia E-mail: ranka.stankovic@rgf.bg.ac.rs Cvetana Krstev Faculty of Philology University of Belgrade Studentski trg 3, 11000 Belgrade, Serbia E-mail: cvetana@matf.bg.ac.rs Biljana Lazić Faculty of Mining and Geology University of Belgrade ...
... international and national journals and proceedings from scientific conferences and. She has developed several tools for various HLT tasks. Cvetana Krstev is a full-time professor of Librarianship and Informatics, University of Belgrade, Faculty of Philology, Her scientific field is Human Language ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
The Nooj System as Module within an Integrated Language Processing Environment
... Environment Ranka Stanković, Duško Vitas, Cvetana Krstev Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Nooj System as Module within an Integrated Language Processing Environment | Ranka Stanković, Duško Vitas, Cvetana Krstev | Proceedings of the 2007 International ...
... system as module within an integrated language processing environment Ranka Stanković, ranka@rgf.bg.ac.yu Duško Vitas, vitas@matf.bg.ac.yu Cvetana Krstev, cvetena@matf.bg.ac.yu 1. Introduction In this paper we describe the main structure and possible applications of one integrated environment ...
... Krstev C., Pavlović-Lažetić G., Vitas D., Obradović I.: Using Textual and Lexical Resources in Developing Serbian Wordnet. Romanian Journal of Information Science and Technology, Romanian Academy, Publishing House of the Romanian Academy, vol. 7, No. 1-2, pp. 147- 161, (2004) Krstev C. ...Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)
-
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... Lexical Database Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Electronic Dictionaries - from File System to lemon Based Lexical Database | Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić ...
... Database Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić {Faculty of Mining and Geology, Faculty of Philology } University of Belgrade {Djušina 7, Studentski trg 3} Belgrade, Serbia {ranka.stankovic, biljana.lazic, mihailo.skoric}@rgf.bg.ac.rs, cvetana@matf.bg.ac.rs Abstract In ...
... electronic dictionaries, lexical model, lexical relations 1. Introduction An application dubbed WS4LR (Krstev et al., 2006), subse- quently upgraded and renamed LeXimir (Stanković, Ranka and Krstev, Cvetana, 2016), was designed and implemented for the purpose of further development and management of ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... 2023-10-14 03:28:46 E-Connecting Balkan Languages Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] E-Connecting Balkan Languages | Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva | Proceedings ...
... the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs E-Connecting Balkan Languages Cvetana Krstev Faculty of Philology University of Belgrade cvetana@matf.bg.ac.rs Ranka Stanković Faculty of Mining and Geology University of Belgrade ranka@rgf.bg.ac.rs ...
... language, Hejzal, Sofia, 2004, 111- 157, 2004. [8] C. Krstev, et al. Combining Heterogeneous Lexical Resources, in Proc. of the Fourth International Conference LREC, Lisbon, Portugal, May 2004, vol. 4, pp. 1103-1106, 2004. [9] C. Krstev, R. Stanković, D. Vitas, I. Obradović. WS4LR: A Workstation ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... Search Engines Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines | Krstev Cvetana, Stanković Ranka ...
... Engines Cvetana Krstev 1 , Ranka Stanković2 , Duško Vitas 3 , Ivan Obradović4 1 professor, Faculty of Philology, Belgrade, 2 assistant, Faculty of Mining and Geology, Belgrade 3 professor, Faculty of Mathematics, Belgrade, 4 professor, Faculty of Mining and Geology, Belgrade E-mail: cvetana@matf ...
... fr/~unitex/ Krstev, C., et al., (2008). Resources and Methods in the Morphosyntactic Processing of Serbo-Croatian, In Formal Description of Slavic Languages: The Fifth Conference, Leipzig 2003, Zybatow, Gerhild et al. (eds.), Peter Lang: Frankfurt am Main, pp. 3-17... Krstev, C., Stanković ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
-
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... Rule-Based Diacritic Restoration in Serbian Cvetana Krstev, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Knowledge and Rule-Based Diacritic Restoration in Serbian | Cvetana Krstev, Ranka Stanković, Duško Vitas | Proceedings of ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51
-
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... Language Processing Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Development of Open Educational Resources (OER) for Natural Language Processing | Cvetana Krstev, Biljana Lazić ...
... September 2015, Belgrade, Serbia DEVELOPMENT OF OPEN EDUCATIONAL RESOURCE (OER) FOR NATURAL LANGUAGE PROCESSING CVETANA KRSTEV University of Belgrade, Faculty of Philology, cvetana@poincare.matf.bg.ac.rs BILJANA LAZIĆ University of Belgrade, Faculty of Mining and Geology, biljana.lazic@rgf.bg ...
... Lažetić, et al. 2014, Belgrade: Faculty of Mathematics. [6], 135. [8] Krstev, C. and A. Trtovac, Teaching Multimedia Documents to LIS Students. The Journal of Academic Librarianship, 2014. 40(2): p. 152-162. [9] Krstev, C., Information Science Curriculum at the Undergraduate Studies of Library ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
-
Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić (2022)In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published ...Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić. "Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection" in Proceedings of the Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022)
-
Indexing of textual databases based on lexical resources: A case study for Serbian
In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...... Serbian Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Indexing of textual databases based on lexical resources: A case study for Serbian | Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera ...
... Serbian Ranka Stanković1, Cvetana Krstev2, Ivan Obradović1, and Olivera Kitanović1 1 University of Belgrade, Faculty of Mining and Geology, ranka@rgf.bg.ac.rs, ivan.obradovic@rgf.bg.ac.rs, olivera.kitanovic@rgf.bg.ac.rs 2 University of Belgrade, Faculty of Philology, cvetana@matf.bg.ac.rs Abstract ...
... Languages with Sparse Resources. INFOtheca 9(1–2), 23a–33a (May 2008) 6. Krstev, C.: Processing of Serbian - Automata, Texts and Electronic Dictionaries. Faculty of Philology, University of Belgrade, Belgrade (2008) 7. Krstev, C., Obradović, I., Utvić, M., Vitas, D.: A System for Named Entity Recog- ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... and Sarcasm Classification Full Paper Miljana Mladenović Milenijum III Vranje, Serbia ml.miljana@gmail.com Cvetana Krstev University of Belgrade, Faculty of Philology Belgrade, Serbia cvetana@matf.bg.ac.rs Jelena Mitrović University of Passau, Faculty of Computer Science and Mathematics Passau, Germany ...
... processing; Lexical semantics; KEYWORDS Computational irony, Verbal irony, Verbal Sarcasm, WordNet ACM Reference format: Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, and Ranka Stanković. 2017. Using Lexical Resources for Irony and Sarcasm Classification. In Proceedings of BCI ’17, Skopje, Macedonia ...
... of the ACL: Human Language Technologies: short papers – Volume 2. Association for Computational Linguistics, 564–568. [10] Matthieu Constant, Cvetana Krstev, and Duško Vitas. 2015. Hybrid Lexical Tagging in Serbian. In Proc. of 7th Language & Technology Conference. Fundacja Uniwersytetu im. A. Mickiewicza ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian | Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš ...
... the production of the new tag- ger model for Serbian are: (a) Serbian morphological dic- tionaries (Cvetana Krstev, Duško Vitas, 2015) (SMD); (b) pre-annotated texts (Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić, 2019). 2.1. Serbian morphological dictionaries Serbian morphological ...
... 12(2):36a–47a, December. 8. Language Resource References Cvetana Krstev, Duško Vitas. (2015). Serbian Morpho- logical Dictionary - SMD. University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0. Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić. (2019). Sr-Basic: Annotated ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... Resources and Local Grammars Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Terminology Acquisition and Description Using Lexical Resources and Local Grammars | Cvetana Krstev, Ranka Stanković, Ivan ...
... acquisition and description using lexical resources and local grammars Cvetana Krstev Ranka Stanković Ivan Obradović Biljana Lazić University of University of University of University of Belgrade Belgrade Belgrade Belgrade cvetana @matf.bg.ac.rs ranka @rgf.bg.ac.rs ivan.obradovic @rgf.bg ...
... a considerable size: they have about 135,000 lemmas generating more than 5 million forms and 13,000 compound lemmas, that is, multi-word units (Krstev, 2008). The number of simple lemmas by Part-Of-Speech (POS) is de- picted in Figure 2 (left). Figure 2: Statistics of lemmas and inflectional ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
-
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
-
A Tool for Enhanced Search of Multilingual Digital Libraries of E-journals
This paper outlines the main features of Bibliša, a tool that offers various possibilities of enhancing queries submitted to large collections of TMX documents generated from aligned parallel articles residing in multilingual digital libraries of e-journals. The queries initiated by a simple or multiword keyword, in Serbian or English, can be expanded by Bibliša, both semantically and morphologically, using different supporting monolingual and multilingual resources, such as wordnets and electronic dictionaries. The tool operates within a complex system composed ...... Ranka Stanković, Cvetana Krstev, Ivan Obradović, Aleksandra Trtovac, Miloš Utvić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A Tool for Enhanced Search of Multilingual Digital Libraries of E-journals | Ranka Stanković, Cvetana Krstev, Ivan Obradović, ...
... of multilingual digital libraries of e-journals Ranka Stanković, Cvetana Krstev, Ivan Obradović, Aleksandra Trtovac, Miloš Utvić University of Belgrade, Serbia Studenski trg 1, 11000 Belgrade E-mail: ranka@rgf.bg.ac.rs, cvetana@matf.bg.ac.rs, ivano@rgf.bg.ac.rs, aleksandra@unilib.bg.ac.rs, misko@matf ...
... submitted to our collection of documents. The most important resources are Serbian morphological dictionaries of simple words and multi-word units [Krstev, 2008]. These comprehensive resources were developed and are being mainly used within two corpus processing systems: Unitex and Nooj. However ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Aleksandra Trtovac, Miloš Utvić. "A Tool for Enhanced Search of Multilingual Digital Libraries of E-journals" in Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, May 2012, Istanbul, Turkey, Istanbul, Turkey : European Language Resources Association (2012)
-
The Dictionary of the Serbian Academy: from the Text to the Lexical Database
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach ...... Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Dictionary of the Serbian Academy: from the Text to the Lexical Database | Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo | Proceedings ...
... Stanković1, Rada Stijović2, Duško Vitas1, Cvetana Krstev1, Olga Sabo2 1University of Belgrade, 2Institute for Serbian Language, Serbian Academy of Sciences and Arts E-mail: ranka.stankovic@rgf.bg.ac.rs, rada.stijovic@isj.sanu.ac.rs, vitas@matf.bg.ac.rs, cvetana@matf.bg.ac.rs, olga011@yahoo.com Abstract ...
... in recent years were these ideas revitalized, and various possibilities of updating the work on this vocabulary have since been considered (Vitas & Krstev, 2015; Ivanović et al. 2016). The digitization (which is also the topic of the present paper) of the published volumes and raw materials (lexicographic ...Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)