An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... third step, the selection of the correct lemma, FST code and grammatical categories is supported by possible combinations offered in auxiliary tables (in the bottom right corner of Figure 3). In the final step, the user has to fill manually the code of the inflectional transducer for the newly produced ...
... useful function is the extraction of sub- sets of lemmas based on different criteria: lemmas’ beginning, their part of speech (PoS), inflectional class code, syntactic and/or semantic markers or a Boolean com- bination of these criteria. Figure 3 shows the table for manual production of a DELAC entry having ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
Polymorphism and photoluminescence properties of K3ErSi2O7
alkalni silikati elemenata retkih zemalja, silikati lantanoida, polimorfizam, fotoluminescencija, kristalna strukturaPredrag Dabić, Marko G. Nikolić, Sabina Kovač, Aleksandar Kremenović. "Polymorphism and photoluminescence properties of K3ErSi2O7" in Acta Crystallographica Section C Structural Chemistry, International Union of Crystallography (IUCr) (2019). https://doi.org/10.1107/S2053229619011926
INVENTS: A Hybrid Mine Ventilation Planning and Design System
Ventilation system analysis is a complex process based on the calculation and analysis of numerous parameters. These problems can be successfully solved by the SimVent numerical package, but a full understanding and use of the obtained results require the involvement of an experienced specialist in the ventilation field. The solution was found in the creation of a hybrid system INVENTS, whose knowledge base represents a formalization of the expert knowledge in the mine ventilation field. In this paper we ...... which provides a wide range of tools for constructing and using applications by means of a high-level graphical environment which generates standard C code. In the KAPPA-PC system, the components of the domain are represented by objects that can be either classes or instances within classes (Fig.9). The ...Lilić Nikola, Stanković Ranka, Obradović Ivan. "INVENTS: A Hybrid Mine Ventilation Planning and Design System" in Proceedings of International Scientific Conference of FME Session 4: Automation Control and Applied Informatics , Hong Kong : iConcept Press (2013)
EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School
Prva škola za obuku polaznika koju je organizovala COST akcija NexusLinguarum održana je od 8. do 12. februara 2021. godine sa ciljem da studenti, istraživači i stručnjaci nauče osnove lingvističke nauke o podacima. Tokom obuke polaznici su se upoznali sa širokim spektrom tema: od semantičkog veba, RDF -a i ontologija, do modeliranja i pretraživanja jezičkih podataka pomoću najsavremenijih ontoloških modela i alata. Škola je održana u okviru serije letnjih škola EUROLAN-a i organizovalo ju je virtuelno (onlajn) nekoliko instituta; ...nauka o lingvističkim podacima, povezani podaci u lingvistici, jezički podaci, EUROLAN, NexusLinguarum, COST akcija, škola za obuku... from Serbia. Various types of materials were generated for the training school, includ- ing presentations (slides)18 and exercises19 accompanied by code and data 13. School Program 14. Jerteh 15. VocBench installation 16. Intelligent Systems PhD Program 17. Slack, The School channel 18. Presentations ...Milan Dojchinovski, Julia Bosque Gil, Jorge Gracia, Ranka Stanković. "EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.7
INVENTS: a hybrid system for subsurface ventilation analysis
Ventilation system analysis is a complex process based on the calculation and analysis of numerous parameters. These problems can be successfully solved by the SimVent numerical package, but a full understanding and use of the obtained results require the involvement of an experienced specialist in the ventilation field. The solution was found in the creation of a hybrid system INVENTS, whose knowledge base represents a formalization of the expert knowledge in the mine ventilation field. In this paper we ...... which provides a wide range of tools for constructing and using applications by means of a high-level graphical environment which generates standard C code. In the KAPPA-PC system, the components of the domain are represented by objects that can be either classes or instances within classes (Fig.9). The ...Nikola Lilić, Ranka Stanković, Ivan Obradović. "INVENTS: a hybrid system for subsurface ventilation analysis" in Proc. of International Scientific Conference of FME, September 2000, Ostrava, FME (2000)
Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking
U radu se prikazuju rezultati istraživanja vezanih za pripremu paralelnih korpusa, fokusirajući se na transformaciju u RDF grafove koristeći NLP Interchange Format (NIF) za lingvističku anotaciju. Pružamo pregled paralelnog korpusa koji je korišćen u ovom studijskom slučaju, kao i proces označavanja delova govora, lematizacije i prepoznavanja imenovanih entiteta (NER). Zatim opisujemo povezivanje imenovanih entiteta (NEL), konverziju podataka u RDF, i uključivanje NIF anotacija. Proizvedene NIF datoteke su evaluirane kroz istraživanje triplestore-a korišćenjem SPARQL upita. Na kraju, razmatra se povezivanje Linked ...paralelni korpusi, povezivanje imenovanih entiteta, prepoznavanje imenovanih entiteta, NER, NEL, povezani podaci, NIF, VikipodaciRanka Stanković, Milica Ikonić Nešić, Olja Perisic, Mihailo Škorić, Olivera Kitanović. "Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
Low-temperature phase transition and magnetic properties of K3YbSi2O7
Predrag Dabić, Volker Kahlenberg, Biljana Krüger, Marko Rodić, Sabina Kovač, Jovan Blanuša, Zvonko Jagličić, Ljiljana Karanović, Václav Petříček, Aleksandar Kremenović (2021)alkalni silikati elemenata retkih zemalja, fazni prelazi, magnetne karakteristike, razdvajanje kristalnog polja, silikati lantanica... (3) �6 Si1—O1 1.615 (2) �3 K2—O2 3.3067 (3) �3 Si1—O2 1.6411 (17) O1—Si1—O1iii 110.65 (11) �3 Si1iv—O2—Si1 180.0 O1—Si1—O2 108.27 (11) �3 Symmetry code: (i) x � y, x, �z; (ii) �y, x � y, z; (iii) �x + y + 1, �x + 1, z; (iv) x, y, �z + 1 2. Table 4 Bond valence sums ij (v.u.) for the cations and anions ...
... 110.65 (11) x3 sil’ —O2—-Sil 180.0 K2—O1 2.742 (4) x4 Yb1—O1 2.212 (4) x4 O1—Sil—O2 108.27 (11) x3 K2—02"% 2.827 (7) x2 Yb1—O2 2.222 (6) x2 Symmetry code: (i) x — y, x, —z; (ii) —y, x — y, z (iii) -x + y +1, —x +1, z O1’—Sil—O1" 111.0 (3) 02—Sil—O3 105.3 (4) (iv) x,y, —z +3. O1’—Sil—O2 110.8 (2) x2 Sil—O3—Sil ...
... form sorosilicate layers sharing common vertices and edges with the Si2O7 groups, respectively. Atoms are shown as spheres using the following colour code: Yb: turquoise, Si: blue, K: violet, O: red. electronic reprint research papers close to the value of 2.268 A (0.868 A+14 A) calculated from the ...Predrag Dabić, Volker Kahlenberg, Biljana Krüger, Marko Rodić, Sabina Kovač, Jovan Blanuša, Zvonko Jagličić, Ljiljana Karanović, Václav Petříček, Aleksandar Kremenović. "Low-temperature phase transition and magnetic properties of K3YbSi2O7" in Acta Crystallographica Section B Structural Science, Crystal Engineering and Materials, International Union of Crystallography (IUCr) (2021). https://doi.org/10.1107/S2052520621006077
XRD dertermination od hydrothermal phases related to epithermal mineralization in the Čukaru Peki deposit
Dragana Bosić, Vladica Cvetković, Miodrag Banješević, Kristina Šarić. "XRD dertermination od hydrothermal phases related to epithermal mineralization in the Čukaru Peki deposit" in V Congress of Geologists of Republic of Macedonia ": “Geology in a changing world”, Makedonsko geološko društvo (2024)
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... mathematics (e.g. diedar,N3+DOM=Math). Val- ues assigned to a certain attribute can belong to a closed set (e.g. +CC2=RS is a two character coun- try code marker assigned, for instance, to geopolitical names), or to an open set (e.g. +Val=Vaughn is as- signed to a surname Von, Serbian transcription of ...
... by its DELAS entry form, and :categories are the possible grammatical categories of the word form, each category represented by a single character code (Krstev and Vitas, 2007). LeXimir, a tool for development and maintenance of e- dictionaries enabled development of Serbian morphologi- cal dictionaries ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
Razvoj ARCGIS geobaze površinskog kopa korišćenjem UML CASE alata
... format for data interchange) file or repository; 3. Import of the UML model schema from XMI format into ArcCatalog and optional generation of code that defines specific behavior of objects. Figure 1 depicts the general schema of the use of UML CASE tools for design and development of geodatabases ...Aleksandra Tomašević, Ljiljana Kolonja, Ivan Obradović, Ranka Stanković, Olivera Kitanović. "Razvoj ARCGIS geobaze površinskog kopa korišćenjem UML CASE alata" in Podzemni radovi, Beograd : Univerzitet u Beogradu - Rudarsko-geološki fakultet (2012)
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković (2019)У овом раду представљамо модел за избор добрих примера за речник српског језика и развој иницијалних компоненти модела. Метода која се користи заснива се на детаљној анализи различитих лексичких и синтактичких карактеристика у корпусу састављених од примера из пет дигитализованих свезака речника САНУ. Почетни скуп функција био је инспирисан сличним приступом и за друге језике. Дистрибуција карактеристика примера из овог корпуса упоређује се са карактеристиком дистрибуције узорака реченица ексцерпираних из корпуса који садрже различите текстове. Анализа је показала да ...Српски, добри примери из речника, аутоматизација израде речника, издвајање својстава, Машинско учење... from the control corpus (Section 4). The implemented set of features is described by metadata, i.e. several attributes are assigned to each feature: code, description, processing level (char, word, and sentence), headword dependency (yes/no), weight (for weighted sum and use in our future model ...
... linguistic labels (some of which are mentioned in Section 2.2), type of editorial intervention (if any) on the example (shortening or insertion) and a code for the bibliographical source. The size of the gold corpus is 133,904 examples, comprising 1,711,231 words or 10,577,723 characters. Within the gold ...Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković. "SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian" in Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference , Lexical Computing CZ, s.r.o. (2019)
Improvement of education through the cooperation between CEEPUS EURO Geo-Sci network and scientific projects: examples from UB-FMG
Kristina Šarić, Dejan Prelević, Miloš Marjanović, Uroš Stojadinović, Vladimir Simić . "Improvement of education through the cooperation between CEEPUS EURO Geo-Sci network and scientific projects: examples from UB-FMG" in V Congress of Geologists of Republic of Macedonia ": “Geology in a changing world”, Makedonsko geološko društvo (2024)
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... General Public License (LGPL). This means that everyone can redistribute Unitex freely within the terms of the LGPL license, access the source code of all Unitex modules and reuse it. 5. LEXICAL RECOGNITION IN NLP Course content The main topic of the presented course 12 is natural ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
Towards translation of educational resources using GIZA++
... The document consists of, , (paragraph),
(Translation Unit) and (Translation unit variant) elements. [15] Metadata code (element ) is attached to each aligned sentence (element ) in order to establish a direct relation to metadata and the original (pdf, edX ... Ivan Obradović, Dalibor Vorkapić, Ranka Stanković, Nikola Vulović, Miladin Kotorčević. "Towards translation of educational resources using GIZA++" in The Seventh International Conference on e-Learning (eLearning-2016), September 2016, Belgrade : Metropolitan Univesity (2016)
Part of Speech Tagging for Serbian language using Natural Language Toolkit
Ranka Stanković, Boro Milovanović (2020)Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...... is possible to define a back off tagger which will take over when the current tagger is not able to determine a tag for a token (returning tag 1 Code for training and evaluation, example dataset and results are available at: https://github.com/bmilovanovic/pos-tagging-serbian. None). This is the ...Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)
Процена емисије гасова са ефектом стаклене баште као последице енергетске активности
Горана Остојић (2024)Рад се бави анализом и проценом емисија гасова са ефектом стаклене баште, са посебним акцентом на емисије које настају као последица енергетских активности. Ефекат стаклене баште представља природни процес који је неопходан за одржавање живота на Земљи, али антропогене активности, попут индустријске производње и сагоревања фосилних горива, значајно повећавају концентрацију ових гасова у атмосфери, што доводи до климатских промена. Рад истражује узроке, последице и могућности за смањење емисија гасова, као и развој стратегија за ублажавање негативних ефеката на животну ...емисија гасова, ефекат стаклене баште, енергетске активности, климатске промене, мобилно сагоревање, стационарно сагоревање... should be reported separately under 1 B2 a, 98 TIpouena eucuje zacosda ca edpjekmo cmakJteHe 0Gauime Kao nocJeOuue eHepeemcke aKkmu6HOCmu (Code number and name Manufacture of Solid Fuels and Other Energy industnes Definiti 'Combustion emissions from fucl use during the manufacture ...
... equi pment, ISIC Divistons 28, 29, 30, 31 -47 99 TIpouena eucuje zacosda ca edpjekmo cmakJteHe 0Gauime Kao nocJeOuue eHepoemcke akmus6Hocmu (Code number and name Mining (excludi u· fuels) and Q ISIC Divistons 13 and 14 Wood and Wood RE u·- BDB 45 |+ cc= _____ cDeeis m- Textile and Leather ...
... countries. 101 TIpouena eucuje zacosda ca edpjekmo cmakJteHe 0Gauime Kao nocJeOuue eHepoemcke akmus6Hocmu IIpunor 2 TpaHcrnopHH cekTop Code and Name Explanation 1A3 TRANSPORT Emissions from the combustion and evaporation of fuel for all transport activity (excluding military transport) ...Горана Остојић. Процена емисије гасова са ефектом стаклене баште као последице енергетске активности, 2024
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... environment one finite-state transducer responsible for generation of all inflectional forms of each DELAS lemma corresponds to each inflectional class code. The Serbian morphological dictionary of simple words contains 121,000 lemmas which yield the production of approximately 1,450,000 different ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology
Mihailo Škorić, Mauro Dragoni (2019)This paper is a result of a task that was presented to attendants of Keyword Search in Big Linked Data summer school, that was organized by Vienna University of Technology, under the Keystone COST action in the summer of 2017. It presents a specific approach to the classification via creation of minimal document surrogates based on the US National medical library’s MeSH ontology, which is derived from the Medical Subject Headings thesaurus. In a series of previously classified medically ...... were pasted onto the beginning and the end of each row respectively (Figure 7).9 For replacement in all the documents to be classified, a second C# code has been prepared. It loads the classification documents one at a time and applies the script generated in the previous step so that the concepts are ...Mihailo Škorić, Mauro Dragoni. "Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology" in Infotheca, Faculty of Philology, University of Belgrade (2019). https://doi.org/10.18485/infotheca.2019.19.1.3
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... expansion was chosen, the appropriate synset was retrieved and two other synonyms for beli luk, namely češnjak (as ‘cyesxnxak’ in the Aurora2 code) and Allium sativum appeared in the list of words that can be used for composing the query. However, given that one of the synonyms is a Latin word ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
Debris-flow Susceptibility Assessment in Flow-R: Ribnica River Case Study
Debris flows are among the most dangerous erosional geohazards due to the fast rate of movement and long runout zones. Even though the initiation can be triggered in mountainous areas, inhabited and with steep slopes, their propagation and deposition can endanger not only buildings and infrastructure in the urbanized areas, but also threaten human lives. As these initiation areas usually represent unattainable terrains with rapid vegetation cover development, field observations and aerial photo analysis become high-demanding tasks. Consequently, medium-to-regional ...Ksenija Micić, Miloš Marjanović, Biljana Abolmasov . "Debris-flow Susceptibility Assessment in Flow-R: Ribnica River Case Study" in Proceeding of the 6th Regional Symposium on Landslides in the Adriatic-Balkan Region, ReSyLAB 2024, University of Belgrade, Faculty of Mining and Geology (2024). https://doi.org/https://doi.org/10.18485/resylab.2024.6