Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Претрага

Per page

Sort by

98 items

Possibilities of retro-digitalized German-Serbian Mining Dictionary

Biljana Lazić, Olivera Kitanović, Ivan Obradović (2019)

U radu će biti prikazan opis procesa retrodigitalizacije dvojezičnog Nemačko-srpskog rudarskog rečnika iz 1923. godine čiji je autor rudarski inženjer Dragutin Stepanović (Степановић, 1923). Ovaj rečnik je zasnovan na skoro 4 000 leksičkih zapisa koji su prevodilački ekvivalenti ili uputnice. Umesto predgovora autor daje uvid u svoje pismo upućeno “Ministru šuma i rudnika” u kome piše o nameri da zabeleži reči koje se koriste u narodu kako bi izbegao upotrebu nemačkih reči. Iako broj odrednica nije toliko veliki, rečnik ...

elektronska leksikografija

Biljana Lazić, Olivera Kitanović, Ivan Obradović. "Possibilities of retro-digitalized German-Serbian Mining Dictionary" in E-dictionaries and E-lexicography, Zagreb, 10-11 May 2019, Zagreb : Institut za hrvatski jezik i jezikoslovlje (2019)
The Nooj System as Module within an Integrated Language Processing Environment

Ranka Stanković, Duško Vitas, Cvetana Krstev (2008)

NooJ, electronic dictionary, lexical resources

... numbers of different individual electronic resources to form large global electronic resources, so conversion of NooJ resources to LMF format (Lexical markup framework) (ISO LMF 2006) is also included in this environment. 3. Lexical resources management 3.1. Dictionary Management This module enables ...
... and using external Perl, Awk, and XSLT scripts. pkg WS4LR moduls WSLR moduls + CONVERSION + DICTIONARY MANAGMENT + WORDNET DEVELOPMENT + EXPLOITATION OF ALIGNED TEXTS (from Use Case View) DICTIONARY MANAGMENT + Simple words manipulation + Compound words management + Nooj dictionaries management ...
... and multilingual dictionaries in wordnet development, we will now briefly describe the basic features of the dictionary management module. The lemma in a morphological dictionary of simple words has the following format: lemma.Knnn [+SinSem]*, where lemma is the word form usually used in ...
Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)
Automatic construction of a morphological dictionary of multi-word units

Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić (2010)

The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...

electronic dictionary, Serbian, morphology, inflection, multiwordn units, noun phrases, query expansion

... present how the same procedure is used for other languages. Key words: electronic dictionary, Serbian, morphology, inflection, multi- word units, noun phrases, query expansion 1 Introduction We have been developing morphological electronic dictionaries of Serbian for natural language processing for many ...
... (its regular or dictionary form). For the given example, the entry in DELAC dictionary is: civilni(civilni.A2:adms1g) vojni(vojni.A2:adms1g) rok(rok.N81u:ms1q),nc axaxn1 The information given in this entry allows automatic production of all 26 in- flected forms for the DELACF dictionary as, for example ...
... construction of a morphological dictionary of multi-word units Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Automatic construction of a morphological dictionary of multi-word units | Cvetana ...
Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić. "Automatic construction of a morphological dictionary of multi-word units" in Lecture Notes in Computer Science 6233, Advances in Natural Language Processing, Proceedings of the 7thInternational Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 2010, Springer (2010): 226-237. https://doi.org/10.1007/978-3-642-14770-8_26
Using English Baits to Catch Serbian Multi-Word Terminology

Cvetana Krstev, Branislava Šandrih, Ranka Stanković (2018)

In this paper we present the first results in bilingual terminology extraction. The hypothesis of our approach is that if for a source language domain terminology exists as well as a domain aligned corpus for a source and a target language, then it is possible to extract the terminology for a target language. Our approach relies on several resources and tools: aligned domain texts, domain terminology for a source language, a terminology extractor for a target language, and a ...

aligned texts, word alignment, terminology extraction, electronic dictionaries, morphological inﬂection

... languages producing many different forms for each lemma. 4.2. Dictionary of Library and Information Science The development of the Dictionary of Librarianship: English-Serbian and Serbian-English (in this text referred to as ‘Dictionary’) (Ljiljana Kovačević, 2014) has started in 2001 at the National ...
... was first used on aligned texts in query ex- pansion (Stanković et al., 2012); the Excel format of the dictionary was at that time transformed into a relational database. The version of the Dictionary that we used for our experiment has 12,592 different Serbian terms (9,376, 74% MWT), 11,857 different ...
... MWTs were both matched with the Dictionary and extracted by the tool. 5. The aligned chunks from step 3 were filtered with the additional condition (T (align.chunk) 6∼ T (term.list)). 1,935 Serbian MWTs were extracted by our term extractor (they were not in the Dictionary; they, however, may be synonymous ...
Cvetana Krstev, Branislava Šandrih, Ranka Stanković. "Using English Baits to Catch Serbian Multi-Word Terminology" in Proceedings of the 11th International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
Production of morphological dictionaries of multi-word units using a multipurpose tool

Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas (2011)

The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...

electronic dictionary, Serbian, morphology, inﬂection, multi-word units, noun phrases, query expansion

... can be briefly described in the following way: in a dictionary of lemmas (DELAS) every lemma is described in full detail so that a dictionary of forms containing all necessary grammatical information (DELAF) can be generated from it. The dictionary of forms is used in NLP tasks. Two corpus processing ...
... featuring complex morphological structures. After realizing that the development of such a dictionary manually is an extremely slow process, we endeavored towards a procedure aimed at automated production of MWU dictionary lemmas, which is also outlined in this paper. The procedure was subsequently implemented ...
... but it was also successfully used for Polish proper names in another environment [9]. By analogy with entries in a dictionary of simple word lemmas, an entry in a DELAC dictionary consists of a MWU lemma to which a name of an inflectional transducer (similar to the one represented in Fig. 1) is ...
Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas. "Production of morphological dictionaries of multi-word units using a multipurpose tool" in Proceedings of the Computational Linguistics-Applications Conference, October 2011, Jachranka, Poland, Jachranka, Poland : PTI - Polish Information Processing Society (2011)
Речници у дигиталном добу - информатичка подршка за српски језик

Биљана Рујевић (2022)

Морфолошки речници српског језика представљају електронски језички ресурс који има значајну историју развоја и коришћења за потребе обраде природних језика. С обзиром на то да су чувани у облику датотека чији је број нарастао па је самим тим управљање речницима постало отежано јавила се потреба за смештањем информација из речника у облик лексикографске базе. Како би се омогућио симултани рад на развоју речника за више корисника јавила се потреба за веб-апликацијом заснованој на лексикографској бази. Како би се размотриле ...

електронски речници, лексикографска база података, лексички ресурси, српски језик

Биљана Рујевић. Речници у дигиталном добу - информатичка подршка за српски језик, Београд : [Б. Рујевић], 2022
Old or New, We Repair, Adjust and Alter (Texts)

Cvetana Krstev, Ranka Stanković (2020)

U ovom radu predstavljamo kako se e-rečnici i kaskade transduktora konačnih stanja implementirani u alatu Unitex mogu koristiti za rešavanje tri problema transformacije teksta: ispravljanje tekstova nakon OCR-a, vraćanje dijakritičkih znakova i prebacivanje između različitih jezičkih varijanti.

ispravka teksta, OCR greške, restauracija dijakritika , jezičke varijante, elektronski rečnik, transduktori konačnih stanja

... diacritics and switching between differ- ent language variants. KEYWORDS: text correction, OCR errors, diacritic restoration, language variants, electronic dictionary, finite-state transducers. PAPER SUBMITTED: 13 October 2019 PAPER ACCEPTED: 08 December 2019 Cvetana Krstev University of Belgrade, Faculty ...
... multiple candidates for znaci (a form of znak ‘sign’ and značiti ‘to mean’); – A dictionary of multi-word units (MWU) (nouns, adjectives, adverbs, pronouns, conjunctions and interjections) obtained from a dictionary of more than 18,000 MWU lemmas; for instance, Dobro vece ⇒ Dobro veče ‘Good evening’ ...
... that contains letters c, s, z or digraphs dj, dz, a list of zero8 or more can- didates obtained from the dictionary SMD_DR, or one candidate obtained from lists of trigrams or bigrams, or a dictionary of MWUs for a sequence of words. The result of the application of the procedure to a sample text is given ...
Cvetana Krstev, Ranka Stanković. "Old or New, We Repair, Adjust and Alter (Texts)" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.3
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian

Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković (2019)

У овом раду представљамо модел за избор добрих примера за речник српског језика и развој иницијалних компоненти модела. Метода која се користи заснива се на детаљној анализи различитих лексичких и синтактичких карактеристика у корпусу састављених од примера из пет дигитализованих свезака речника САНУ. Почетни скуп функција био је инспирисан сличним приступом и за друге језике. Дистрибуција карактеристика примера из овог корпуса упоређује се са карактеристиком дистрибуције узорака реченица ексцерпираних из корпуса који садрже различите текстове. Анализа је показала да ...

Српски, добри примери из речника, аутоматизација израде речника, издвајање својстава, Машинско учење

... Gold Standard for Good Dictionary Examples for Serbian | Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković | Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference | 2019 | | http://dr.rgf.bg.ac.rs/s/repo/item/0004956 Дигитални ...
... C. (2019). Identification and Automatic Extraction of Good Dictionary Examples: the Case(s) of GDEX. International Journal of Lexicography, 32(2), pp. 119–137. Krstev, C. (2008). Processing of Serbian – Automata, Texts and Electronic dictionaries. Belgrade: Faculty of Philology, University of ...
... Blueprint for the computerized dictionary of the Serbian language [Nacrt za informatizovani rečnik srpskog jezika]. Naučni sastanak slavista u Vukove dane, 44(3), pp. 105–116. (In Serbian, Cyrillic.) Vitas, D. & Krstev, C. (2012). Processing of Corpora of Serbian Using Electronic Dictionaries. Prace F ...
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković. "SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian" in Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference , Lexical Computing CZ, s.r.o. (2019)
The Dictionary of the Serbian Academy: from the Text to the Lexical Database

Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo (2018)

In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach ...

computer lexicography, lexical database, language resources, dictionary, Serbian language

... similar approach was used as in Stanković et al. (2018) for the Serbian morphological electronic dictionary. The main class, in the core of this dic- tionary model, is LexicalEntry, representing a headword of the dictionary article, which encompasses the set of senses that are associated with this headword ...
... [Possibility for modernizing the development of the dictionary on the example of the Dictionary of the Serbo-Croatian literary and vernacular language SASA and the Institute for Serbo-Croatian] Stanković, R., Krstev, C., Lazić, B., Škorić, M. (2018) Electronic Dictionaries – from File System to lemon Based ...
... 6 5 6 4 Figure 1: The microstructure of dictionary articles. 4 The transformation from the dictionary article text form to the lexical database The guidelines for dictionary writing were used to defi ne the rules for the segmentation of the dictionary articles, the pattern recognition, and the ...
Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)
A WordNet Ontology in Improving Searches of Digital Dialect Dictionary

Miljana Mladenović, Ranka Stanković, Cvetana Krstev (2017)

In this paper, we present a method for automatic generation of a digital resource, which connects all indirect synonyms of a dialect term to all indirect synonyms of a corresponding term in the standard language, aiming to improve the search of a digital dialect dictionary. The method uses SWRL rules defined in the Serbian WordNet ontology to identify sets of synonymous words. It also uses e-dictionaries to produce correct lemmas in standard language that users usually employ in searches. ...

... task we used Serbian morphological electronic dictionaries and grammars developed within the University of Belgrade Human Language Technology Group [14]. Morphological electronic dictionaries of Serbian for NLP are being developed for many years now. In the dictionary of lemmas (DELAS) each lemma is de- ...
... carried out for the terms in the dictionary which: start with typed word, contain it or are equal to it. Search results offer information on the number of terms found in the dictionary that satisfy the given query. This kind of search is standard for on-line dictionary look-up, but it is based on the ...
... “English Dialect Dictionary” (EED) [8], MS SQLServer for storing a South Serbian dialect dictionary data [7], or it can be one of a machine- readable and interoperable Semantic Web standards, such as RDF, SKOS and SKOS-XL which are used, for example, in the case of the dialect dictionary of the German language ...
Miljana Mladenović, Ranka Stanković, Cvetana Krstev. "A WordNet Ontology in Improving Searches of Digital Dialect Dictionary" in New Trends in Databases and Information Systems: ADBIS 2017 Short Papers and Workshops - SW4CH (Semantic Web for Cultural Heritage) 767, Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-67162-8_37
Towards Automatic Definition Extraction for Serbian

Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić (2021)

U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...

... ion of dictionary-making, local grammar 1 Introduction In the age of electronic lexicography, in which the goal is fast production of dictionaries and the products derived from them, special attention is paid to automatic or semi-automatic performance of some tasks. The use of electronic dictionaries ...
... pp. 941–949. Stanković, R., Šandrih, B., Stijović, R., Krstev, C., Vitas, D. & Marković, A. (2019). SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian. In: Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference, 1–3 October 2019, Sintra, Portugal ...
... automatic extraction are analyzed and formalized in our paper is the support of dictionary drafting (Kilgarriff & Rychlý 2010), which implies the development based on corpora. The results were achieved by using electronic dictionaries of the Serbian language and local grammars developed based on the ...
Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
From DELA Based Dictionary to Leximirka Lexical Database

Biljana Lazić, Mihailo Škorić (2020)

In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...

Morfološki rečnici, jezički resursi, Leksimirka

... work on application development will be presented in Section 5. 2 Electronic dictionaries 2.1 The DELA text format Serbian morphological dictionaries are electronic dictionaries primarily intended for machine use. This type of dictionary was first developed for the French language under the influence ...
... information have been considered: Guidelines for Electronic Text Encoding and Interchange, Text Encoding Initiative (TEI)3, Lexical Markup Framework (LMF)4 and the Lemon model5. Although Chapter 9 of the TEI Guidelines addresses the issue of dictionary encoding, they only recently address the specificities ...
... References Bański, Piotr, Jack Bowers and Tomaž Erjavec. “TEI-Lex0 Guidelines for the Encoding of Dictionary Information on Written and Spoken Forms”. In Proceedings of eLex 2017 conference: Electronic lexicography in the 21st century, 485–94. Brno: Lexical Computing CZ s.r.o., 2017, ac- cessed September ...
Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
Развој геолошког терминолошког речника ГеолИССТерм

Ranka Stanković, Branislav Trivić, Olivera Kitanović, Branislav Blagojević, Velizar Nikolić (2011)

... geoSciml initiative are planned to be provided. The electronic edition of the dictionary is complemented by the printed version. Keywords. Terminological resources, geology, gIS, geologic Information System, geologic vocabulary, electronic dictionary. INFOtheca, № 1, vol XII, August 201150 rANkA ...
... purpose of data processing, the electronic dictionary is comple- mented by a print edition (Trivić, 2011). Special attention has been paid to the struc- ture of concepts and the manner of their orga- nization in the terminological resources and the geolISSTerm dictionary, as well as to their posi- tion ...
... geOlISSTerm TermINOlOgIcAl DIcTIONAry 1. Introduction The physical implementation of the geologic Information System of Serbia (geolISS) in mid- 2006 marked the beginning of development of the geologic terminology and nomenclature. The main aim of the development of this electronic resource was the creation ...
Ranka Stanković, Branislav Trivić, Olivera Kitanović, Branislav Blagojević, Velizar Nikolić. "Развој геолошког терминолошког речника ГеолИССТерм" in INFOteka: časopis za informatiku i bibliotekarstvo, Beograd : Zajednica biblioteka univerziteta u Srbiji (2011)
A Data Driven Approach for Raw Material Terminology

Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)

The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...

sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaci

... lexicographically relevant data (lemma lists, example sentences, collocations) as complementary resources in electronic dictionaries is known as the one-click dictionary or push-pull dictionary model, which is used, for example, in the Sketch-engine [9] for several languages, but has not yet been used ...
... 2020; pp. 1–9. 54. Stanković, R.; Šandrih, B.; Stijović, R.; Krstev, C.; Vitas, D.; Marković, A. SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian. In Electronic lexicography in the 21st Century, Proceedings of the eLex 2019 Conference, Sintra, Portugal, 1–3 October 2019; ...
... paper presents a data driven approach aimed at using opportunities offered by electronic lexicography, as well as various available techniques of Natural Language Processing (NLP), to develop a semi-automatic pipeline for dictionary production. The approach is focused on raw material terminology, with an ...
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment

Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...

lexical semantic resources, sense alignment, lexicography, language resource

... paraphrases broader The sense in the first dictionary completely covers the meaning of the sense in the second dictionary and is applicable to further meanings narrower The sense in the first dictionary is entirely covered by the sense of the second dictionary, which is applicable to further meanings ...
... version of Web- ster’s dictionary from 19139. Estonian We used the EKS Dictionary of Estonian and the PSV Basic Estonian Dictionary (Kallas et al., 2014). German We used the German versions of OmegaWiki10 and Wiktionary11. Hungarian We linked the Explanatory Dictionary of Hun- garian (1959-1962)12 ...
... 4,500 DDO lemmas (of 97,500 in the dictionary). The lemma intersection (86%) with ODS was selected for our task. Dutch We used the Woordenboek der Nederlandsche Taal (Dictionary of the Dutch Language, WNT) 6 and the Algemeen Nederlands Woordenboek (Dictionary of Contemporary Dutch, ANW)7. The Dutch ...
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
Electronic Dictionaries - from File System to lemon Based Lexical Database

Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić (2018)

In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...

... J. (1998). Electronic dictionary en- coding: Customizing the TEI guidelines. In Proc. Eu- ralex. Villegas, M. and Bel, N. (2015). PAROLE/SIMPLE ’lemon’ ontology and lexicons. Semantic Web, 6:363– 369. Vitas, D., Pavlović-Lažetić, G., and Krstev, C. (1993). Electronic dictionary and text processing ...
... automatic generation of dictionary candidates, with their lexical and derivation variants. This automatic procedure enabled migration of all 26 simple word and 15 multi-word unit Serbian dictionary files with more than 150,000 lexical entries. Keywords: lexical database, lemon, electronic dictionaries, lexical ...
... implemented for the purpose of further development and management of morphological electronic dictionaries of Serbian (SMD), presented in more details in Section 3.. However, with the growing number of dictionary developers, and given the va- riety of dictionaries and information stored in them (proper ...
Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
Terminological and lexical resources used to provide open multilingual educational resources

Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić (2016)

Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...

otvoreni obrazovni resursi, leksički resursi, obrada prirodnih jezika, terminologija

... The last one ensured the infrastructure for online terminological resources, such as electronic dictionaries and term bases, which can be monolingual, 2 Available at: http://www.macmillandictionary.com/dictionary/british/terminol ogy bilingual or multilingual. Additionally, it strengthened the ...
... org/rest/dc/4024 http://www.macmillandictionary.com/dictionary/british/terminology http://www.macmillandictionary.com/dictionary/british/terminology http://www.macmillandictionary.com/dictionary/british/terminology http://www.macmillandictionary.com/dictionary/british/terminology http://www.isocat.org/rest/dc/4024 ...
... Linguistique) format. There are two types of dictionaries: dictionary of simple words and dictionary of compounds. Two main components of dictionary of simple words are DELAS and DELAF. Here we have an entry found in Serbian dictionary of simple words: učiteljica, N651+Hum+GM:fs4v. The first part ...
Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
An Approach to Development of Bilingual Lexical Resources

Stanković Ranka, Obradović Ivan, Trtovac Aleksandra (2012)

... l equivalents, and this pair was also entered into the new dictionary. 4.4 Absence of Terms In some cases the concordances revealed an absence of adequate terms for a concept in both available lexical resources. Terms electronic learning and e-learning and their Serbian translational equivalents ...
... of the resources. Hence the English synset {electronic learning, e-learning} and its Serbian counterpart {elektronsko učenje, e-učenje} were entered into Biblimir. Serbian term semantički veb does not exist in available resources. The Dictionary of Librarianship uses English orthography {semantički ...
... glish Dictionary of Library and Information Science technology (further referred to as Dictionary of Librarianship) [Kovačević et al., 2004]. An analysis of results obtained by Bibliša revealed that in some cases the available bilingual resources, namely the wordnets and the Dictionary of ...
Stanković Ranka, Obradović Ivan, Trtovac Aleksandra. "An Approach to Development of Bilingual Lexical Resources" in Proceedings of the Fifth Balkan Conference in Informatics BCI 2012, Workshop on Computational Linguistics and Natural Language Processing of Balkan Languages – CLoBL 2012, September 2012, Novi Sad : BCI (2012)
An Approach to Efficient Processing of Multi-Word Units

Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas (2013)

Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...

Natural Language Processing, Grammatical Category, Lexical Representation, MWU, multi-word unit

... Lingvisticae Investigationes (2002) 12. Mota, C., Carvalho, P., Ranchhod, E.: Multiword lexical acquisition and dictionary formal- ization. In: Proceedings of the Workshop Enhancing and Using Electronic Dictionaries, Col- ing’2004, pp. 73–77. Geneva, Switzerland (2004) 13. Paumier, S.: Unitex 2.1 User Manual ...
... been produced for many other languages. This format can be briefly described in the following way: in a dictionary of lem- mas (DELAS) every lemma is described in full detail so that a dictionary of forms containing all necessary grammatical information (DELAF) can be generated from it, and subsequently ...
... Serbian MWUs 104 such transducers were developed — 18 for adjectives and 86 for nouns. By analogy with entries in a dictionary of simple word lemmas, an entry in a DELAC dictionary consists of a MWU lemma to which a name of an inflectional transducer (similar to the one represented in Figure 1) is ...
Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
Combining Heterogeneous Lexical Resources

Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić (2004)

development of lexical resources, morphological dictionaries, WordNet

... used for the production of electronic resources, and almost none exist in electronic form, the Serbian resources presented in this paper have been manually produced, checked and double checked. Our standpoint is that only when reliable lexical resources in electronic form are fully developed it ...
... ones are: • The system of morphological dictionaries of Serbian (SMD) in Intex format (Silberztein, 2000), that consists of a dictionary of simple lemmas, a dictionary of compounds (under construction), the corresponding dictionaries of word forms, and morphological finite-state automata that ...
... classes of lemmas. The current size of SMD of simple lemmas is around 65.000, and they produce a dictionary of word forms with more than 930.000 entries. An example of an entry in the dictionary of simple lemmas (DELAS) is: (1) devojcyin,A1+Pos+Ek The information that has to be assigned to ...
Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić. "Combining Heterogeneous Lexical Resources" in Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4, ELRA - European Language Resources Association (2004)

Претрага

98 items

Possibilities of retro-digitalized German-Serbian Mining Dictionary cite

The Nooj System as Module within an Integrated Language Processing Environment cite

Automatic construction of a morphological dictionary of multi-word units cite

Using English Baits to Catch Serbian Multi-Word Terminology cite

Production of morphological dictionaries of multi-word units using a multipurpose tool cite

Речници у дигиталном добу - информатичка подршка за српски језик cite

Old or New, We Repair, Adjust and Alter (Texts) cite

SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian cite

The Dictionary of the Serbian Academy: from the Text to the Lexical Database cite

A WordNet Ontology in Improving Searches of Digital Dialect Dictionary cite

Towards Automatic Definition Extraction for Serbian cite

From DELA Based Dictionary to Leximirka Lexical Database cite

Развој геолошког терминолошког речника ГеолИССТерм cite

A Data Driven Approach for Raw Material Terminology cite

A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment cite

Electronic Dictionaries - from File System to lemon Based Lexical Database cite

Terminological and lexical resources used to provide open multilingual educational resources cite

An Approach to Development of Bilingual Lexical Resources cite

An Approach to Efficient Processing of Multi-Word Units cite

Combining Heterogeneous Lexical Resources cite

Possibilities of retro-digitalized German-Serbian Mining Dictionary

The Nooj System as Module within an Integrated Language Processing Environment

Automatic construction of a morphological dictionary of multi-word units

Using English Baits to Catch Serbian Multi-Word Terminology

Production of morphological dictionaries of multi-word units using a multipurpose tool

Речници у дигиталном добу - информатичка подршка за српски језик

Old or New, We Repair, Adjust and Alter (Texts)

SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian

The Dictionary of the Serbian Academy: from the Text to the Lexical Database

A WordNet Ontology in Improving Searches of Digital Dialect Dictionary

Towards Automatic Definition Extraction for Serbian

From DELA Based Dictionary to Leximirka Lexical Database

Развој геолошког терминолошког речника ГеолИССТерм

A Data Driven Approach for Raw Material Terminology

A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment

Electronic Dictionaries - from File System to lemon Based Lexical Database

Terminological and lexical resources used to provide open multilingual educational resources

An Approach to Development of Bilingual Lexical Resources

An Approach to Efficient Processing of Multi-Word Units

Combining Heterogeneous Lexical Resources