Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Претрага

Per page

Sort by

92 items

Белешка о дигитализацији речника

Душко М. Витас, Цветана Ј. Крстев, Ранка М. Станковић (2019)

У раду ће се анализирати ограничења која проистичу из линеарног процеса традиционалне израде речника на примеру Речника САНУ. Начин да се превазиђу ова ограничења се састоји у формирању електронске лексикографске базе која не представља само пуку дигиталну транскрипцију папирног издања речника. Посебно се указује на чињеницу да текст речника може представљати корпус и приказују се одабрани примери анализе таквог корпуса формираног из текстове 1. и 19. тома Речника САНУ.

лексикографија, рачунарска лексикографија, информатика, информациони систем

... Vitas, Cvetana Krstev, Ranka М. Stanković A NOTE ON A DICTIONARY DIGITIZATION Summary The paper analyses the limitations of the linear process of the traditional dictionary production and illustrates them on the example of the SASA Dictionary. These limitations can be overcome by the establishment ...
... n of a paper dictionary edition. It is additionally stressed that a dictionary text represents itself a valuable corpus for various research purposes which is illustrated by a few examples of the analysis of such corpus compiled form the first and 19th volume of the SASA Dictionary. ...
... (Павловић-Лаже- тић 1996), опредељивање за начин организовања корпуса и његове експлоата- ције, као и одређивање система за писање речника (DWS – dictionary writing system) који мора бити тесно повезан са структуром базе и корпуса. Одлуке о ове две компоненте су дакле најуже повезане са концепцијом ...
Душко М. Витас, Цветана Ј. Крстев, Ранка М. Станковић. "Белешка о дигитализацији речника" in Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch3
Чији је пример? Анализа лексичких обележја на примерима Речника САНУ

Бранислава Б. Шандрих, Ранка М. Станковић, Мирјана С. Гочанин (2019)

У овом раду поставља се питање: да ли се може утврдити ко је аутор неког текста уколико се анализирају искључиво његова лексичка обележја? Како бисмо покушали да добијемо одговор на ово питање, посматрали смо примере у оквиру речничког чланка појединачне лексеме Речника САНУ, који су забележени у пет томова (и то: I, II, XVIII, XIX и XX). Сваки пример је преузет из неког извора на шта упућују скраћенице, наведене у заградама. Од преко 5.000 понуђених извора, определили смо се ...

идентификација ауторства, лексичка обележја, анализа обележја, примери Речника САНУ

... GDEX: Automatically Finding Good Dictionary Examples in a Cor- pus, In E. Bernal & J. DeCesaris (eds.). Proceedings of the XIII EURALEX International Congress, Barcelona: Universitat Pompeu Fabra, 425–432. Косем 2017: Iztok Kosem, Dictionary examples, In Dictionary of Modern Slov- ene: Problems and ...
... get an answer, we observed examples that support lexical entries listed in five of the total of twenty volumes of the Dictionary of Serbian Academy of Science and Arts. Each dictionary example is documented with its author, so we decided to examine only examples that origin from twelve great names in ...
... др. 2018: Iztok Kosem, Kristina Koppel, Tanara Zingano Kuhn, Jan Michelfeit & Carole Tiberius, Identification and Automatic Extraction of Good Dictionary Examples: the Case(s) of GDEX, International Journal of Lexicography. Чији је пример? анализа лексичких обележја на примерима речника сану 315 ...
Бранислава Б. Шандрих, Ранка М. Станковић, Мирјана С. Гочанин. "Чији је пример? Анализа лексичких обележја на примерима Речника САНУ" in Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch13
Речник САНУ као база терминолошких речника (на примеру речника кулинарства)

Рада Стијовић, Олга Сабо, Ранка Станковић (2017)

... obtained list was compared to a list of the lexical entry of the SASA dictionary; extracted the entries from the Dictionary that have no information about the culinary domain (should be amended), and words that do not exist in the Dictionary (should be entered). In this research are also used syntactic ...
... а рад на Речнику убрза. SASA DICTIONARY AS A BASE FOR TERMINOLOGICAL DICTIONARIES (ON THE EXAMPLE OF CULINARY VOCABULARY) In this paper is discussed the possibility of creating culinary vocabulary based on the culinary lexica contained in the Dictionary of Serbo-Croatian Literary and folk ...
... lexicon contained in the SASA Dictionary and show how traditional vocabulary during the digitization process becomes a base for terminology dictionaries that allows different uses. In addition, the process of digitalization and modernization of work on the SASA Dictionary provide its enrichment with ...
Рада Стијовић, Олга Сабо, Ранка Станковић. "Речник САНУ као база терминолошких речника (на примеру речника кулинарства)" in Словенска терминологија данас, Београд : Српска академија наука и уметности (2017)
Development Of The Serbian Geological Resources Portal

Ranka Stanković, Jelena Prodanović, Olivera Kitanović, Velizar Nikolić (2011)

... ences related to the select dictionary entry are dis- played, as well as terms of hyponym and hypernym concepts. The dictionary can also be searched with the use of key words. After entering a string of characters (word or part of a word), the user is offered a list of dictionary entries where the given ...
... of web services and web applications which consume them. Further steps encompass the creation of a lexicon of mapped units, and integration of the dictionary and cartographic representation of spatial objects in which they appear. Further publication of results of both recent, as well as older projects ...
... Apatin. (In Serbian). STANKOVIĆ, R., TRIVIĆ, B., KITANOVIĆ, O., BLAGOJEVIĆ, B., NIKOLIĆ, V., 2011. “The Development of the GeolISSTerm Terminological Dictionary”, INFOteka: časopis za informatiku i bibliotekarstvo, 12/1, Belgrade. ESRI: GIS and mapping software, http://www.esri.com, ESRI Developer network ...
Ranka Stanković, Jelena Prodanović, Olivera Kitanović, Velizar Nikolić. "Development Of The Serbian Geological Resources Portal" in Proceedings of the 17th Meeting of the Association of European Geological Societies, Belgrade, Serbia : The Serbian Geological Society (2011)
Bilingual lexical extraction based on word alignment for improving corpus search

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović (2019)

Library and Information Sciences,Computer Science Applications

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
GIS Application Improvement with Multilingual Lexical and Terminological Resources

Ranka Stanković, Ivan Obradović, Olivera Kitanović (2010)

... dictionary of simple words contains 122,000 lemmas, which can generate approximately 1,400,000 different lexical words. The Serbian morphological dictionary of compounds contains about 4,300 lemmas (generating more than 70,000 different forms) and it is being constantly upgraded. Inflectional finite ...
... comprehensive dictionary of Serbian compounds is a tedious task. In the attempt to alleviate this problem, we have developed a procedure for automatic creation of lemmas for a given list of compounds (Stanković, 2008b). This procedure is based on rules and relies on data from morphological dictionaries ...
... technical applications, as is the case here. Namely, it often happens that a technical term, which is frequently a compound, is not in the morphological e-dictionary of compounds. For example, in order to determine the third inflectional transducer for kvarcna stena ‘quartz rock’, the following ...
Ranka Stanković, Ivan Obradović, Olivera Kitanović. "GIS Application Improvement with Multilingual Lexical and Terminological Resources" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, May 2010, Valetta, Malta : European Language Resources Association (2010)
A Lexical Approach to Acronyms and their Definitions

Cvetana Krstev, Duško Vitas, Ranka Stanković (2015)

In this paper we present a comprehensive approach to acronyms for Natural-Language Processing (NLP) of Serbian texts. The proposed procedure includes extraction of acronyms and their definitions that are usual Multi-Word Units (MWUs), shallow parsing of MWUs that enables MWU lemmatization and production of entries in morphological electronic dictionaries, both for MWU and acronyms, that are provided with grammatical, syntactic, semantic and domain information. This approach enables representation that reflects complex relations between acronyms and their definitions.

... Party, not that DS is an acronym for Zoran Djindjić. 3. Lemmatizing the MWU names from the list obtained in Step 2 in order to obtain names in a dictionary form, normally in the singu- lar, nominative case, sometimes in the plural. (3) KFOR - Medjunarodna mirovna snaga na Kosovu (nominative, singular) ...
... (A:fp1). This becomes a value of a variable $a$ (upper part of the graph in Fig. 2), and its lemma ($a.LEMMA$) is retrieved from the following e-dictionary lines (lower part of the same graph): (8) mirovne,mirovan.A:aefs2g mirovne,mirovan.A:aefp1g 4In Unitex complex grammars can be modelled by using ...
... input is different and the used e-dictionaries as well. For the same example as before and the form (sim- ple word lemma) mirovan the following e-dictionary lines are used: (10) mirovan,mirovne.A:aefs2g mirovan,mirovne.A:aefp1g This form of e-dictionaries is obtained from the previous form by exchanging ...
Cvetana Krstev, Duško Vitas, Ranka Stanković. "A Lexical Approach to Acronyms and their Definitions" in Proceedings of the 7th Language & Technology Conference, November 27-29, 2015, Poznań, Poland, Springer (2015)
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities

Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov (2024)

Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...

полилексемске језинице, именовани ентитет, вишезначност значења речи, складиште смисла, LLOD

Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
Managing mining project documentation using human language technology

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja (2018)

Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...

Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentation

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration

Cvetana Krstev, Ranka Stanković, Vitas Duško (2010)

In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...

Morphology, Lexicon, lexical database, Standards for LRs

... languages (Przepiórkowski, 2003). On the other hand, several applications developed in the frame of LADL e-dictionary format use their own morphological descriptions, most notably ELAG for morphological disambiguation (Laporte & Monceaux, 1999) and Multiflex for compound inflection (Savary, 2008). When ...
... description of morphological features MULTEXT-East (Erjavec, 2004) was used in several projects (Kešelj et al., 2004), (Popović, 2009). Serbian morphological dictionaries of simple words and compounds developed in the LADL format (Courtois & Silberztein, 1990) use different morphological description ...
... 03:28:09 A Description of Morphological Features of Serbian: a Revision using Feature System Declaration Cvetana Krstev, Ranka Stanković, Vitas Duško Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A Description of Morphological Features of Serbian: a Revision ...
Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
An aproach to Implementation of blended learning in a university setting

Ivan Obradović, Ranka Stanković, Olivera Kitanović, Jelena Prodanović (2011)

... we have developed an electronic dictionary of basic GIS terms. Besides the English and the Serbian term, each dictionary entry contains a short definition of the term in both languages, but without any relations between equivalents. An example of a dictionary entry, in English, and then in ...
... или у вишекорисничкој релационој бази података ... http://www.esri.com/� http://edn.esri.com/� There is also a rather developed dictionary of statistical terms, organized in a somewhat different manner, containing both the Serbian and the English equivalent within the same entry ...
... Given that a Serbian thesaurus of geological terms is already developed (http://geoliss.ekoplan.gov.rs/term) and that it contains more than 3000 dictionary entries and the same number of English equivalents, and that the development of an ontology related to mining is underway, we now plan to connect ...
Ivan Obradović, Ranka Stanković, Olivera Kitanović, Jelena Prodanović . "An aproach to Implementation of blended learning in a university setting" in Proceedings of the Second International Conference on e-Learning, eLearning 2011, September 2011, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2011)
Improvement of geodatabase queries within GeolISS

Ranka Stanković (2008)

... substantially improved by using various lexical resources, such as morphological dictionaries and a geological dictionary. These lexical resources used within WS4QE (Workstation for query expansion) enable semantic and morphological expansion of the query, the latter being very important in highly ...
... can be substantially improved by using various lexical resources. Morphological dictionaries enable morphological expansion of the query, very important in highly inflective languages, such as Serbian. The geological dictionary, developed within GeolISS, supports semantic and multilingual expansions ...
... retrieved result. WS4LR handles simultaneously several types of resources, one of them being the system of morphological dictionaries of Serbian simple words and compounds in LADL format. Morphological dictionaries in the same format exist for many other languages, including French, English, Greek, Portuguese ...
Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian

Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih (2021)

Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...

uvredljivi jezik, govor mržnje, srpski, tviter, leksikon, korpus

... and lexicons from other languages, lexicons of sentiment words and expressions, rhetorical figures, etc. To expand the dictionary, synsets from the Serbian WordNet and the dictionary of synonyms will be used for linking with Twitter examples. Regarding the categorization of terms in the lexicon, the ...
... and it is calculated based on the number of different meanings in the comprehensive explanatory dictionary of Serbian, and need to match neither corpus nor probability of use. An excerpt from the dictionary for the word lopov (thief) is presented in Listing 1. It can be seen that this word can be used ...
... xuslinguarum.eu/ 13:12 Building Language Resources for Abusive Language Detection in Serbian Listing 1 An excerpt from the XML version of the dictionary. lopov
Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
Sentiment Analysis of Serbian Old Novels

Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović (2022)

In this paper we present first study of Sentiment Analysis (SA) of Serbian novels from the 1840-1920 period. The preparation of sentiment lexicon was based on three existing lexicons: NRC, AFFIN and Bing with additional extensive corrections. The first phase of dataset refinement included filtering the word that are not found in Serbian morphological dictionary and in second automatic POS tagging and lemma were manually corrected. The polarity lexicon was extracted and transformed into ontolex-lemon and published as initial ...

sentiment lexicon, sentiment analysis, distant-reading, machine learning, old novels

Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović. "Sentiment Analysis of Serbian Old Novels" in Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data, June 2022, Marseille, France, European Language Resources Association (2022)
Indexing of textual databases based on lexical resources: A case study for Serbian

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2015)

In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...

... described in [1], [2]. The role of electronic dictionar- ies, covering both simple words and multi-word units, and dictionary finite-state transducers (FSTs) is text tagging. Each e-dictionary of forms consists of a list of entries supplied with their lemmas, morphosyntactic, semantic and other in- formation ...
... several categories: cartographic content, multimedia, dictionaries and textual databases. The “core” is the whole information system of the Geological Dictionary (Thesaurus) containing about 4,000 geological terms described by definitions, of which about 3,000 have a translation into English. The most important ...
... integration of created indexes will enable the realization of a query expansion by adding synonyms from available resources, such as the geologic dictionary [15] for terminological query terms and WordNet for more general terms. Acknowledgement. This research was supported by the Serbian Ministry of ...
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
Keyword-Based Search on Bilingual Digital Libraries

Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović (2017)

This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...

Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines

Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan (2008)

In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...

LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval

... 1. Morphological dictionaries of simple words and compounds in the so called LADL format (Courtois et al., 1990) basically consist of lemmas accompanied with inflectional class codes which enables a precise production of all inflectional forms. The Serbian morphological dictionary of simple ...
... simple lemmas belong to general lexica, while the remaining 32,000 lemmas represent various kinds of simple proper names. The Serbian morphological dictionary of compounds contains approximately 2,700 lemmas (yielding more than 60,000 different forms) and it is being constantly upgrading. ...
... results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic and morphological expansion of the query, the latter being very important in highly inflective languages, such as Serbian ...
Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain

Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović (2021)

U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...

Srpski jezik, semantika okvira, FrameNet, scenario rizika, rudarski korpus, obrada prirodnog jezika

... also downloaded and used locally. As the website states, it can be used for different purposes: as a dictionary for language learning (since it contains more than 13,000 LUs); as a valence dictionary; as a training dataset for semantic role labeling14 which makes it a rich digital language resource (with ...
... used. This was the motivation for creating an online dictionary whose entries are frames rather than lexemes, as found in paper dictionaries, providing a notation better suited to such a complex system. Conceived in such a manner, an online dictionary allows for represen- tation of individual frame elements ...
... frequency lists, collocations, concordances with a narrower and broader con- text. Figure 5 shows the concordances extracted from the Leximirka20 digital dictionary management web app (Stanković et al. 2018) of the adjective-noun pattern containing the noun ризик (risk), while in Figure 6 there is a his- togram ...
Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2017)

Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...

... the Geological Dictionary (Thesaurus) containing 5,152 geological terms described by definitions, of which 4,839 have a translation into English. The cartographic content includes a general geological map, maps of national parks, map of endangered groundwater bodies, geo-morphological map, map of ex ...
... described in [3,7]. The role of electronic dictionar- ies, covering both simple words and multi-word units, and dictionary finite-state transducers (FSTs) is text tagging. Each e-dictionary of forms consists of a list of entries supplied with their lemmas, morphosyntactic, semantic and other information ...
... length, k1 = 1.2, k2 = 0.75 length normalisation; 5. Creating a dictionary of the whole document collection from all words selected in Step 4. For each term Tk in the document collection, k = 1, . . . M , where M is the size of the dictionary of document collection: (a) calculating document frequency dfk ...
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8
Part of Speech Tagging for Serbian language using Natural Language Toolkit

Ranka Stanković, Boro Milovanović (2020)

Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...

obrada prirodnog jezika, mašinsko učenje, neuronske mreže

... International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland, May 2014 [14] C. Krstev and D. Vitas, “Serbian Morphological Dictionary – SMD,” University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0, 2015 [15] A. Balvet, D. Stošić, and A. Miletić, (2014). TALC-Sef ...
... of low-resource languages so there’s a modest research on this topic. First attempts to create an automatic PoS tagger for Serbian relied on a dictionary. Delić et al. used custom transformations and rules [5]. Utvić created a parameter file TT11 for a TreeTagger Boro Milovanović is a PhD student ...
... two different tagsets. Tagset is a collection of tags. UD_POS is a Universal Dependency tagset [13]. N_POS is a tagset used in Serbian Morphology Dictionary [14] expanded with a gender category. From the given data we extracted token, N_POS and UD_POS tag. We stripped gender from the N_POS and got ...
Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)

Претрага

92 items

Белешка о дигитализацији речника cite

Чији је пример? Анализа лексичких обележја на примерима Речника САНУ cite

Речник САНУ као база терминолошких речника (на примеру речника кулинарства) cite

Development Of The Serbian Geological Resources Portal cite

Bilingual lexical extraction based on word alignment for improving corpus search cite

GIS Application Improvement with Multilingual Lexical and Terminological Resources cite

A Lexical Approach to Acronyms and their Definitions cite

Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities cite

Managing mining project documentation using human language technology cite

A Description of Morphological Features of Serbian: a Revision using Feature System Declaration cite

An aproach to Implementation of blended learning in a university setting cite

Improvement of geodatabase queries within GeolISS cite

A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian cite

Sentiment Analysis of Serbian Old Novels cite

Indexing of textual databases based on lexical resources: A case study for Serbian cite

Keyword-Based Search on Bilingual Digital Libraries cite

The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines cite

FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain cite

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources cite

Part of Speech Tagging for Serbian language using Natural Language Toolkit cite

Белешка о дигитализацији речника

Чији је пример? Анализа лексичких обележја на примерима Речника САНУ

Речник САНУ као база терминолошких речника (на примеру речника кулинарства)

Development Of The Serbian Geological Resources Portal

Bilingual lexical extraction based on word alignment for improving corpus search

GIS Application Improvement with Multilingual Lexical and Terminological Resources

A Lexical Approach to Acronyms and their Definitions

Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities

Managing mining project documentation using human language technology

A Description of Morphological Features of Serbian: a Revision using Feature System Declaration

An aproach to Implementation of blended learning in a university setting

Improvement of geodatabase queries within GeolISS

A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian

Sentiment Analysis of Serbian Old Novels

Indexing of textual databases based on lexical resources: A case study for Serbian

Keyword-Based Search on Bilingual Digital Libraries

The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines

FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Part of Speech Tagging for Serbian language using Natural Language Toolkit