Претрага
49 items
-
From DELA Based Dictionary to Leximirka Lexical Database
Biljana Lazić, Mihailo Škorić (2020)In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...... Pol- ish, Greek, Russian etc. The system of morphological dictionaries is based on the theory of finite-state automata, namely on morphological and local grammars in the form of finite-state transducers that generate all morpho- logical forms of words in the dictionary (Krstev, 2008). 2 Laboratoire ...
... researchgate.net/publication/265297624_A_ SKOS-based_Schema_for_TEI-encoded_Dictionaries Gross, Maurice. “The construction of local grammars”. In Finite State Lan- guage Processing eds. Emmanuel Roche and Yves Schabs (1997): 329–354, accessed September 1, 2015. https://halshs.archives-ouvertes.fr/ ha ...
... cookbook.pdf Paumier, Sébastien. Unitex User Manual, Université Paris-Est Marne-la- Vallée, 2016 Savary, Agata. “Multiflex: A Multilingual Finite-State Tool for Multi- Word Units”. In Implementation and Application of Automata, 14th International Conference, CIAA 2009, Sydney, Australia, July 14-17 ...Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
-
Serbian NER&Beyond: The Archaic and the Modern Intertwinned
U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...... existence of large-scale lexical resources for Serbian, e-dictionaries in particular (Kr- stev, 2008), coupled with local grammars in the form of finite-state transducers (Vitas and Krstev, 2012), enabled the development of a comprehensive rule-based system for NER Srp- NER. This system presented by Krstev ...
... words, and need to deal with numerous XML-TEI tags used to preserve the format of original editions. Authors’ solution was based on the cascades of finite-state automata and both general dictionaries and those built speci- fically for the project. The evaluation showed that the slot error rate of name tagging ...Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
-
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis
U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...... English WordNets, terminological databases: Termi, GeolISSTerm, RudOnto and Librarian dictionary. Apart from the grammars in the form finite state automata and transducers, system is using rules for inflection of multiword units. Among textual resources are most important digital libraries, Unitex corpora16 ...Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
-
Building learning capacity by blending different sources of knowledge
... for full functionality of the language support system grammars are also needed, and they are implemented by the so called finite state automata, finite state transducers and compound inflection rules (Krstev, 2008). Another important lexical resource offering support for multilingual terminology ...
... languages such as English and Russian, are also envisaged. Given this variety of languages within the network, a language support system, based on state of the art language technology, is put in place to support multilinguality, but also terminology issues and query handling. Finally, besides ...
... enterprises. On the other hand, once they graduate and become employees, they will have an opportunity for life-long learning, through access to state of the art high quality academic courses. They could thus continue with their professional development in a way more in line with their professional ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Dalibor Vorkapić. "Building learning capacity by blending different sources of knowledge" in International Journal of Learning and Intellectual Capital (2016). https://doi.org/10.1504/IJLIC.2016.075698
-
An Integrated Environment for Management and Exploitation of Linguistic Resources
Ranka Stanković, Ivan Obradović (2009)... International Con- ference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, pp. 1692-1697, 2006. [7] M. Silberztein, “ INTEX: a Finite State Transducer toolbox”, in Theo- retical Computer Science, vol. 231, no.1, pp.33-46, Jan 2000. [8] S. Paumier, “M anuel d’utilisation du logiciel ...
... categories, etc). Finally, the module for management of dictionaries al- lows access to an editor of regular expressions, namely, transducers that describe the inflectional characteristics of a chosen lemma or class. This completes the set of tools the user needs in order to be able ...Ranka Stanković, Ivan Obradović. "An Integrated Environment for Management and Exploitation of Linguistic Resources" in Proceedings of the International Multiconference on Computer Science and Information Technology, Computational Linguistics – Applications Workshop (CLA09), Mrągowo, Poland, October 2009, Piscataway : IEEE (2009)
-
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++
Branislava Šandrih, Ranka Stanković (2020)U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...... extraction. The first module is a rule-based system re- lying on e-dictionaries and local grammars developed in Unitex,6 that are implemented as finite-state transducers (FST). The second module implements various statistical measures used for ranking of term candi- dates. In this research the system was tuned ...
... Vol. 19, No. 2, December 2019 119 Šandrih B., Stanković R., “Extraction of Bilingual . . . ”, pp. 119–138 ekrana (namely, a photo of a current state of the screen) or as a “skrinšot” (i.e, the word is transcribed). It is not uncommon that even experts from a certain field have difficulties while ...Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Developing Termbases for Expert Terminology under the TBX Standard
... al dictionaries [13]. Some examples from the simple word DELAS dictionary of terms related to mining and geology, represented by their lemmas, transducers for their respective inflection classes and semantic markers are: elektrovod,Ni+RudOntot+Elektro aerozagadenje,N300+RudOntotEkolog hidrogeoloski ...
... challenged by machine translation (MT), especially statistical machine translation (SMT), an approach developed at IBM in the late 1980s, now the state-of-the art paradigm in MT. The exponential growth of aligned multilingual corpora greatly improved the efficiency and accuracy of SMT in general, and ...
... VIII, 2001. . Alan K. Melby. Terminology in the Age of Multilingual Corpora. The Journal of Specialized Translation, 18:7-29, 2012. Uwe Reinke. State of the Art in Translation Memory Technology. Translation: Computation, Corpora, Cognition, 3(1), 2013. Laurent Romary. TBX Goes TEI - Implementing ...Ranka Stanković, Ivan Obradović, and Miloš Utvić. "Developing Termbases for Expert Terminology under the TBX Standard" in Natural Language Processing for Serbian - Resources and Applications, Belgrade : University of Belgrade, Faculty of Mathematics (2014)
-
Multiple-Criteria Decision-Making in Mine Development Planning
Sanja Bajić (2023)The Borska Reka ore deposit is an experimental location where developed methodologies have been applied. It is the largest ore body within the Bor mining complex, which has been the subject of numerous studies and analyses for more than three decades. The paper focuses on the application of FAHP and the VIKOR method to address ranking of alternatives and select the optimal mining method by means of fuzzy multicriteria optimization.Sanja Bajić. "Multiple-Criteria Decision-Making in Mine Development Planning" in Proceedings of the 5th International Underground Excavations Symposium, 5-6-7 June 2023, Istanbul, Topkapi : Dinç Ofset Mat. Rek. San. ve Tic. Ltd. Şti. (2023)