Претрага
454 items
-
A WordNet Ontology in Improving Searches of Digital Dialect Dictionary
In this paper, we present a method for automatic generation of a digital resource, which connects all indirect synonyms of a dialect term to all indirect synonyms of a corresponding term in the standard language, aiming to improve the search of a digital dialect dictionary. The method uses SWRL rules defined in the Serbian WordNet ontology to identify sets of synonymous words. It also uses e-dictionaries to produce correct lemmas in standard language that users usually employ in searches. ...... distribution of a language and geographical information relevant for linguistic research [9], using of Semantic Web-based techniques for representing digital resources as knowledge based resources and as Linked Open Data (LOD) on the Web [6], [15]. Digital dictionary of the South Serbian dialect*, ...
... terms in order to improve search. 3 Resources 3.1 Use of morphological e-dictionaries The first problem with search of verbs in dialect dictionary is the grammatical form of the headword of the lexical entry. Namely, grammatical form of the headword of the verb lexical entry in the dialect dictionary is ...
... given context. Synsets respect the syntactic categories noun, verb, adjective, and adverb and can be interconnected by semantic relations, while word forms can be connected by lexical relations. In SWN ontology there are currently 2,243 verb synsets defined as ontology individuals belonging to the VerbSynset ...Miljana Mladenović, Ranka Stanković, Cvetana Krstev. "A WordNet Ontology in Improving Searches of Digital Dialect Dictionary" in New Trends in Databases and Information Systems: ADBIS 2017 Short Papers and Workshops - SW4CH (Semantic Web for Cultural Heritage) 767, Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-67162-8_37
-
An aproach to Implementation of blended learning in a university setting
... These dictionaries are not proper lexical resources, since the current format implemented in Moodle is not very flexible. However, we plan to work on this issue and enable the use of lexical resources and ontologies in Moodle, as multilingual language resources and ontologies are gradually becoming ...
... knowledge – the Semantic Web [5]. As the main Semantic Web tool, ontologies can be integrated into repositories of learning objects to organize the different concepts covered by the resources stored in a so called “knowledge domain ontology” [6]. An nice example of the use of Semantic Web services ...
... in a specific domain. In order to achieve these goals a Semantic Web based infrastructure has been designed, implemented and tested and an educational ontology has been developed. Another example of integration of semantic resources in education is the M-OBLIGE model for building multitutor ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Jelena Prodanović . "An aproach to Implementation of blended learning in a university setting" in Proceedings of the Second International Conference on e-Learning, eLearning 2011, September 2011, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2011)
-
Using technology for knowledge transfer between academia and enterprises
Ivan Obradović, Ranka Stanković (2014)... structure is outlined in Figure 3, is based on electronic language resources, namely, lexical resources, textual resources and grammars. Bilingual dictionaries in electronic form are one of the simplest multilingual lexical resources. However, for their full functionality in languages with complex ...
... tool for lexical resources management and query expansion developed at FMG (Stanković et al., 2011). Besides specific tools, the TEL platform has corresponding resources, which have already been briefly described at the beginning of this section. An important place among the resources is occupied ...
... et al., 2010) are thus also part of the lexical resources used by LSS. Besides Serbian, such resources exist for many other languages, including English and Russian, which are also envisaged as OER languages within our TEL platform. Another important lexical resource offering support for multilingual ...Ivan Obradović, Ranka Stanković. "Using technology for knowledge transfer between academia and enterprises" in Knowledge and Management Models for Sustainable Growth, Proc. of IFKAD 2014, 9th International Forum on Knowledge Asset Dynamics, 11-13 June 2013, Matera, Italy, Bari : IFKAD (2014)
-
Building learning capacity by blending different sources of knowledge
... of Belgrade. She is interested in semantic web, information systems, database modelling, geoinformation management and artificial intelligence. Her current research is focused on building custom components that incorporate knowledge from various lexical resources. ...
... support system, whose structure is outlined in Figure 3, is based on electronic language resources, namely, lexical resources, textual resources and grammars. The simplest multilingual lexical resources in general are bilingual dictionaries in electronic form. However, for their full functionality ...
... 2008). Another important lexical resource offering support for multilingual terminology is the Serbian wordnet. In brief, a wordnet consists of sets of synonymous words representing specific concepts, called synsets, with a semantic network formed on basis of semantic relations between them. ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Dalibor Vorkapić. "Building learning capacity by blending different sources of knowledge" in International Journal of Learning and Intellectual Capital (2016). https://doi.org/10.1504/IJLIC.2016.075698
-
OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian
Ovaj rad predstavlja novi jezički resurs za pretraživanje i istraživanje verbalnih aspektnih parova u BCS (bosanskom, hrvatskom i srpskom), kreiran korišćenjem principa Lingvističkih Povezanih Otvorenih Podataka (LLOD). Pošto ne postoji resurs koji bi pomogao učenicima bosanskog, hrvatskog i srpskog kao stranih jezika da prepoznaju aspekt glagola ili njegove parove, kreirali smo novi resurs koji će korisnicima pružiti informacije o aspektu, kao i link ka aspektnim parovima glagola. Ovaj resurs takođe sadrži spoljne linkove ka monolingvalnim rečnicima, Wordnetu i BabelNetu. ...Ranka Stanković, Maxim Ionov, Medina Bajtarević, Lorena Ninčević. "OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC
OntoLex, dominantni standard zajednice za mašinski čitljive leksičke resurse u kontekstu RDF-a, Linked Data i tehnologija Semantičkog veba, trenutno se proširuje sa posebnim modulom za Frekvencije, Primere i Informacije zasnovane na Korpusu (OntoLex-FrAC). Predlažemo novi komponent za OntoLex-FrAC, koji se bavi inkorporacijom korpusnih upita za (a) povezivanje rečnika sa korpusnim mašinama, (b) omogućavanje RDF baziranih web servisa da dinamički razmenjuju korpusne upite i podatke odgovora, i (c) korišćenje konvencionalnih upitačkih jezika za formalizaciju unutrašnje strukture kolokacija, skica reči i ...standardizacija, digitalna leksikografija, OntoLex, upiti korpusa, povezani podaci, Lingvistički povezani otvoreni podaciChristian Chiarcos, Ranka Stanković, Maxim Ionov, Gilles Sérasset. "Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, 20-25 May 2024, LREC (2024)
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... Figure 1: Bibliša: the system’s components 4.1 Lexical resources Lexical Resources are used to enhance and refine users’ queries. The query expansion is supported by e-dictionaries (Serbian morphological e-dictionaries), general purpose semantic networks (English and Serbian WordNets) and domain ...
... keywords from the lexical resources of a query language and then finds their equivalents in another language based on inter- lingual relations established in the lexical resources. After refinement of a query (e.g. deleting or adding terms manually), the system performs semantic and multilingual ...
... on the concept of a “semantic digital library”, and the main management and technical challenges derived from such an idea/ideas (Lytras, 2005). The overall requirements for a semantic approach to digital libraries are use of ontologies, lexical and terminological resources. 3.2. Introducing Bibliša ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
An Approach to Development of Bilingual Lexical Resources
... 102 language resources such as grammars in the form of finite automata and transducers, as well as various lexical resources. Bibliša is able to expand search queries both morphologically and semantically, as well as to another language. One type of lexical resources, morphological e-d ...
... An Approach to Development of Bilingual Lexical Resources Stanković Ranka, Obradović Ivan, Trtovac Aleksandra Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] An Approach to Development of Bilingual Lexical Resources | Stanković Ranka, Obradović Ivan, Trtovac ...
... overall system architecture of this tool aimed at using and developing bilingual lexical resources. The third section describes in more detail the components aimed specifically for developing bilingual lexical resources. System modeling was realized using the Unified Modeling Language (UML) and ...Stanković Ranka, Obradović Ivan, Trtovac Aleksandra. "An Approach to Development of Bilingual Lexical Resources" in Proceedings of the Fifth Balkan Conference in Informatics BCI 2012, Workshop on Computational Linguistics and Natural Language Processing of Balkan Languages – CLoBL 2012, September 2012, Novi Sad : BCI (2012)
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... on and reasoning; Semantic networks; Natural lan- guage processing; Lexical semantics; KEYWORDS Computational irony, Verbal irony, Verbal Sarcasm, WordNet ACM Reference format: Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, and Ranka Stanković. 2017. Using Lexical Resources for Irony and Sarcasm ...
... restriction allowed us to also find tweets mostly written in the BCMS languages. We developed a language tweet classifier that relies on lexical resources. Although resources we are using were developed for Serbian primarily, their development was based on traditional re- sources and texts covering to certain ...
... and understanding of ironic con- structs. There are not many direct antonyms in a natural language, therefore, their number is also small in the lexical-semantic network WordNet, compared to other relations. Also, indirect antonyms are often used in natural language, that is to say, synonyms of direct antonyms ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/
-
Advantages and challenges in presenting mathematical content using EDX platform
... comprehensive research related to awareness of importance of OER materials in Serbian learning environment. In parallel, improvement of lexical resources for mathematical content in Serbian will be continued. REFERENCES [1] United Nations Educational, Scientific and Cultural Organization ...
... is an adjective but within mathematical terms in Serbian it is a noun. Thus, there is a need for developing a Semantic, Multilingual Termbase for Mathematics (SMGIoM) [11], a semantic term base with strong terminological relations and an explicit and expressive domain ontology. Such a resource ...
... how different resources can be combined in creating mathematical learning content, such as using the Termi application for mathematical terms. Some challenges in creating mathematical courses within the edX-BAEKTEL platform were pointed out. The lack of engines and resources for deeper analysis ...Marija Radojičić, Ivan Obradović, Ranka Stanković, Olivera Kitanović, Roberto Linzalone. "Advantages and challenges in presenting mathematical content using EDX platform" in The Seventh International Conference on e-Learning (eLearning-2016), Belgrade : Metropolitan University (2016)
-
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... s of Linguistic Resources and Evaluation Conference, pages 1077–1082. Fellbaum, C., Ed. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press. Gelfenbeyn, I., Goncharuk, A., Lehelt, V., Lipatov, A., and Shilo, V. (2003). Automatic translation of wordnet semantic network to russian ...
... 120–132. Kunze, C. and Lemnitzer, L. (2010). Lexical-semantic and conceptual relations in germanet. Lexical-semantic relations: Theoretical and practical perspectives, (28):163–183. Kupriyanov, V., Kossilov, A., Maximov, N., and Kupriyanova, I. (2016). A Semantic-Based Approach for Pre- serving Operational ...
... models of linguistic ontologies for natural lan- guage processing on the scale from more lexical to more conceptual resources. In this paper, we consider the approach to developing Russian ontological resources having the format of the RuThes thesaurus (Loukachevitch and Dobrov, 2014) and created for ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... Thierry Declerck, Asunción Gómez-Pérez, Jorge Gracia, Laura Hollink, Elena Montiel-Ponsoda, Dennis Spohr, et al. Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46(4):701–719, 2012. doi:10.1007/s10579-012-9182-3. 25 Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar ...
... usage of a hybrid approach that combines machine learning and lexical resources. Finally, a user-friendly interface that will enable the use of these resources on the Web is under development. As for the development of the lexical resources, we plan to prepare an ontology for the classification of abusive ...
... the Linked (Open) Data (LOD) paradigm that is used for publishing lexical resources by using URIs to unambiguously identify lexical entries, their components and their relations in the web of data. Moreover, it is used to make lexical data sets accessible via http(s), to publish them in accordance with ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
Indexing of textual databases based on lexical resources: A case study for Serbian
In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...... morphological electronic dictionaries and finite state transducers for Serbian [6]. 4.1 Used Resources Lexical Resources. The resources for natural language processing of Serbian consisting of lexical resources and local grammars are being developed using the finite-state methodology as described in [1] ...
... textual databases based on lexical resources: A case study for Serbian Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Indexing of textual databases based on lexical resources: A case study for Serbian ...
... Serbian. INFOtheca – Journal of Informatics & Librarianship 12(2), 36a–47a (2011) 17. Vossen, P.: EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Boston (1998) View publication statsView publication stats ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
-
Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction
Velislava Stoykova, Ranka Stanković (2018)Velislava Stoykova, Ranka Stanković. "Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction" in Advances in Intelligent Systems and Computing, Springer International Publishing (2018). https://doi.org/10.1007/978-3-319-91189-2_16
-
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... the necessary knowledge to use the existing resources for NLP for Serbian and to develop new ones. Keywords: E-Learning, Open Educational Resources, Computational Linguistics, Lexical Resources, edX 1. INTRODUCTION Open educational resources (OER) publicly available on the web are growing ...
... consider complex rules for MWU inflection in Serbian. 10. The use of powerful morphological mode is presented that enables the use of lexical resources at sub-word level, as well as the use of information from e-dictionaries for output transformations by transducers. More types of variables ...
... CONCLUSION We hope that the developed OER for lexical recognition in NLP will be used in order to reduce the lack of similar courses. We hope that participants will easily acquire the necessary knowledge to use the existing resources for NLP for Serbian and that the number of resource ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
-
A business intelligence approach to mine safety management
Ljiljana Kolonja, Ranka Stanković, Ivan Obradović, Olivera Kitanović, Dejan Stevanović, Marija Radojičić (2016)... vocabulary. This means that all terms used within a domain need to be standardized, with a clear and unambiguous definition, accompanied by lexical and semantic relations with other terms. Examples are relations established between general and more specific terms, such as "coal mine", and "open pit" ...
... developed at FMG [8]. Recognizing the importance of ontologies as key resources for knowledge management, as well as most complex terminological resources, a methodological approach to upgrading the terminological resources developed at FMG to a system of ontologies for Serbian mining industry ...
... [6]. One of the first terminological resources in the mining domain was developed at the University of Belgrade Faculty of Mining and Geology (FMG) within the Technological coal mine information system [7]. Further growth and variety of terminological resources for specific subdomains developed ...Ljiljana Kolonja, Ranka Stanković, Ivan Obradović, Olivera Kitanović, Dejan Stevanović, Marija Radojičić . "A business intelligence approach to mine safety management" in 13th International Symposium Continuous Surface Mining, Beograd : Yugoslav Opencast Mining Committee (2016)
-
Srbija u OneGeology Europe
Геолошки завод Србије као носилац Пројекта ОneGeologyEurope заједно са Рударско геолошким факултетом и Министарством за природне ресурсе, рударство и просторно планирање су се укључили у међународни Пројекат OneGeology Europe у мају 2013. године у већ поодмаклој фази израде Пројекта. До краја 2013. године испунили су завршене активности које треба да доведу до пуноправног укључења у Пројекат чиме је Република Србија нашла своје место на Геолошкој карти Европе 1:1М. Геолошка карта Србије 1:1М представља компилациону односно поједностављену верзију ОГК 1:500 ...... Stanković, R., Obradović, I., Kitanović, O. (2010): GIS Application Improvement with Multilingual Lexical and Terminological Resources, Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, 2283-2287. Stephen M. R. and CGI Interoperability Working ...
... Stanković, R., Obradović, I., Kitanović, O. (2010): GIS Application Improvement with Multilingual Lexical and Terminological Resources, Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, 2283-2287. Stephen M. R. and CGI Interoperability Working ...
... , Београд. OneGeology-Europe, Scientific/Semantic Data Specification and Dictionaries - Generic Specification for Spatial Geological Data in Europe, http://arkisto.gtk.fi/metatieto/onegeologywp3-dataspecv5.pdf, ECP- 2007-GEO-317001. GeoSciML Resources repository http://www.geosciml.org (accessed ...Danka Blagojević, Ranka Stanković, Petar Stejić, Velizar Nikolić. "Srbija u OneGeology Europe" in Zapisnici Srpskog geološkog društva za 2013. godinu, Beograd : Srpsko geološko društvo (2014)
-
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... lexicographic definition is the identification of the semantic content of a certain lexeme, with relevant elements of realization. The semantic content consists of an archiseme, which carries information about the lexeme belonging to a wider lexical-semantic group, and a lower-ranking seme, which carries ...
... carries information about individual characteristics of a lexeme, based on which one lexeme differs from another in the same lexical-semantic group. The basic types of lexicographic definition are descriptive, synonymous, and combined - descriptive and synonymous. A descriptive definition without synonyms ...
... light. Local grammars that model definitions of nouns (and other types of words) will contribute to the creation of dictionaries and other lexical resources in various ways. For instance, when creating a dictionary, they will be used to check the compliance of definitions with the adopted forms. In ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... previous research [14] for terminology extraction in the Serbian language used the rule-based method for multi-word term extraction that relies on lexical resources for modeling various syntactic structures of multi-word terms. It is applied in several domains, also among them is the corpus of Serbian texts ...
... other words, more sophisticated keyword extraction methods in the text preprocessing step usually use some heuristics to gain in performance by using semantic or syntactic knowledge. As the source of syntactic knowledge, methods usually use part-of-speech tags (POS) in order to restrict access to certain ...
... 19,20] or suffix sequences which denote the sequence of morpho- logical suffixes of its words [27,29]. Wikipedia is one of the most commonly used semantic sources: using n-grams that appear in Wikipedia article titles as candidates for keywords [22], utilizing Wikipedia as a thesaurus for candidate selection ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
A Lexical Approach to Acronyms and their Definitions
In this paper we present a comprehensive approach to acronyms for Natural-Language Processing (NLP) of Serbian texts. The proposed procedure includes extraction of acronyms and their definitions that are usual Multi-Word Units (MWUs), shallow parsing of MWUs that enables MWU lemmatization and production of entries in morphological electronic dictionaries, both for MWU and acronyms, that are provided with grammatical, syntactic, semantic and domain information. This approach enables representation that reflects complex relations between acronyms and their definitions.... associate them with a MWU lemma, grammatical categories, semantic and domain information, etc. (7) KFOR-u, Medjunarodne mirovne snage na Kosovu.N +NProp+Org+DOM=Mil+ACR=KFOR:ms3:ms7 KFOR-u,KFOR.ABB+NProp+Org+DOM=Mil +ACR=KFOR:ms3:ms7 3. Used Resources and Tools Corpus: As a corpus we have used an excerpt ...
... unknown words for them: applications based on machine learning techniques have not encountered them in training corpora, while those based on lexical resources do not have them listed in lex- icons. However, their adequate treatment is crucial for many applications, e.g. text-to-speech systems (Taylor ...
... are look- ing locally for acronyms, their definitions and their varia- tions, with a final goal to incorporate collected information into lexical resources for Serbian. In order to achieve these goals we have to deal with complex inflection of both Ser- bian MWUs and acronyms. We have followed these ...Cvetana Krstev, Duško Vitas, Ranka Stanković. "A Lexical Approach to Acronyms and their Definitions" in Proceedings of the 7th Language & Technology Conference, November 27-29, 2015, Poznań, Poland, Springer (2015)