Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...
... basen” — indexed query: istraživanje; kolubarski;basen
— ‘exploration in Kolubara basin’
35 information need: rudno telo — faceted query: “rudno telo” — indexed
query: rudni ; Rudno; telo — ‘ore body’
1 information need: zlato Au Bor Borski okrug — faceted query: zlato; Bor ;
“Borski okrug” — indexed ...
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8