Creating controlled vocabularies for smart search at WUR

J.L. Top, B. Öztürk, J.J. Hoekstra, R.J. Vlek

Research output: Book/ReportReportProfessional

Abstract

Searching text or documents in large unstructured and semi-structured data sources is not trivial. A search engine is supposed to make more search efficient and effective. It supports to build a query that can be applied automatically to extract the information that complies with the user’s intention. Controlled vocabularies and ontologies help improving the search and make it domain-aware. In this document, we explain the notion of a controlled vocabulary, its construction methods and its use in smart search engines. Manual construction of controlled vocabularies and ontologies can be achieved using several existing tools,which require specific technical skills. Therefore, we refer to the ROC+ tool, developed within WFBR, which helps domain researchers build a controlled vocabulary in a faster and easier way. Another application, namely the TALK tool, was developed to start a discussion on a specific term in multidisciplinary teams. It proposes automatically generated associated terms, which can then be exported in a machine processible nformat as input for ROC+. We also briefly mention the use of NLP technology in text mining, where domain related concepts can be automatically extracted from pdf documents. Finally, some example of controlled vocabularies developed within WFBR are listed for further reference.
Original languageEnglish
Place of PublicationWageningen
PublisherWageningen Food & Biobased Research
Number of pages14
DOIs
Publication statusPublished - 2023

Publication series

NameReport / Wageningen Food & Biobased Research
No.2383

Fingerprint

Dive into the research topics of 'Creating controlled vocabularies for smart search at WUR'. Together they form a unique fingerprint.

Cite this