Ingrid Tchilibou:
A WordNet based backend for the Interactive Concept Validation (ICV)
Abstract
Summarizing and categorizing a large quantity of ideas to generate new and interesting ideas is a promising process, also called idea synthesis. However, the collected ideas are usually very extensive, which takes a lot of time when categorizing them manually. A Software could help analysts to oversee large quantities in order to categorize faster and better synthesize new ideas. This thesis deals with the question on whether WordNet hypernyms are helpful in categorizing a group of ideas. To use WordNet hypernyms it is necessary to first annotate the ideas with the meaning (synset) of each word contained in an idea (so-called Word-Sense-Disambiguation). For this purpose, a WordNet connection for the existing ICV (Interactive Concept Validation) tool was implemented in this thesis. Furthermore, several possibilities of categorization were implemented and tested in an expert interview. The interview showed that hypernyms cannot be considered in terms of the amount of ideas that are grouped as categories. On the other hand, the interview showed that hypernyms can help in giving an orientation on the main context in the ideas. These observations provide a base for the further development and improvement of WordNet-based categorization approaches.
Requirements
- General Programming Language (Python, Java, etc)
- Basic Knowledge in Data Analysis
Contents
- Understanding Ideas in Large Scale Ideation
- Ideas to Market
Wikidata and Dbpedia provide noise concept mappings, because they are created by crowdsourcing. In contrast, wordnet is created by experts. This could potentially lead to better hierarchies of concepts, potentially useful for the algorithmic analysis of idea texts.
Objectives- Implement a wordnet backend for the ICV approach.
- Annotate ideas with wordnet synsets
- Analyse the extracted hypernyms as a mechanism for idea analysis.
Mackeprang, Maximilian, Claudia Müller-Birn, and Maximilian Timo Stauss. "Discovering the Sweet Spot of Human-Computer Configurations: A Case Study in Information Extraction." Proceedings of the ACM on Human-Computer Interaction 3.CSCW (2019): 1-30.
Daniel Ringler and Heiko Paulheim. One knowledge graph to rule them all? analyzing the differences between dbpedia, yago, wikidata & co. In Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz), pages 366–372. Springer, 2017.
Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.