TopicExplorer is a web-based topic model browser that helps non-technical users to analyze data. Data is typically a collection of text pieces like blog posts, book chapters, Wikipedia pages, articles in journals and newspapers. Without the need of any further input, a topic model learns a number of word lists that often can be interpreted as topics. TopicExplorer helps users to explore the semantics of the learned topics with several visual and interactive features. The ecosystem around the TopicExplorer browser include web applications to filter text corpora, tune the vocabulary used in the analysis and create new topic models.
The current use cases of TopicExplorer comprises media analyses of blogs, linguistic discourse analysis, analysis of translations of literature, discourse reconstruction from the historic Halle journals during the period of Enlightenment as well as teaching activities in seminars. Continue reading “Use Cases”
TopicExplorer is designed as middleware that connects machine learning and topic inference with databases and visual web-based user interfaces. It can be easily adapted to very different application domains through a novel workflow based plug-in-mechanism. The system stores the training data of the topic model, the inference output and additional data depending on the application. It is designed to scale to very large data. Different data stores can be mixed to give optimal performance, e.g. different types of SQL and No-SQL databases. Currently, we develop an ecosystem of micro-services around the TopicExplorer core-application. This includes services for data-import like a blog-crawler, corpus configuration and topic model tuning. The goal is to provide a functionally complete, sustainable, web-based self-service around topic modeling for non-technical end users like researchers from the humanities and social sciences. Continue reading “Development”
In this part, I explore the option to refactor the model of the elm-TopicExplorer-client such that identifiers for topics link out to all data relevant to viewing and interacting with topics. Starting with the topic hierarchy, this strategy leads to an overall implementation that is decomposed into several small reusable modules. This allows to easily …
Demos of TopicExplorer are available that include many new features, e.g. hierarchical topics: Japanese Blogs