TopicExplorer is a web-based topic model browser that helps non-technical users to analyze data. Data is typically a collection of text pieces like blog posts, book chapters, Wikipedia pages, articles in journals and newspapers. Without the need of any further input, a topic model learns a number of word lists that often can be interpreted as topics. TopicExplorer helps users to explore the semantics of the learned topics with several visual and interactive features. The ecosystem around the TopicExplorer browser include web applications to filter text corpora, tune the vocabulary used in the analysis and create new topic models.
The current use cases of TopicExplorer comprises media analyses of blogs, linguistic discourse analysis, analysis of translations of literature, discourse reconstruction from the historic Halle journals during the period of Enlightenment as well as teaching activities in seminars. Continue reading “Use Cases”
TopicExplorer is designed as middleware that connects machine learning and topic inference with databases and visual web-based user interfaces. It can be easily adapted to very different application domains through a novel workflow based plug-in-mechanism. The system stores the training data of the topic model, the inference output and additional data depending on the application. It is designed to scale to very large data. Different data stores can be mixed to give optimal performance, e.g. different types of SQL and No-SQL databases. Currently, we develop an ecosystem of micro-services around the TopicExplorer core-application. This includes services for data-import like a blog-crawler, corpus configuration and topic model tuning. The goal is to provide a functionally complete, sustainable, web-based self-service around topic modeling for non-technical end users like researchers from the humanities and social sciences. Continue reading “Development”
Demos of TopicExplorer are available that include many new features, e.g. hierarchical topics: Japanese Blogs
We demonstrate TopicExplorer on the following corpora: English Wikipedia (subset of 10.000 articles) German Fairy Tales Japanese Blogs If demo is not starting due to slow internet connection, please reload demo page.
We are happy to announce the first release of TopicExplorer 1.0, Japanese Blog-Analysis-Distribution that includes general features for browsing topics and related documents, temporal analysis and topic frames.