TopicExplorer is a web-based topic model browser that helps non-technical users to analyze data. Data is typically a collection of text pieces like blog posts, book chapters, Wikipedia pages, articles in journals and newspapers. Without the need of any further input, a topic model learns a number of word lists that often can be interpreted as topics. TopicExplorer helps users to explore the semantics of the learned topics with several visual and interactive features. The ecosystem around the TopicExplorer browser include web applications to filter text corpora, tune the vocabulary used in the analysis and create new topic models.
Features
TopicExplorer helps users to explore the semantics of the learned topics with several visual and interactive features. Continue reading “Features”
Use Cases
The current use cases of TopicExplorer comprises media analyses of blogs, linguistic discourse analysis, analysis of translations of literature, discourse reconstruction from the historic Halle journals during the period of Enlightenment as well as teaching activities in seminars. Continue reading “Use Cases”
Development
TopicExplorer is designed as middleware that connects machine learning and topic inference with databases and visual web-based user interfaces. It can be easily adapted to very different application domains through a novel workflow based plug-in-mechanism. The system stores the training data of the topic model, the inference output and additional data depending on the application. It is designed to scale to very large data. Different data stores can be mixed to give optimal performance, e.g. different types of SQL and No-SQL databases. Currently, we develop an ecosystem of micro-services around the TopicExplorer core-application. This includes services for data-import like a blog-crawler, corpus configuration and topic model tuning. The goal is to provide a functionally complete, sustainable, web-based self-service around topic modeling for non-technical end users like researchers from the humanities and social sciences. Continue reading “Development”
Blog
[in German] Arbeitsprozess der Themenanalyse in einem Korpus japanischer Blogartikel
In meiner Doktorarbeit (Institut für Politikwissenschaft & Japanologie / Martin-Luther-Universität Halle-Wittenberg) nutze ich den TopicExplorer als Werkzeug, um verschiedene Diskurse zu identifizieren, in denen der japanische Begriff jikosekinin („Selbstverantwortung“) vorkommt. Im Folgenden soll exemplarisch der Arbeitsvorgang bei der Identifikation und Interpretation eines Themas beschrieben werden. In der Darstellung des Arbeitsprozesses wird deutlich, welche zusätzlichen Informationen …
Towards a TopicExplorer implementation in Elm – Part two
In this part, I explore the option to refactor the model of the elm-TopicExplorer-client such that identifiers for topics link out to all data relevant to viewing and interacting with topics. Starting with the topic hierarchy, this strategy leads to an overall implementation that is decomposed into several small reusable modules. This allows to easily …
Continue reading “Towards a TopicExplorer implementation in Elm – Part two”
Towards a TopicExplorer implementation in Elm
Due to the benefits of the programming language Elm, which compiles to JavaScript, we plan to implement the TopicExplorer web interface in Elm. We discuss the re-implementation in several blog posts. In this first part, we show a demo and discuss the code of the topic navigator sub-interface.