Topic mining and categorization in online discussion forums
Dey, Jishnu
Loading…
Permalink
https://hdl.handle.net/2142/108348
Description
Title
Topic mining and categorization in online discussion forums
Author(s)
Dey, Jishnu
Issue Date
2020-05-12
Director of Research (if dissertation) or Advisor (if thesis)
Zhai, ChengXiang
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
discussion forums
topic modeling
text categorization
hierarchical categorization
Abstract
Online Forums provide a useful way to engage in discussions about a wide variety of topics, as well as gather custom information for which an exact source may not be available, using a combination of knowledge and human interpretation. Usually forums have categories which cater to a particular topic of interest, allowing information seekers and topic experts to meet. It is thus imperative to organize forum data into an organized structure. In this work we look at methods for categorizing forum posts into appropriate categories, where the number of such categories is large. We compare several baseline methods with state-of-the-art deep learning methods and analyze their performance. We observe that given the highly keyword-centric nature of our data, deep learning methods only slightly outperform baseline methods. Following this, we perform topic modeling on the forum data to find latent topics which creates a hierarchy across forum categories and clusters similar categories. In this process we observe that some of the recent approaches in topic modeling that utilize word embeddings lead to better topics. Finally, we use this hierarchy to perform hierarchical classification of the forum posts to allow better management of the classification task and analyze the benefits of this method.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.