Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: Method and clinical application

posted in: Updates in Informatics | 0

Abstract: Motivation: Topic segmentation and labeling systems enable fine-grained information search. However, previously proposed methods require annotated data to adapt to different information needs and have limited applicability to texts with short segment length.Methods: We introduce an unsupervised method based on a combination of hidden Markov models and latent semantic analysis which allows the topics of interest to be defined freely, without the need for data annotation, and can identify short segments.Results: The method is evaluated on intensive care nursing narratives and motivated by information needs in this domain. The method is shown to considerably outperform a keyword-based heuristic baseline and to achieve a level of performance comparable to that of a related supervised….