Date of Award
9-2010
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
Supervisor
Norm Archer
Language
English
Abstract
Classification or Categorization is a text mining technique in which the given text documents are classified into specified categories. There are several techniques for classifying messages, ranging from simple K Nearest Neighbours to complicated Support Vector Machines. These classifiers have proven to be effective in cases where the documents in each category do not have a great deal of overlap with other documents. Designing a classifier that is effective in environments where there is no way to avoid this overlap, like em ails, text messages, or user opinions and comments, has remained a continuing challenge. This work is a proposal for a system that classifies such documents based on their content so they can be sorted by semantic significance. This has several applications in the real world, like triaging patient messages to physicians in the healthcare field or sorting user opinions on a product webpage. We have combined and tailored different classifiers to build a high performance classifier that supports this type of classification. The system has been tested and proven to have good performance with real-world user messages that were exchanged between patients and physicians during a hypertension prevention study.
Recommended Citation
Tavasoli, Amir, "Automated Message Triage - A Proposal for Supervised Semantic Classification of Messages" (2010). Open Access Dissertations and Theses. Paper 4630.
http://digitalcommons.mcmaster.ca/opendissertations/4630
McMaster University Library
