Date of Award

Fall 2012

Degree Type


Degree Name

Master of Applied Science (MASc)


Computing and Software


Tom Maibaum



Committee Member

George Karakostas, Alan Wassyng


Large software systems tend to have complex architecture and numerous lines of source code. As software systems have grown in size and complexity, it has become increasingly difficult to deliver bug-free software to end-users. Most system failures occurring at run-time are directly caused by system defects; therefore diagnosis of software defects becomes an important but challenging task in software development and maintenance.

A system log is one available source of information from a software system. Software developers have used system logs to record program variable values, trace execution, report run-time statistics and print out full-sentence messages. This makes system logs a helpful resource for diagnosis of software defects. The conventional log analysis requires human intervention to examine run-time information in system logs and to apply their expertise to software systems, in order to determine the root cause of a software defect and work out a concrete solution. Complex software systems can generate thousands of system logs in a relatively short time frame. Analyzing such large amounts of information turns out to be extremely time-consuming. Automated techniques are needed to improve the efficiency and quality of the diagnostic process.

This thesis presents an automated approach to diagnosis of software defects, combining source code analysis, log analysis and sequential pattern mining, to detect anomalies among system logs, diagnose reported system errors and narrow down the range of source code lines to determine the root cause. We demonstrate that, by implementation, the methodology provides a feasible solution to the diagnostic problem.

McMaster University Library