Machine Learning techniques, in particular induction algorithms, have been applied to the field of expert systems development in an effort to overcome the knowledge acquisition bottleneck. Many different induction algorithms have been developed. These utilise a number of different knowledge representations: e.g decision trees and rules. The rule based representation includes systems which utilise both propositional and predicate logic languages. Decision tree and rule based induction employ different knowledge acquisition mechanisms, however both strategies tend to induce complex and inaccurate knowledge from problem domains that contain noise or representational complexity. Previous techniques have concentrated on changing either the knowledge acquisition mechanism or the final ruleset/decision tree to reduce complexity. This thesis presents a new approach that focuses induction on those members of a training set that are likely to provide reliable knowledge. This is achieved by a new measure which is used to identify the most representative examples in a training set. The results of experiments show that this approach produces much simpler rulesets which in some circumstances perform with greater accuracy on unseen data.
|Date of Award||Jun 1998|