Reducing Overfitting in Process Model Induction

Will Bridewell, Narges Bani Asadi, Pat Langley, Ljupčo Todorovski

August, 2005

Abstract

In this paper, we review the paradigm of inductive process modeling, which uses background knowledge about possible component processes to construct quantitative models of dynamical systems. We note that previous methods for this task tend to overfit the training data, which suggests ensemble learning as a likely response. However, such techniques combine models in ways that reduce comprehensibility, making their output much less accessible to domain scientists. As an alternative, we introduce a new approach that induces a set of process models from different samples of the training data and uses them to guide a final search through the space of model structures. Experiments with synthetic and natural data suggest this method reduces error and decreases the chance of including unnecessary processes in the model. We conclude by discussing related work and suggesting directions for additional research.

Type

Conference paper

Publication

In Proceedings of the 22nd International Conference on Machine Learning