Overfitting is a concept from machine learning that also applies in the context of human learning and pedagogy, though it is not a standard term in this context. It refers to a situation where a learner masters the examples that are used in instructional material or exercises, but in a manner that relies on the peculiarities of those specific examples, so that the learner is not able to perform well on new examples from the same reference class.
Overfitting could be exposed in either of two ways:
- The learner gets stuck on new examples.
- The learner seems to be able to solve new examples, but makes systematic errors or has very low accuracy on the new examples.
Teaching to the test may be viewed as an example of overfitting where the instructor is complicit in the process.
Examples of overfitting
Overfitting where learners effectively memorize the examples and don't develop a model at all
In some cases, learners repeatedly practice the specific examples, to the point where they internalize the details of each example and can reproduce it quickly. They have thus effectively memorized the examples. Note that they may have done this through an explicit rote memorization effort, or they may have done this subconsciously as a result of repeated practice with the examples.
This does not mean that every instance of rote memorization is an instance of overfitting. Rather, memorization can lead to overfitting if learners are also mistakenly confident that they can easily use the (implicitly or explicitly) memorized examples to solve new examples in the reference class.
Learners who overfit in this way are more likely to get stuck on new examples than solve them incorrectly.
Overfitting due to the creation of false problem-solving strategies
In some cases, learners find patterns or heuristics that happen to apply well to the specific examples they are given, but do not work for the general class of problems. In other words, learners are developing a (possibly) explicit problem-solving strategy, but this strategy, although it works on the examples the learner tries, does not work in general. This usually happens in one of these cases:
- The set of examples the learner learns from is small and not sufficiently diverse or representative of the reference class.
- The learner is using a really convoluted problem-solving strategy that makes little aesthetic sense.
False problem-solving strategies might relate to misconceptions about the underlying model.
Learners who overfit by creating false problem-sovling strategies are more likely to solve new examples incorrectly than to get stuck on them entirely.
Remedies for overfitting
Separate training examples and test examples
One of the standard practices to detect and correct for the problem of overfitting in machine learning is to have separate sets of training examples and test examples. The training examples are used to learn and form the internal model for how to solve the problem, and the test examples are then used to test whether the model is correct. The key idea is that test examples are withheld during the training process, so as to allow them to play the role of independently assessing the model one has learned. If performance on the test examples is significantly worse than performance on the training examples, this suggests that the model one used to learn is overfit to the training examples. If this happens, then adjustments need to be made to the model.
Note that the term test here does not (necessarily) refer to the formal assessments given by instructors. Rather, it may refer to a set of practice examples that the student earmarks for later use.
Randomly vary training examples
This approach considers slight, somewhat artificial, variations of the training examples and checks whether the strategies learned for solving those examples generalize to the variations. This approach is better suited in situations where the set of externally provided examples is limited, but it is possible to tweak an example to create a new example.
This is an adaptation to human learning of the methods used for artificial data synthesis in machine learning.