Learning framework in machine learning
The Probably Approximately Correct (PAC) framework is about learning something efficiently. The learner selects one hypothesis that has high probability and low chance of making errors, from a set of functions. It helps analyse whether and under what conditions a learner will probably output an approximately correct outcome.
First, we define "approximate." A hypothesis ℎ∈𝐻 is approximately correct if its error over the distribution of inputs is bounded by some 𝜖, with 0≤𝜖≤1/2. That means 𝑒𝑟𝑟𝑜𝑟𝐷(ℎ)<𝜖, where 𝐷 is the distribution over inputs.
Next, we define "probably." If 𝐿 will output such a classifier with probability 1−𝛿, with 0≤𝛿≤1/2, we call that classifier probably approximately correct.
Knowing that a target concept is PAC-learnable allows you to bound the sample size (denoted by m) necessary to probably learn an approximately correct classifier, which is shown in the formula below :
𝑚≥(1/𝜖)(𝑙𝑛|𝐻|+𝑙𝑛1/𝛿)
To have some intuition about the formula, think of this: when error (𝜖) goes down. sample size goes up. When H size increases, the sample size goes up. When probability of correct prediction (1−𝛿) increases, the sample size goes up.
Mistake bound is another learning framework which is characterised by bounded number of mistaken classifications.
Let us consider a learning scenario:
for t=1…n learner receives unlabeled instance x(t) learner predicts h(x(t) ) learner receives label c(x(t) ) and updates model h
(It means learner is told the correct answer)
In mistake bound model, the learner is evaluated by the total number of mistakes it makes before it converges to the correct hypothesis. The model h is said to learn c(x(t)) in the mistake bound model if total number of mistakes made by model h is bounded to a value.
We look at Find-S algorithm of mistake model. Consider Find-S when H = conjunction of boolean literals. X is the distribution space.
- Initialise h to the most specific hypothesis
- For each positive training instance x , remove from h any literal that is not satisfied by x
- Output hypothesis h .
How many mistakes before it converges to correct h ? The maximum number of mistakes made will be at most n+1. n is the number of polynomial size.