Summary of Computational Learning Theory

Kui Tang December 2012

WRITE THIS ESSAY FOR ME

Tell us about your assignment and we will find the best writer for your paper.

Prepared from notes and homeworks from Professor Rocco Servedio’s Fall 2012 course in preparation for the ﬁnal. No guarantees of completeness or correctness. Contents 1 Online Learning 2 1.1 Elementary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Winnow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 (Randomized) Weighted Majority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 General Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 PAC Learning 5 2.1 Probability Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 OLMB to PAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Consistent Hypothesis Finder; Occam’s Razor . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Hardness of 3-Term DNF Learning; Proper and Improper Learning . . . . . . . . . . . . . . . 7 2.5 VC Dimension Lower-Bounds PAC Sample Complexity . . . . . . . . . . . . . . . . . . . . . . 8 3 Bounds for Inﬁnite Hypothesis Classes 8 3.1 Growth Functions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Sauer’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Inﬁnite CHF (Method of Double Sampling) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Boosting 10 4.1 Three-Stage Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Recursive Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3 AdaBoost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5 Noisy PAC 12 5.1 Statistical Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.2 Example: Monotone Disjunctions Noisy PAC . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3 Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.1 Sample Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.2 Noise Rate and Adversarial Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 6 Representation-Independent (Cryptographic) Hardness 13 6.1 Pseudorandom Functions are Hard to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 6.2 Discrete Cube Roots are Hard to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1
Background image of page 1
Info iconThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentRight Arrow Icon
1 Online Learning 1.1 Elementary Algorithms To learn a one-decision list, use extended decision (multiple rules; if multiple antecedents are satisﬁed, arbitrarily choose one) lists are your hypotheses. Begin with all 4 n + 2 rules. Each time a rule ﬁres and returns a wrong answer, such that that rule is responsible for the error, we push the rule down to the next level. A rule is pushed iﬀ a mistake is made (show the converse side), so we have at most O ( n ) mistakes. Decision lists can be encoded as LTFs with exponentially decaying weights. Recall that a matching rule short-circuits the rest of the decision list evaluation. In an LTF with exponentially-decaying weights, the ﬁrst matching rule overwhelm all subsequent coeﬃcients [HW1.1b]. [HW1.2] Any OLMB algorithm can be made conservative: Suppose algorithm A has mistake bound M . Algorithm A0 predicts with its current hypothesis. If it turned out to be a mistake, A0 gives the current example to A and takes A0 s output hypothesis. The two algorithms make the same number of mistakes on the subsequence that causes mistakes. This is because A only sees this sequence of mistake examples. Therefore, A0 can make no more mistakes than A itself would have. Randomized halving algorithm: if we remove the oblivious adversary, a more eﬃcient algorithm (in expectation) can be obtained: maintain the set of consistent-so-far hypothesis, and on each mistake, pick h uniformly at random from the new set of consistent hypotheses. Then for any ﬁxed target c and training sequence ± x i ² (this is the oblivious adversary assumption), we have E [ M ] ≤ ln |C| + O (1) . Proof idea: order the concepts in CONSIST by the order in which the ﬁxed training sequence will eliminate them

Introducing our Online Essay Writing Services Agency, where you can confidently place orders for a wide range of academic assignments. Our reputable homework writing company specializes in crafting essays, term papers, research papers, capstone projects, movie reviews, presentations, annotated bibliographies, reaction papers, research proposals, discussions, and various other assignments. Rest assured, our content is guaranteed to be 100% original, as every piece is meticulously written from scratch. Say goodbye to concerns about plagiarism and trust us to deliver authentic and high-quality work.

WRITE THIS ESSAY FOR ME

WRITE MY ESSAY NOW

LATEST ASSIGNMENTS

EDL-827 Module 5 DQ 1

BUS 311 Week 2 Quiz

EDL 505 Week 5 Learning Team Assignment Improved Lesson Plan