MLCourse:Mid-term-exam-points
From Dahuawiki
Contents |
[edit]
Basic concepts
- samples, labels
- classifier
- a mapping of samples to labels
- training set, testing set
- training error
- the goal of learning
- to find a prediction rule that generalizes well
- cross-validation
- leave-one-out cross-validation
- relation with generalization (likely to generalizes well, but not guaranteed)
- maximum likelihood estimation
[edit]
Linear classifiers
- the formulation of linear classifier
- decision boundary
- zero-one loss
[edit]
Perceptron
- update rule
- convergence (bound and conditions)
- the generalization guarantees (with feedback)
[edit]
Linear SVM
- strict formulation (without slack variables)
- support vectors
- leave-one-out error and the number of SVs
- relaxed formulation (with slack variables)
- trade off
- what are support vectors in this case?
- regularization
- desired objective and regularization penalty
- hinge loss
[edit]
Logistic Regression
- the discriminative formulation
- log-odds of likelihood is linear
- MLE estimates
- log-loss (-log p)
- need regularization (when samples may be linearly separable)
[edit]
Linear Regression
- formulation
- probabilistic formulation
- prediction rule
- the optimal solution
- bias and variance of the estimates
- mean squared error
- ridge regression
- regularization
- trade-off between bias and variance reduction
[edit]
Active Learning
- what is active learning
- active learning for linear regression
- minimizing MSE of parameter estimates
- selecting most uncertain input
- a convex quadratic function of x
[edit]
Kernels
[edit]
what is kernel
- definitions
- inner product of features
- gram matrix is always positive semi-definite
- kernel construction rules
[edit]
kernel regression
- parameters lie in the span of training features <- regularization
- kernelized prediction
[edit]
kernel perceptron
- solution
- algorithm
[edit]
kernel SVM
- primal form and dual form
- kernelized prediction rule
- constraints of α
- geometric margin
[edit]
kernel optimization
- kernel parameterization
- optimization criterion
- surrogate measure of generalization error
- cross-validation, margin
- kernel alignment
- surrogate measure of generalization error
- kernel normalization
- margin depends on scale
[edit]
Model selection
- goal: pursue good generalization
- model -> class of functions
- nested model (sub-class)
- empirical risk, (expected) risk
- minimum probability of error classifier
- over-fitting
- relation between training error/test error and model complexity
[edit]
Structural Risk Minimization
- complexity penalty
- depends on model, training set size, and confidence
- upper bound guarantee of generalization error
- it is a probabilistic guarantee
- SRM -> best guarantee
