What is SL?

What is SL?

Statistical Learning is a approach for Estimating

predictors

independent variables

features

response

dependent variable

fixed but unknown function of

Systematic information

random error term

independent of

Why is ?

Prediction

description of prediction's accuracy

reducible error

which could be reduced by the more appropriate statistical learning model

irreducible error

Because of

unmeasured variables which could influence Y

unmeasurable variables

Which can't be predicted by

This book focus on the Reducible Error
The Irreducible Error could be reduced (but not to zero) by Theories.

Understanding By Formulations

Reducible Error

The most appropriate model

The model we estimated

Interference

  • Identify the important predictors among a large set of variables.

  • To understand the relationship between dependent variable and predictors. (positive or negative relationship; main effect and interact effect)

  • To known the relationship between the predictors and dependent variable is linear or more complex non-linear.

The choice between Prediction and Interference

  • Prediction

    • complex model could predict more accurately
    • But complex model is hard to explain.
  • Interference

    • Simple model is easy to explain.
    • But simple model couldn't predict accurate as complex model.

How to Estimate

parametric methods

Steps

  1. select model

    • make an assumption about function form of
    • e.g.
  2. fitting model

    • Ordinary least squares

      • Chapter 03
    • Others estimating methods

      • Chapter 06

Advantages and Disadvantages

  • estimating a set of parameters easily
  • model used to fit is very important !!!
  • Overfitting

non-parametric methods

Defination

  • seek an estimate of f that get as close to the data points as possible

Advantages and Disadvantages

  • more accurately
  • need large number of observations to get accurate estimate
  • Overfitting

The trade-off between Prediction Accuracy and Model Interpretability

  • methds comparation according to accuracy and interpretability F%i

Supervised v.s. Unsupervised Learning

Supervised Learning

Understand the relationship between the predictors and response.

Unsupervised Learning

The situation we lack a response variable that can supervise our analysis.

To understand the relationship between variables

Cluster Analysis

Semi-supervises Learning Problem

When have a set of observations but only have of both have predictors and response the remaining observations just have predictors but not response.

Regression v.s. Classification Problems

Regression

with quantitative response

Classsification

with qualitative response

results matching ""

    No results matching ""