NTU Machine Learning - Lec11 · Stay hungry. Stay foolish

14 Apr 2018

Machine Learning / NTU

總結目前學到的 3 個 linear model。

Linear Models for Binary Classification

前面學到的 regression ，其 error (經過 const. scale) 後都會 $\geq$ 0-1 error，因此利用有效率的演算法，求出 regression 的 model 後， apply step function 來作為 binary classifier 也是一種做 classification 的方法。

Remark: error bound 相對比較鬆，可以想成為了提高效率所作的 accuracy trade-off。

Acceleration on Gradient Descent - Stochastic GD

之前在找 $E_i$ 下山的方向時， $\nabla_w E_i$ 要iterate 過所有資料點才能得到，但我們不妨只 sample 幾個資料點即可(期望值一樣)，但實務上比較不穩定。

Remark: PLA 可以想成 SGD 的一個特例 (只 sample 一個資料點，且 $\eta = 1$)

Binary Classification to Multiclass

可以使用 1-vs-all (OVA) 或 1-vs-1 (OVO) 的方法

OVA: 有 $M$ 個 class 的話，要 train $M$ 個 binary OVA 的 classifier (each with $|\mathcal{D}| = N$ )，但資料 unblanced，結果可能 unstable。
OVO: 要 train $\binom{M}{2}$ 個分類器 (assume equally distributed, each size is $\frac{2N}{M}$)，結果比較 stable ，但 prediction 時要 run 過 $\mathcal{O}(M^2)$ 使其比較沒效率。

Remark: OVO 在 training 時，反而是比較有效率的！

Reference

林軒田老師的講義

NTU Machine Learning - Lec11

Linear Models for Classification

Linear Models for Binary Classification

Acceleration on Gradient Descent - Stochastic GD

Binary Classification to Multiclass

Reference