正文

吴恩达机器学习笔记(代码片段)

未来可期-2018  未来可期-2018  2022-12-12  501

关键词：

文章目录

Error analysis

Methods to solve over fitting

more training examples
try smaller sets of features
try increasing $\\lambda$

Methods to solve under fitting

getting additional features
try adding polynomial features
try decreasing $\\lambda$

Recommend approach

start with a simple algorithm that you can implement quickly. Implement it and test it on your cross-validation data
plot learning curves to decide if more data, more features, etc. are likely to help.
Error analysis: See if you spot any systematic trend in what type of examples it is making errors on

Error metrics for skewed classes

	1	0
1	True Positive	False Positive
0	False Negative	True Negative

$precision=\\fracTPTP+FP$ 准确率

$recall=\\fracTPTP+FN$ 召回率

$\\displaystyle F_1score=2\\fracPRP+R$ F值

Data for machine learning

In the following conditions, more data makes sense.

Assume feature $x\\in R^n+1$ has sufficient information to predict y accurately.
Use a learning algorithm with many parameters such as logistic regression or linear regression with many features or neural network with many hidden units.

Support Vector Machine

Logistic regression Cost function
$min_\\theta\\frac1m[\\sum_i=1^my^(i)(-\\log h_\\theta(x^(i)))+(1-y^(i))(-\\log(1-h_\\theta(x^(i))))]+\\frac\\lambda2m\\sum_j=1^n\\theta_j^2$
SVM hypothesis
$min_\\theta C\\sum_i=1^m[y^(i)cost_1(\\theta^Tx^(i))+(1-y^(i))cost_0(\\theta^Tx^(i))]+\\frac12\\sum_i=1^n\\theta_j^2$

$h_\\theta(x)= \\begincases 0\\\\ 1\\\\ \\endcases$

SVM parameters

$C=\\frac1\\lambda$

large C low bias, high variance
small C higher bias, low variance

$\\sigma^2$

large $\\sigma^2$ . feature $f_i$ vary more smoothly. higher bias, lower variance
small $\\sigma^2$ . feature $f_i$ vary less smoothly. lower bias, higher variance

Kernel function

no kernel
高斯kernel
$f=e^-\\frac|| x_1-x_2 ||2\\sigma^2$
多项式kernel $k(x,l)=(x^Tl+constant)^degree$

Muti-class classification

train K SVMs, one to distinguish $y = i$ from the rest

Logistic regression vs SVMs

n=number of features $x\\in R^n+1$ ，m=number of training examples

n is large relative to m: use logistic regression or SVM without kernel
n is small , m is intermediate: use SVM with Gaussian kernel
n is small and m is large: create or add more features and then use logistic regression or svm without kernel

K-means

Input

正文

吴恩达机器学习笔记(代码片段)

文章目录

Error analysis

Methods to solve over fitting

Methods to solve under fitting

Recommend approach

Error metrics for skewed classes

Data for machine learning

Support Vector Machine

K-means

机器学习-吴恩达andrewngweek1知识总结introduciton(代码片段)

吴恩达机器学习学习笔记——代价函数

吴恩达“机器学习”——学习笔记二

吴恩达机器学习随笔什么是机器学习？(代码片段)

吴恩达《机器学习系列课程》学习笔记

吴恩达《机器学习系列课程》学习笔记：监督学习

machinelearning|andrewng|coursera吴恩达机器学习笔记

吴恩达“机器学习”——学习笔记八

coursera-andrewng(吴恩达)机器学习笔记——第一周

吴恩达深度学习笔记(代码片段)

吴恩达2014机器学习教程笔记目录

吴恩达深度学习笔记(代码片段)

machinelearning（吴恩达＜一＞）(代码片段)

吴恩达机器学习——线性回归(代码片段)

吴恩达机器学习作业——异常检测和推荐系统(代码片段)

吴恩达机器学习作业——异常检测和推荐系统(代码片段)

吴恩达“机器学习”——学习笔记五