review for data mining

复习的时候顺手做的提纲_(:3JZ)_

lecture2

PCA

Variance is

where .
Maximize the variance
s.t.
Lagrangian:
Let gradient of be : .
is the largest eigenvector of , is the largest eigenvalue.

Maximize

s.t
return largest eigenvectors of .

SVD

:

rankmin:

return largest left singular vectors of .

s.t.

lecture3 Distance

  • -Norm, Minkowski distance
  • Match-based similarity computaiton
  • Mahalanobis distance, Geodesic distance
  • Inverse Occurrence Frequency
    #
  • Cosine, TF-IDF
    #
  • Dynamic time warping
  • Edit distance, longest common subsequence
    #
  • shortest-path, random walk
    #
  • supervised similarity functions

lecture4

Support Monotonicity Property

The support of every subset of is at least equal to that of the support of itemset .

Downward Closure Property

Every subset of a frequent itemset is also frequent.

lecture5

k-means

Mahalanobis k-means

where is the covariance matrix of .

Spectral Clustering

Point Set .
Edge exists if .
Edge weight

  • 1
  • .

Minimize (s.t. )

  • is the graph Laplacian.
  • is the similarity matrix.
  • is diagonal matrix, .

Solution

.
smallest eigenvector is , useless. return the [2,k+1] small eigenvectors.
Remember the normalization .

NMF

be the non-negative data matrix.
NMF factor

where .
Minimize
,s.t.
iteration:
result:
be considered as the -th cluster. be considered as the association between and .
is label to s.t. .

lecture8,9

Naive Bayes

对于给出的待分类项,求解在此项出现的条件下各个类别出现的概率,哪个最大,就认为此待分类项属于哪个类别。

SVM(hinge loss)

hard margin

Maximize

s.t.

soft margin and hinge loss

soft margin and logistic loss

lecture10

minimize

s.t.

solution ,.

dual problem

.

Maximize , s.t. .
solution, .
弱对偶:(总是满足).
强对偶:(当KKT时满足).

KKT

非线性规划(Nonlinear Programming)问题能有最优化解法的充要条件

最优点满足

  1. 对偶约束: .
  2. 补松弛: .
  3. 拉格朗日梯度:

lecuture11

semisupervised learning

Labels are Expensive,label少量样本来达到好效果
Utilize the unlabeled data

active learning

Label the most informative data

ensemble learning

  • Different classifiers may make different
    predictions on test instances
  • Increase the prediction accuracy by
    combining the results from multiple
    classifiers

lecture12

where ’s are unkown coefficients.
could be

  • quantitative inputs
  • functions of quantitative inputes
  • basis expansions (e.g. )
  • numeric coding of qualitative inputs

Least Square

given a set of training data .
Minimize

valid if ’s are conditionally independent given the inputs .

Matrix tools: (, first column is .)

Differentiate
Assume is invertible

Ridge Regression

where is a complexity parameter.

Matrix tools: ()

Differentiate
final let , .

lecture13 Mining web data:

ranking

recommend

item-based recommendation

  • 用向量描述物品
  • 物品的相似度:向量余弦。sim(i,j)=cos(i,j).
  • 估计喜好u对i的喜好,加权求和. R(u,j)表示u对j的评价

CC BY-SA 4.0 review for data mining by himemeizhi is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

  • 你还学过机器学习啊!(高速接近靠拢状

    • 必修课,学了就忘了,你现在应该比我懂得多很多。对不能简洁优美地刻画的东西不很感兴趣……