TSNE介绍
TSNE是一种降维方法,通过将多维度数据,在保持原有数据分布和相似性情况下,转换为二维或者三维数据。
原理
TSNE目的
- 主要是为了对各种高维度数据进行可视化(常用在聚类分析)
- 简化机器模型训练和预测
TSNE使用
下面使用sklearn.manifold
简单演示
1 2 3 4 5 6 7 8
| import pandas as pd data = pd.DataFrame({ 'A':[2,2.1,2.5,6], 'B':[3,10.1,12.5,8], 'C':[3.5,7.1,6.5,9], '类别':[1,2,3,1] }) data
|
![image.png](https://cdn.nlark.com/yuque/0/2020/png/613759/1603250565677-52f92332-c759-40d6-8264-20a74da4c78d.png#align=left&display=inline&height=134&margin=%5Bobject%20Object%5D&name=image.png&originHeight=134&originWidth=153&size=5000&status=done&style=none&width=153)
1 2 3 4 5 6
| from sklearn.manifold import TSNE
tsne = TSNE() tsne.fit_transform(data) tsne = pd.DataFrame(tsne.embedding_, index = data.index) tsne
|
![image.png](https://cdn.nlark.com/yuque/0/2020/png/613759/1603250529801-9f17b022-d091-4e99-b370-8dbffb5bdd29.png#align=left&display=inline&height=133&margin=%5Bobject%20Object%5D&name=image.png&originHeight=133&originWidth=186&size=7948&status=done&style=none&width=186)
1 2 3 4 5 6 7 8 9 10 11 12
| import matplotlib.pyplot as plt
d = tsne[data['类别'] == 3] plt.plot(d[0], d[1], 'r.')
d1 = tsne[data['类别'] == 1] plt.plot(d1[0], d1[1], 'go')
d2 = tsne[data['类别'] == 2] plt.plot(d2[0], d2[1], 'b*') plt.show()
|
![output_3_0.png](https://cdn.nlark.com/yuque/0/2020/png/613759/1603250472132-37a9b707-2a31-464e-aa61-c5093b1d827c.png#align=left&display=inline&height=248&margin=%5Bobject%20Object%5D&name=output_3_0.png&originHeight=248&originWidth=383&size=4351&status=done&style=none&width=383)