- Random under sampling
- CLUE
- EditedNearestNeighbours under-sampling technique
- NearMiss 3 under-sampling technique (E3_NM)
CLUE
Cluster-based Under-Sampling
- 降采样主要是针对多数类
- 一般来说,我们希望降采样后数据可以保留其主要特征
- 一个自然的想法:先用聚类算法把数据划分为多个类,然后分别从各个类中分别采样
- 深想一步...
EditedNearestNeighbours under-sampling technique
- a majority instance is removed if its class label does not agree with its K nearest neighbors
- omit the noisy and borderline instances
- enhance the accuracy of decision boundary
NearMiss 3 under-sampling technique
- belongs to the NearMiss family
- under-sampling on the majority class according to their distance
- removes majority samples with the largest distance from minority samples’ K nearest neighbors
Machine Learning
Applications and practices