Under Sampling

  • Random under sampling
  • CLUE
  • EditedNearestNeighbours under-sampling technique
  • NearMiss 3 under-sampling technique (E3_NM)

CLUE

Cluster-based Under-Sampling

  • 降采样主要是针对多数类
  • 一般来说,我们希望降采样后数据可以保留其主要特征
  • 一个自然的想法:先用聚类算法把数据划分为多个类,然后分别从各个类中分别采样
  • 深想一步...

NearMiss 3 under-sampling technique

  • belongs to the NearMiss family
  • under-sampling on the majority class according to their distance
  • removes majority samples with the largest distance from minority samples’ K nearest neighbors
Machine Learning Applications and practices