Outlier: Difference between revisions

From Rice Wiki
m (Rice moved page Outliers to Outlier)
No edit summary
 
Line 1: Line 1:
Outliers are samples that show abnormal distance from other samples. They impact the accuracy of the model.
Outliers are samples in a [[dataset]] that show abnormal distance from other samples. They impact the accuracy of the model.


= Detection =
Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed.
Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed.


Line 7: Line 8:
*Data Analysis such as box plot
*Data Analysis such as box plot
*ML algorithms such as One-Class-SVM
*ML algorithms such as One-Class-SVM
 
Numerically, outliers are defined to be 1.5xIQR away from the min/max.
[[Category:Machine Learning]]
[[Category:Machine Learning]]

Latest revision as of 06:48, 26 April 2024

Outliers are samples in a dataset that show abnormal distance from other samples. They impact the accuracy of the model.

Detection

Outliers are detected during Exploratory data analysis. Several detection methods are listed.

  • Background knowledge such as impossible values like negative age
  • Visualization such as scatter plot
  • Data Analysis such as box plot
  • ML algorithms such as One-Class-SVM

Numerically, outliers are defined to be 1.5xIQR away from the min/max.