Outlier: Difference between revisions
From Rice Wiki
(Created page with "Outliers are samples that show abnormal distance from other samples. They impact the accuracy of the model. Outliers are detected during Exploratory data analysis. Several detection methods are listed. *Background knowledge such as impossible values like negative age *Visualization such as scatter plot *Data Analysis such as box plot *ML algorithms such as One-Class-SVM Category:Machine Learning") |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
Outliers are samples that show abnormal distance from other samples. They impact the accuracy of the model. | Outliers are samples in a [[dataset]] that show abnormal distance from other samples. They impact the accuracy of the model. | ||
= Detection = | |||
Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed. | Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed. | ||
Line 7: | Line 8: | ||
*Data Analysis such as box plot | *Data Analysis such as box plot | ||
*ML algorithms such as One-Class-SVM | *ML algorithms such as One-Class-SVM | ||
Numerically, outliers are defined to be 1.5xIQR away from the min/max. | |||
[[Category:Machine Learning]] | [[Category:Machine Learning]] |
Latest revision as of 06:48, 26 April 2024
Outliers are samples in a dataset that show abnormal distance from other samples. They impact the accuracy of the model.
Detection
Outliers are detected during Exploratory data analysis. Several detection methods are listed.
- Background knowledge such as impossible values like negative age
- Visualization such as scatter plot
- Data Analysis such as box plot
- ML algorithms such as One-Class-SVM
Numerically, outliers are defined to be 1.5xIQR away from the min/max.