Outlier: Difference between revisions
From Rice Wiki
No edit summary |
|||
Line 1: | Line 1: | ||
Outliers are samples that show abnormal distance from other samples. They impact the accuracy of the model. | Outliers are samples in a [[dataset]] that show abnormal distance from other samples. They impact the accuracy of the model. | ||
= Detection = | |||
Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed. | Outliers are detected during [[Exploratory data analysis]]. Several detection methods are listed. | ||
Line 7: | Line 8: | ||
*Data Analysis such as box plot | *Data Analysis such as box plot | ||
*ML algorithms such as One-Class-SVM | *ML algorithms such as One-Class-SVM | ||
Numerically, outliers are defined to be 1.5xIQR away from the min/max. | |||
[[Category:Machine Learning]] | [[Category:Machine Learning]] |
Latest revision as of 06:48, 26 April 2024
Outliers are samples in a dataset that show abnormal distance from other samples. They impact the accuracy of the model.
Detection
Outliers are detected during Exploratory data analysis. Several detection methods are listed.
- Background knowledge such as impossible values like negative age
- Visualization such as scatter plot
- Data Analysis such as box plot
- ML algorithms such as One-Class-SVM
Numerically, outliers are defined to be 1.5xIQR away from the min/max.