Exploratory data analysis

From Rice Wiki
Revision as of 18:16, 3 April 2024 by Rice (talk | contribs)

Exploratory data analysis (EDA) is the first step in the Machine Learning pipeline. It allows us to make informed decisions about tools used to analyze the data.

  • Look at features of data
  • Look at correlated features
  • Find trends and unusual characteristics

Dataset description

EDA also detects unwanted values/noise that lead to inaccurate predictions.

EDA finds the correlation of attributes of datasets, such as linearity.