Real-world data often has missing values. Data can have missing values for a number of reasons such as observations that were not recorded and data corruption.
Handling missing data is important as many machine learning algorithms do not support data with missing values.
In this tutorial, you will discover how to handle missing data for machine learning with Python.
Specifically, after completing this tutorial you will know:
- How to mark invalid or corrupt values as missing in your dataset.
- How to confirm that the presence of marked missing values causes problems for learning algorithms.
- How to remove rows with missing data from your dataset and evaluate a learning algorithm on the transformed dataset.