20/10/2021
Non Maximum Suppression: Theory and Implementation in PyTorch
17/10/2021
Object Detection Metrics With Worked Example
Average Precision (AP) and mean Average Precision (mAP) are the most popular metrics used to evaluate object detection models such as Faster R_CNN, Mask R-CNN, YOLO among others. The same metrics have also been used the evaluate submissions in competitions like COCO and PASCAL VOC challenges.
13/10/2021
40 Must know Questions to test a data scientist on Dimensionality Reduction techniques
Q1) Imagine, you have 1000 input features and 1 target feature in a machine learning problem. You have to select 100 most important features based on the relationship between input features and the target features.
Do you think, this is an example of dimensionality reduction?
A. Yes
B. No
Dimensionality Reduction - Part 5 - Practical Approach to Dimensionality Reduction Using PCA, LDA and Kernel PCA
Dimensionality reduction is an important approach in machine learning. A large number of features available in the dataset may result in overfitting of the learning model. To identify the set of significant features and to reduce the dimension of the dataset, there are three popular dimensionality reduction techniques that are used.
In this article, we will discuss the practical implementation of these three dimensionality reduction techniques:
- Principal Component Analysis (PCA)
- Linear Discriminant Analysis (LDA), and
- Kernel PCA (KPCA)
- Comparison of PCA, LDA and Kernel PCA
12/10/2021
Dimensionality Reduction - Part 4 - How to Perform SVD Dimensionality Reduction
The most popular technique for dimensionality reduction in machine learning is Singular Value Decomposition (SVD). This is a technique that comes from the field of linear algebra and can be used as a data preparation technique to create a projection of a sparse dataset prior to fitting a model.
In this tutorial, you will discover how to use SVD for dimensionality reduction when developing predictive models. After completing this tutorial, you will know:
- SVD is a technique from linear algebra that can be used to automatically perform dimensionality reduction.
- How to evaluate predictive models that use an SVD projection as input and make predictions with new raw data.
Dimensionality Reduction - Part 3 - How to Perform PCA Dimensionality Reduction
The most popular technique for dimensionality reduction in machine learning is Principal Component Analysis (PCA). This is a technique that comes from the field of linear algebra and can be used as a data preparation technique to create a projection of a dataset prior to fitting a model.
In this tutorial, you will discover how to use PCA for dimensionality reduction when developing predictive models. After completing this tutorial, you will know:
- PCA is a technique from linear algebra that can be used to automatically perform dimensionality reduction.
- How to evaluate predictive models that use a PCA projection as input and make predictions with new raw data.
11/10/2021
Dimensionality Reduction - Part 2 - How to Perform LDA Dimensionality Reduction
Linear Discriminant Analysis (LDA) is a predictive modeling algorithm for multiclass classification.
It can also be used as a dimensionality reduction technique, providing a projection of a training dataset that best separates the examples by their assigned class.
The ability to use Linear Discriminant Analysis for dimensionality reduction often surprises most practitioners.
10/10/2021
Dimensionality Reduction - Part 1 - What is Dimensionality Reduction?
The number of input variables or features for a dataset is referred to as its dimensionality. Dimensionality reduction refers to techniques that reduce the number of input variables in a dataset. More input features often make a predictive modeling task more challenging to model, more generally referred to as the curse of dimensionality.
High-dimensionality statistics and dimensionality reduction techniques are often used for data visualization. Nevertheless these techniques can be used in applied machine learning to simplify a classification or regression dataset in order to better fit a predictive model.
In this tutorial, you will discover a gentle introduction to dimensionality reduction for machine learning.
Advanced Transform - Part 3 - How to Save and Load Data Transforms
It is critical that any data preparation performed on a training dataset is also performed on a new dataset in the future. This may include a test dataset when evaluating a model or new data from the domain when using a model to make predictions.
Typically, the model fit on the training dataset is saved for later use. The correct solution to preparing new data for the model in the future is to also save any data preparation objects, like data scaling methods, to file along with the model.
In this tutorial, you will discover how to save a model and data preparation object to file for later use. After completing this tutorial, you will know:
- The challenge of correctly preparing test data and new data for a machine learning model.
- The solution of saving the model and data preparation objects to file for later use.
- How to save and later load and use a machine learning model and data preparation model on new data.
08/10/2021
Advanced Transform - Part 2 - How to Transform the Target in Regression
On regression predictive modeling problems where a numerical value must be predicted, it can also be critical to scale and perform other data transformations on the target variable. This can be achieved in Python using the TransformedTargetRegressor class.
In this tutorial, you will discover how to use the TransformedTargetRegressor to scale and transform target variables for regression using the scikit-learn Python machine learning library.
After completing this tutorial, you will know:
- The importance of scaling input and target data for machine learning.
- The two approaches to applying data transforms to target variables.
- How to use the TransformedTargetRegressor on a real regression dataset
07/10/2021
Advanced Transform - Part 1 - How to Transform Numerical and Categorical Data
Applying data transforms like scaling or encoding categorical variables is straightforward when all input variables are the same type. It can be challenging when you have a dataset with mixed types and you want to selectively apply data transforms to some, but not all, input features.
The scikit-learn Python machine learning library provides the ColumnTransformer that allows you to selectively apply data transforms to different columns in your dataset. In this tutorial, you will discover how to use the ColumnTransformer to selectively apply data transforms to columns in a dataset with mixed data types.
After completing this tutorial, you will know:
- The challenge of using data transformations with datasets that have mixed data types
- How to define, fit, and use the ColumnTransformer to selectively apply data transforms to columns
- How to work through a real dataset with mixed data types and use the ColumnTransformer to apply different transforms to categorical and numerical data columns
Subscribe to:
Posts (Atom)