Menu

Credit Card Fraud Transaction Model

Explored how XGB can be used to detect fraudulent credit card transactions

Pink Flower
Pink Flower
Notebook icon
Credit Card Fraud Detection - Jupyter Notebook

Process

Process

Process

01

Data Preprocessing

The dataset had numerous irrelevant and redundant columns, as well as missing values. Irrelevant columns were removed, and missing values were imputed using the mean, median, or mode for numerical variables and mode imputation for categorical variables

02

Multicollinearity

The dataset exhibited high multicollinearity, which can lead to unstable estimates in the model. The Variance Inflation Factor (VIF) was calculated to identify and remove the variables with high multicollinearity

03

Dimensionality Reduction

Principal Component Analysis (PCA) was applied to reduce the dimensionality of the data, helping to speed up the training process and potentially improving model performance by removing noise and redundant information.

04

Hyperparameter Tuning

Initially XGBoost was used to categorize transactions, but I noticed that the performance was subpar. Hyperparameter tuning was performed using grid search and early stopping to improve model performance.

We conducted user interviews, surveys, and analyzed in-app analytics to understand the pain points and user needs. We also studied competitor apps and industry trends to gather insights

01

Data Preprocessing

The dataset had numerous irrelevant and redundant columns, as well as missing values. Irrelevant columns were removed, and missing values were imputed using the mean, median, or mode for numerical variables and mode imputation for categorical variables

02

Multicollinearity

The dataset exhibited high multicollinearity, which can lead to unstable estimates in the model. The Variance Inflation Factor (VIF) was calculated to identify and remove the variables with high multicollinearity

03

Dimensionality Reduction

Principal Component Analysis (PCA) was applied to reduce the dimensionality of the data, helping to speed up the training process and potentially improving model performance by removing noise and redundant information.

04

Hyperparameter Tuning

Initially XGBoost was used to categorize transactions, but I noticed that the performance was subpar. Hyperparameter tuning was performed using grid search and early stopping to improve model performance.

We conducted user interviews, surveys, and analyzed in-app analytics to understand the pain points and user needs. We also studied competitor apps and industry trends to gather insights

01

Data Preprocessing

The dataset had numerous irrelevant and redundant columns, as well as missing values. Irrelevant columns were removed, and missing values were imputed using the mean, median, or mode for numerical variables and mode imputation for categorical variables

02

Multicollinearity

The dataset exhibited high multicollinearity, which can lead to unstable estimates in the model. The Variance Inflation Factor (VIF) was calculated to identify and remove the variables with high multicollinearity

03

Dimensionality Reduction

Principal Component Analysis (PCA) was applied to reduce the dimensionality of the data, helping to speed up the training process and potentially improving model performance by removing noise and redundant information.

04

Hyperparameter Tuning

Initially XGBoost was used to categorize transactions, but I noticed that the performance was subpar. Hyperparameter tuning was performed using grid search and early stopping to improve model performance.

We conducted user interviews, surveys, and analyzed in-app analytics to understand the pain points and user needs. We also studied competitor apps and industry trends to gather insights

01

Data Preprocessing

The dataset had numerous irrelevant and redundant columns, as well as missing values. Irrelevant columns were removed, and missing values were imputed using the mean, median, or mode for numerical variables and mode imputation for categorical variables

02

Multicollinearity

The dataset exhibited high multicollinearity, which can lead to unstable estimates in the model. The Variance Inflation Factor (VIF) was calculated to identify and remove the variables with high multicollinearity

03

Dimensionality Reduction

Principal Component Analysis (PCA) was applied to reduce the dimensionality of the data, helping to speed up the training process and potentially improving model performance by removing noise and redundant information.

04

Hyperparameter Tuning

Initially XGBoost was used to categorize transactions, but I noticed that the performance was subpar. Hyperparameter tuning was performed using grid search and early stopping to improve model performance.

We conducted user interviews, surveys, and analyzed in-app analytics to understand the pain points and user needs. We also studied competitor apps and industry trends to gather insights