By integrating diverse datasets
(employee demographics, job history, monthly earnings, and deductions)
over a six-year period, I developed and evaluated multiple
classification models, including Decision Trees, Random Forests,
Gradient Boosting, and Logistic Regression.
Key accomplishments include:
• Comprehensive Data Engineering: Consolidated, cleaned, and feature-engineered
complex relational data from multiple sources, handling missing values and
outliers to create a robust dataset.
• Advanced Predictive Modeling: Trained and optimized models using
techniques like SMOTE for class imbalance and GridSearchCV for
hyperparameter tuning, achieving strong predictive performance in
identifying employees at risk of attrition.
• Explainable AI (XAI) Integration: Applied LIME and SHAP to ensure a
clear understanding of why specific employees were predicted to leave.
Impactful Analysis: Demonstrated that financial factors and
tenure are critical drivers of attrition, offering actionable
insights for targeted retention strategies.
Technologies Used:
Python, Pandas, NumPy, Scikit-learn, Imblearn, Matplotlib, Seaborn, LIME, SHAP.