Protein data classification model

Python

Developed a data classification model to predict drug binding based on a dataset of protein data for Cyclica and handled data imbalance by applying data augmentation with SMOTE resampling

Trained the model with XGBoost and applied thresholding for AUC Score optimization and used Grid Search for setting up the optimized hyperparameters