UK2007 Spam Detection Analysis
A comprehensive machine learning school project focused on detecting web spam.
Screenshot

Tech Stack
R
RStudio
Markdown
Project Details
Analyzed three types of features — Direct, Link-based (transformed), and Content-based — to determine the best predictors of web spam
Applied machine learning models including Logistic Regression, Random Forest, and SVM using R and evaluated with cross-validation
Performed feature set combination analysis to test additive performance effects and ranked models using AUC as the primary metric
Automated full report generation with RMarkdown, including plots, tables, and ROC curves for each classifier-feature combination
Provided domain-specific discussion on spam detection strategies and summarized insights into feature importance and classifier behavior