Resolved: Adjusted R-squared Dropped with Less Features
It was mentioned in the video that adjusted R-squared penalizes for extra variables added. However, that didn't seem to be so in the data I'm working on (NOT THE SAT-GPA DATA). I ran the summary with more features and here is my result:
Seeing that two of the features: good_friends_at_work and opportunities_for_CPD_at_work are both insignificant based on α = 0.05, I decided to remove those features from my dataset and see how my model will perform adn I got the following summary result:
It turns out that I had lower adjusted r-squared even after dropping two features.
What does that tell me about my data? Does it mean I should retain those fratures since I have a better adjusted R-squared wit them?
Hello,
Adjusted R-squared is a handy statistic that helps us understand how well our model explains the data, taking into account the number of predictors we use. Generally, adding extra predictors that don’t significantly boost the model’s performance won’t bump up the adjusted R-squared. However, you might notice something interesting like a drop in adjusted R-squared when you remove variables, even those that seem statistically insignificant (with p-values above 0.05). This drop suggests that these variables, while not standing out on their own, still contribute some useful information to the model, helping to capture more nuances in the data.
Best,
Ned