Techniques and intuition for binning continuous variables (fine classing)
- how it was decided to use 50 for cutting?
- What sort of an impact would a bin size of 43 have compared to a bin size of 50? I am sure it doesn't matter much (assuming) but I was trying to get an intuition of how the bin size would effect the accuracy of the model.
thanks for reaching out! The topic of choosing the appropriate number of bins for a Histogram is a complex one. And even though there are advanced techniques, such as Sturges rules, they rarely work so well in practice, as real data is discrete and usually noisy.
If you're looking to develop a bit more intuition on the matter, you can check out the chapter on Histogram in the Data Visualization Course.
There is a lecture specifically dedicated to choosing the appropriate number of bins. Hope you'll find it instructive: https://learn.365datascience.com/courses/the-complete-data-visualization-course-with-python-r-tableau-and-excel/histogram-how-to-choose-the-right-number-of-bins