Resolved: example in granularity?

Question

in the video when talking about whether to trust the R² value, an example cited is "granularity" now that part is a bit unclear for me.

can you please provide an example of "case-level" data as well as "aggregated" data?

thank you

Answer 1

Hi Ramzi,

Thank you for reaching out. This is indeed an important question. Here an example:

Imagine that you want to model the relationship between temperature and daylight (like shown in the video). This time, instead of using monthly data, you use daily data. Daily data is more granular than monthly data, right?

As you can imagine, there is much more variability in daily temperature data than on monthly data (which is averaged out). This variability leads to larger distances between the data points and the regression line in the scatter plot. As a result, you get a lower R2.

So typically, you get a higher R2 when you take average data (e.g., monthly) than if you use the underlying (e.g., daily data).

Does this help?