# Explain how to interpret the matrices output in Histogram such as hist2d and histdd

I couldnt much understand the density array part of the matrices in hist2d and histdd , please clarify .

You aren't the only one. I wouldn't worry to much about it. You wouldn't use this in real world scenario. Stakeholders have difficults understanding normal histograms plots, there's no way as a data analyst you would present them a chart that resembles something like this.

For the sake of understanding, consider that we are using only the first two rows of our matrixA array where X is the first and Y is the second

now if you consider the first 1-D array given in the result of the 2-D histogram read the bins as rows for the 2-D array result and

read the second 1-D array as a column for the 2-D array in the result. This will mean that:

0-0.75 is row1 and 0.75-1.5 is row 2

2.-3.75 is column1 and 3.75-5.5 is column 2

so to read this row1 and column1 from the 2-D array in the result tell you the number of X and Y values (from the matrixA) that falls within those ranges.

That is row1 (0.-0.75) and column3(5.5-7.5) has 2 values (X=1 and Y=9 from the matrixA) which is why we have 2 in row1 column3 of our 2-D result array

I hope this helps

What I think is:

There are "TWO" coordinates (0, 6) & (0, 6) in matriax_A falls in the bins of row(0-0.75) & column(5.5-7.25),

so we get a number "2" in the density array,

and there are 5 total coordinates(2+1+1+1 = 5) which is the sum of element values in density array.

Am I right?

Thanks!

```
array([[0., 0., 2., 0.],
[1., 0., 0., 1.],
[0., 0., 0., 0.],
[1., 0., 0., 0.]]
X = array([0. , 0.75, 1.5 , 2.25, 3. ])
Y = array([2. , 3.75, 5.5 , 7.25, 9. ]))
```

**Number 2 in the density array is located at the 1st row and 3rd column. Use this information to find the bin number in the bin edges:** find the **1st** bin from the **X** array which is *0 - 0.75*, find the **3rd** bin from the **Y** array which is *5.5 - 7.25*.

What do they all mean? Well, it just tells you there are **2** numbers (I am referring to number 2 in the density array here) that are between *0 and 0.75* from **matrix_A[0]**, and they are 0 and 0.

There are also 2 numbers between *5.5 and 7.25* from the **matrix_A[1]** and they are 6 and 6.

Another example:

Number 1 at the 4th row, 1st column of the density array:

4 ---> 4th bin in the X array: 2.25-3

1 ---> 1st bin in the Y array: 2-3.75

How many numbers in matrix_A[0] fall in between 2.25 and 3? only 1

How many numbers in matrix_A[1] fall in between 2 and 3.75? only 1

Just remember the closed-open interval rule for the bin edge. As a reminder, for the last bin, it is actually closed-closed.

I hope this helps.

I think it could be easier to understand this lesson if you write down the points of the matrix_A in the format (x,y):

(1,3) (0,6) (0,6) (3,2) (1,9)

Then you compare the x and y from the points above and check if they fit into the intervals in the array([0. , 0.75, 1.5 , 2.25, 3. ]). Intervals here are (0-0.75), (0.75-1.5),(1.5-2.25),(2.25-3)

Similarly for y intervals in array([2. , 3.75, 5.5 , 7.25, 9. ]). Intervals here are (2.0-3.75), (3.75-5.5), (5.5-7.25), 7.25-9.0)

For example, point (1,3) fits into these intervals: (0.75-1.5,2.0-3.75), point (0,6) -> (0-0.75,5.5-7.25), point (3,2) -> (2.25-3.0,2.0-3.75), point (1,9) ->(0.75-1.5,7.25-9.0)

So, the 2D array you see shows the frequency/density or a number of points that fall into each possible interval. In total you have (4x4) possible intervals. For example, for interval (0-0.75,5.5-7.25) u've got 2 points -> (0,6), (0.6). As shows on the graph of a 2D histogram, the (0-0.75,5.5-7.25) has darker shade, so it has higher density (2) than other intervals (1).

It was pretty confusing to me at first too, but I hope this helps