# Resolved: Cannot understand why resetting indexes makes NaNs disappear

Can you better explain why initially np.exp(y_test) gives NaN values and resetting the indexes solves this issue? Thanks

Hey Alessandro,

Thank you for reaching out!

Let's try to understand what each of the variables stores.

The variable `targets`

stores the target values, with each of these values corresponding to a unique index. In the `targets`

variable, the indices are arranged in an ascending order.

The `train_test_split`

method reserves some of the observations for training and some - for testing. It does that by shuffling the samples, but preserving their original indices.

Now, `y_hat_test`

is an array of numbers representing the predictions from `x_test`

. Most importantly, `y_hat_test`

doesn't know anything about the indexing of `x_test`

. Therefore, once we add `np.exp(y_hat_test)`

as a column of a `DataFrame`

, the indexing naturally starts from 0 and goes down to 773 in an ascending fashion.

Now, imagine what happens when we add `np.exp(y_test)`

as a second column to the same `DataFrame`

object. `pandas`

will try to match the indices of `y_hat_test`

(ranging from 0 to 773) to those of `y_test`

(randomly drawn from the `targets`

variable). However, `y_test`

contains indices that are larger than 773. Additionally, some indices between 0 and 773 will not be included. Let's show that this is indeed the case.

In the code below, right after the definition of the `df_pf`

`DataFrame`

, I have created another one called `data_test`

that stores only the (exponential of the) values of `y_test`

together with their original indices.

From the output of the `df_pf`

`DataFrame`

, we can see that index 1 gives a non-null value in the **Target** column, while index 2 corresponds to null target. By typing

```
data_test.loc[1]
```

we see that the output is indeed 7900.0, as in the `DataFrame`

below. Typing

```
data_test.loc[2]
```

on the other hand, returns in an error. The reason is that there is no value in `data_test`

with an index of 2.

To resolve this issue, we reset the indices of the `y_test`

variable, such that they start from 0 and go down to 773 in an ascending order. In that way, each prediction will have a corresponding target with the same index.

Hope this helps! Let me know if anything remains unclear.

Kind regards,

365 Hristina

I have to play a bit around with it, but I think it's quite clear. Thank you!