What are different ways to make sure no errors are involved?

Question

Hi, is there any automated way or tool , which can validate and make sure that we do not make any of the errors What will happen when we detect error very late in our analysis? How data scientist solve problem when they encounter multiple errors and are working under very tight deadlines? Thanks in advance!

Answer 1

Hi Victor.

There is no way to eliminate the two errors simultaneously, because they are connected. The only sure way to minimize both is to increase the sample size (but that's also the hardest).

Usually, one of those errors is more important to you.

One example is fraud detection in banks.

Say a false positive there is a transaction flagged as fraudulent when it isn't.

A false negative is a transaction that is determined as alright, when in fact fraudulent.

You can quickly realize that a false positive is easy-peasy. You double-check the transaction, lose some time but realize everything is alright with it, and nobody lost money.

A false negative, however, is the big problem. You designed the system specifically to catch fraudulent activity and you missed it.

In this case, you will be desperately trying to minimize the false negative (Type II Error). This implies, you will be making more of the first type of error. But you are okay with that, aren't you? A bit more work, but at least you are safe.

That's how the reasoning goes usually.

Best,
The 365 Team

What are different ways to make sure no errors are involved?

Submit an answer