Applying mathematical manipulation after data pre-processing
In video it is said that no mathematical manipuation can be applied to observation. can you explain more about it.. Is it meant that we cannot apply mathematical manipulation after data pre-processing? or something else. I couldnt get that part clearly.
No, you cannot apply (mathematical) manipulation during or before observation, because:
(observation => quantification => measure => metric)
KPI: We want to know whether adding shoes to our assortment was a good choice.
Raw data: First, we look at our raw data. We pick out the data of last years' sales as well as the list with all customers' names that made a purchase last year: this is our data collection.
Pre-processing: We remove all relevant mistakes in the data collection, if there are any: data cleansing. We remove all sales that do not contain shoe sales. After that, our observation is a list of last years' shoe sales, as well as the new customers. The reason that we cannot do mathematically manipulation here is because we do not want to manipulate this data. It would make the sale data incorrect. The amount of sold pair of shoes and new customers are a fact and should not be touched here.
We label the observations as categorical or numerical if it is traditional data, or as numbers, text etc if it is big data (check the course sheet). In this situation we are working with big data, because we have a lot of other sales and other business data etc. in our raw data. We determine that all data we need from our observation are numbers: the amount of sold shoes are numbers and the same goes for the customers. This is called class labeling.
Processing: We add up all shoe sales as well as the new customers to get our quantification; representing observations as numbers. Next up, we measure (the accumulation of observations to show some information) how much revenue we made from shoes, of which there are 100 (all 50 euro per pair). If we calculate this, 100*50, it seems the revenue of shoes is 5000 euro and we had 200 new customers over the course of last year. The metrics, which aims at business performance or progress, is 1500/200 = 7.5 euro per new customer. Then we compare the money spent by new customerss to that of the past 3 years and we have our KPI. Outofchars...