question on pd.to_datetime
How does pandas know if the string is month or day? e.g. 04/07/2018, how does pandas know if it is july or april?
Great to have you in the course and thanks for reaching out!
Pandas can recognize an eventual datetime value automatically by analyzing the given string pattern if it abides by a pre-defined standard. This standard orders the date components in the following way: DAY - MONTH - YEAR or YEAR - MONTH - DAY respectively. This means that if you use:
pandas will interpret both string values as dates and format these accordingly (using ISO 8601). But if you try with:
04/14/2018 it will lead to an error message because the order: MONTH - DAY - YEAR doesn't abide by the pre-defined standard and pandas will "think" that you are trying to use 14 as a month value which is not possible.
Hope this helps.
Thanks for the reply,a bit confused how pandas determine if 07 or 04 is the month or day
e.g. in the video, from the column StartDate, seems like 07 from 04/07/2018 is the month, based on the 5th row 28/10/2017
However, after runnning pd.to_datetime(lending_co_data['StartDate']), it shows that 07 is the day now as per below screenshot.
Thank you for sharing this information with the Community! Your observation is a valuable remark!
In cases, when string dates don't start with the year value (2018/07/04) but end with it instead (04/07/2018),
to_datetime() will convert strings considering the first value as a month (MM/DD/YYYY) by default.
If we want pandas to consider the day value first instead of the month, we can use the following
dayfirst argument set to
True. Like this:
pd.to_datetime(lending_co_data['StartDate'], dayfirst = True)
However, if the string date is formatted like this: 23/07/2018, pandas will 'see' that 23 > 12 (which means it cannot be a month) and will automatically abide by the other formatting order (DD/MM/YYYY). Thus, it will interpret the first value as a day value indeed.
Thank you for pointing this out!