Super learner
  
    This user is a Super Learner. To become a Super Learner, you need to reach Level 8.
  
Adjusting the dates where "mths_since_earliest_cr_line" is negative
In case you want to do set the date the original date instead of replacing with the highest positive value, use the following:
def earliest_cr_line_date(df):
    df["earliest_cr_line_date"]= pd.to_datetime(df["earliest_cr_line"], format="%b-%y")
    df["mths_since_earliest_cr_line"] = round(pd.to_numeric((pd.to_datetime('2017-12-01') - 
    df['earliest_cr_line_date']) / np.timedelta64(1, 'M')))
    df.loc[df["mths_since_earliest_cr_line"] < 0, "earliest_cr_line_date"] = df["earliest_cr_line_date"] - 
    pd.DateOffset(years=100)
    df.loc[df["mths_since_earliest_cr_line"] < 0, "mths_since_earliest_cr_line"] = (
        round(pd.to_numeric((pd.to_datetime('2017-12-01') - df['earliest_cr_line_date']) / np.timedelta64(1, 
        'M')))
    )
    return df
df = earliest_cr_line_date(df)This took me a while to compute but it was worth the while. Here is my df["mths_since_earliest_cr_line"].describe() after running this code: Hope this helps someone.
Hope this helps someone.
        1 answers ( 0 marked as helpful)
      
 Hey Jonathan, thanks so much for your contribution! Much appreciated.