Apply function on rolling basis across pandas dataframe
Hi, I need to pass a function (HurstEXP), the Hurst Exponent, to every column in a dataframe on a rolling row basis and assign the values to a new DataFrame.
The HE function I use is below:
#Hurst is essentially a measure of the memory in a particular time series,
#and this memory can be both mean-reverting and trending at the same time, depending on the time scale (no. of lags)
num_lags = 20
def HurstEXP(ts):
if isinstance(ts, np.ndarray):
ts = ts.tolist()
pass;
lags = range( 2, num_lags ) # Create the range of lag values
tau = [std(subtract(ts[lag:], ts[:-lag])) for lag in lags] # Calculate the array of the variances of the lagged differences
m = np.polyfit(np.log(lags), np.log(tau), 1)
return (m[0])
The issue i am struggling with is how to then apply this to every column in a dataframe with my timeseries data. E.g. 5Year daily pricing data for Tesla & Google;
import pandas as pd
import yfinance as yf
import numpy as np
from datetime import datetime
from dateutil.relativedelta import relativedelta
from numpy import log, polyfit, sqrt, std, subtract
years = 5
today = datetime.today().strftime('%Y-%m-%d')
lastyeartoday = (datetime.today() - relativedelta(years=years)).strftime('%Y-%m-%d')
df = yf.download(['TSLA', 'GOOG'],
start=lastyeartoday,
end=today,
progress=False)
df = df.dropna()
df = df[[u'Close']]
df
So far so good... however now when i try to apply my function it gives NaN values across the df:
df.rolling(20).apply(lambda x: HurstEXP(ts = x), raw=True)
I think the issue is to do with the dtype so i tried converting my df to an ndarray through df.to_numpy() but then i cannot apply the rolling functions:
AttributeError: 'numpy.ndarray' object has no attribute 'rolling'
Please help, been struggling with this for a few days now, very green when it comes to Python! Many thanks