pyg.timeseries¶

Given pandas, why do we need this timeseries library? pandas is amazing but there are a few features in pyg.timeseries designed to enhance it. There are three issues with pandas that pyg.timeseries tries to address:

pandas works on pandas objects (obviously) but not on numpy arrays.
pandas handles TimeSeries with nan inconsistently across its functions. This makes your results sensitive to reindexing/resampling. E.g.:
- a.expanding() & a.ewm() ignore nan’s for calculation and then ffill the result.
- a.diff(), a.rolling() include any nans in the calculation, leading to nan propagation.
pandas is great if you have the full timeseries. However, if you now want to run the same calculations in a live environment, on recent data, pandas cannot help you: you have to stick the new data at the end of the DataFrame and rerun.

pyg.timeseries tries to address this:

pyg.timeseries agrees with pandas 100% on DataFrames (with no nan) while being of comparable (if not faster) speed
pyg.timeseries works seemlessly on pandas objects and on numpy arrays, with no code change.
pyg.timeseries handles nan consistently across all its functions, ‘ignoring’ all nan, making your results consistent regardless of resampling.
pyg.timeseries exposes the state of the internal function calculation. The exposure of internal states allows us to calculate the output of additional data without re-running history. This speeds up of two very common problems in finance:
- risk calculations, Monte Carlo scenarios: We can run a trading strategy up to today and then generate multiple scenarios and see what-if, without having to rerun the full history.
- live versus history: pandas is designed to run a full historical simulation. However, once we reach “today”, speed is of the essense and running a full historical simulation every time we ingest a new price, is just too slow. That is why most fast trading is built around fast state-machines. Of course, making sure research & live versions do the same thing is tricky. pyg gives you the ability to run two systems in parallel with almost the same code base: run full history overnight and then run today’s code base instantly, instantiated with the output of the historical simulation.

simple functions¶

diff¶

pyg.timeseries._rolling.diff(a, n=1, axis=0, data=None, state=None)¶

equivalent to a.diff(n) in pandas if there are no nans. If there are, we SKIP nans rather than propagate them.

Parameters

aarray/timeseries: array/timeseries
n: int, optional, default = 1: window size
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: : matching pandas no nan’s

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> assert eq(timer(diff, 1000)(a), timer(lambda a, n=1: a.diff(n), 1000)(a))

Example: : nan skipping

>>> a = np.array([1., np.nan, 3., 9.])
>>> assert eq(diff(a),                      np.array([np.nan, np.nan, 2.0,   6.0]))
>>> assert eq(pd.Series(a).diff().values,   np.array([np.nan, np.nan, np.nan,6.0]))

shift¶

pyg.timeseries._rolling.shift(a, n=1, axis=0, data=None, state=None)¶

Equivalent to a.shift() with support to arra

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2,3,4,5], drange(-4))
>>> assert eq(shift(a), pd.Series([np.nan,1,2,3,4], drange(-4)))
>>> assert eq(shift(a,2), pd.Series([np.nan,np.nan,1,2,3], drange(-4)))
>>> assert eq(shift(a,-1), pd.Series([2,3,4,5,np.nan], drange(-4)))

Example: np.ndarrays

>>> assert eq(shift(a.values), shift(a).values)

Example: nan skipping

>>> a = pd.Series([1.,2,np.nan,3,4], drange(-4))
>>> assert eq(shift(a), pd.Series([np.nan,1,np.nan, 2,3], drange(-4)))
>>> assert eq(a.shift(), pd.Series([np.nan,1,2,np.nan,3], drange(-4))) # the location of the nan changes

Example: state management

>>> old = a.iloc[:3]
>>> new = a.iloc[3:]
>>> old_ts = shift_(old)
>>> new_ts = shift(new, **old_ts)
>>> assert eq(new_ts, shift(a).iloc[3:])

ratio¶

pyg.timeseries._rolling.ratio(a, n=1, data=None, state=None)¶

Equivalent to a.diff() but in log-space..

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2,3,4,5], drange(-4))
>>> assert eq(ratio(a), pd.Series([np.nan, 2, 1.5, 4/3,1.25], drange(-4)))
>>> assert eq(ratio(a,2), pd.Series([np.nan, np.nan, 3, 2, 5/3], drange(-4)))

ts_count¶

pyg.timeseries._ts.ts_count(a) is equivalent to a.count() (though slightly slower)¶

supports numpy arrays
skips nan
supports state management

Example: pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_count(a) == a.count()

Example: numpy

>>> assert ts_count(a.values) == ts_count(a)

Example: state management

>>> old = ts_count_(a.iloc[:2000])
>>> new = ts_count(a.iloc[2000:], state = old.state)
>>> assert new == ts_count(a)

ts_sum¶

pyg.timeseries._ts.ts_sum(a) is equivalent to a.sum()¶

supports numpy arrays
handles nan
supports state management

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_sum(a) == a.sum()

Example: numpy

>>> assert ts_sum(a.values) == ts_sum(a)

Example: state management

>>> old = ts_sum_(a.iloc[:2000])
>>> new = ts_sum(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_sum(a)

ts_mean¶

pyg.timeseries._ts.ts_mean(a) is equivalent to a.mean()¶

supports numpy arrays
handles nan
supports state management
pandas is actually faster on count

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_mean(a) == a.mean()

Example: numpy

>>> assert ts_mean(a.values) == ts_mean(a)

Example: state management

>>> old = ts_mean_(a.iloc[:2000])
>>> new = ts_mean(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_mean(a)

ts_rms¶

pyg.timeseries._ts.ts_rms(a, axis=0, data=None, state=None)¶

ts_rms(a) is equivalent to (a**2).mean()**0.5

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

supports numpy arrays
handles nan
supports state management

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan

Example: pandas matching

>>> assert abs(ts_std(a) - a.std())<1e-13

Example: numpy

>>> assert ts_std(a.values) == ts_std(a)

Example: state management

>>> old = ts_rms_(a.iloc[:2000])
>>> new = ts_rms(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_rms(a)

ts_std¶

pyg.timeseries._ts.ts_std(a) is equivalent to a.std()¶

supports numpy arrays
handles nan
supports state management

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan

Example: pandas matching

>>> assert abs(ts_std(a) - a.std())<1e-13

Example: numpy

>>> assert ts_std(a.values) == ts_std(a)

Example: state management

>>> old = ts_std_(a.iloc[:2000])
>>> new = ts_std(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_std(a)

ts_skew¶

pyg.timeseries._ts.ts_skew(a, 0) is equivalent to a.skew()¶

supports numpy arrays
handles nan
faster than pandas
supports state management

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
min_sample: float, optional: This refers to the denominator when we calculate the skew. Over time, the deonimator converges to 1 but initially, it is small. Also, if there is a gap in the data, older datapoints weight may have decayed while there are not enough “new point”. min_sample ensures that in both cases, if denominator<0.25 )(default value) we return nan.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert abs(ts_skew(a, 0) - a.skew())<1e-13

Example: numpy

>>> assert ts_skew(a.values) == ts_skew(a)

Example: state management

>>> old = ts_skew_(a.iloc[:2000])
>>> new = ts_skew(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_skew(a)

ts_min¶

pyg.timeseries._ts.ts_max(a) is equivalent to pandas a.min()¶

ts_max¶

pyg.timeseries._ts.ts_max(a) is equivalent to pandas a.min()¶

ts_median¶

pyg.timeseries._ts.ts_median(a, axis=0)¶

fnna¶

pyg.timeseries._rolling.fnna(a, n=1, axis=0)¶

returns the index in a of the nth first non-nan.

Parameters

a : array/timeseries n: int, optional, default = 1

Example

>>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan])
>>> fnna(a,n=-2)

v2na/na2v¶

pyg.timeseries._rolling.v2na(a, old=0.0, new=nan)¶

replaces an old value with a new value (default is nan)

Examples

>>> from pyg import *
>>> a = np.array([1., np.nan, 1., 0.])
>>> assert eq(v2na(a), np.array([1., np.nan, 1., np.nan]))
>>> assert eq(v2na(a,1), np.array([np.nan, np.nan, np.nan, 0]))
>>> assert eq(v2na(a,1,0), np.array([0., np.nan, 0., 0.]))

Parameters

a : array/timeseries old: float

newfloat, optional: new value to be used, The default is np.nan.

Returns

array/timeseries

pyg.timeseries._rolling.na2v(a, new=0.0)¶

replaces a nan with a new value

Example

>>> from pyg import *
>>> a = np.array([1., np.nan, 1.])
>>> assert eq(na2v(a), np.array([1., 0.0, 1.]))
>>> assert eq(na2v(a,1), np.array([1., 1., 1.]))

Parameters

a : array/timeseries new : float, optional

Returns

array/timeseries

ffill/bfill¶

pyg.timeseries._rolling.ffill(a, n=0, axis=0, data=None, state=None)¶

returns a forward filled array, up to n values forward. supports state manegement which is needed if we want only nth

Parameters

aarray/timeseries: array/timeseries
n: int, optional, default = 1: window size
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan])
>>> fnna(a, n=-2)

pyg.timeseries._rolling.bfill(a, n=- 1, axis=0)¶

equivalent to a.fillna(‘bfill’). There is no state-aware as this function is forward looking

Example

>>> from pyg import *
>>> a = np.array([np.nan, 1., np.nan])
>>> b = np.array([1., 1., np.nan])
>>> assert eq(bfill(a),  b)

Example: pd.Series

>>> ts = pd.Series(a, drange(-2))
>>> assert eq(bfill(ts).values, b)

nona¶

pyg.timeseries._ts.nona(a, value=nan)¶

removes rows that are entirely nan (or a specific other value)

Parameters

a : dataframe/ndarray

valuefloat, optional: value to be removed. The default is np.nan.

Example

>>> from pyg import *
>>> a = np.array([1,np.nan,2,3])
>>> assert eq(nona(a), np.array([1,2,3]))

Example: multiple columns

>>> a = np.array([[1,np.nan,2,np.nan], [np.nan, np.nan, np.nan, 3]]).T 
>>> b = np.array([[1,2,np.nan], [np.nan, np.nan, 3]]).T ## 2nd row has nans across
>>> assert eq(nona(a), b)

expanding window functions¶

expanding_mean¶

pyg.timeseries._expanding.expanding_mean(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().mean().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().mean(); ts = expanding_mean(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().mean(); ts = expanding_mean(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  1.562960  1.562960
>>> 1993-09-24  0.908910  0.908910
>>> 1993-09-25  0.846817  0.846817
>>> 1993-09-26  0.821423  0.821423
>>> 1993-09-27  0.821423       NaN
>>>              ...       ...
>>> 2021-02-03  0.870358  0.870358
>>> 2021-02-04  0.870358       NaN
>>> 2021-02-05  0.870358       NaN
>>> 2021-02-06  0.870358       NaN
>>> 2021-02-07  0.870353  0.870353

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_mean(a)
>>> old_ts = expanding_mean_(old)
>>> new_ts = expanding_mean(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_mean(dict(x = a, y = a**2)), dict(x = expanding_mean(a), y = expanding_mean(a**2)))
>>> assert eq(expanding_mean([a,a**2]), [expanding_mean(a), expanding_mean(a**2)])

expanding_rms¶

pyg.timeseries._expanding.expanding_rms(a, axis=0, data=None, state=None)¶

equivalent to pandas (a**2).expanding().mean()**0.5). - works with np.arrays - handles nan without forward filling. - supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame, list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  0.160462  0.160462
>>> 1993-09-24  0.160462       NaN
>>> 1993-09-25  0.160462       NaN
>>> 1993-09-26  0.160462       NaN
>>> 1993-09-27  0.160462       NaN
>>>                  ...       ...
>>> 2021-02-03  1.040346  1.040346
>>> 2021-02-04  1.040346       NaN
>>> 2021-02-05  1.040338  1.040338
>>> 2021-02-06  1.040337  1.040337
>>> 2021-02-07  1.040473  1.040473

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_rms(a)
>>> old_ts = expanding_rms_(old)
>>> new_ts = expanding_rms(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_rms(dict(x = a, y = a**2)), dict(x = expanding_rms(a), y = expanding_rms(a**2)))
>>> assert eq(expanding_rms([a,a**2]), [expanding_rms(a), expanding_rms(a**2)])

expanding_std¶

pyg.timeseries._expanding.expanding_std(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().std().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().std(); ts = expanding_std(a)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().std(); ts = expanding_std(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23       NaN       NaN
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26       NaN       NaN
>>> 1993-09-27       NaN       NaN
>>>              ...       ...
>>> 2021-02-03  0.590448  0.590448
>>> 2021-02-04  0.590448       NaN
>>> 2021-02-05  0.590475  0.590475
>>> 2021-02-06  0.590475       NaN
>>> 2021-02-07  0.590411  0.590411

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_std(a)
>>> old_ts = expanding_std_(old)
>>> new_ts = expanding_std(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_std(dict(x = a, y = a**2)), dict(x = expanding_std(a), y = expanding_std(a**2)))
>>> assert eq(expanding_std([a,a**2]), [expanding_std(a), expanding_std(a**2)])

expanding_sum¶

pyg.timeseries._expanding.expanding_sum(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().sum().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().sum(); ts = expanding_sum(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23          NaN          NaN
>>> 1993-09-24          NaN          NaN
>>> 1993-09-25     0.645944     0.645944
>>> 1993-09-26     2.816321     2.816321
>>> 1993-09-27     2.816321          NaN
>>>                 ...          ...
>>> 2021-02-03  3976.911348  3976.911348
>>> 2021-02-04  3976.911348          NaN
>>> 2021-02-05  3976.911348          NaN
>>> 2021-02-06  3976.911348          NaN
>>> 2021-02-07  3976.911348          NaN

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_sum(a)
>>> old_ts = expanding_sum_(old)
>>> new_ts = expanding_sum(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2)))
>>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)])

expanding_skew¶

pyg.timeseries._expanding.expanding_skew(a, bias=False, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().skew() which doesn’t exist

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: state management

One can split the calculation and run old and new data separately.

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_skew(a)
>>> old_ts = expanding_skew_(old)
>>> new_ts = expanding_skew(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_skew(dict(x = a, y = a**2)), dict(x = expanding_skew(a), y = expanding_skew(a**2)))
>>> assert eq(expanding_skew([a,a**2]), [expanding_skew(a), expanding_skew(a**2)])

expanding_min¶

pyg.timeseries._min.expanding_min(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().min().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().min(); ts = expanding_min(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().min(); ts = expanding_min(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26  0.775176  0.775176
>>> 1993-09-27  0.691942  0.691942
>>> 1993-09-28  0.691942       NaN
>>>              ...       ...
>>> 2021-02-04  0.100099  0.100099
>>> 2021-02-05  0.100099       NaN
>>> 2021-02-06  0.100099       NaN
>>> 2021-02-07  0.100099  0.100099
>>> 2021-02-08  0.100099  0.100099

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_min(a)
>>> old_ts = expanding_min_(old)
>>> new_ts = expanding_min(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_min(dict(x = a, y = a**2)), dict(x = expanding_min(a), y = expanding_min(a**2)))
>>> assert eq(expanding_min([a,a**2]), [expanding_min(a), expanding_min(a**2)])

expanding_max¶

pyg.timeseries._max.expanding_max(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().max().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().max(); ts = expanding_max(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().max(); ts = expanding_max(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26  0.875409  0.875409
>>> 1993-09-27  0.875409       NaN
>>> 1993-09-28  0.875409       NaN
>>>              ...       ...
>>> 2021-02-04  3.625858  3.625858
>>> 2021-02-05  3.625858       NaN
>>> 2021-02-06  3.625858  3.625858
>>> 2021-02-07  3.625858       NaN
>>> 2021-02-08  3.625858       NaN

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_max(a)
>>> old_ts = expanding_max_(old)
>>> new_ts = expanding_max(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_max(dict(x = a, y = a**2)), dict(x = expanding_max(a), y = expanding_max(a**2)))
>>> assert eq(expanding_max([a,a**2]), [expanding_max(a), expanding_max(a**2)])

expanding_median¶

pyg.timeseries._median.expanding_median(a, axis=0)¶

equivalent to pandas a.expanding().median().

works with np.arrays
handles nan without forward filling.
There is no state-aware version since this requires essentially the whole history to be stored.

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().median(); ts = expanding_median(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().median(); ts = expanding_median(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  1.562960  1.562960
>>> 1993-09-24  0.908910  0.908910
>>> 1993-09-25  0.846817  0.846817
>>> 1993-09-26  0.821423  0.821423
>>> 1993-09-27  0.821423       NaN
>>>              ...       ...
>>> 2021-02-03  0.870358  0.870358
>>> 2021-02-04  0.870358       NaN
>>> 2021-02-05  0.870358       NaN
>>> 2021-02-06  0.870358       NaN
>>> 2021-02-07  0.870353  0.870353

Example: dict/list inputs

>>> assert eq(expanding_median(dict(x = a, y = a**2)), dict(x = expanding_median(a), y = expanding_median(a**2)))
>>> assert eq(expanding_median([a,a**2]), [expanding_median(a), expanding_median(a**2)])

expanding_rank¶

pyg.timeseries._rank.expanding_rank(a, axis=0)¶

returns a rank of the current value within history, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2., np.nan, 0.,4.,2.], drange(-5))
>>> rank = expanding_rank(a)
>>> assert eq(rank, pd.Series([0, 1, np.nan, -1, 1, 0.25], drange(-5)))
>>> #
>>> # 2 is largest in [1,2] so goes to 1; 
>>> # 0 is smallest in [1,2,0] so goes to -1 etc.

Example: numpy equivalent

>>> assert eq(expanding_rank(a.values), expanding_rank(a).values)  

cumsum¶

pyg.timeseries._expanding.cumsum(a, axis=0, data=None, state=None)¶

equivalent to pandas a.expanding().sum().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> assert eq(ts,panda)    

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().sum(); ts = expanding_sum(a)

>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23          NaN          NaN
>>> 1993-09-24          NaN          NaN
>>> 1993-09-25     0.645944     0.645944
>>> 1993-09-26     2.816321     2.816321
>>> 1993-09-27     2.816321          NaN
>>>                 ...          ...
>>> 2021-02-03  3976.911348  3976.911348
>>> 2021-02-04  3976.911348          NaN
>>> 2021-02-05  3976.911348          NaN
>>> 2021-02-06  3976.911348          NaN
>>> 2021-02-07  3976.911348          NaN

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_sum(a)
>>> old_ts = expanding_sum_(old)
>>> new_ts = expanding_sum(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2)))
>>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)])

cumprod¶

pyg.timeseries._expanding.cumprod(a, axis=0, data=None, state=None)¶

equivalent to pandas np.exp(np.log(a).expanding().sum()).

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
axisint, optional: 0/1/-1. The default is 0.
data: None: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = 1 + pd.Series(np.random.normal(0.001,0.05,10000), drange(-9999))
>>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a)
>>> assert abs(ts-panda).max() < 1e-10

Example: nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a = 1 + pd.Series(np.random.normal(-0.01,0.05,100), drange(-99, 2020))
>>> a[a<0.975] = np.nan
>>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a)

>>> pd.concat([panda,ts], axis=1)
>>> 2019-09-24  1.037161  1.037161
>>> 2019-09-25  1.050378  1.050378
>>> 2019-09-26  1.158734  1.158734
>>> 2019-09-27  1.158734       NaN
>>> 2019-09-28  1.219402  1.219402
>>>              ...       ...
>>> 2019-12-28  4.032919  4.032919
>>> 2019-12-29  4.032919       NaN
>>> 2019-12-30  4.180120  4.180120
>>> 2019-12-31  4.180120       NaN
>>> 2020-01-01  4.244261  4.244261

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:50]        
>>> new = a.iloc[50:]    
>>> ts = cumprod(a)
>>> old_ts = cumprod_(old)
>>> new_ts = cumprod(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[50:])

Example: dict/list inputs

>>> assert eq(cumprod(dict(x = a, y = a**2)), dict(x = cumprod(a), y = cumprod(a**2)))
>>> assert eq(cumprod([a,a**2]), [cumprod(a), cumprod(a**2)])

rolling window functions¶

rolling_mean¶

pyg.timeseries._rolling.rolling_mean(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).mean().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_mean(a,10)
>>> old_ts = rolling_mean_(old,10)
>>> new_ts = rolling_mean(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_mean(dict(x = a, y = a**2),10), dict(x = rolling_mean(a,10), y = rolling_mean(a**2,10)))
>>> assert eq(rolling_mean([a,a**2],10), [rolling_mean(a,10), rolling_mean(a**2,10)])

rolling_rms¶

pyg.timeseries._rolling.rolling_rms(a, n, axis=0, data=None, state=None)¶

equivalent to pandas (a**2).rolling(n).mean()**0.5.

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_rms(a,10)
>>> old_ts = rolling_rms_(old,10)
>>> new_ts = rolling_rms(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_rms(dict(x = a, y = a**2),10), dict(x = rolling_rms(a,10), y = rolling_rms(a**2,10)))
>>> assert eq(rolling_rms([a,a**2],10), [rolling_rms(a,10), rolling_rms(a**2,10)])

rolling_std¶

pyg.timeseries._rolling.rolling_std(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).std().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).std(); ts = rolling_std(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).std(); ts = rolling_std(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_std(a,10)
>>> old_ts = rolling_std_(old,10)
>>> new_ts = rolling_std(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_std(dict(x = a, y = a**2),10), dict(x = rolling_std(a,10), y = rolling_std(a**2,10)))
>>> assert eq(rolling_std([a,a**2],10), [rolling_std(a,10), rolling_std(a**2,10)])

rolling_sum¶

pyg.timeseries._rolling.rolling_sum(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).sum().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_sum(a,10)
>>> old_ts = rolling_sum_(old,10)
>>> new_ts = rolling_sum(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_sum(dict(x = a, y = a**2),10), dict(x = rolling_sum(a,10), y = rolling_sum(a**2,10)))
>>> assert eq(rolling_sum([a,a**2],10), [rolling_sum(a,10), rolling_sum(a**2,10)])

rolling_skew¶

pyg.timeseries._rolling.rolling_skew(a, n, bias=False, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).skew().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
bias:: affects the skew calculation definition, see scipy documentation for details.
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_skew(a,10)
>>> old_ts = rolling_skew_(old,10)
>>> new_ts = rolling_skew(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_skew(dict(x = a, y = a**2),10), dict(x = rolling_skew(a,10), y = rolling_skew(a**2,10)))
>>> assert eq(rolling_skew([a,a**2],10), [rolling_skew(a,10), rolling_skew(a**2,10)])

rolling_min¶

pyg.timeseries._min.rolling_min(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).min().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).min(); ts = rolling_min(a,10)
>>> assert abs(ts-panda).min()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).min(); ts = rolling_min(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_min(a,10)
>>> old_ts = rolling_min_(old,10)
>>> new_ts = rolling_min(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_min(dict(x = a, y = a**2),10), dict(x = rolling_min(a,10), y = rolling_min(a**2,10)))
>>> assert eq(rolling_min([a,a**2],10), [rolling_min(a,10), rolling_min(a**2,10)])

rolling_max¶

pyg.timeseries._max.rolling_max(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).max().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).max(); ts = rolling_max(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).max(); ts = rolling_max(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_max(a,10)
>>> old_ts = rolling_max_(old,10)
>>> new_ts = rolling_max(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_max(dict(x = a, y = a**2),10), dict(x = rolling_max(a,10), y = rolling_max(a**2,10)))
>>> assert eq(rolling_max([a,a**2],10), [rolling_max(a,10), rolling_max(a**2,10)])

rolling_median¶

pyg.timeseries._median.rolling_median(a, n, axis=0, data=None, state=None)¶

equivalent to pandas a.rolling(n).median().

works with np.arrays
handles nan without forward filling.
supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: size of rolling window
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).median(); ts = rolling_median(a,10)
>>> assert abs(ts-panda).max()<1e-10   

Example: nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).median(); ts = rolling_median(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
#original: 4634 timeseries: 4625 panda: 4 data points

Example: state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_median(a,10)
>>> old_ts = rolling_median_(old,10)
>>> new_ts = rolling_median(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])

Example: dict/list inputs

>>> assert eq(rolling_median(dict(x = a, y = a**2),10), dict(x = rolling_median(a,10), y = rolling_median(a**2,10)))
>>> assert eq(rolling_median([a,a**2],10), [rolling_median(a,10), rolling_median(a**2,10)])

rolling_quantile¶

pyg.timeseries._stride.rolling_quantile(a, n, quantile=0.5, axis=0, data=None, state=None)¶

equivalent to a.rolling(n).quantile(q) except… - supports numpy arrays - supports multiple q values

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> res = rolling_quantile(a, 100, 0.3)
>>> assert sub_(res, a.rolling(100).quantile(0.3)).max() < 1e-13

Example: multiple quantiles

>>> res = rolling_quantile(a, 100, [0.3, 0.5, 0.75])
>>> assert abs(res[0.3] - a.rolling(100).quantile(0.3)).max() < 1e-13

Example: state management

>>> res = rolling_quantile(a, 100, 0.3)
>>> old = rolling_quantile_(a.iloc[:2000], 100, 0.3)
>>> new = rolling_quantile(a.iloc[2000:], 100, 0.3, **old)
>>> both = pd.concat([old.data, new])
>>> assert eq(both, res)

Parameters

a : array/timeseries n : integer

qfloat or list of floats in [0,1]: quantile(s).
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Returns

timeseries/array of quantile(s)

rolling_rank¶

pyg.timeseries._rank.rolling_rank(a, n, axis=0, data=None, state=None)¶

returns a rank of the current value within a given window, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these: timeseries
n: int: window size
axisint, optional: 0/1/-1. The default is 0.
data: None.: unused at the moment. Allow code such as func(live, **func_(history)) to work
state: dict, optional: state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2., np.nan, 0., 4., 2., 3., 1., 2.], drange(-8))
>>> rank = rolling_rank(a, 3)
>>> assert eq(rank.values, np.array([np.nan, np.nan, np.nan, -1, 1, 0, 0, -1, 0]))
>>> # 0 is smallest in [1,2,0] so goes to -1
>>> # 4 is largest in [2,0,4] so goes to +1
>>> # 2 is middle of [0,4,2] so goes to 0

Example: numpy equivalent

>>> assert eq(rolling_rank(a.values, 10), rolling_rank(a, 10).values)  

Example: state management

>>> a = np.random.normal(0,1,10000)
>>> old = rolling_rank_(a[:5000], 10) # grab both data and state
>>> new = rolling_rank(a[5000:], 10, **old)
>>> assert eq(np.concatenate([old.data,new]), rolling_rank(a, 10))

exponentially weighted moving functions¶

ewma¶

pyg.timeseries._ewm.ewma(a, n, time=None, axis=0, data=None, state=None)¶

ewma is equivalent to a.ewm(n).mean() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewma(a,10); df = a.ewm(10).mean()
>>> assert abs(ts-df).max()<1e-10

Example: numpy arrays support

>>> assert eq(ewma(a.values, 10), ewma(a,10).values)

Example: nan handling

>>> a[a.values<0.1] = np.nan
>>> ts = ewma(a,10, time = 'i'); df = a.ewm(10).mean() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

>>> pd.concat([ts,df], axis=1)
>>>                        0         1
>>> 1993-09-24  0.263875  0.263875
>>> 1993-09-25       NaN  0.263875
>>> 1993-09-26       NaN  0.263875
>>> 1993-09-27       NaN  0.263875
>>> 1993-09-28       NaN  0.263875
>>>                  ...       ...
>>> 2021-02-04       NaN  0.786506
>>> 2021-02-05  0.928817  0.928817
>>> 2021-02-06       NaN  0.928817
>>> 2021-02-07  0.839168  0.839168
>>> 2021-02-08  0.831109  0.831109

Example: state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewma_(old, 10)
>>> new_ts = ewma(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewma(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])

Example: Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewma(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewma(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example: Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewma(dict(x=x, y=y),10), dict(x=ewma(x,10), y=ewma(y,10)))
>>> assert eq(ewma([x,y],10), [ewma(x,10), ewma(y,10)])

Returns

an array/timeseries of ewma

ewmrms¶

pyg.timeseries._ewm.ewmrms(a, n, time=None, axis=0, data=None, state=None)¶

ewmrms is equivalent to (a**2).ewm(n).mean()**0.5 but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmrms(a,10); df = (a**2).ewm(10).mean()**0.5
>>> assert abs(ts-df).max()<1e-10

Example: numpy arrays support

>>> assert eq(ewmrms(a.values, 10), ewmrms(a,10).values)

Example: nan handling

>>> a[a.values<0.1] = np.nan
>>> ts = ewmrms(a,10, time = 'i'); df = (a**2).ewm(10).mean()**0.5 # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

>>> pd.concat([ts,df], axis=1)
>>>                        0         1
>>> 1993-09-24  0.263875  0.263875
>>> 1993-09-25       NaN  0.263875
>>> 1993-09-26       NaN  0.263875
>>> 1993-09-27       NaN  0.263875
>>> 1993-09-28       NaN  0.263875
>>>                  ...       ...
>>> 2021-02-04       NaN  0.786506
>>> 2021-02-05  0.928817  0.928817
>>> 2021-02-06       NaN  0.928817
>>> 2021-02-07  0.839168  0.839168
>>> 2021-02-08  0.831109  0.831109

Example: state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmrms_(old, 10)
>>> new_ts = ewmrms(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmrms(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])

Example: Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmrms(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmrms(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example: Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmrms(dict(x=x, y=y),10), dict(x=ewmrms(x,10), y=ewmrms(y,10)))
>>> assert eq(ewmrms([x,y],10), [ewmrms(x,10), ewmrms(y,10)])

Returns

an array/timeseries of ewma

ewmstd¶

pyg.timeseries._ewm.ewmstd(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)¶

ewmstd is equivalent to a.ewm(n).std() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmstd(a,10); df = a.ewm(10).std()
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmstd(a,10, bias = True); df = a.ewm(10).std(bias = True)
>>> assert abs(ts-df).max()<1e-10

Example: numpy arrays support

>>> assert eq(ewmstd(a.values, 10), ewmstd(a,10).values)

Example: nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmstd(a,10, time = 'i'); df = a.ewm(10).std() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmstd(a,10, time = 'i', bias = True); df = a.ewm(10).std(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

Example: state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmstd_(old, 10)
>>> new_ts = ewmstd(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmstd(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])

Example: Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmstd(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmstd(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example: Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmstd(dict(x=x, y=y),10), dict(x=ewmstd(x,10), y=ewmstd(y,10)))
>>> assert eq(ewmstd([x,y],10), [ewmstd(x,10), ewmstd(y,10)])

Returns

an array/timeseries of ewma

ewmvar¶

pyg.timeseries._ewm.ewmvar(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)¶

ewmstd is equivalent to a.ewm(n).var() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmvar(a,10); df = a.ewm(10).var()
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmvar(a,10, bias = True); df = a.ewm(10).var(bias = True)
>>> assert abs(ts-df).max()<1e-10

Example: numpy arrays support

>>> assert eq(ewmvar(a.values, 10), ewmvar(a,10).values)

Example: nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmvar(a,10, time = 'i'); df = a.ewm(10).var() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmvar(a,10, time = 'i', bias = True); df = a.ewm(10).var(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

Example: state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmvar_(old, 10)
>>> new_ts = ewmvar(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmvar(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])

Example: Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmvar(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmvar(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example: Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmvar(dict(x=x, y=y),10), dict(x=ewmvar(x,10), y=ewmvar(y,10)))
>>> assert eq(ewmvar([x,y],10), [ewmvar(x,10), ewmvar(y,10)])

Returns

an array/timeseries of ewma

ewmcor¶

pyg.timeseries._ewm.ewmcor(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, state=None)¶

calculates pair-wise correlation between a and b.

Parameters

a : array/timeseries b : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

biasbook, optional

vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default.

axisint, optional

axis of calculation. The default is 0.

dataplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmcor. The default is None.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,9000), drange(-8999))
>>> ts = ewmcor(a, b, n = 10); df = a.ewm(10).corr(b)
>>> assert abs(ts-df).max()<1e-10

Example: numpy arrays support

>>> assert eq(ewmcor(a.values, b.values, 10), ewmcor(a, b, 10).values)

Example: nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

Example: state management

>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000]
>>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:]
>>> old_ts = ewmcor_(old_a, old_b, 10)
>>> new_ts = ewmcor(new_a, new_b, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmcor(a,b,10)
>>> assert eq(new_ts, ts.iloc[5000:])

ewmLR¶

pyg.timeseries._ewm.ewmLR(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, state=None)¶

calculates pair-wise linear regression between a and b. We have a and b for which we want to fit:

>>> b_i = c + m a_i 
>>> LSE(c,m) = \sum w_i (c + m a_i - b_i)^2
>>> dLSE/dc  = 0  <==> \sum w_i  (c + m a_i - b_i) = 0    [1]
>>> dLSE/dm  = 0 <==> \sum w_i  a_i (c + m a_i - b_i) = 0 [2]

>>> c     + mE(a)    = E(b)     [1]
>>> cE(a) + mE(a^2)  = E(ab)    [2]

>>> cE(a) + mE(a)^2  = E(a)E(n) [1] * E(a) 
>>> m(E(a^2) - E(a)^2) = E(ab) - E(a)E(b)
>>> m = covar(a,b)/var(a)
>>> c = E(b) - mE(a)

a : array/timeseries b : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

biasbook, optional

vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default.

axisint, optional

axis of calculation. The default is 0.

c,mplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmcor. The default is None.

Example: numpy arrays support

>>> assert eq(ewmLR(a.values, b.values, 10), ewmLR(a, b, 10).values)

Example: nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10

Example: state management

>>> from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000]
>>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:]
>>> old_ts = ewmLR_(old_a, old_b, 10)
>>> new_ts = ewmLR(new_a, new_b, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmLR(a,b,10)
>>> assert eq(new_ts.c, ts.c.iloc[5000:])
>>> assert eq(new_ts.m, ts.m.iloc[5000:])

Example

>>> from pyg import *
>>> a0 = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a1 = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = (a0 - a1) + pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a = pd.concat([a0,a1], axis=1)
>>> LR = ewmLR(a,b,50)
>>> assert abs(LR.m.mean()[0]-1)<0.5
>>> assert abs(LR.m.mean()[1]+1)<0.5

ewmGLM¶

pyg.timeseries._ewm.ewmGLM(a, b, n, time=None, min_sample=0.25, bias=True, data=None, state=None)¶

Calculates a General Linear Model fitting b to a.

Parameters

a : a 2-d array/pd.DataFrame of values fitting b b : a 1-d array/pd.Series n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return the fitting. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

dataplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmGLM. The default is None.

Theory

See https://en.wikipedia.org/wiki/Generalized_linear_model for full details. Briefly, we assume b is single column while a is multicolumn. We minimize least square error (LSE) fitting:

>>> b[i] =\sum_j m_j a_j[i]    
>>> LSE(m) = \sum_i w_i (b[i] - \sum_j m_j * a_j[i])^2

>>> dLSE/dm_k = 0  
>>> <==>  \sum_i w_i (b[i] - \sum_j m_j * a_j[i]) a_k[i] = 0
>>> <==>  E(b*a_k) = m_k E(a_k^2) + sum_{j<>k} m_k E(a_j a_k) 

E is expectation under weights w. And we can rewrite it as:

>>> a2 x m = ab ## matrix multiplication
>>> a2[i,j] = E(a_i * a_j)
>>> ab[j] = E(a_j * b)
>>> m = a2.inverse x ab ## matrix multiplication    

Example: simple fit

>>> from pyg import *
>>> a = pd.DataFrame(np.random.normal(0,1,(10000,10)), drange(-9999))
>>> true_m = np.random.normal(1,1,10)
>>> noise = np.random.normal(0,1,10000)
>>> b = (a * true_m).sum(axis = 1) + noise

>>> fitted_m = ewmGLM(a, b, 50)    

ewmskew¶

pyg.timeseries._ewm.ewmskew(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, state=None)¶

Equivalent to a.ewm(n).skew() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)

If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.

if we have intraday data, and set time = ‘d’, then
the ewm calculation on last observations per day is what is retained.
the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example: matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> x = a.ewm(10).skew()

>>> old = a.iloc[:10]
>>> new = a.iloc[10:]
f = ewmskew_
for f in [ewma_, ewmstd_, ewmrms_, ewmskew_, ]:
    both = f(a, 3)
    o = f(old, 3)
    n = f(new, 3, **o)
    assert eq(o.data, both.data.iloc[:10]) 
    assert eq(n.data, both.data.iloc[10:]) 
    assert both - 'data' == n - 'data'

>>> assert abs(a.ewm(10).mean() - ewma(a,10)).max() < 1e-14
>>> assert abs(a.ewm(10).std() - ewmstd(a,10)).max() < 1e-14

Example: numpy arrays support

>>> assert eq(ewma(a.values, 10), ewma(a,10).values)

Example: nan handling

while panadas ffill values, timeseries skips nans:

>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a[a.values>0.1] = np.nan
>>> ts = ewma(a,10)
>>> assert eq(ts[~np.isnan(ts)], ewma(a[~np.isnan(a)], 10))

Example: initiating the ewma with past state

>>> old = np.random.normal(0,1,100)
>>> new = np.random.normal(0,1,100)
>>> old_ = ewma_(old, 10)
>>> new_ = ewma(new, 10, t0 = old_ewma.t0, t1 = old_ewma.t1) # instantiation with previous ewma
>>> new_2 = ewma(np.concatenate([old,new]), 10)[-100:]
>>> assert eq(new_ewma, new_ewma2)

Example: Support for time & clock

>>> daily = pd.Series(np.random.normal(0,1,10000), drange(-9999)).cumsum()
>>> monthly = daily.resample('M').last()
>>> m = ewma(monthly, 3) ## 3-month ewma run on monthly data
>>> d = ewma(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d.resample('M').last()
>>> assert abs(daily_resampled_to_month - m).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Returns

an array/timeseries of ewma

functions exposing their state¶

simple functions¶

pyg.timeseries._rolling.diff_(a, n=1, axis=0, data=None, instate=None)¶: returns a forward filled array, up to n values forward. Equivalent to diff(a,n) but returns the full state. See diff for full details

pyg.timeseries._rolling.shift_(a, n=1, axis=0, instate=None)¶: Equivalent to shift(a,n) but returns the full state. See shift for full details

pyg.timeseries._rolling.ratio_(a, n=1, data=None, instate=None)¶

pyg.timeseries._ts.ts_count_(a, axis=0, data=None, instate=None)¶: ts_count_(a) is equivalent to ts_count(a) except vec is also returned. See ts_count for full documentation

pyg.timeseries._ts.ts_sum_(a, axis=0, data=None, instate=None)¶: ts_sum_(a) is equivalent to ts_sum(a) except vec is also returned. See ts_sum for full documentation

pyg.timeseries._ts.ts_mean_(a, axis=0, data=None, instate=None)¶: ts_mean_(a) is equivalent to ts_mean(a) except vec is also returned. See ts_mean for full documentation

pyg.timeseries._ts.ts_rms_(a, axis=0, data=None, instate=None)¶: ts_rms_(a) is equivalent to ts_rms(a) except it also returns vec see ts_rms for full documentation

pyg.timeseries._ts.ts_std_(a, axis=0, data=None, instate=None)¶: ts_std_(a) is equivalent to ts_std(a) except vec is also returned. See ts_std for full documentation

pyg.timeseries._ts.ts_skew_(a, bias=False, min_sample=0.25, axis=0, data=None, instate=None)¶: ts_skew_(a) is equivalent to ts_skew except vec is also returned. See ts_skew for full details

pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)¶: ts_max(a) is equivalent to pandas a.min()

pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)¶: ts_max(a) is equivalent to pandas a.min()

pyg.timeseries._rolling.ffill_(a, n=0, axis=0, instate=None)¶: returns a forward filled array, up to n values forward. supports state manegement

expanding window functions¶

pyg.timeseries._expanding.expanding_mean_(a, axis=0, data=None, instate=None)¶: Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_mean.__doc__

pyg.timeseries._expanding.expanding_rms_(a, axis=0, data=None, instate=None)¶: Equivalent to expanding_rms(a) but returns also the state variables. For full documentation, look at expanding_rms.__doc__

pyg.timeseries._expanding.expanding_std_(a, axis=0, data=None, instate=None)¶: Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_std.__doc__

pyg.timeseries._expanding.expanding_sum_(a, axis=0, data=None, instate=None)¶: Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__

pyg.timeseries._expanding.expanding_skew_(a, bias=False, axis=0, data=None, instate=None)¶: Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_skew.__doc__

pyg.timeseries._min.expanding_min_(a, axis=0, data=None, instate=None)¶: Equivalent to a.expanding().min() but returns the full state: i.e. both data: the expanding().min() m: the current minimum

pyg.timeseries._max.expanding_max_(a, axis=0, data=None, instate=None)¶: Equivalent to a.expanding().max() but returns the full state: i.e. both data: the expanding().max() m: the current maximum

pyg.timeseries._expanding.cumsum_(a, axis=0, data=None, instate=None)¶: Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__

pyg.timeseries._expanding.cumprod_(a, axis=0, data=None, instate=None)¶: Equivalent to cumprod(a) but returns also the state variable. For full documentation, look at cumprod.__doc__

rolling window functions¶

pyg.timeseries._rolling.rolling_mean_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_mean(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_mean.__doc__

pyg.timeseries._rolling.rolling_rms_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_rms(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_rms.__doc__

pyg.timeseries._rolling.rolling_std_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_std(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_std.__doc__

pyg.timeseries._rolling.rolling_sum_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_sum(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_sum.__doc__

pyg.timeseries._rolling.rolling_skew_(a, n, bias=False, axis=0, data=None, instate=None)¶: Equivalent to rolling_skew(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_skew.__doc__

pyg.timeseries._min.rolling_min_(a, n, vec=None, axis=0, data=None, instate=None)¶: Equivalent to rolling_min(a) but returns also the state. For full documentation, look at rolling_min.__doc__

pyg.timeseries._max.rolling_max_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_max(a) but returns also the state. For full documentation, look at rolling_max.__doc__

pyg.timeseries._median.rolling_median_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_median(a) but returns also the state. For full documentation, look at rolling_median.__doc__

pyg.timeseries._rank.rolling_rank_(a, n, axis=0, data=None, instate=None)¶: Equivalent to rolling_rank(a) but returns also the state variables. For full documentation, look at rolling_rank.__doc__

pyg.timeseries._stride.rolling_quantile_(a, n, quantile=0.5, axis=0, data=None, instate=None)¶: Equivalent to rolling_quantile(a) but returns also the state. For full documentation, look at rolling_quantile.__doc__

exponentially weighted moving functions¶

pyg.timeseries._ewm.ewma_(a, n, time=None, data=None, instate=None)¶: Equivalent to ewma but returns a state parameter for instantiation of later calculations. See ewma documentation for more details

pyg.timeseries._ewm.ewmrms_(a, n, time=None, axis=0, data=None, instate=None)¶: Equivalent to ewmrms but returns a state parameter for instantiation of later calculations. See ewmrms documentation for more details

pyg.timeseries._ewm.ewmstd_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)¶: Equivalent to ewmstd but returns a state parameter for instantiation of later calculations. See ewmstd documentation for more details

pyg.timeseries._ewm.ewmvar_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)¶: Equivalent to ewmvar but returns a state parameter for instantiation of later calculations. See ewmvar documentation for more details

pyg.timeseries._ewm.ewmcor_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, instate=None)¶: Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details

pyg.timeseries._ewm.ewmLR_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, instate=None)¶: Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details

pyg.timeseries._ewm.ewmGLM_(a, b, n, time=None, min_sample=0.25, bias=True, data=None, instate=None)¶: Equivalent to ewmGLM but returns a state parameter for instantiation of later calculations. See ewmGLM documentation for more details

pyg.timeseries._ewm.ewmskew_(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, instate=None)¶: Equivalent to ewmskew but returns a state parameter for instantiation of later calculations. See ewmskew documentation for more details

Index handling¶

df_fillna¶

pyg.timeseries._index.df_fillna(df, method=None, axis=0, limit=None)¶

Equivelent to df.fillna() except:

support np.ndarray as well as dataframes
support multiple methods of filling/interpolation
supports removal of nan from the start/all of the timeseries
supports action on multiple timeseries

Parameters

df : dataframe/numpy array

methodstring, list of strings or None, optional: Either a fill method (bfill, ffill, pad) Or an interplation method: ‘linear’, ‘time’, ‘index’, ‘values’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘krogh’, ‘spline’, ‘polynomial’, ‘from_derivatives’, ‘piecewise_polynomial’, ‘pchip’, ‘akima’, ‘cubicspline’ Or ‘fnna’: removes all to the first non nan Or ‘nona’: removes all nans
axisint, optional: axis. The default is 0.
limitTYPE, optional: when filling, how many nan get filled. The default is None (indefinite)

Example: method ffill or bfill

>>> from pyg import *; import numpy as np
>>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'ffill'), np.array([ np.nan, 1.,  1.,  9.,  9., 25.]))
>>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1.,  1.,  9.,  9., 25.]))
>>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1.,  1.,  9.,  9., 25.]))

>>> df = np.array([np.nan, 1., np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'ffill', limit = 2), np.array([np.nan,  1.,  1.,  1., np.nan, np.nan, np.nan, np.nan,  9.,  9., 25.]))

df_fillna does not maintain state of latest ‘prev’ value: use ffill_ for that.

Example: interpolation methods

>>> from pyg import *; import numpy as np
>>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'linear'), np.array([ np.nan, 1.,  5.,  9.,  17., 25.]))
>>> assert eq(df_fillna(df, 'quadratic'), np.array([ np.nan, 1.,  4.,  9.,  16., 25.]))

Example: method = fnna and nona

>>> from pyg import *; import numpy as np
>>> ts = np.array([np.nan] * 10 + [1.] * 10 + [np.nan])
>>> assert eq(df_fillna(ts, 'fnna'), np.array([1.]*10 + [np.nan]))
>>> assert eq(df_fillna(ts, 'nona'), np.array([1.]*10))

>>> assert len(df_fillna(np.array([np.nan]), 'nona')) == 0
>>> assert len(df_fillna(np.array([np.nan]), 'fnna')) == 0

Returns

array/dataframe with nans removed/filled

df_index¶

pyg.timeseries._index.df_index(seq, index='inner')¶

Determines a joint index of multiple timeseries objects.

Parameters

seqsequence whose index needs to be determined: a (possible nested) sequence of timeseries/non-timeseries object within lists/dicts
indexstr, optional: method to determine the index. The default is ‘inner’.

Returns

pd.Index: The joint index.

Example

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> more_tss_as_dict = dict(zip('abcde',[pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]))
>>> res = df_index(tss + [more_tss_as_dict], 'inner')
>>> assert len(res) == 6
>>> res = df_index(more_tss_as_dict, 'outer')
>>> assert len(res) == 14

df_reindex¶

pyg.timeseries._index.df_reindex(ts, index=None, method=None, limit=None)¶

A slightly more general version of df.reindex(index)

Parameters

tsdataframe or numpy array (or list/dict of theses): timeseries to be reindexed
indexstr, timeseries, pd.Index.: The new index
methodstr, list of str, float, optional: various methods of handling nans are available. The default is None. See df_fillna for a full list.

Returns

timeseries/np.ndarray (or list/dict of theses): timeseries reindex.

Example: index = inner/outer

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> res = df_reindex(tss, 'inner')
>>> assert len(res[0]) == 6
>>> res = df_reindex(tss, 'outer')
>>> assert len(res[0]) == 14

Example: index provided

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> res = df_reindex(tss, tss[0])
>>> assert eq(res[0], tss[0])
>>> res = df_reindex(tss, tss[0].index)
>>> assert eq(res[0], tss[0])

presync¶

pyg.timeseries._index.presync()¶

Much of timeseries analysis in Pandas is spent aligning multiple timeseries before feeding them into a function. presync allows easy presynching of all paramters of a function.

Parameters

functioncallable, optional: function to be presynched. The default is None.
indexstr, optional: index join policy. The default is ‘inner’.
methodstr/int/list of these, optional: method of nan handling. The default is None.
columnsstr, optional: columns join policy. The default is ‘inner’.
defaultfloat, optional: value when no data is available. The default is np.nan.

Returns

presynch-decorated function

Example

>>> from pyg import *
>>> x = pd.Series([1,2,3,4], drange(-3))
>>> y = pd.Series([1,2,3,4], drange(-4,-1))    
>>> z = pd.DataFrame([[1,2],[3,4]], drange(-3,-2), ['a','b'])
>>> addition = lambda a, b: a+b    

#We get some nonsensical results:

>>> assert list(addition(x,z).columns) ==  list(x.index) + ['a', 'b']

#But:

>>> assert list(presync(addition)(x,z).columns) == ['a', 'b']
>>> res = presync(addition, index='outer', method = 'ffill')(x,z)
>>> assert eq(res.a.values, np.array([2,5,6,7]))

Example 2: alignment works for parameters ‘buried’ within…

>>> function = lambda a, b: a['x'] + a['y'] + b    
>>> f = presync(function, 'outer', method = 'ffill')
>>> res = f(dict(x = x, y = y), b = z)
>>> assert eq(res, pd.DataFrame(dict(a = [np.nan, 4, 8, 10, 11], b = [np.nan, 5, 9, 11, 12]), index = drange(-4)))

Example 3: alignment of numpy arrays

>>> addition = lambda a, b: a+b
>>> a = presync(addition)
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4]]).T),  pd.Series([2,4,6,8], drange(-3)))
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([1,2,3,4])),  pd.Series([2,4,6,8], drange(-3)))
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4],[5,6,7,8]]).T),  pd.DataFrame({0:[2,4,6,8], 1:[6,8,10,12]}, drange(-3)))
>>> assert eq(a(np.array([1,2,3,4]), np.array([[1,2,3,4]]).T),  np.array([2,4,6,8]))

Example 4: inner join alignment of columns in dataframes by default

>>> x = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3))
>>> y = pd.DataFrame({'wrong':[2,4,6,8], 'columns':[6,8,10,12]}, drange(-3))
>>> assert len(a(x,y)) == 0    
>>> y = pd.DataFrame({'a':[2,4,6,8], 'other':[6,8,10,12.]}, drange(-3))
>>> assert eq(a(x,y),x[['a']]*2)
>>> y = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3))
>>> assert eq(a(x,y),x*2)
>>> y = pd.DataFrame({'column name for a single column dataframe is ignored':[1,1,1,1]}, drange(-3)) 
>>> assert eq(a(x,y),x+1)

>>> a = presync(addition, columns = 'outer')
>>> y = pd.DataFrame({'other':[2,4,6,8], 'a':[6,8,10,12]}, drange(-3))
>>> assert sorted(a(x,y).columns) == ['a','b','other']    

Example 4: ffilling, bfilling

>>> x = pd.Series([1.,np.nan,3.,4.], drange(-3))    
>>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1))    
>>> assert eq(a(x,y), pd.Series([np.nan, np.nan,7], drange(-3,-1)))

but, we provide easy conversion of internal parameters of presync:

>>> assert eq(a.ffill(x,y), pd.Series([2,4,7], drange(-3,-1)))
>>> assert eq(a.bfill(x,y), pd.Series([4,6,7], drange(-3,-1)))
>>> assert eq(a.oj(x,y), pd.Series([np.nan, np.nan, np.nan, 7, np.nan], drange(-4)))
>>> assert eq(a.oj.ffill(x,y), pd.Series([np.nan, 2, 4, 7, 8], drange(-4)))

Example 5: indexing to a specific index

>>> index = pd.Index([dt(-3), dt(-1)])
>>> a = presync(addition, index = index)
>>> x = pd.Series([1.,np.nan,3.,4.], drange(-3))    
>>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1))    
>>> assert eq(a(x,y), pd.Series([np.nan, 7], index))

Example 6: returning complicated stuff

>>> from pyg import * 
>>> a = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99))
>>> b = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99))

>>> def f(a, b):
>>>     return (a*b, ts_sum(a), ts_sum(b))

>>> old = f(a,b)    
>>> self = presync(f)
>>> args = (); kwargs = dict(a = a, b = b)
>>> new = self(*args, **kwargs)
>>> assert eq(new, old)

add/sub/mul/div/pow operators¶

pyg.timeseries._index.add_(a, b)¶: addition of a and b supporting presynching (inner join) of timeseries

pyg.timeseries._index.mul_(a, b)¶: multiplication of a and b supporting presynching (inner join) of timeseries

pyg.timeseries._index.div_(a, b)¶: division of a by b supporting presynching (inner join) of timeseries

pyg.timeseries._index.sub_(a, b)¶: subtraction of b from a supporting presynching (inner join) of timeseries

pyg.timeseries._index.pow_(a, b)¶: equivalent to a**b supporting presynching (inner join) of timeseries