pyg.timeseries

Given pandas, why do we need this timeseries library? pandas is amazing but there are a few features in pyg.timeseries designed to enhance it. There are three issues with pandas that pyg.timeseries tries to address:

  • pandas works on pandas objects (obviously) but not on numpy arrays.

  • pandas handles TimeSeries with nan inconsistently across its functions. This makes your results sensitive to reindexing/resampling. E.g.:
    • a.expanding() & a.ewm() ignore nan’s for calculation and then ffill the result.

    • a.diff(), a.rolling() include any nans in the calculation, leading to nan propagation.

  • pandas is great if you have the full timeseries. However, if you now want to run the same calculations in a live environment, on recent data, pandas cannot help you: you have to stick the new data at the end of the DataFrame and rerun.

pyg.timeseries tries to address this:

  • pyg.timeseries agrees with pandas 100% on DataFrames (with no nan) while being of comparable (if not faster) speed

  • pyg.timeseries works seemlessly on pandas objects and on numpy arrays, with no code change.

  • pyg.timeseries handles nan consistently across all its functions, ‘ignoring’ all nan, making your results consistent regardless of resampling.

  • pyg.timeseries exposes the state of the internal function calculation. The exposure of internal states allows us to calculate the output of additional data without re-running history. This speeds up of two very common problems in finance:
    • risk calculations, Monte Carlo scenarios: We can run a trading strategy up to today and then generate multiple scenarios and see what-if, without having to rerun the full history.

    • live versus history: pandas is designed to run a full historical simulation. However, once we reach “today”, speed is of the essense and running a full historical simulation every time we ingest a new price, is just too slow. That is why most fast trading is built around fast state-machines. Of course, making sure research & live versions do the same thing is tricky. pyg gives you the ability to run two systems in parallel with almost the same code base: run full history overnight and then run today’s code base instantly, instantiated with the output of the historical simulation.

simple functions

diff

pyg.timeseries._rolling.diff(a, n=1, axis=0, data=None, state=None)

equivalent to a.diff(n) in pandas if there are no nans. If there are, we SKIP nans rather than propagate them.

Parameters

aarray/timeseries

array/timeseries

n: int, optional, default = 1

window size

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

: matching pandas no nan’s

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> assert eq(timer(diff, 1000)(a), timer(lambda a, n=1: a.diff(n), 1000)(a))
Example

: nan skipping

>>> a = np.array([1., np.nan, 3., 9.])
>>> assert eq(diff(a),                      np.array([np.nan, np.nan, 2.0,   6.0]))
>>> assert eq(pd.Series(a).diff().values,   np.array([np.nan, np.nan, np.nan,6.0]))

shift

pyg.timeseries._rolling.shift(a, n=1, axis=0, data=None, state=None)

Equivalent to a.shift() with support to arra

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2,3,4,5], drange(-4))
>>> assert eq(shift(a), pd.Series([np.nan,1,2,3,4], drange(-4)))
>>> assert eq(shift(a,2), pd.Series([np.nan,np.nan,1,2,3], drange(-4)))
>>> assert eq(shift(a,-1), pd.Series([2,3,4,5,np.nan], drange(-4)))
Example

np.ndarrays

>>> assert eq(shift(a.values), shift(a).values)
Example

nan skipping

>>> a = pd.Series([1.,2,np.nan,3,4], drange(-4))
>>> assert eq(shift(a), pd.Series([np.nan,1,np.nan, 2,3], drange(-4)))
>>> assert eq(a.shift(), pd.Series([np.nan,1,2,np.nan,3], drange(-4))) # the location of the nan changes
Example

state management

>>> old = a.iloc[:3]
>>> new = a.iloc[3:]
>>> old_ts = shift_(old)
>>> new_ts = shift(new, **old_ts)
>>> assert eq(new_ts, shift(a).iloc[3:])

ratio

pyg.timeseries._rolling.ratio(a, n=1, data=None, state=None)

Equivalent to a.diff() but in log-space..

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2,3,4,5], drange(-4))
>>> assert eq(ratio(a), pd.Series([np.nan, 2, 1.5, 4/3,1.25], drange(-4)))
>>> assert eq(ratio(a,2), pd.Series([np.nan, np.nan, 3, 2, 5/3], drange(-4)))

ts_count

pyg.timeseries._ts.ts_count(a) is equivalent to a.count() (though slightly slower)
  • supports numpy arrays

  • skips nan

  • supports state management

Example

pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_count(a) == a.count()
Example

numpy

>>> assert ts_count(a.values) == ts_count(a)
Example

state management

>>> old = ts_count_(a.iloc[:2000])
>>> new = ts_count(a.iloc[2000:], state = old.state)
>>> assert new == ts_count(a)

ts_sum

pyg.timeseries._ts.ts_sum(a) is equivalent to a.sum()
  • supports numpy arrays

  • handles nan

  • supports state management

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_sum(a) == a.sum()
Example

numpy

>>> assert ts_sum(a.values) == ts_sum(a)
Example

state management

>>> old = ts_sum_(a.iloc[:2000])
>>> new = ts_sum(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_sum(a)

ts_mean

pyg.timeseries._ts.ts_mean(a) is equivalent to a.mean()
  • supports numpy arrays

  • handles nan

  • supports state management

  • pandas is actually faster on count

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert ts_mean(a) == a.mean()
Example

numpy

>>> assert ts_mean(a.values) == ts_mean(a)
Example

state management

>>> old = ts_mean_(a.iloc[:2000])
>>> new = ts_mean(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_mean(a)

ts_rms

pyg.timeseries._ts.ts_rms(a, axis=0, data=None, state=None)

ts_rms(a) is equivalent to (a**2).mean()**0.5

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

  • supports numpy arrays

  • handles nan

  • supports state management

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
Example

pandas matching

>>> assert abs(ts_std(a) - a.std())<1e-13
Example

numpy

>>> assert ts_std(a.values) == ts_std(a)
Example

state management

>>> old = ts_rms_(a.iloc[:2000])
>>> new = ts_rms(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_rms(a)

ts_std

pyg.timeseries._ts.ts_std(a) is equivalent to a.std()
  • supports numpy arrays

  • handles nan

  • supports state management

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
Example

pandas matching

>>> assert abs(ts_std(a) - a.std())<1e-13
Example

numpy

>>> assert ts_std(a.values) == ts_std(a)
Example

state management

>>> old = ts_std_(a.iloc[:2000])
>>> new = ts_std(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_std(a)

ts_skew

pyg.timeseries._ts.ts_skew(a, 0) is equivalent to a.skew()
  • supports numpy arrays

  • handles nan

  • faster than pandas

  • supports state management

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

min_sample: float, optional

This refers to the denominator when we calculate the skew. Over time, the deonimator converges to 1 but initially, it is small. Also, if there is a gap in the data, older datapoints weight may have decayed while there are not enough “new point”. min_sample ensures that in both cases, if denominator<0.25 )(default value) we return nan.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

pandas matching

>>> # create sample data:
>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan
>>> assert abs(ts_skew(a, 0) - a.skew())<1e-13
Example

numpy

>>> assert ts_skew(a.values) == ts_skew(a)
Example

state management

>>> old = ts_skew_(a.iloc[:2000])
>>> new = ts_skew(a.iloc[2000:], vec = old.vec)
>>> assert new == ts_skew(a)

ts_min

pyg.timeseries._ts.ts_max(a) is equivalent to pandas a.min()

ts_max

pyg.timeseries._ts.ts_max(a) is equivalent to pandas a.min()

ts_median

pyg.timeseries._ts.ts_median(a, axis=0)

fnna

pyg.timeseries._rolling.fnna(a, n=1, axis=0)

returns the index in a of the nth first non-nan.

Parameters

a : array/timeseries n: int, optional, default = 1

Example

>>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan])
>>> fnna(a,n=-2)

v2na/na2v

pyg.timeseries._rolling.v2na(a, old=0.0, new=nan)

replaces an old value with a new value (default is nan)

Examples

>>> from pyg import *
>>> a = np.array([1., np.nan, 1., 0.])
>>> assert eq(v2na(a), np.array([1., np.nan, 1., np.nan]))
>>> assert eq(v2na(a,1), np.array([np.nan, np.nan, np.nan, 0]))
>>> assert eq(v2na(a,1,0), np.array([0., np.nan, 0., 0.]))
Parameters

a : array/timeseries old: float

value to be replaced

newfloat, optional

new value to be used, The default is np.nan.

Returns

array/timeseries

pyg.timeseries._rolling.na2v(a, new=0.0)

replaces a nan with a new value

Example

>>> from pyg import *
>>> a = np.array([1., np.nan, 1.])
>>> assert eq(na2v(a), np.array([1., 0.0, 1.]))
>>> assert eq(na2v(a,1), np.array([1., 1., 1.]))
Parameters

a : array/timeseries new : float, optional

DESCRIPTION. The default is 0.0.

Returns

array/timeseries

ffill/bfill

pyg.timeseries._rolling.ffill(a, n=0, axis=0, data=None, state=None)

returns a forward filled array, up to n values forward. supports state manegement which is needed if we want only nth

Parameters

aarray/timeseries

array/timeseries

n: int, optional, default = 1

window size

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan])
>>> fnna(a, n=-2)
pyg.timeseries._rolling.bfill(a, n=- 1, axis=0)

equivalent to a.fillna(‘bfill’). There is no state-aware as this function is forward looking

Example

>>> from pyg import *
>>> a = np.array([np.nan, 1., np.nan])
>>> b = np.array([1., 1., np.nan])
>>> assert eq(bfill(a),  b)
Example

pd.Series

>>> ts = pd.Series(a, drange(-2))
>>> assert eq(bfill(ts).values, b)

nona

pyg.timeseries._ts.nona(a, value=nan)

removes rows that are entirely nan (or a specific other value)

Parameters

a : dataframe/ndarray

valuefloat, optional

value to be removed. The default is np.nan.

Example

>>> from pyg import *
>>> a = np.array([1,np.nan,2,3])
>>> assert eq(nona(a), np.array([1,2,3]))
Example

multiple columns

>>> a = np.array([[1,np.nan,2,np.nan], [np.nan, np.nan, np.nan, 3]]).T 
>>> b = np.array([[1,2,np.nan], [np.nan, np.nan, 3]]).T ## 2nd row has nans across
>>> assert eq(nona(a), b)

expanding window functions

expanding_mean

pyg.timeseries._expanding.expanding_mean(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().mean().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().mean(); ts = expanding_mean(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().mean(); ts = expanding_mean(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  1.562960  1.562960
>>> 1993-09-24  0.908910  0.908910
>>> 1993-09-25  0.846817  0.846817
>>> 1993-09-26  0.821423  0.821423
>>> 1993-09-27  0.821423       NaN
>>>              ...       ...
>>> 2021-02-03  0.870358  0.870358
>>> 2021-02-04  0.870358       NaN
>>> 2021-02-05  0.870358       NaN
>>> 2021-02-06  0.870358       NaN
>>> 2021-02-07  0.870353  0.870353
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_mean(a)
>>> old_ts = expanding_mean_(old)
>>> new_ts = expanding_mean(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_mean(dict(x = a, y = a**2)), dict(x = expanding_mean(a), y = expanding_mean(a**2)))
>>> assert eq(expanding_mean([a,a**2]), [expanding_mean(a), expanding_mean(a**2)])

expanding_rms

pyg.timeseries._expanding.expanding_rms(a, axis=0, data=None, state=None)

equivalent to pandas (a**2).expanding().mean()**0.5). - works with np.arrays - handles nan without forward filling. - supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame, list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  0.160462  0.160462
>>> 1993-09-24  0.160462       NaN
>>> 1993-09-25  0.160462       NaN
>>> 1993-09-26  0.160462       NaN
>>> 1993-09-27  0.160462       NaN
>>>                  ...       ...
>>> 2021-02-03  1.040346  1.040346
>>> 2021-02-04  1.040346       NaN
>>> 2021-02-05  1.040338  1.040338
>>> 2021-02-06  1.040337  1.040337
>>> 2021-02-07  1.040473  1.040473
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_rms(a)
>>> old_ts = expanding_rms_(old)
>>> new_ts = expanding_rms(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_rms(dict(x = a, y = a**2)), dict(x = expanding_rms(a), y = expanding_rms(a**2)))
>>> assert eq(expanding_rms([a,a**2]), [expanding_rms(a), expanding_rms(a**2)])

expanding_std

pyg.timeseries._expanding.expanding_std(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().std().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().std(); ts = expanding_std(a)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().std(); ts = expanding_std(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23       NaN       NaN
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26       NaN       NaN
>>> 1993-09-27       NaN       NaN
>>>              ...       ...
>>> 2021-02-03  0.590448  0.590448
>>> 2021-02-04  0.590448       NaN
>>> 2021-02-05  0.590475  0.590475
>>> 2021-02-06  0.590475       NaN
>>> 2021-02-07  0.590411  0.590411
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_std(a)
>>> old_ts = expanding_std_(old)
>>> new_ts = expanding_std(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_std(dict(x = a, y = a**2)), dict(x = expanding_std(a), y = expanding_std(a**2)))
>>> assert eq(expanding_std([a,a**2]), [expanding_std(a), expanding_std(a**2)])

expanding_sum

pyg.timeseries._expanding.expanding_sum(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().sum().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23          NaN          NaN
>>> 1993-09-24          NaN          NaN
>>> 1993-09-25     0.645944     0.645944
>>> 1993-09-26     2.816321     2.816321
>>> 1993-09-27     2.816321          NaN
>>>                 ...          ...
>>> 2021-02-03  3976.911348  3976.911348
>>> 2021-02-04  3976.911348          NaN
>>> 2021-02-05  3976.911348          NaN
>>> 2021-02-06  3976.911348          NaN
>>> 2021-02-07  3976.911348          NaN
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_sum(a)
>>> old_ts = expanding_sum_(old)
>>> new_ts = expanding_sum(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2)))
>>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)])

expanding_skew

pyg.timeseries._expanding.expanding_skew(a, bias=False, axis=0, data=None, state=None)

equivalent to pandas a.expanding().skew() which doesn’t exist

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

state management

One can split the calculation and run old and new data separately.

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_skew(a)
>>> old_ts = expanding_skew_(old)
>>> new_ts = expanding_skew(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_skew(dict(x = a, y = a**2)), dict(x = expanding_skew(a), y = expanding_skew(a**2)))
>>> assert eq(expanding_skew([a,a**2]), [expanding_skew(a), expanding_skew(a**2)])

expanding_min

pyg.timeseries._min.expanding_min(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().min().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().min(); ts = expanding_min(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().min(); ts = expanding_min(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26  0.775176  0.775176
>>> 1993-09-27  0.691942  0.691942
>>> 1993-09-28  0.691942       NaN
>>>              ...       ...
>>> 2021-02-04  0.100099  0.100099
>>> 2021-02-05  0.100099       NaN
>>> 2021-02-06  0.100099       NaN
>>> 2021-02-07  0.100099  0.100099
>>> 2021-02-08  0.100099  0.100099
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_min(a)
>>> old_ts = expanding_min_(old)
>>> new_ts = expanding_min(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_min(dict(x = a, y = a**2)), dict(x = expanding_min(a), y = expanding_min(a**2)))
>>> assert eq(expanding_min([a,a**2]), [expanding_min(a), expanding_min(a**2)])

expanding_max

pyg.timeseries._max.expanding_max(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().max().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().max(); ts = expanding_max(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().max(); ts = expanding_max(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-24       NaN       NaN
>>> 1993-09-25       NaN       NaN
>>> 1993-09-26  0.875409  0.875409
>>> 1993-09-27  0.875409       NaN
>>> 1993-09-28  0.875409       NaN
>>>              ...       ...
>>> 2021-02-04  3.625858  3.625858
>>> 2021-02-05  3.625858       NaN
>>> 2021-02-06  3.625858  3.625858
>>> 2021-02-07  3.625858       NaN
>>> 2021-02-08  3.625858       NaN
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_max(a)
>>> old_ts = expanding_max_(old)
>>> new_ts = expanding_max(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_max(dict(x = a, y = a**2)), dict(x = expanding_max(a), y = expanding_max(a**2)))
>>> assert eq(expanding_max([a,a**2]), [expanding_max(a), expanding_max(a**2)])

expanding_median

pyg.timeseries._median.expanding_median(a, axis=0)

equivalent to pandas a.expanding().median().

  • works with np.arrays

  • handles nan without forward filling.

  • There is no state-aware version since this requires essentially the whole history to be stored.

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().median(); ts = expanding_median(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().median(); ts = expanding_median(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23  1.562960  1.562960
>>> 1993-09-24  0.908910  0.908910
>>> 1993-09-25  0.846817  0.846817
>>> 1993-09-26  0.821423  0.821423
>>> 1993-09-27  0.821423       NaN
>>>              ...       ...
>>> 2021-02-03  0.870358  0.870358
>>> 2021-02-04  0.870358       NaN
>>> 2021-02-05  0.870358       NaN
>>> 2021-02-06  0.870358       NaN
>>> 2021-02-07  0.870353  0.870353
Example

dict/list inputs

>>> assert eq(expanding_median(dict(x = a, y = a**2)), dict(x = expanding_median(a), y = expanding_median(a**2)))
>>> assert eq(expanding_median([a,a**2]), [expanding_median(a), expanding_median(a**2)])

expanding_rank

pyg.timeseries._rank.expanding_rank(a, axis=0)

returns a rank of the current value within history, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2., np.nan, 0.,4.,2.], drange(-5))
>>> rank = expanding_rank(a)
>>> assert eq(rank, pd.Series([0, 1, np.nan, -1, 1, 0.25], drange(-5)))
>>> #
>>> # 2 is largest in [1,2] so goes to 1; 
>>> # 0 is smallest in [1,2,0] so goes to -1 etc.
Example

numpy equivalent

>>> assert eq(expanding_rank(a.values), expanding_rank(a).values)  

cumsum

pyg.timeseries._expanding.cumsum(a, axis=0, data=None, state=None)

equivalent to pandas a.expanding().sum().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> assert eq(ts,panda)    
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a[a<0.1] = np.nan
>>> panda = a.expanding().sum(); ts = expanding_sum(a)
>>> pd.concat([panda,ts], axis=1)
>>>                    0         1
>>> 1993-09-23          NaN          NaN
>>> 1993-09-24          NaN          NaN
>>> 1993-09-25     0.645944     0.645944
>>> 1993-09-26     2.816321     2.816321
>>> 1993-09-27     2.816321          NaN
>>>                 ...          ...
>>> 2021-02-03  3976.911348  3976.911348
>>> 2021-02-04  3976.911348          NaN
>>> 2021-02-05  3976.911348          NaN
>>> 2021-02-06  3976.911348          NaN
>>> 2021-02-07  3976.911348          NaN
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = expanding_sum(a)
>>> old_ts = expanding_sum_(old)
>>> new_ts = expanding_sum(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2)))
>>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)])

cumprod

pyg.timeseries._expanding.cumprod(a, axis=0, data=None, state=None)

equivalent to pandas np.exp(np.log(a).expanding().sum()).

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

axisint, optional

0/1/-1. The default is 0.

data: None

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = 1 + pd.Series(np.random.normal(0.001,0.05,10000), drange(-9999))
>>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a)
>>> assert abs(ts-panda).max() < 1e-10
Example

nan handling

Unlike pandas, timeseries does not forward fill the nans.

>>> a = 1 + pd.Series(np.random.normal(-0.01,0.05,100), drange(-99, 2020))
>>> a[a<0.975] = np.nan
>>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a)
>>> pd.concat([panda,ts], axis=1)
>>> 2019-09-24  1.037161  1.037161
>>> 2019-09-25  1.050378  1.050378
>>> 2019-09-26  1.158734  1.158734
>>> 2019-09-27  1.158734       NaN
>>> 2019-09-28  1.219402  1.219402
>>>              ...       ...
>>> 2019-12-28  4.032919  4.032919
>>> 2019-12-29  4.032919       NaN
>>> 2019-12-30  4.180120  4.180120
>>> 2019-12-31  4.180120       NaN
>>> 2020-01-01  4.244261  4.244261
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:50]        
>>> new = a.iloc[50:]    
>>> ts = cumprod(a)
>>> old_ts = cumprod_(old)
>>> new_ts = cumprod(new, **old_ts)    
>>> assert eq(new_ts, ts.iloc[50:])
Example

dict/list inputs

>>> assert eq(cumprod(dict(x = a, y = a**2)), dict(x = cumprod(a), y = cumprod(a**2)))
>>> assert eq(cumprod([a,a**2]), [cumprod(a), cumprod(a**2)])

rolling window functions

rolling_mean

pyg.timeseries._rolling.rolling_mean(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).mean().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_mean(a,10)
>>> old_ts = rolling_mean_(old,10)
>>> new_ts = rolling_mean(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_mean(dict(x = a, y = a**2),10), dict(x = rolling_mean(a,10), y = rolling_mean(a**2,10)))
>>> assert eq(rolling_mean([a,a**2],10), [rolling_mean(a,10), rolling_mean(a**2,10)])

rolling_rms

pyg.timeseries._rolling.rolling_rms(a, n, axis=0, data=None, state=None)

equivalent to pandas (a**2).rolling(n).mean()**0.5.

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_rms(a,10)
>>> old_ts = rolling_rms_(old,10)
>>> new_ts = rolling_rms(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_rms(dict(x = a, y = a**2),10), dict(x = rolling_rms(a,10), y = rolling_rms(a**2,10)))
>>> assert eq(rolling_rms([a,a**2],10), [rolling_rms(a,10), rolling_rms(a**2,10)])

rolling_std

pyg.timeseries._rolling.rolling_std(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).std().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).std(); ts = rolling_std(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).std(); ts = rolling_std(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_std(a,10)
>>> old_ts = rolling_std_(old,10)
>>> new_ts = rolling_std(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_std(dict(x = a, y = a**2),10), dict(x = rolling_std(a,10), y = rolling_std(a**2,10)))
>>> assert eq(rolling_std([a,a**2],10), [rolling_std(a,10), rolling_std(a**2,10)])

rolling_sum

pyg.timeseries._rolling.rolling_sum(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).sum().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_sum(a,10)
>>> old_ts = rolling_sum_(old,10)
>>> new_ts = rolling_sum(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_sum(dict(x = a, y = a**2),10), dict(x = rolling_sum(a,10), y = rolling_sum(a**2,10)))
>>> assert eq(rolling_sum([a,a**2],10), [rolling_sum(a,10), rolling_sum(a**2,10)])

rolling_skew

pyg.timeseries._rolling.rolling_skew(a, n, bias=False, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).skew().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

bias:

affects the skew calculation definition, see scipy documentation for details.

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 2 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_skew(a,10)
>>> old_ts = rolling_skew_(old,10)
>>> new_ts = rolling_skew(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_skew(dict(x = a, y = a**2),10), dict(x = rolling_skew(a,10), y = rolling_skew(a**2,10)))
>>> assert eq(rolling_skew([a,a**2],10), [rolling_skew(a,10), rolling_skew(a**2,10)])

rolling_min

pyg.timeseries._min.rolling_min(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).min().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).min(); ts = rolling_min(a,10)
>>> assert abs(ts-panda).min()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).min(); ts = rolling_min(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_min(a,10)
>>> old_ts = rolling_min_(old,10)
>>> new_ts = rolling_min(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_min(dict(x = a, y = a**2),10), dict(x = rolling_min(a,10), y = rolling_min(a**2,10)))
>>> assert eq(rolling_min([a,a**2],10), [rolling_min(a,10), rolling_min(a**2,10)])

rolling_max

pyg.timeseries._max.rolling_max(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).max().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).max(); ts = rolling_max(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).max(); ts = rolling_max(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
>>> #original: 4534 timeseries: 4525 panda: 6 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_max(a,10)
>>> old_ts = rolling_max_(old,10)
>>> new_ts = rolling_max(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_max(dict(x = a, y = a**2),10), dict(x = rolling_max(a,10), y = rolling_max(a**2,10)))
>>> assert eq(rolling_max([a,a**2],10), [rolling_max(a,10), rolling_max(a**2,10)])

rolling_median

pyg.timeseries._median.rolling_median(a, n, axis=0, data=None, state=None)

equivalent to pandas a.rolling(n).median().

  • works with np.arrays

  • handles nan without forward filling.

  • supports state parameters

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

size of rolling window

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

agreement with pandas

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> panda = a.rolling(10).median(); ts = rolling_median(a,10)
>>> assert abs(ts-panda).max()<1e-10   
Example

nan handling

Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans

>>> a[a<0.1] = np.nan
>>> panda = a.rolling(10).median(); ts = rolling_median(a,10)
>>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points')
#original: 4634 timeseries: 4625 panda: 4 data points
Example

state management

One can split the calculation and run old and new data separately.

>>> old = a.iloc[:5000]        
>>> new = a.iloc[5000:]    
>>> ts = rolling_median(a,10)
>>> old_ts = rolling_median_(old,10)
>>> new_ts = rolling_median(new, 10, **old_ts)    
>>> assert eq(new_ts, ts.iloc[5000:])
Example

dict/list inputs

>>> assert eq(rolling_median(dict(x = a, y = a**2),10), dict(x = rolling_median(a,10), y = rolling_median(a**2,10)))
>>> assert eq(rolling_median([a,a**2],10), [rolling_median(a,10), rolling_median(a**2,10)])

rolling_quantile

pyg.timeseries._stride.rolling_quantile(a, n, quantile=0.5, axis=0, data=None, state=None)

equivalent to a.rolling(n).quantile(q) except… - supports numpy arrays - supports multiple q values

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> res = rolling_quantile(a, 100, 0.3)
>>> assert sub_(res, a.rolling(100).quantile(0.3)).max() < 1e-13
Example

multiple quantiles

>>> res = rolling_quantile(a, 100, [0.3, 0.5, 0.75])
>>> assert abs(res[0.3] - a.rolling(100).quantile(0.3)).max() < 1e-13
Example

state management

>>> res = rolling_quantile(a, 100, 0.3)
>>> old = rolling_quantile_(a.iloc[:2000], 100, 0.3)
>>> new = rolling_quantile(a.iloc[2000:], 100, 0.3, **old)
>>> both = pd.concat([old.data, new])
>>> assert eq(both, res)
Parameters

a : array/timeseries n : integer

window size.

qfloat or list of floats in [0,1]

quantile(s).

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Returns

timeseries/array of quantile(s)

rolling_rank

pyg.timeseries._rank.rolling_rank(a, n, axis=0, data=None, state=None)

returns a rank of the current value within a given window, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill

Parameters

aarray, pd.Series, pd.DataFrame or list/dict of these

timeseries

n: int

window size

axisint, optional

0/1/-1. The default is 0.

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

>>> from pyg import *; import pandas as pd; import numpy as np
>>> a = pd.Series([1.,2., np.nan, 0., 4., 2., 3., 1., 2.], drange(-8))
>>> rank = rolling_rank(a, 3)
>>> assert eq(rank.values, np.array([np.nan, np.nan, np.nan, -1, 1, 0, 0, -1, 0]))
>>> # 0 is smallest in [1,2,0] so goes to -1
>>> # 4 is largest in [2,0,4] so goes to +1
>>> # 2 is middle of [0,4,2] so goes to 0
Example

numpy equivalent

>>> assert eq(rolling_rank(a.values, 10), rolling_rank(a, 10).values)  
Example

state management

>>> a = np.random.normal(0,1,10000)
>>> old = rolling_rank_(a[:5000], 10) # grab both data and state
>>> new = rolling_rank(a[5000:], 10, **old)
>>> assert eq(np.concatenate([old.data,new]), rolling_rank(a, 10))

exponentially weighted moving functions

ewma

pyg.timeseries._ewm.ewma(a, n, time=None, axis=0, data=None, state=None)

ewma is equivalent to a.ewm(n).mean() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewma(a,10); df = a.ewm(10).mean()
>>> assert abs(ts-df).max()<1e-10
Example

numpy arrays support

>>> assert eq(ewma(a.values, 10), ewma(a,10).values)
Example

nan handling

>>> a[a.values<0.1] = np.nan
>>> ts = ewma(a,10, time = 'i'); df = a.ewm(10).mean() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> pd.concat([ts,df], axis=1)
>>>                        0         1
>>> 1993-09-24  0.263875  0.263875
>>> 1993-09-25       NaN  0.263875
>>> 1993-09-26       NaN  0.263875
>>> 1993-09-27       NaN  0.263875
>>> 1993-09-28       NaN  0.263875
>>>                  ...       ...
>>> 2021-02-04       NaN  0.786506
>>> 2021-02-05  0.928817  0.928817
>>> 2021-02-06       NaN  0.928817
>>> 2021-02-07  0.839168  0.839168
>>> 2021-02-08  0.831109  0.831109
Example

state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewma_(old, 10)
>>> new_ts = ewma(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewma(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])
Example

Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewma(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewma(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example

Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewma(dict(x=x, y=y),10), dict(x=ewma(x,10), y=ewma(y,10)))
>>> assert eq(ewma([x,y],10), [ewma(x,10), ewma(y,10)])
Returns

an array/timeseries of ewma

ewmrms

pyg.timeseries._ewm.ewmrms(a, n, time=None, axis=0, data=None, state=None)

ewmrms is equivalent to (a**2).ewm(n).mean()**0.5 but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmrms(a,10); df = (a**2).ewm(10).mean()**0.5
>>> assert abs(ts-df).max()<1e-10
Example

numpy arrays support

>>> assert eq(ewmrms(a.values, 10), ewmrms(a,10).values)
Example

nan handling

>>> a[a.values<0.1] = np.nan
>>> ts = ewmrms(a,10, time = 'i'); df = (a**2).ewm(10).mean()**0.5 # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> pd.concat([ts,df], axis=1)
>>>                        0         1
>>> 1993-09-24  0.263875  0.263875
>>> 1993-09-25       NaN  0.263875
>>> 1993-09-26       NaN  0.263875
>>> 1993-09-27       NaN  0.263875
>>> 1993-09-28       NaN  0.263875
>>>                  ...       ...
>>> 2021-02-04       NaN  0.786506
>>> 2021-02-05  0.928817  0.928817
>>> 2021-02-06       NaN  0.928817
>>> 2021-02-07  0.839168  0.839168
>>> 2021-02-08  0.831109  0.831109
Example

state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmrms_(old, 10)
>>> new_ts = ewmrms(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmrms(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])
Example

Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmrms(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmrms(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example

Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmrms(dict(x=x, y=y),10), dict(x=ewmrms(x,10), y=ewmrms(y,10)))
>>> assert eq(ewmrms([x,y],10), [ewmrms(x,10), ewmrms(y,10)])
Returns

an array/timeseries of ewma

ewmstd

pyg.timeseries._ewm.ewmstd(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)

ewmstd is equivalent to a.ewm(n).std() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmstd(a,10); df = a.ewm(10).std()
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmstd(a,10, bias = True); df = a.ewm(10).std(bias = True)
>>> assert abs(ts-df).max()<1e-10
Example

numpy arrays support

>>> assert eq(ewmstd(a.values, 10), ewmstd(a,10).values)
Example

nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmstd(a,10, time = 'i'); df = a.ewm(10).std() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmstd(a,10, time = 'i', bias = True); df = a.ewm(10).std(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
Example

state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmstd_(old, 10)
>>> new_ts = ewmstd(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmstd(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])
Example

Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmstd(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmstd(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example

Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmstd(dict(x=x, y=y),10), dict(x=ewmstd(x,10), y=ewmstd(y,10)))
>>> assert eq(ewmstd([x,y],10), [ewmstd(x,10), ewmstd(y,10)])
Returns

an array/timeseries of ewma

ewmvar

pyg.timeseries._ewm.ewmvar(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)

ewmstd is equivalent to a.ewm(n).var() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> ts = ewmvar(a,10); df = a.ewm(10).var()
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmvar(a,10, bias = True); df = a.ewm(10).var(bias = True)
>>> assert abs(ts-df).max()<1e-10
Example

numpy arrays support

>>> assert eq(ewmvar(a.values, 10), ewmvar(a,10).values)
Example

nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmvar(a,10, time = 'i'); df = a.ewm(10).var() # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
>>> ts = ewmvar(a,10, time = 'i', bias = True); df = a.ewm(10).var(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
Example

state management

>>> old = a.iloc[:5000]
>>> new = a.iloc[5000:]
>>> old_ts = ewmvar_(old, 10)
>>> new_ts = ewmvar(new, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmvar(a,10)
>>> assert eq(new_ts, ts.iloc[5000:])
Example

Support for time & clock

>>> daily = a
>>> monthly = daily.resample('M').last()
>>> m_ts = ewmvar(monthly, 3) ## 3-month ewma run on monthly data
>>> d_ts = ewmvar(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d_ts.resample('M').last()
>>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Example

Support for dict/list of arrays

>>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999))
>>> a = dict(x = x, y = y)    
>>> assert eq(ewmvar(dict(x=x, y=y),10), dict(x=ewmvar(x,10), y=ewmvar(y,10)))
>>> assert eq(ewmvar([x,y],10), [ewmvar(x,10), ewmvar(y,10)])
Returns

an array/timeseries of ewma

ewmcor

pyg.timeseries._ewm.ewmcor(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, state=None)

calculates pair-wise correlation between a and b.

Parameters

a : array/timeseries b : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

biasbook, optional

vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default.

axisint, optional

axis of calculation. The default is 0.

dataplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmcor. The default is None.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,9000), drange(-8999))
>>> ts = ewmcor(a, b, n = 10); df = a.ewm(10).corr(b)
>>> assert abs(ts-df).max()<1e-10
Example

numpy arrays support

>>> assert eq(ewmcor(a.values, b.values, 10), ewmcor(a, b, 10).values)
Example

nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
Example

state management

>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000]
>>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:]
>>> old_ts = ewmcor_(old_a, old_b, 10)
>>> new_ts = ewmcor(new_a, new_b, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmcor(a,b,10)
>>> assert eq(new_ts, ts.iloc[5000:])

ewmLR

pyg.timeseries._ewm.ewmLR(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, state=None)

calculates pair-wise linear regression between a and b. We have a and b for which we want to fit:

>>> b_i = c + m a_i 
>>> LSE(c,m) = \sum w_i (c + m a_i - b_i)^2
>>> dLSE/dc  = 0  <==> \sum w_i  (c + m a_i - b_i) = 0    [1]
>>> dLSE/dm  = 0 <==> \sum w_i  a_i (c + m a_i - b_i) = 0 [2]
>>> c     + mE(a)    = E(b)     [1]
>>> cE(a) + mE(a^2)  = E(ab)    [2]
>>> cE(a) + mE(a)^2  = E(a)E(n) [1] * E(a) 
>>> m(E(a^2) - E(a)^2) = E(ab) - E(a)E(b)
>>> m = covar(a,b)/var(a)
>>> c = E(b) - mE(a)

a : array/timeseries b : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

biasbook, optional

vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default.

axisint, optional

axis of calculation. The default is 0.

c,mplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmcor. The default is None.

Example

numpy arrays support

>>> assert eq(ewmLR(a.values, b.values, 10), ewmLR(a, b, 10).values)
Example

nan handling

>>> a[a.values<-0.1] = np.nan
>>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan
>>> assert abs(ts-df).max()<1e-10
Example

state management

>>> from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000]
>>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:]
>>> old_ts = ewmLR_(old_a, old_b, 10)
>>> new_ts = ewmLR(new_a, new_b, 10, **old_ts) # instantiation with previous ewma
>>> ts = ewmLR(a,b,10)
>>> assert eq(new_ts.c, ts.c.iloc[5000:])
>>> assert eq(new_ts.m, ts.m.iloc[5000:])
Example

>>> from pyg import *
>>> a0 = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a1 = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> b = (a0 - a1) + pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a = pd.concat([a0,a1], axis=1)
>>> LR = ewmLR(a,b,50)
>>> assert abs(LR.m.mean()[0]-1)<0.5
>>> assert abs(LR.m.mean()[1]+1)<0.5

ewmGLM

pyg.timeseries._ewm.ewmGLM(a, b, n, time=None, min_sample=0.25, bias=True, data=None, state=None)

Calculates a General Linear Model fitting b to a.

Parameters

a : a 2-d array/pd.DataFrame of values fitting b b : a 1-d array/pd.Series n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

min_samplefloar, optional

minimum weight of observations before we return the fitting. The default is 0.25. This ensures that we don’t get silly numbers due to small population.

dataplace holder, ignore, optional

ignore. The default is None.

statedict, optional

Output from a previous run of ewmGLM. The default is None.

Theory

See https://en.wikipedia.org/wiki/Generalized_linear_model for full details. Briefly, we assume b is single column while a is multicolumn. We minimize least square error (LSE) fitting:

>>> b[i] =\sum_j m_j a_j[i]    
>>> LSE(m) = \sum_i w_i (b[i] - \sum_j m_j * a_j[i])^2
>>> dLSE/dm_k = 0  
>>> <==>  \sum_i w_i (b[i] - \sum_j m_j * a_j[i]) a_k[i] = 0
>>> <==>  E(b*a_k) = m_k E(a_k^2) + sum_{j<>k} m_k E(a_j a_k) 

E is expectation under weights w. And we can rewrite it as:

>>> a2 x m = ab ## matrix multiplication
>>> a2[i,j] = E(a_i * a_j)
>>> ab[j] = E(a_j * b)
>>> m = a2.inverse x ab ## matrix multiplication    
Example

simple fit

>>> from pyg import *
>>> a = pd.DataFrame(np.random.normal(0,1,(10000,10)), drange(-9999))
>>> true_m = np.random.normal(1,1,10)
>>> noise = np.random.normal(0,1,10000)
>>> b = (a * true_m).sum(axis = 1) + noise
>>> fitted_m = ewmGLM(a, b, 50)    

ewmskew

pyg.timeseries._ewm.ewmskew(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, state=None)

Equivalent to a.ewm(n).skew() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation

Parameters

a : array/timeseries n : int/fraction

The number or days (or a ratio) to scale the history

timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
  • if we have intraday data, and set time = ‘d’, then

  • the ewm calculation on last observations per day is what is retained.

  • the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation)

data: None.

unused at the moment. Allow code such as func(live, **func_(history)) to work

state: dict, optional

state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided.

Example

matching pandas

>>> import pandas as pd; import numpy as np; from pyg import *
>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> x = a.ewm(10).skew()
>>> old = a.iloc[:10]
>>> new = a.iloc[10:]
f = ewmskew_
for f in [ewma_, ewmstd_, ewmrms_, ewmskew_, ]:
    both = f(a, 3)
    o = f(old, 3)
    n = f(new, 3, **o)
    assert eq(o.data, both.data.iloc[:10]) 
    assert eq(n.data, both.data.iloc[10:]) 
    assert both - 'data' == n - 'data'
>>> assert abs(a.ewm(10).mean() - ewma(a,10)).max() < 1e-14
>>> assert abs(a.ewm(10).std() - ewmstd(a,10)).max() < 1e-14
Example

numpy arrays support

>>> assert eq(ewma(a.values, 10), ewma(a,10).values)
Example

nan handling

while panadas ffill values, timeseries skips nans:

>>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999))
>>> a[a.values>0.1] = np.nan
>>> ts = ewma(a,10)
>>> assert eq(ts[~np.isnan(ts)], ewma(a[~np.isnan(a)], 10))
Example

initiating the ewma with past state

>>> old = np.random.normal(0,1,100)
>>> new = np.random.normal(0,1,100)
>>> old_ = ewma_(old, 10)
>>> new_ = ewma(new, 10, t0 = old_ewma.t0, t1 = old_ewma.t1) # instantiation with previous ewma
>>> new_2 = ewma(np.concatenate([old,new]), 10)[-100:]
>>> assert eq(new_ewma, new_ewma2)
Example

Support for time & clock

>>> daily = pd.Series(np.random.normal(0,1,10000), drange(-9999)).cumsum()
>>> monthly = daily.resample('M').last()
>>> m = ewma(monthly, 3) ## 3-month ewma run on monthly data
>>> d = ewma(daily, 3, 'm') ## 3-month ewma run on daily data
>>> daily_resampled_to_month = d.resample('M').last()
>>> assert abs(daily_resampled_to_month - m).max() < 1e-10

So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history.

Returns

an array/timeseries of ewma

functions exposing their state

simple functions

pyg.timeseries._rolling.diff_(a, n=1, axis=0, data=None, instate=None)

returns a forward filled array, up to n values forward. Equivalent to diff(a,n) but returns the full state. See diff for full details

pyg.timeseries._rolling.shift_(a, n=1, axis=0, instate=None)

Equivalent to shift(a,n) but returns the full state. See shift for full details

pyg.timeseries._rolling.ratio_(a, n=1, data=None, instate=None)
pyg.timeseries._ts.ts_count_(a, axis=0, data=None, instate=None)

ts_count_(a) is equivalent to ts_count(a) except vec is also returned. See ts_count for full documentation

pyg.timeseries._ts.ts_sum_(a, axis=0, data=None, instate=None)

ts_sum_(a) is equivalent to ts_sum(a) except vec is also returned. See ts_sum for full documentation

pyg.timeseries._ts.ts_mean_(a, axis=0, data=None, instate=None)

ts_mean_(a) is equivalent to ts_mean(a) except vec is also returned. See ts_mean for full documentation

pyg.timeseries._ts.ts_rms_(a, axis=0, data=None, instate=None)

ts_rms_(a) is equivalent to ts_rms(a) except it also returns vec see ts_rms for full documentation

pyg.timeseries._ts.ts_std_(a, axis=0, data=None, instate=None)

ts_std_(a) is equivalent to ts_std(a) except vec is also returned. See ts_std for full documentation

pyg.timeseries._ts.ts_skew_(a, bias=False, min_sample=0.25, axis=0, data=None, instate=None)

ts_skew_(a) is equivalent to ts_skew except vec is also returned. See ts_skew for full details

pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)

ts_max(a) is equivalent to pandas a.min()

pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)

ts_max(a) is equivalent to pandas a.min()

pyg.timeseries._rolling.ffill_(a, n=0, axis=0, instate=None)

returns a forward filled array, up to n values forward. supports state manegement

expanding window functions

pyg.timeseries._expanding.expanding_mean_(a, axis=0, data=None, instate=None)

Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_mean.__doc__

pyg.timeseries._expanding.expanding_rms_(a, axis=0, data=None, instate=None)

Equivalent to expanding_rms(a) but returns also the state variables. For full documentation, look at expanding_rms.__doc__

pyg.timeseries._expanding.expanding_std_(a, axis=0, data=None, instate=None)

Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_std.__doc__

pyg.timeseries._expanding.expanding_sum_(a, axis=0, data=None, instate=None)

Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__

pyg.timeseries._expanding.expanding_skew_(a, bias=False, axis=0, data=None, instate=None)

Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_skew.__doc__

pyg.timeseries._min.expanding_min_(a, axis=0, data=None, instate=None)

Equivalent to a.expanding().min() but returns the full state: i.e. both data: the expanding().min() m: the current minimum

pyg.timeseries._max.expanding_max_(a, axis=0, data=None, instate=None)

Equivalent to a.expanding().max() but returns the full state: i.e. both data: the expanding().max() m: the current maximum

pyg.timeseries._expanding.cumsum_(a, axis=0, data=None, instate=None)

Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__

pyg.timeseries._expanding.cumprod_(a, axis=0, data=None, instate=None)

Equivalent to cumprod(a) but returns also the state variable. For full documentation, look at cumprod.__doc__

rolling window functions

pyg.timeseries._rolling.rolling_mean_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_mean(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_mean.__doc__

pyg.timeseries._rolling.rolling_rms_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_rms(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_rms.__doc__

pyg.timeseries._rolling.rolling_std_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_std(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_std.__doc__

pyg.timeseries._rolling.rolling_sum_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_sum(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_sum.__doc__

pyg.timeseries._rolling.rolling_skew_(a, n, bias=False, axis=0, data=None, instate=None)

Equivalent to rolling_skew(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_skew.__doc__

pyg.timeseries._min.rolling_min_(a, n, vec=None, axis=0, data=None, instate=None)

Equivalent to rolling_min(a) but returns also the state. For full documentation, look at rolling_min.__doc__

pyg.timeseries._max.rolling_max_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_max(a) but returns also the state. For full documentation, look at rolling_max.__doc__

pyg.timeseries._median.rolling_median_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_median(a) but returns also the state. For full documentation, look at rolling_median.__doc__

pyg.timeseries._rank.rolling_rank_(a, n, axis=0, data=None, instate=None)

Equivalent to rolling_rank(a) but returns also the state variables. For full documentation, look at rolling_rank.__doc__

pyg.timeseries._stride.rolling_quantile_(a, n, quantile=0.5, axis=0, data=None, instate=None)

Equivalent to rolling_quantile(a) but returns also the state. For full documentation, look at rolling_quantile.__doc__

exponentially weighted moving functions

pyg.timeseries._ewm.ewma_(a, n, time=None, data=None, instate=None)

Equivalent to ewma but returns a state parameter for instantiation of later calculations. See ewma documentation for more details

pyg.timeseries._ewm.ewmrms_(a, n, time=None, axis=0, data=None, instate=None)

Equivalent to ewmrms but returns a state parameter for instantiation of later calculations. See ewmrms documentation for more details

pyg.timeseries._ewm.ewmstd_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)

Equivalent to ewmstd but returns a state parameter for instantiation of later calculations. See ewmstd documentation for more details

pyg.timeseries._ewm.ewmvar_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)

Equivalent to ewmvar but returns a state parameter for instantiation of later calculations. See ewmvar documentation for more details

pyg.timeseries._ewm.ewmcor_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, instate=None)

Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details

pyg.timeseries._ewm.ewmLR_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, instate=None)

Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details

pyg.timeseries._ewm.ewmGLM_(a, b, n, time=None, min_sample=0.25, bias=True, data=None, instate=None)

Equivalent to ewmGLM but returns a state parameter for instantiation of later calculations. See ewmGLM documentation for more details

pyg.timeseries._ewm.ewmskew_(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, instate=None)

Equivalent to ewmskew but returns a state parameter for instantiation of later calculations. See ewmskew documentation for more details

Index handling

df_fillna

pyg.timeseries._index.df_fillna(df, method=None, axis=0, limit=None)

Equivelent to df.fillna() except:

  • support np.ndarray as well as dataframes

  • support multiple methods of filling/interpolation

  • supports removal of nan from the start/all of the timeseries

  • supports action on multiple timeseries

Parameters

df : dataframe/numpy array

methodstring, list of strings or None, optional

Either a fill method (bfill, ffill, pad) Or an interplation method: ‘linear’, ‘time’, ‘index’, ‘values’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘krogh’, ‘spline’, ‘polynomial’, ‘from_derivatives’, ‘piecewise_polynomial’, ‘pchip’, ‘akima’, ‘cubicspline’ Or ‘fnna’: removes all to the first non nan Or ‘nona’: removes all nans

axisint, optional

axis. The default is 0.

limitTYPE, optional

when filling, how many nan get filled. The default is None (indefinite)

Example

method ffill or bfill

>>> from pyg import *; import numpy as np
>>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'ffill'), np.array([ np.nan, 1.,  1.,  9.,  9., 25.]))
>>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1.,  1.,  9.,  9., 25.]))
>>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1.,  1.,  9.,  9., 25.]))
>>> df = np.array([np.nan, 1., np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'ffill', limit = 2), np.array([np.nan,  1.,  1.,  1., np.nan, np.nan, np.nan, np.nan,  9.,  9., 25.]))

df_fillna does not maintain state of latest ‘prev’ value: use ffill_ for that.

Example

interpolation methods

>>> from pyg import *; import numpy as np
>>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25])    
>>> assert eq(df_fillna(df, 'linear'), np.array([ np.nan, 1.,  5.,  9.,  17., 25.]))
>>> assert eq(df_fillna(df, 'quadratic'), np.array([ np.nan, 1.,  4.,  9.,  16., 25.]))
Example

method = fnna and nona

>>> from pyg import *; import numpy as np
>>> ts = np.array([np.nan] * 10 + [1.] * 10 + [np.nan])
>>> assert eq(df_fillna(ts, 'fnna'), np.array([1.]*10 + [np.nan]))
>>> assert eq(df_fillna(ts, 'nona'), np.array([1.]*10))
>>> assert len(df_fillna(np.array([np.nan]), 'nona')) == 0
>>> assert len(df_fillna(np.array([np.nan]), 'fnna')) == 0
Returns

array/dataframe with nans removed/filled

df_index

pyg.timeseries._index.df_index(seq, index='inner')

Determines a joint index of multiple timeseries objects.

Parameters

seqsequence whose index needs to be determined

a (possible nested) sequence of timeseries/non-timeseries object within lists/dicts

indexstr, optional

method to determine the index. The default is ‘inner’.

Returns

pd.Index

The joint index.

Example

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> more_tss_as_dict = dict(zip('abcde',[pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]))
>>> res = df_index(tss + [more_tss_as_dict], 'inner')
>>> assert len(res) == 6
>>> res = df_index(more_tss_as_dict, 'outer')
>>> assert len(res) == 14

df_reindex

pyg.timeseries._index.df_reindex(ts, index=None, method=None, limit=None)

A slightly more general version of df.reindex(index)

Parameters

tsdataframe or numpy array (or list/dict of theses)

timeseries to be reindexed

indexstr, timeseries, pd.Index.

The new index

methodstr, list of str, float, optional

various methods of handling nans are available. The default is None. See df_fillna for a full list.

Returns

timeseries/np.ndarray (or list/dict of theses)

timeseries reindex.

Example

index = inner/outer

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> res = df_reindex(tss, 'inner')
>>> assert len(res[0]) == 6
>>> res = df_reindex(tss, 'outer')
>>> assert len(res[0]) == 14
Example

index provided

>>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)]
>>> res = df_reindex(tss, tss[0])
>>> assert eq(res[0], tss[0])
>>> res = df_reindex(tss, tss[0].index)
>>> assert eq(res[0], tss[0])

presync

pyg.timeseries._index.presync()

Much of timeseries analysis in Pandas is spent aligning multiple timeseries before feeding them into a function. presync allows easy presynching of all paramters of a function.

Parameters

functioncallable, optional

function to be presynched. The default is None.

indexstr, optional

index join policy. The default is ‘inner’.

methodstr/int/list of these, optional

method of nan handling. The default is None.

columnsstr, optional

columns join policy. The default is ‘inner’.

defaultfloat, optional

value when no data is available. The default is np.nan.

Returns

presynch-decorated function

Example

>>> from pyg import *
>>> x = pd.Series([1,2,3,4], drange(-3))
>>> y = pd.Series([1,2,3,4], drange(-4,-1))    
>>> z = pd.DataFrame([[1,2],[3,4]], drange(-3,-2), ['a','b'])
>>> addition = lambda a, b: a+b    

#We get some nonsensical results:

>>> assert list(addition(x,z).columns) ==  list(x.index) + ['a', 'b']

#But:

>>> assert list(presync(addition)(x,z).columns) == ['a', 'b']
>>> res = presync(addition, index='outer', method = 'ffill')(x,z)
>>> assert eq(res.a.values, np.array([2,5,6,7]))
Example 2

alignment works for parameters ‘buried’ within…

>>> function = lambda a, b: a['x'] + a['y'] + b    
>>> f = presync(function, 'outer', method = 'ffill')
>>> res = f(dict(x = x, y = y), b = z)
>>> assert eq(res, pd.DataFrame(dict(a = [np.nan, 4, 8, 10, 11], b = [np.nan, 5, 9, 11, 12]), index = drange(-4)))
Example 3

alignment of numpy arrays

>>> addition = lambda a, b: a+b
>>> a = presync(addition)
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4]]).T),  pd.Series([2,4,6,8], drange(-3)))
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([1,2,3,4])),  pd.Series([2,4,6,8], drange(-3)))
>>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4],[5,6,7,8]]).T),  pd.DataFrame({0:[2,4,6,8], 1:[6,8,10,12]}, drange(-3)))
>>> assert eq(a(np.array([1,2,3,4]), np.array([[1,2,3,4]]).T),  np.array([2,4,6,8]))
Example 4

inner join alignment of columns in dataframes by default

>>> x = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3))
>>> y = pd.DataFrame({'wrong':[2,4,6,8], 'columns':[6,8,10,12]}, drange(-3))
>>> assert len(a(x,y)) == 0    
>>> y = pd.DataFrame({'a':[2,4,6,8], 'other':[6,8,10,12.]}, drange(-3))
>>> assert eq(a(x,y),x[['a']]*2)
>>> y = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3))
>>> assert eq(a(x,y),x*2)
>>> y = pd.DataFrame({'column name for a single column dataframe is ignored':[1,1,1,1]}, drange(-3)) 
>>> assert eq(a(x,y),x+1)
>>> a = presync(addition, columns = 'outer')
>>> y = pd.DataFrame({'other':[2,4,6,8], 'a':[6,8,10,12]}, drange(-3))
>>> assert sorted(a(x,y).columns) == ['a','b','other']    
Example 4

ffilling, bfilling

>>> x = pd.Series([1.,np.nan,3.,4.], drange(-3))    
>>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1))    
>>> assert eq(a(x,y), pd.Series([np.nan, np.nan,7], drange(-3,-1)))

but, we provide easy conversion of internal parameters of presync:

>>> assert eq(a.ffill(x,y), pd.Series([2,4,7], drange(-3,-1)))
>>> assert eq(a.bfill(x,y), pd.Series([4,6,7], drange(-3,-1)))
>>> assert eq(a.oj(x,y), pd.Series([np.nan, np.nan, np.nan, 7, np.nan], drange(-4)))
>>> assert eq(a.oj.ffill(x,y), pd.Series([np.nan, 2, 4, 7, 8], drange(-4)))
Example 5

indexing to a specific index

>>> index = pd.Index([dt(-3), dt(-1)])
>>> a = presync(addition, index = index)
>>> x = pd.Series([1.,np.nan,3.,4.], drange(-3))    
>>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1))    
>>> assert eq(a(x,y), pd.Series([np.nan, 7], index))
Example 6

returning complicated stuff

>>> from pyg import * 
>>> a = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99))
>>> b = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99))
>>> def f(a, b):
>>>     return (a*b, ts_sum(a), ts_sum(b))
>>> old = f(a,b)    
>>> self = presync(f)
>>> args = (); kwargs = dict(a = a, b = b)
>>> new = self(*args, **kwargs)
>>> assert eq(new, old)

add/sub/mul/div/pow operators

pyg.timeseries._index.add_(a, b)

addition of a and b supporting presynching (inner join) of timeseries

pyg.timeseries._index.mul_(a, b)

multiplication of a and b supporting presynching (inner join) of timeseries

pyg.timeseries._index.div_(a, b)

division of a by b supporting presynching (inner join) of timeseries

pyg.timeseries._index.sub_(a, b)

subtraction of b from a supporting presynching (inner join) of timeseries

pyg.timeseries._index.pow_(a, b)

equivalent to a**b supporting presynching (inner join) of timeseries