pyg.timeseries¶
Given pandas, why do we need this timeseries library? pandas is amazing but there are a few features in pyg.timeseries designed to enhance it. There are three issues with pandas that pyg.timeseries tries to address:
- pandas works on pandas objects (obviously) but not on numpy arrays. 
- pandas handles TimeSeries with nan inconsistently across its functions. This makes your results sensitive to reindexing/resampling. E.g.:
- a.expanding() & a.ewm() ignore nan’s for calculation and then ffill the result. 
- a.diff(), a.rolling() include any nans in the calculation, leading to nan propagation. 
 
 
- pandas is great if you have the full timeseries. However, if you now want to run the same calculations in a live environment, on recent data, pandas cannot help you: you have to stick the new data at the end of the DataFrame and rerun. 
pyg.timeseries tries to address this:
- pyg.timeseries agrees with pandas 100% on DataFrames (with no nan) while being of comparable (if not faster) speed 
- pyg.timeseries works seemlessly on pandas objects and on numpy arrays, with no code change. 
- pyg.timeseries handles nan consistently across all its functions, ‘ignoring’ all nan, making your results consistent regardless of resampling. 
- pyg.timeseries exposes the state of the internal function calculation. The exposure of internal states allows us to calculate the output of additional data without re-running history. This speeds up of two very common problems in finance:
- risk calculations, Monte Carlo scenarios: We can run a trading strategy up to today and then generate multiple scenarios and see what-if, without having to rerun the full history. 
- live versus history: pandas is designed to run a full historical simulation. However, once we reach “today”, speed is of the essense and running a full historical simulation every time we ingest a new price, is just too slow. That is why most fast trading is built around fast state-machines. Of course, making sure research & live versions do the same thing is tricky. pyg gives you the ability to run two systems in parallel with almost the same code base: run full history overnight and then run today’s code base instantly, instantiated with the output of the historical simulation. 
 
 
simple functions¶
diff¶
- 
pyg.timeseries._rolling.diff(a, n=1, axis=0, data=None, state=None)¶
- equivalent to a.diff(n) in pandas if there are no nans. If there are, we SKIP nans rather than propagate them. - Parameters
 - aarray/timeseries
- array/timeseries 
- n: int, optional, default = 1
- window size 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- : matching pandas no nan’s 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> assert eq(timer(diff, 1000)(a), timer(lambda a, n=1: a.diff(n), 1000)(a)) - Example
- : nan skipping 
 - >>> a = np.array([1., np.nan, 3., 9.]) >>> assert eq(diff(a), np.array([np.nan, np.nan, 2.0, 6.0])) >>> assert eq(pd.Series(a).diff().values, np.array([np.nan, np.nan, np.nan,6.0])) 
shift¶
- 
pyg.timeseries._rolling.shift(a, n=1, axis=0, data=None, state=None)¶
- Equivalent to a.shift() with support to arra - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series([1.,2,3,4,5], drange(-4)) >>> assert eq(shift(a), pd.Series([np.nan,1,2,3,4], drange(-4))) >>> assert eq(shift(a,2), pd.Series([np.nan,np.nan,1,2,3], drange(-4))) >>> assert eq(shift(a,-1), pd.Series([2,3,4,5,np.nan], drange(-4))) - Example
- np.ndarrays 
 - >>> assert eq(shift(a.values), shift(a).values) - Example
- nan skipping 
 - >>> a = pd.Series([1.,2,np.nan,3,4], drange(-4)) >>> assert eq(shift(a), pd.Series([np.nan,1,np.nan, 2,3], drange(-4))) >>> assert eq(a.shift(), pd.Series([np.nan,1,2,np.nan,3], drange(-4))) # the location of the nan changes - Example
- state management 
 - >>> old = a.iloc[:3] >>> new = a.iloc[3:] >>> old_ts = shift_(old) >>> new_ts = shift(new, **old_ts) >>> assert eq(new_ts, shift(a).iloc[3:]) 
ratio¶
- 
pyg.timeseries._rolling.ratio(a, n=1, data=None, state=None)¶
- Equivalent to a.diff() but in log-space.. - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series([1.,2,3,4,5], drange(-4)) >>> assert eq(ratio(a), pd.Series([np.nan, 2, 1.5, 4/3,1.25], drange(-4))) >>> assert eq(ratio(a,2), pd.Series([np.nan, np.nan, 3, 2, 5/3], drange(-4))) 
ts_count¶
- 
pyg.timeseries._ts.ts_count(a) is equivalent to a.count() (though slightly slower)¶
- supports numpy arrays 
- skips nan 
- supports state management 
 - Example
- pandas matching 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan >>> assert ts_count(a) == a.count() - Example
- numpy 
 - >>> assert ts_count(a.values) == ts_count(a) - Example
- state management 
 - >>> old = ts_count_(a.iloc[:2000]) >>> new = ts_count(a.iloc[2000:], state = old.state) >>> assert new == ts_count(a) 
ts_sum¶
- 
pyg.timeseries._ts.ts_sum(a) is equivalent to a.sum()¶
- supports numpy arrays 
- handles nan 
- supports state management 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- pandas matching 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan >>> assert ts_sum(a) == a.sum() - Example
- numpy 
 - >>> assert ts_sum(a.values) == ts_sum(a) - Example
- state management 
 - >>> old = ts_sum_(a.iloc[:2000]) >>> new = ts_sum(a.iloc[2000:], vec = old.vec) >>> assert new == ts_sum(a) 
ts_mean¶
- 
pyg.timeseries._ts.ts_mean(a) is equivalent to a.mean()¶
- supports numpy arrays 
- handles nan 
- supports state management 
- pandas is actually faster on count 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- pandas matching 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan >>> assert ts_mean(a) == a.mean() - Example
- numpy 
 - >>> assert ts_mean(a.values) == ts_mean(a) - Example
- state management 
 - >>> old = ts_mean_(a.iloc[:2000]) >>> new = ts_mean(a.iloc[2000:], vec = old.vec) >>> assert new == ts_mean(a) 
ts_rms¶
- 
pyg.timeseries._ts.ts_rms(a, axis=0, data=None, state=None)¶
- ts_rms(a) is equivalent to (a**2).mean()**0.5 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - supports numpy arrays 
- handles nan 
- supports state management 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan - Example
- pandas matching 
 - >>> assert abs(ts_std(a) - a.std())<1e-13 - Example
- numpy 
 - >>> assert ts_std(a.values) == ts_std(a) - Example
- state management 
 - >>> old = ts_rms_(a.iloc[:2000]) >>> new = ts_rms(a.iloc[2000:], vec = old.vec) >>> assert new == ts_rms(a) 
ts_std¶
- 
pyg.timeseries._ts.ts_std(a) is equivalent to a.std()¶
- supports numpy arrays 
- handles nan 
- supports state management 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan - Example
- pandas matching 
 - >>> assert abs(ts_std(a) - a.std())<1e-13 - Example
- numpy 
 - >>> assert ts_std(a.values) == ts_std(a) - Example
- state management 
 - >>> old = ts_std_(a.iloc[:2000]) >>> new = ts_std(a.iloc[2000:], vec = old.vec) >>> assert new == ts_std(a) 
ts_skew¶
- 
pyg.timeseries._ts.ts_skew(a, 0) is equivalent to a.skew()¶
- supports numpy arrays 
- handles nan 
- faster than pandas 
- supports state management 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- min_sample: float, optional
- This refers to the denominator when we calculate the skew. Over time, the deonimator converges to 1 but initially, it is small. Also, if there is a gap in the data, older datapoints weight may have decayed while there are not enough “new point”. min_sample ensures that in both cases, if denominator<0.25 )(default value) we return nan. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- pandas matching 
 - >>> # create sample data: >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)); a[a>0] = np.nan >>> assert abs(ts_skew(a, 0) - a.skew())<1e-13 - Example
- numpy 
 - >>> assert ts_skew(a.values) == ts_skew(a) - Example
- state management 
 - >>> old = ts_skew_(a.iloc[:2000]) >>> new = ts_skew(a.iloc[2000:], vec = old.vec) >>> assert new == ts_skew(a) 
fnna¶
- 
pyg.timeseries._rolling.fnna(a, n=1, axis=0)¶
- returns the index in a of the nth first non-nan. - Parameters
 - a : array/timeseries n: int, optional, default = 1 - Example
 - >>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan]) >>> fnna(a,n=-2) 
v2na/na2v¶
- 
pyg.timeseries._rolling.v2na(a, old=0.0, new=nan)¶
- replaces an old value with a new value (default is nan) - Examples
 - >>> from pyg import * >>> a = np.array([1., np.nan, 1., 0.]) >>> assert eq(v2na(a), np.array([1., np.nan, 1., np.nan])) >>> assert eq(v2na(a,1), np.array([np.nan, np.nan, np.nan, 0])) >>> assert eq(v2na(a,1,0), np.array([0., np.nan, 0., 0.])) - Parameters
 - a : array/timeseries old: float - value to be replaced - newfloat, optional
- new value to be used, The default is np.nan. 
 - Returns
 - array/timeseries 
- 
pyg.timeseries._rolling.na2v(a, new=0.0)¶
- replaces a nan with a new value - Example
 - >>> from pyg import * >>> a = np.array([1., np.nan, 1.]) >>> assert eq(na2v(a), np.array([1., 0.0, 1.])) >>> assert eq(na2v(a,1), np.array([1., 1., 1.])) - Parameters
 - a : array/timeseries new : float, optional - DESCRIPTION. The default is 0.0. - Returns
 - array/timeseries 
ffill/bfill¶
- 
pyg.timeseries._rolling.ffill(a, n=0, axis=0, data=None, state=None)¶
- returns a forward filled array, up to n values forward. supports state manegement which is needed if we want only nth - Parameters
 - aarray/timeseries
- array/timeseries 
- n: int, optional, default = 1
- window size 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
 - >>> a = np.array([np.nan,np.nan,1,np.nan,np.nan,2,np.nan,np.nan,np.nan]) >>> fnna(a, n=-2) 
- 
pyg.timeseries._rolling.bfill(a, n=- 1, axis=0)¶
- equivalent to a.fillna(‘bfill’). There is no state-aware as this function is forward looking - Example
 - >>> from pyg import * >>> a = np.array([np.nan, 1., np.nan]) >>> b = np.array([1., 1., np.nan]) >>> assert eq(bfill(a), b) - Example
- pd.Series 
 - >>> ts = pd.Series(a, drange(-2)) >>> assert eq(bfill(ts).values, b) 
nona¶
- 
pyg.timeseries._ts.nona(a, value=nan)¶
- removes rows that are entirely nan (or a specific other value) - Parameters
 - a : dataframe/ndarray - valuefloat, optional
- value to be removed. The default is np.nan. 
 - Example
 - >>> from pyg import * >>> a = np.array([1,np.nan,2,3]) >>> assert eq(nona(a), np.array([1,2,3])) - Example
- multiple columns 
 - >>> a = np.array([[1,np.nan,2,np.nan], [np.nan, np.nan, np.nan, 3]]).T >>> b = np.array([[1,2,np.nan], [np.nan, np.nan, 3]]).T ## 2nd row has nans across >>> assert eq(nona(a), b) 
expanding window functions¶
expanding_mean¶
- 
pyg.timeseries._expanding.expanding_mean(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().mean(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().mean(); ts = expanding_mean(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().mean(); ts = expanding_mean(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 1.562960 1.562960 >>> 1993-09-24 0.908910 0.908910 >>> 1993-09-25 0.846817 0.846817 >>> 1993-09-26 0.821423 0.821423 >>> 1993-09-27 0.821423 NaN >>> ... ... >>> 2021-02-03 0.870358 0.870358 >>> 2021-02-04 0.870358 NaN >>> 2021-02-05 0.870358 NaN >>> 2021-02-06 0.870358 NaN >>> 2021-02-07 0.870353 0.870353 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_mean(a) >>> old_ts = expanding_mean_(old) >>> new_ts = expanding_mean(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_mean(dict(x = a, y = a**2)), dict(x = expanding_mean(a), y = expanding_mean(a**2))) >>> assert eq(expanding_mean([a,a**2]), [expanding_mean(a), expanding_mean(a**2)]) 
expanding_rms¶
- 
pyg.timeseries._expanding.expanding_rms(a, axis=0, data=None, state=None)¶
- equivalent to pandas (a**2).expanding().mean()**0.5). - works with np.arrays - handles nan without forward filling. - supports state parameters - Parameters
 - aarray, pd.Series, pd.DataFrame, list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = (a**2).expanding().mean()**0.5; ts = expanding_rms(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 0.160462 0.160462 >>> 1993-09-24 0.160462 NaN >>> 1993-09-25 0.160462 NaN >>> 1993-09-26 0.160462 NaN >>> 1993-09-27 0.160462 NaN >>> ... ... >>> 2021-02-03 1.040346 1.040346 >>> 2021-02-04 1.040346 NaN >>> 2021-02-05 1.040338 1.040338 >>> 2021-02-06 1.040337 1.040337 >>> 2021-02-07 1.040473 1.040473 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_rms(a) >>> old_ts = expanding_rms_(old) >>> new_ts = expanding_rms(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_rms(dict(x = a, y = a**2)), dict(x = expanding_rms(a), y = expanding_rms(a**2))) >>> assert eq(expanding_rms([a,a**2]), [expanding_rms(a), expanding_rms(a**2)]) 
expanding_std¶
- 
pyg.timeseries._expanding.expanding_std(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().std(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().std(); ts = expanding_std(a) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().std(); ts = expanding_std(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 NaN NaN >>> 1993-09-24 NaN NaN >>> 1993-09-25 NaN NaN >>> 1993-09-26 NaN NaN >>> 1993-09-27 NaN NaN >>> ... ... >>> 2021-02-03 0.590448 0.590448 >>> 2021-02-04 0.590448 NaN >>> 2021-02-05 0.590475 0.590475 >>> 2021-02-06 0.590475 NaN >>> 2021-02-07 0.590411 0.590411 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_std(a) >>> old_ts = expanding_std_(old) >>> new_ts = expanding_std(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_std(dict(x = a, y = a**2)), dict(x = expanding_std(a), y = expanding_std(a**2))) >>> assert eq(expanding_std([a,a**2]), [expanding_std(a), expanding_std(a**2)]) 
expanding_sum¶
- 
pyg.timeseries._expanding.expanding_sum(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().sum(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().sum(); ts = expanding_sum(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().sum(); ts = expanding_sum(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 NaN NaN >>> 1993-09-24 NaN NaN >>> 1993-09-25 0.645944 0.645944 >>> 1993-09-26 2.816321 2.816321 >>> 1993-09-27 2.816321 NaN >>> ... ... >>> 2021-02-03 3976.911348 3976.911348 >>> 2021-02-04 3976.911348 NaN >>> 2021-02-05 3976.911348 NaN >>> 2021-02-06 3976.911348 NaN >>> 2021-02-07 3976.911348 NaN - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_sum(a) >>> old_ts = expanding_sum_(old) >>> new_ts = expanding_sum(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2))) >>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)]) 
expanding_skew¶
- 
pyg.timeseries._expanding.expanding_skew(a, bias=False, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().skew() which doesn’t exist - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_skew(a) >>> old_ts = expanding_skew_(old) >>> new_ts = expanding_skew(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_skew(dict(x = a, y = a**2)), dict(x = expanding_skew(a), y = expanding_skew(a**2))) >>> assert eq(expanding_skew([a,a**2]), [expanding_skew(a), expanding_skew(a**2)]) 
expanding_min¶
- 
pyg.timeseries._min.expanding_min(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().min(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().min(); ts = expanding_min(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().min(); ts = expanding_min(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-24 NaN NaN >>> 1993-09-25 NaN NaN >>> 1993-09-26 0.775176 0.775176 >>> 1993-09-27 0.691942 0.691942 >>> 1993-09-28 0.691942 NaN >>> ... ... >>> 2021-02-04 0.100099 0.100099 >>> 2021-02-05 0.100099 NaN >>> 2021-02-06 0.100099 NaN >>> 2021-02-07 0.100099 0.100099 >>> 2021-02-08 0.100099 0.100099 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_min(a) >>> old_ts = expanding_min_(old) >>> new_ts = expanding_min(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_min(dict(x = a, y = a**2)), dict(x = expanding_min(a), y = expanding_min(a**2))) >>> assert eq(expanding_min([a,a**2]), [expanding_min(a), expanding_min(a**2)]) 
expanding_max¶
- 
pyg.timeseries._max.expanding_max(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().max(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().max(); ts = expanding_max(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().max(); ts = expanding_max(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-24 NaN NaN >>> 1993-09-25 NaN NaN >>> 1993-09-26 0.875409 0.875409 >>> 1993-09-27 0.875409 NaN >>> 1993-09-28 0.875409 NaN >>> ... ... >>> 2021-02-04 3.625858 3.625858 >>> 2021-02-05 3.625858 NaN >>> 2021-02-06 3.625858 3.625858 >>> 2021-02-07 3.625858 NaN >>> 2021-02-08 3.625858 NaN - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_max(a) >>> old_ts = expanding_max_(old) >>> new_ts = expanding_max(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_max(dict(x = a, y = a**2)), dict(x = expanding_max(a), y = expanding_max(a**2))) >>> assert eq(expanding_max([a,a**2]), [expanding_max(a), expanding_max(a**2)]) 
expanding_median¶
- 
pyg.timeseries._median.expanding_median(a, axis=0)¶
- equivalent to pandas a.expanding().median(). - works with np.arrays 
- handles nan without forward filling. 
- There is no state-aware version since this requires essentially the whole history to be stored. 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().median(); ts = expanding_median(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().median(); ts = expanding_median(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 1.562960 1.562960 >>> 1993-09-24 0.908910 0.908910 >>> 1993-09-25 0.846817 0.846817 >>> 1993-09-26 0.821423 0.821423 >>> 1993-09-27 0.821423 NaN >>> ... ... >>> 2021-02-03 0.870358 0.870358 >>> 2021-02-04 0.870358 NaN >>> 2021-02-05 0.870358 NaN >>> 2021-02-06 0.870358 NaN >>> 2021-02-07 0.870353 0.870353 - Example
- dict/list inputs 
 - >>> assert eq(expanding_median(dict(x = a, y = a**2)), dict(x = expanding_median(a), y = expanding_median(a**2))) >>> assert eq(expanding_median([a,a**2]), [expanding_median(a), expanding_median(a**2)]) 
expanding_rank¶
- 
pyg.timeseries._rank.expanding_rank(a, axis=0)¶
- returns a rank of the current value within history, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
 - Example
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series([1.,2., np.nan, 0.,4.,2.], drange(-5)) >>> rank = expanding_rank(a) >>> assert eq(rank, pd.Series([0, 1, np.nan, -1, 1, 0.25], drange(-5))) >>> # >>> # 2 is largest in [1,2] so goes to 1; >>> # 0 is smallest in [1,2,0] so goes to -1 etc. - Example
- numpy equivalent 
 - >>> assert eq(expanding_rank(a.values), expanding_rank(a).values) 
cumsum¶
- 
pyg.timeseries._expanding.cumsum(a, axis=0, data=None, state=None)¶
- equivalent to pandas a.expanding().sum(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.expanding().sum(); ts = expanding_sum(a) >>> assert eq(ts,panda) - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a[a<0.1] = np.nan >>> panda = a.expanding().sum(); ts = expanding_sum(a) - >>> pd.concat([panda,ts], axis=1) >>> 0 1 >>> 1993-09-23 NaN NaN >>> 1993-09-24 NaN NaN >>> 1993-09-25 0.645944 0.645944 >>> 1993-09-26 2.816321 2.816321 >>> 1993-09-27 2.816321 NaN >>> ... ... >>> 2021-02-03 3976.911348 3976.911348 >>> 2021-02-04 3976.911348 NaN >>> 2021-02-05 3976.911348 NaN >>> 2021-02-06 3976.911348 NaN >>> 2021-02-07 3976.911348 NaN - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = expanding_sum(a) >>> old_ts = expanding_sum_(old) >>> new_ts = expanding_sum(new, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(expanding_sum(dict(x = a, y = a**2)), dict(x = expanding_sum(a), y = expanding_sum(a**2))) >>> assert eq(expanding_sum([a,a**2]), [expanding_sum(a), expanding_sum(a**2)]) 
cumprod¶
- 
pyg.timeseries._expanding.cumprod(a, axis=0, data=None, state=None)¶
- equivalent to pandas np.exp(np.log(a).expanding().sum()). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = 1 + pd.Series(np.random.normal(0.001,0.05,10000), drange(-9999)) >>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a) >>> assert abs(ts-panda).max() < 1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not forward fill the nans. - >>> a = 1 + pd.Series(np.random.normal(-0.01,0.05,100), drange(-99, 2020)) >>> a[a<0.975] = np.nan >>> panda = np.exp(np.log(a).expanding().sum()); ts = cumprod(a) - >>> pd.concat([panda,ts], axis=1) >>> 2019-09-24 1.037161 1.037161 >>> 2019-09-25 1.050378 1.050378 >>> 2019-09-26 1.158734 1.158734 >>> 2019-09-27 1.158734 NaN >>> 2019-09-28 1.219402 1.219402 >>> ... ... >>> 2019-12-28 4.032919 4.032919 >>> 2019-12-29 4.032919 NaN >>> 2019-12-30 4.180120 4.180120 >>> 2019-12-31 4.180120 NaN >>> 2020-01-01 4.244261 4.244261 - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:50] >>> new = a.iloc[50:] >>> ts = cumprod(a) >>> old_ts = cumprod_(old) >>> new_ts = cumprod(new, **old_ts) >>> assert eq(new_ts, ts.iloc[50:]) - Example
- dict/list inputs 
 - >>> assert eq(cumprod(dict(x = a, y = a**2)), dict(x = cumprod(a), y = cumprod(a**2))) >>> assert eq(cumprod([a,a**2]), [cumprod(a), cumprod(a**2)]) 
rolling window functions¶
rolling_mean¶
- 
pyg.timeseries._rolling.rolling_mean(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).mean(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).mean(); ts = rolling_mean(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 6 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_mean(a,10) >>> old_ts = rolling_mean_(old,10) >>> new_ts = rolling_mean(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_mean(dict(x = a, y = a**2),10), dict(x = rolling_mean(a,10), y = rolling_mean(a**2,10))) >>> assert eq(rolling_mean([a,a**2],10), [rolling_mean(a,10), rolling_mean(a**2,10)]) 
rolling_rms¶
- 
pyg.timeseries._rolling.rolling_rms(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas (a**2).rolling(n).mean()**0.5. - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans - >>> a[a<0.1] = np.nan >>> panda = (a**2).rolling(10).mean()**0.5; ts = rolling_rms(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 6 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_rms(a,10) >>> old_ts = rolling_rms_(old,10) >>> new_ts = rolling_rms(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_rms(dict(x = a, y = a**2),10), dict(x = rolling_rms(a,10), y = rolling_rms(a**2,10))) >>> assert eq(rolling_rms([a,a**2],10), [rolling_rms(a,10), rolling_rms(a**2,10)]) 
rolling_std¶
- 
pyg.timeseries._rolling.rolling_std(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).std(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).std(); ts = rolling_std(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).std(); ts = rolling_std(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 2 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_std(a,10) >>> old_ts = rolling_std_(old,10) >>> new_ts = rolling_std(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_std(dict(x = a, y = a**2),10), dict(x = rolling_std(a,10), y = rolling_std(a**2,10))) >>> assert eq(rolling_std([a,a**2],10), [rolling_std(a,10), rolling_std(a**2,10)]) 
rolling_sum¶
- 
pyg.timeseries._rolling.rolling_sum(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).sum(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).sum(); ts = rolling_sum(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 2 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_sum(a,10) >>> old_ts = rolling_sum_(old,10) >>> new_ts = rolling_sum(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_sum(dict(x = a, y = a**2),10), dict(x = rolling_sum(a,10), y = rolling_sum(a**2,10))) >>> assert eq(rolling_sum([a,a**2],10), [rolling_sum(a,10), rolling_sum(a**2,10)]) 
rolling_skew¶
- 
pyg.timeseries._rolling.rolling_skew(a, n, bias=False, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).skew(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- bias:
- affects the skew calculation definition, see scipy documentation for details. 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99.9% nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).skew(); ts = rolling_skew(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 2 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_skew(a,10) >>> old_ts = rolling_skew_(old,10) >>> new_ts = rolling_skew(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_skew(dict(x = a, y = a**2),10), dict(x = rolling_skew(a,10), y = rolling_skew(a**2,10))) >>> assert eq(rolling_skew([a,a**2],10), [rolling_skew(a,10), rolling_skew(a**2,10)]) 
rolling_min¶
- 
pyg.timeseries._min.rolling_min(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).min(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).min(); ts = rolling_min(a,10) >>> assert abs(ts-panda).min()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).min(); ts = rolling_min(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 6 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_min(a,10) >>> old_ts = rolling_min_(old,10) >>> new_ts = rolling_min(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_min(dict(x = a, y = a**2),10), dict(x = rolling_min(a,10), y = rolling_min(a**2,10))) >>> assert eq(rolling_min([a,a**2],10), [rolling_min(a,10), rolling_min(a**2,10)]) 
rolling_max¶
- 
pyg.timeseries._max.rolling_max(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).max(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).max(); ts = rolling_max(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).max(); ts = rolling_max(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') >>> #original: 4534 timeseries: 4525 panda: 6 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_max(a,10) >>> old_ts = rolling_max_(old,10) >>> new_ts = rolling_max(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_max(dict(x = a, y = a**2),10), dict(x = rolling_max(a,10), y = rolling_max(a**2,10))) >>> assert eq(rolling_max([a,a**2],10), [rolling_max(a,10), rolling_max(a**2,10)]) 
rolling_median¶
- 
pyg.timeseries._median.rolling_median(a, n, axis=0, data=None, state=None)¶
- equivalent to pandas a.rolling(n).median(). - works with np.arrays 
- handles nan without forward filling. 
- supports state parameters 
 - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- size of rolling window 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- agreement with pandas 
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> panda = a.rolling(10).median(); ts = rolling_median(a,10) >>> assert abs(ts-panda).max()<1e-10 - Example
- nan handling 
 - Unlike pandas, timeseries does not include the nans in the rolling calculation: it skips them. Since pandas rolling engine does not skip nans, they propagate. In fact, having removed half the data points, rolling(10) will return 99% of nans - >>> a[a<0.1] = np.nan >>> panda = a.rolling(10).median(); ts = rolling_median(a,10) >>> print('#original:', len(nona(a)), 'timeseries:', len(nona(ts)), 'panda:', len(nona(panda)), 'data points') #original: 4634 timeseries: 4625 panda: 4 data points - Example
- state management 
 - One can split the calculation and run old and new data separately. - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> ts = rolling_median(a,10) >>> old_ts = rolling_median_(old,10) >>> new_ts = rolling_median(new, 10, **old_ts) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- dict/list inputs 
 - >>> assert eq(rolling_median(dict(x = a, y = a**2),10), dict(x = rolling_median(a,10), y = rolling_median(a**2,10))) >>> assert eq(rolling_median([a,a**2],10), [rolling_median(a,10), rolling_median(a**2,10)]) 
rolling_quantile¶
- 
pyg.timeseries._stride.rolling_quantile(a, n, quantile=0.5, axis=0, data=None, state=None)¶
- equivalent to a.rolling(n).quantile(q) except… - supports numpy arrays - supports multiple q values - Example
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> res = rolling_quantile(a, 100, 0.3) >>> assert sub_(res, a.rolling(100).quantile(0.3)).max() < 1e-13 - Example
- multiple quantiles 
 - >>> res = rolling_quantile(a, 100, [0.3, 0.5, 0.75]) >>> assert abs(res[0.3] - a.rolling(100).quantile(0.3)).max() < 1e-13 - Example
- state management 
 - >>> res = rolling_quantile(a, 100, 0.3) >>> old = rolling_quantile_(a.iloc[:2000], 100, 0.3) >>> new = rolling_quantile(a.iloc[2000:], 100, 0.3, **old) >>> both = pd.concat([old.data, new]) >>> assert eq(both, res) - Parameters
 - a : array/timeseries n : integer - window size. - qfloat or list of floats in [0,1]
- quantile(s). 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Returns
 - timeseries/array of quantile(s) 
rolling_rank¶
- 
pyg.timeseries._rank.rolling_rank(a, n, axis=0, data=None, state=None)¶
- returns a rank of the current value within a given window, scaled to be -1 if it is the smallest and +1 if it is the largest - works on mumpy arrays too - skips nan, no ffill - Parameters
 - aarray, pd.Series, pd.DataFrame or list/dict of these
- timeseries 
- n: int
- window size 
- axisint, optional
- 0/1/-1. The default is 0. 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
 - >>> from pyg import *; import pandas as pd; import numpy as np >>> a = pd.Series([1.,2., np.nan, 0., 4., 2., 3., 1., 2.], drange(-8)) >>> rank = rolling_rank(a, 3) >>> assert eq(rank.values, np.array([np.nan, np.nan, np.nan, -1, 1, 0, 0, -1, 0])) >>> # 0 is smallest in [1,2,0] so goes to -1 >>> # 4 is largest in [2,0,4] so goes to +1 >>> # 2 is middle of [0,4,2] so goes to 0 - Example
- numpy equivalent 
 - >>> assert eq(rolling_rank(a.values, 10), rolling_rank(a, 10).values) - Example
- state management 
 - >>> a = np.random.normal(0,1,10000) >>> old = rolling_rank_(a[:5000], 10) # grab both data and state >>> new = rolling_rank(a[5000:], 10, **old) >>> assert eq(np.concatenate([old.data,new]), rolling_rank(a, 10)) 
exponentially weighted moving functions¶
ewma¶
- 
pyg.timeseries._ewm.ewma(a, n, time=None, axis=0, data=None, state=None)¶
- ewma is equivalent to a.ewm(n).mean() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation - Parameters
 - a : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> ts = ewma(a,10); df = a.ewm(10).mean() >>> assert abs(ts-df).max()<1e-10 - Example
- numpy arrays support 
 - >>> assert eq(ewma(a.values, 10), ewma(a,10).values) - Example
- nan handling 
 - >>> a[a.values<0.1] = np.nan >>> ts = ewma(a,10, time = 'i'); df = a.ewm(10).mean() # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - >>> pd.concat([ts,df], axis=1) >>> 0 1 >>> 1993-09-24 0.263875 0.263875 >>> 1993-09-25 NaN 0.263875 >>> 1993-09-26 NaN 0.263875 >>> 1993-09-27 NaN 0.263875 >>> 1993-09-28 NaN 0.263875 >>> ... ... >>> 2021-02-04 NaN 0.786506 >>> 2021-02-05 0.928817 0.928817 >>> 2021-02-06 NaN 0.928817 >>> 2021-02-07 0.839168 0.839168 >>> 2021-02-08 0.831109 0.831109 - Example
- state management 
 - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> old_ts = ewma_(old, 10) >>> new_ts = ewma(new, 10, **old_ts) # instantiation with previous ewma >>> ts = ewma(a,10) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- Support for time & clock 
 - >>> daily = a >>> monthly = daily.resample('M').last() >>> m_ts = ewma(monthly, 3) ## 3-month ewma run on monthly data >>> d_ts = ewma(daily, 3, 'm') ## 3-month ewma run on daily data >>> daily_resampled_to_month = d_ts.resample('M').last() >>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10 - So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history. - Example
- Support for dict/list of arrays 
 - >>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999)) >>> a = dict(x = x, y = y) >>> assert eq(ewma(dict(x=x, y=y),10), dict(x=ewma(x,10), y=ewma(y,10))) >>> assert eq(ewma([x,y],10), [ewma(x,10), ewma(y,10)]) - Returns
 - an array/timeseries of ewma 
ewmrms¶
- 
pyg.timeseries._ewm.ewmrms(a, n, time=None, axis=0, data=None, state=None)¶
- ewmrms is equivalent to (a**2).ewm(n).mean()**0.5 but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation - Parameters
 - a : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> ts = ewmrms(a,10); df = (a**2).ewm(10).mean()**0.5 >>> assert abs(ts-df).max()<1e-10 - Example
- numpy arrays support 
 - >>> assert eq(ewmrms(a.values, 10), ewmrms(a,10).values) - Example
- nan handling 
 - >>> a[a.values<0.1] = np.nan >>> ts = ewmrms(a,10, time = 'i'); df = (a**2).ewm(10).mean()**0.5 # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - >>> pd.concat([ts,df], axis=1) >>> 0 1 >>> 1993-09-24 0.263875 0.263875 >>> 1993-09-25 NaN 0.263875 >>> 1993-09-26 NaN 0.263875 >>> 1993-09-27 NaN 0.263875 >>> 1993-09-28 NaN 0.263875 >>> ... ... >>> 2021-02-04 NaN 0.786506 >>> 2021-02-05 0.928817 0.928817 >>> 2021-02-06 NaN 0.928817 >>> 2021-02-07 0.839168 0.839168 >>> 2021-02-08 0.831109 0.831109 - Example
- state management 
 - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> old_ts = ewmrms_(old, 10) >>> new_ts = ewmrms(new, 10, **old_ts) # instantiation with previous ewma >>> ts = ewmrms(a,10) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- Support for time & clock 
 - >>> daily = a >>> monthly = daily.resample('M').last() >>> m_ts = ewmrms(monthly, 3) ## 3-month ewma run on monthly data >>> d_ts = ewmrms(daily, 3, 'm') ## 3-month ewma run on daily data >>> daily_resampled_to_month = d_ts.resample('M').last() >>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10 - So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history. - Example
- Support for dict/list of arrays 
 - >>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999)) >>> a = dict(x = x, y = y) >>> assert eq(ewmrms(dict(x=x, y=y),10), dict(x=ewmrms(x,10), y=ewmrms(y,10))) >>> assert eq(ewmrms([x,y],10), [ewmrms(x,10), ewmrms(y,10)]) - Returns
 - an array/timeseries of ewma 
ewmstd¶
- 
pyg.timeseries._ewm.ewmstd(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)¶
- ewmstd is equivalent to a.ewm(n).std() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation - Parameters
 - a : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> ts = ewmstd(a,10); df = a.ewm(10).std() >>> assert abs(ts-df).max()<1e-10 >>> ts = ewmstd(a,10, bias = True); df = a.ewm(10).std(bias = True) >>> assert abs(ts-df).max()<1e-10 - Example
- numpy arrays support 
 - >>> assert eq(ewmstd(a.values, 10), ewmstd(a,10).values) - Example
- nan handling 
 - >>> a[a.values<-0.1] = np.nan >>> ts = ewmstd(a,10, time = 'i'); df = a.ewm(10).std() # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 >>> ts = ewmstd(a,10, time = 'i', bias = True); df = a.ewm(10).std(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - Example
- state management 
 - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> old_ts = ewmstd_(old, 10) >>> new_ts = ewmstd(new, 10, **old_ts) # instantiation with previous ewma >>> ts = ewmstd(a,10) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- Support for time & clock 
 - >>> daily = a >>> monthly = daily.resample('M').last() >>> m_ts = ewmstd(monthly, 3) ## 3-month ewma run on monthly data >>> d_ts = ewmstd(daily, 3, 'm') ## 3-month ewma run on daily data >>> daily_resampled_to_month = d_ts.resample('M').last() >>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10 - So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history. - Example
- Support for dict/list of arrays 
 - >>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999)) >>> a = dict(x = x, y = y) >>> assert eq(ewmstd(dict(x=x, y=y),10), dict(x=ewmstd(x,10), y=ewmstd(y,10))) >>> assert eq(ewmstd([x,y],10), [ewmstd(x,10), ewmstd(y,10)]) - Returns
 - an array/timeseries of ewma 
ewmvar¶
- 
pyg.timeseries._ewm.ewmvar(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, state=None)¶
- ewmstd is equivalent to a.ewm(n).var() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation - Parameters
 - a : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> ts = ewmvar(a,10); df = a.ewm(10).var() >>> assert abs(ts-df).max()<1e-10 >>> ts = ewmvar(a,10, bias = True); df = a.ewm(10).var(bias = True) >>> assert abs(ts-df).max()<1e-10 - Example
- numpy arrays support 
 - >>> assert eq(ewmvar(a.values, 10), ewmvar(a,10).values) - Example
- nan handling 
 - >>> a[a.values<-0.1] = np.nan >>> ts = ewmvar(a,10, time = 'i'); df = a.ewm(10).var() # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 >>> ts = ewmvar(a,10, time = 'i', bias = True); df = a.ewm(10).var(bias = True) # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - Example
- state management 
 - >>> old = a.iloc[:5000] >>> new = a.iloc[5000:] >>> old_ts = ewmvar_(old, 10) >>> new_ts = ewmvar(new, 10, **old_ts) # instantiation with previous ewma >>> ts = ewmvar(a,10) >>> assert eq(new_ts, ts.iloc[5000:]) - Example
- Support for time & clock 
 - >>> daily = a >>> monthly = daily.resample('M').last() >>> m_ts = ewmvar(monthly, 3) ## 3-month ewma run on monthly data >>> d_ts = ewmvar(daily, 3, 'm') ## 3-month ewma run on daily data >>> daily_resampled_to_month = d_ts.resample('M').last() >>> assert abs(daily_resampled_to_month - m_ts).max() < 1e-10 - So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history. - Example
- Support for dict/list of arrays 
 - >>> x = pd.Series(np.random.normal(0,1,1000), drange(-999)); y = pd.Series(np.random.normal(0,1,1000), drange(-999)) >>> a = dict(x = x, y = y) >>> assert eq(ewmvar(dict(x=x, y=y),10), dict(x=ewmvar(x,10), y=ewmvar(y,10))) >>> assert eq(ewmvar([x,y],10), [ewmvar(x,10), ewmvar(y,10)]) - Returns
 - an array/timeseries of ewma 
ewmcor¶
- 
pyg.timeseries._ewm.ewmcor(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, state=None)¶
- calculates pair-wise correlation between a and b. - Parameters
 - a : array/timeseries b : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- min_samplefloar, optional
- minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population. 
- biasbook, optional
- vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default. 
- axisint, optional
- axis of calculation. The default is 0. 
- dataplace holder, ignore, optional
- ignore. The default is None. 
- statedict, optional
- Output from a previous run of ewmcor. The default is None. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> b = pd.Series(np.random.normal(0,1,9000), drange(-8999)) >>> ts = ewmcor(a, b, n = 10); df = a.ewm(10).corr(b) >>> assert abs(ts-df).max()<1e-10 - Example
- numpy arrays support 
 - >>> assert eq(ewmcor(a.values, b.values, 10), ewmcor(a, b, 10).values) - Example
- nan handling 
 - >>> a[a.values<-0.1] = np.nan >>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - Example
- state management 
 - >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000] >>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:] >>> old_ts = ewmcor_(old_a, old_b, 10) >>> new_ts = ewmcor(new_a, new_b, 10, **old_ts) # instantiation with previous ewma >>> ts = ewmcor(a,b,10) >>> assert eq(new_ts, ts.iloc[5000:]) 
ewmLR¶
- 
pyg.timeseries._ewm.ewmLR(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, state=None)¶
- calculates pair-wise linear regression between a and b. We have a and b for which we want to fit: - >>> b_i = c + m a_i >>> LSE(c,m) = \sum w_i (c + m a_i - b_i)^2 >>> dLSE/dc = 0 <==> \sum w_i (c + m a_i - b_i) = 0 [1] >>> dLSE/dm = 0 <==> \sum w_i a_i (c + m a_i - b_i) = 0 [2] - >>> c + mE(a) = E(b) [1] >>> cE(a) + mE(a^2) = E(ab) [2] - >>> cE(a) + mE(a)^2 = E(a)E(n) [1] * E(a) >>> m(E(a^2) - E(a)^2) = E(ab) - E(a)E(b) >>> m = covar(a,b)/var(a) >>> c = E(b) - mE(a) - a : array/timeseries b : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- min_samplefloar, optional
- minimum weight of observations before we return a reading. The default is 0.25. This ensures that we don’t get silly numbers due to small population. 
- biasbook, optional
- vol estimation for a and b should really by unbiased. Nevertheless, we track pandas and set bias = True as a default. 
- axisint, optional
- axis of calculation. The default is 0. 
- c,mplace holder, ignore, optional
- ignore. The default is None. 
- statedict, optional
- Output from a previous run of ewmcor. The default is None. 
 - Example
- numpy arrays support 
 - >>> assert eq(ewmLR(a.values, b.values, 10), ewmLR(a, b, 10).values) - Example
- nan handling 
 - >>> a[a.values<-0.1] = np.nan >>> ts = ewmcor(a, b, 10, time = 'i'); df = a.ewm(10).corr(b) # note: pandas assumes, 'time' pass per index entry, even if value is nan >>> assert abs(ts-df).max()<1e-10 - Example
- state management 
 - >>> from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> b = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> old_a = a.iloc[:5000]; old_b = b.iloc[:5000] >>> new_a = a.iloc[5000:]; new_b = b.iloc[5000:] >>> old_ts = ewmLR_(old_a, old_b, 10) >>> new_ts = ewmLR(new_a, new_b, 10, **old_ts) # instantiation with previous ewma >>> ts = ewmLR(a,b,10) >>> assert eq(new_ts.c, ts.c.iloc[5000:]) >>> assert eq(new_ts.m, ts.m.iloc[5000:]) - Example
 - >>> from pyg import * >>> a0 = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> a1 = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> b = (a0 - a1) + pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> a = pd.concat([a0,a1], axis=1) >>> LR = ewmLR(a,b,50) >>> assert abs(LR.m.mean()[0]-1)<0.5 >>> assert abs(LR.m.mean()[1]+1)<0.5 
ewmGLM¶
- 
pyg.timeseries._ewm.ewmGLM(a, b, n, time=None, min_sample=0.25, bias=True, data=None, state=None)¶
- Calculates a General Linear Model fitting b to a. - Parameters
 - a : a 2-d array/pd.DataFrame of values fitting b b : a 1-d array/pd.Series n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- min_samplefloar, optional
- minimum weight of observations before we return the fitting. The default is 0.25. This ensures that we don’t get silly numbers due to small population. 
- dataplace holder, ignore, optional
- ignore. The default is None. 
- statedict, optional
- Output from a previous run of ewmGLM. The default is None. 
 - Theory
 - See https://en.wikipedia.org/wiki/Generalized_linear_model for full details. Briefly, we assume b is single column while a is multicolumn. We minimize least square error (LSE) fitting: - >>> b[i] =\sum_j m_j a_j[i] >>> LSE(m) = \sum_i w_i (b[i] - \sum_j m_j * a_j[i])^2 - >>> dLSE/dm_k = 0 >>> <==> \sum_i w_i (b[i] - \sum_j m_j * a_j[i]) a_k[i] = 0 >>> <==> E(b*a_k) = m_k E(a_k^2) + sum_{j<>k} m_k E(a_j a_k) - E is expectation under weights w. And we can rewrite it as: - >>> a2 x m = ab ## matrix multiplication >>> a2[i,j] = E(a_i * a_j) >>> ab[j] = E(a_j * b) >>> m = a2.inverse x ab ## matrix multiplication - Example
- simple fit 
 - >>> from pyg import * >>> a = pd.DataFrame(np.random.normal(0,1,(10000,10)), drange(-9999)) >>> true_m = np.random.normal(1,1,10) >>> noise = np.random.normal(0,1,10000) >>> b = (a * true_m).sum(axis = 1) + noise - >>> fitted_m = ewmGLM(a, b, 50) 
ewmskew¶
- 
pyg.timeseries._ewm.ewmskew(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, state=None)¶
- Equivalent to a.ewm(n).skew() but with… - supports np.ndarrays as well as timeseries - handles nan by skipping them - allows state-management - ability to supply a ‘clock’ to the calculation - Parameters
 - a : array/timeseries n : int/fraction - The number or days (or a ratio) to scale the history - timeCalendar, ‘b/d/y/m’ or a timeseries of time (use clock(a) to see output)
- If time parameter is provided, we allow multiple observations per unit of time. i.e., converging to the last observation in time unit.
- if we have intraday data, and set time = ‘d’, then 
- the ewm calculation on last observations per day is what is retained. 
- the ewm calculation on each intraday observation is same as an ewm(past EOD + current intraday observation) 
 
 
- data: None.
- unused at the moment. Allow code such as func(live, **func_(history)) to work 
- state: dict, optional
- state parameters used to instantiate the internal calculations, based on history prior to ‘a’ provided. 
 - Example
- matching pandas 
 - >>> import pandas as pd; import numpy as np; from pyg import * >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> x = a.ewm(10).skew() - >>> old = a.iloc[:10] >>> new = a.iloc[10:] f = ewmskew_ for f in [ewma_, ewmstd_, ewmrms_, ewmskew_, ]: both = f(a, 3) o = f(old, 3) n = f(new, 3, **o) assert eq(o.data, both.data.iloc[:10]) assert eq(n.data, both.data.iloc[10:]) assert both - 'data' == n - 'data' - >>> assert abs(a.ewm(10).mean() - ewma(a,10)).max() < 1e-14 >>> assert abs(a.ewm(10).std() - ewmstd(a,10)).max() < 1e-14 - Example
- numpy arrays support 
 - >>> assert eq(ewma(a.values, 10), ewma(a,10).values) - Example
- nan handling 
 - while panadas ffill values, timeseries skips nans: - >>> a = pd.Series(np.random.normal(0,1,10000), drange(-9999)) >>> a[a.values>0.1] = np.nan >>> ts = ewma(a,10) >>> assert eq(ts[~np.isnan(ts)], ewma(a[~np.isnan(a)], 10)) - Example
- initiating the ewma with past state 
 - >>> old = np.random.normal(0,1,100) >>> new = np.random.normal(0,1,100) >>> old_ = ewma_(old, 10) >>> new_ = ewma(new, 10, t0 = old_ewma.t0, t1 = old_ewma.t1) # instantiation with previous ewma >>> new_2 = ewma(np.concatenate([old,new]), 10)[-100:] >>> assert eq(new_ewma, new_ewma2) - Example
- Support for time & clock 
 - >>> daily = pd.Series(np.random.normal(0,1,10000), drange(-9999)).cumsum() >>> monthly = daily.resample('M').last() >>> m = ewma(monthly, 3) ## 3-month ewma run on monthly data >>> d = ewma(daily, 3, 'm') ## 3-month ewma run on daily data >>> daily_resampled_to_month = d.resample('M').last() >>> assert abs(daily_resampled_to_month - m).max() < 1e-10 - So you can run a 3-monthly ewma on daily, where within month, most recent value is used with the EOM history. - Returns
 - an array/timeseries of ewma 
functions exposing their state¶
simple functions¶
- 
pyg.timeseries._rolling.diff_(a, n=1, axis=0, data=None, instate=None)¶
- returns a forward filled array, up to n values forward. Equivalent to diff(a,n) but returns the full state. See diff for full details 
- 
pyg.timeseries._rolling.shift_(a, n=1, axis=0, instate=None)¶
- Equivalent to shift(a,n) but returns the full state. See shift for full details 
- 
pyg.timeseries._rolling.ratio_(a, n=1, data=None, instate=None)¶
- 
pyg.timeseries._ts.ts_count_(a, axis=0, data=None, instate=None)¶
- ts_count_(a) is equivalent to ts_count(a) except vec is also returned. See ts_count for full documentation 
- 
pyg.timeseries._ts.ts_sum_(a, axis=0, data=None, instate=None)¶
- ts_sum_(a) is equivalent to ts_sum(a) except vec is also returned. See ts_sum for full documentation 
- 
pyg.timeseries._ts.ts_mean_(a, axis=0, data=None, instate=None)¶
- ts_mean_(a) is equivalent to ts_mean(a) except vec is also returned. See ts_mean for full documentation 
- 
pyg.timeseries._ts.ts_rms_(a, axis=0, data=None, instate=None)¶
- ts_rms_(a) is equivalent to ts_rms(a) except it also returns vec see ts_rms for full documentation 
- 
pyg.timeseries._ts.ts_std_(a, axis=0, data=None, instate=None)¶
- ts_std_(a) is equivalent to ts_std(a) except vec is also returned. See ts_std for full documentation 
- 
pyg.timeseries._ts.ts_skew_(a, bias=False, min_sample=0.25, axis=0, data=None, instate=None)¶
- ts_skew_(a) is equivalent to ts_skew except vec is also returned. See ts_skew for full details 
- 
pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)¶
- ts_max(a) is equivalent to pandas a.min() 
- 
pyg.timeseries._ts.ts_max_(a, axis=0, data=None, instate=None)¶
- ts_max(a) is equivalent to pandas a.min() 
- 
pyg.timeseries._rolling.ffill_(a, n=0, axis=0, instate=None)¶
- returns a forward filled array, up to n values forward. supports state manegement 
expanding window functions¶
- 
pyg.timeseries._expanding.expanding_mean_(a, axis=0, data=None, instate=None)¶
- Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_mean.__doc__ 
- 
pyg.timeseries._expanding.expanding_rms_(a, axis=0, data=None, instate=None)¶
- Equivalent to expanding_rms(a) but returns also the state variables. For full documentation, look at expanding_rms.__doc__ 
- 
pyg.timeseries._expanding.expanding_std_(a, axis=0, data=None, instate=None)¶
- Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_std.__doc__ 
- 
pyg.timeseries._expanding.expanding_sum_(a, axis=0, data=None, instate=None)¶
- Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__ 
- 
pyg.timeseries._expanding.expanding_skew_(a, bias=False, axis=0, data=None, instate=None)¶
- Equivalent to expanding_mean(a) but returns also the state variables. For full documentation, look at expanding_skew.__doc__ 
- 
pyg.timeseries._min.expanding_min_(a, axis=0, data=None, instate=None)¶
- Equivalent to a.expanding().min() but returns the full state: i.e. both data: the expanding().min() m: the current minimum 
- 
pyg.timeseries._max.expanding_max_(a, axis=0, data=None, instate=None)¶
- Equivalent to a.expanding().max() but returns the full state: i.e. both data: the expanding().max() m: the current maximum 
- 
pyg.timeseries._expanding.cumsum_(a, axis=0, data=None, instate=None)¶
- Equivalent to expanding_sum(a) but returns also the state variables. For full documentation, look at expanding_sum.__doc__ 
- 
pyg.timeseries._expanding.cumprod_(a, axis=0, data=None, instate=None)¶
- Equivalent to cumprod(a) but returns also the state variable. For full documentation, look at cumprod.__doc__ 
rolling window functions¶
- 
pyg.timeseries._rolling.rolling_mean_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_mean(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_mean.__doc__ 
- 
pyg.timeseries._rolling.rolling_rms_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_rms(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_rms.__doc__ 
- 
pyg.timeseries._rolling.rolling_std_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_std(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_std.__doc__ 
- 
pyg.timeseries._rolling.rolling_sum_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_sum(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_sum.__doc__ 
- 
pyg.timeseries._rolling.rolling_skew_(a, n, bias=False, axis=0, data=None, instate=None)¶
- Equivalent to rolling_skew(a) but returns also the state variables t0,t1 etc. For full documentation, look at rolling_skew.__doc__ 
- 
pyg.timeseries._min.rolling_min_(a, n, vec=None, axis=0, data=None, instate=None)¶
- Equivalent to rolling_min(a) but returns also the state. For full documentation, look at rolling_min.__doc__ 
- 
pyg.timeseries._max.rolling_max_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_max(a) but returns also the state. For full documentation, look at rolling_max.__doc__ 
- 
pyg.timeseries._median.rolling_median_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_median(a) but returns also the state. For full documentation, look at rolling_median.__doc__ 
- 
pyg.timeseries._rank.rolling_rank_(a, n, axis=0, data=None, instate=None)¶
- Equivalent to rolling_rank(a) but returns also the state variables. For full documentation, look at rolling_rank.__doc__ 
- 
pyg.timeseries._stride.rolling_quantile_(a, n, quantile=0.5, axis=0, data=None, instate=None)¶
- Equivalent to rolling_quantile(a) but returns also the state. For full documentation, look at rolling_quantile.__doc__ 
exponentially weighted moving functions¶
- 
pyg.timeseries._ewm.ewma_(a, n, time=None, data=None, instate=None)¶
- Equivalent to ewma but returns a state parameter for instantiation of later calculations. See ewma documentation for more details 
- 
pyg.timeseries._ewm.ewmrms_(a, n, time=None, axis=0, data=None, instate=None)¶
- Equivalent to ewmrms but returns a state parameter for instantiation of later calculations. See ewmrms documentation for more details 
- 
pyg.timeseries._ewm.ewmstd_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)¶
- Equivalent to ewmstd but returns a state parameter for instantiation of later calculations. See ewmstd documentation for more details 
- 
pyg.timeseries._ewm.ewmvar_(a, n, time=None, min_sample=0.25, bias=False, axis=0, data=None, instate=None)¶
- Equivalent to ewmvar but returns a state parameter for instantiation of later calculations. See ewmvar documentation for more details 
- 
pyg.timeseries._ewm.ewmcor_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, data=None, instate=None)¶
- Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details 
- 
pyg.timeseries._ewm.ewmLR_(a, b, n, time=None, min_sample=0.25, bias=True, axis=0, c=None, m=None, instate=None)¶
- Equivalent to ewmcor but returns a state parameter for instantiation of later calculations. See ewmcor documentation for more details 
- 
pyg.timeseries._ewm.ewmGLM_(a, b, n, time=None, min_sample=0.25, bias=True, data=None, instate=None)¶
- Equivalent to ewmGLM but returns a state parameter for instantiation of later calculations. See ewmGLM documentation for more details 
- 
pyg.timeseries._ewm.ewmskew_(a, n, time=None, bias=False, min_sample=0.25, axis=0, data=None, instate=None)¶
- Equivalent to ewmskew but returns a state parameter for instantiation of later calculations. See ewmskew documentation for more details 
Index handling¶
df_fillna¶
- 
pyg.timeseries._index.df_fillna(df, method=None, axis=0, limit=None)¶
- Equivelent to df.fillna() except: - support np.ndarray as well as dataframes 
- support multiple methods of filling/interpolation 
- supports removal of nan from the start/all of the timeseries 
- supports action on multiple timeseries 
 - Parameters
 - df : dataframe/numpy array - methodstring, list of strings or None, optional
- Either a fill method (bfill, ffill, pad) Or an interplation method: ‘linear’, ‘time’, ‘index’, ‘values’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘krogh’, ‘spline’, ‘polynomial’, ‘from_derivatives’, ‘piecewise_polynomial’, ‘pchip’, ‘akima’, ‘cubicspline’ Or ‘fnna’: removes all to the first non nan Or ‘nona’: removes all nans 
- axisint, optional
- axis. The default is 0. 
- limitTYPE, optional
- when filling, how many nan get filled. The default is None (indefinite) 
 - Example
- method ffill or bfill 
 - >>> from pyg import *; import numpy as np >>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25]) >>> assert eq(df_fillna(df, 'ffill'), np.array([ np.nan, 1., 1., 9., 9., 25.])) >>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1., 1., 9., 9., 25.])) >>> assert eq(df_fillna(df, ['ffill','bfill']), np.array([ 1., 1., 1., 9., 9., 25.])) - >>> df = np.array([np.nan, 1., np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 9, np.nan, 25]) >>> assert eq(df_fillna(df, 'ffill', limit = 2), np.array([np.nan, 1., 1., 1., np.nan, np.nan, np.nan, np.nan, 9., 9., 25.])) - df_fillna does not maintain state of latest ‘prev’ value: use ffill_ for that. - Example
- interpolation methods 
 - >>> from pyg import *; import numpy as np >>> df = np.array([np.nan, 1., np.nan, 9, np.nan, 25]) >>> assert eq(df_fillna(df, 'linear'), np.array([ np.nan, 1., 5., 9., 17., 25.])) >>> assert eq(df_fillna(df, 'quadratic'), np.array([ np.nan, 1., 4., 9., 16., 25.])) - Example
- method = fnna and nona 
 - >>> from pyg import *; import numpy as np >>> ts = np.array([np.nan] * 10 + [1.] * 10 + [np.nan]) >>> assert eq(df_fillna(ts, 'fnna'), np.array([1.]*10 + [np.nan])) >>> assert eq(df_fillna(ts, 'nona'), np.array([1.]*10)) - >>> assert len(df_fillna(np.array([np.nan]), 'nona')) == 0 >>> assert len(df_fillna(np.array([np.nan]), 'fnna')) == 0 - Returns
 - array/dataframe with nans removed/filled 
df_index¶
- 
pyg.timeseries._index.df_index(seq, index='inner')¶
- Determines a joint index of multiple timeseries objects. - Parameters
 - seqsequence whose index needs to be determined
- a (possible nested) sequence of timeseries/non-timeseries object within lists/dicts 
- indexstr, optional
- method to determine the index. The default is ‘inner’. 
 - Returns
 - pd.Index
- The joint index. 
 - Example
 - >>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)] >>> more_tss_as_dict = dict(zip('abcde',[pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)])) >>> res = df_index(tss + [more_tss_as_dict], 'inner') >>> assert len(res) == 6 >>> res = df_index(more_tss_as_dict, 'outer') >>> assert len(res) == 14 
df_reindex¶
- 
pyg.timeseries._index.df_reindex(ts, index=None, method=None, limit=None)¶
- A slightly more general version of df.reindex(index) - Parameters
 - tsdataframe or numpy array (or list/dict of theses)
- timeseries to be reindexed 
- indexstr, timeseries, pd.Index.
- The new index 
- methodstr, list of str, float, optional
- various methods of handling nans are available. The default is None. See df_fillna for a full list. 
 - Returns
 - timeseries/np.ndarray (or list/dict of theses)
- timeseries reindex. 
 - Example
- index = inner/outer 
 - >>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)] >>> res = df_reindex(tss, 'inner') >>> assert len(res[0]) == 6 >>> res = df_reindex(tss, 'outer') >>> assert len(res[0]) == 14 - Example
- index provided 
 - >>> tss = [pd.Series(np.random.normal(0,1,10), drange(-i, 9-i)) for i in range(5)] >>> res = df_reindex(tss, tss[0]) >>> assert eq(res[0], tss[0]) >>> res = df_reindex(tss, tss[0].index) >>> assert eq(res[0], tss[0]) 
presync¶
- 
pyg.timeseries._index.presync()¶
- Much of timeseries analysis in Pandas is spent aligning multiple timeseries before feeding them into a function. presync allows easy presynching of all paramters of a function. - Parameters
 - functioncallable, optional
- function to be presynched. The default is None. 
- indexstr, optional
- index join policy. The default is ‘inner’. 
- methodstr/int/list of these, optional
- method of nan handling. The default is None. 
- columnsstr, optional
- columns join policy. The default is ‘inner’. 
- defaultfloat, optional
- value when no data is available. The default is np.nan. 
 - Returns
 - presynch-decorated function - Example
 - >>> from pyg import * >>> x = pd.Series([1,2,3,4], drange(-3)) >>> y = pd.Series([1,2,3,4], drange(-4,-1)) >>> z = pd.DataFrame([[1,2],[3,4]], drange(-3,-2), ['a','b']) >>> addition = lambda a, b: a+b - #We get some nonsensical results: - >>> assert list(addition(x,z).columns) == list(x.index) + ['a', 'b'] - #But: - >>> assert list(presync(addition)(x,z).columns) == ['a', 'b'] >>> res = presync(addition, index='outer', method = 'ffill')(x,z) >>> assert eq(res.a.values, np.array([2,5,6,7])) - Example 2
- alignment works for parameters ‘buried’ within… 
 - >>> function = lambda a, b: a['x'] + a['y'] + b >>> f = presync(function, 'outer', method = 'ffill') >>> res = f(dict(x = x, y = y), b = z) >>> assert eq(res, pd.DataFrame(dict(a = [np.nan, 4, 8, 10, 11], b = [np.nan, 5, 9, 11, 12]), index = drange(-4))) - Example 3
- alignment of numpy arrays 
 - >>> addition = lambda a, b: a+b >>> a = presync(addition) >>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4]]).T), pd.Series([2,4,6,8], drange(-3))) >>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([1,2,3,4])), pd.Series([2,4,6,8], drange(-3))) >>> assert eq(a(pd.Series([1,2,3,4], drange(-3)), np.array([[1,2,3,4],[5,6,7,8]]).T), pd.DataFrame({0:[2,4,6,8], 1:[6,8,10,12]}, drange(-3))) >>> assert eq(a(np.array([1,2,3,4]), np.array([[1,2,3,4]]).T), np.array([2,4,6,8])) - Example 4
- inner join alignment of columns in dataframes by default 
 - >>> x = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3)) >>> y = pd.DataFrame({'wrong':[2,4,6,8], 'columns':[6,8,10,12]}, drange(-3)) >>> assert len(a(x,y)) == 0 >>> y = pd.DataFrame({'a':[2,4,6,8], 'other':[6,8,10,12.]}, drange(-3)) >>> assert eq(a(x,y),x[['a']]*2) >>> y = pd.DataFrame({'a':[2,4,6,8], 'b':[6,8,10,12.]}, drange(-3)) >>> assert eq(a(x,y),x*2) >>> y = pd.DataFrame({'column name for a single column dataframe is ignored':[1,1,1,1]}, drange(-3)) >>> assert eq(a(x,y),x+1) - >>> a = presync(addition, columns = 'outer') >>> y = pd.DataFrame({'other':[2,4,6,8], 'a':[6,8,10,12]}, drange(-3)) >>> assert sorted(a(x,y).columns) == ['a','b','other'] - Example 4
- ffilling, bfilling 
 - >>> x = pd.Series([1.,np.nan,3.,4.], drange(-3)) >>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1)) >>> assert eq(a(x,y), pd.Series([np.nan, np.nan,7], drange(-3,-1))) - but, we provide easy conversion of internal parameters of presync: - >>> assert eq(a.ffill(x,y), pd.Series([2,4,7], drange(-3,-1))) >>> assert eq(a.bfill(x,y), pd.Series([4,6,7], drange(-3,-1))) >>> assert eq(a.oj(x,y), pd.Series([np.nan, np.nan, np.nan, 7, np.nan], drange(-4))) >>> assert eq(a.oj.ffill(x,y), pd.Series([np.nan, 2, 4, 7, 8], drange(-4))) - Example 5
- indexing to a specific index 
 - >>> index = pd.Index([dt(-3), dt(-1)]) >>> a = presync(addition, index = index) >>> x = pd.Series([1.,np.nan,3.,4.], drange(-3)) >>> y = pd.Series([1.,np.nan,3.,4.], drange(-4,-1)) >>> assert eq(a(x,y), pd.Series([np.nan, 7], index)) - Example 6
- returning complicated stuff 
 - >>> from pyg import * >>> a = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99)) >>> b = pd.DataFrame(np.random.normal(0,1,(100,10)), drange(-99)) - >>> def f(a, b): >>> return (a*b, ts_sum(a), ts_sum(b)) - >>> old = f(a,b) >>> self = presync(f) >>> args = (); kwargs = dict(a = a, b = b) >>> new = self(*args, **kwargs) >>> assert eq(new, old) 
add/sub/mul/div/pow operators¶
- 
pyg.timeseries._index.add_(a, b)¶
- addition of a and b supporting presynching (inner join) of timeseries 
- 
pyg.timeseries._index.mul_(a, b)¶
- multiplication of a and b supporting presynching (inner join) of timeseries 
- 
pyg.timeseries._index.div_(a, b)¶
- division of a by b supporting presynching (inner join) of timeseries 
- 
pyg.timeseries._index.sub_(a, b)¶
- subtraction of b from a supporting presynching (inner join) of timeseries 
- 
pyg.timeseries._index.pow_(a, b)¶
- equivalent to a**b supporting presynching (inner join) of timeseries