acgc.stats.timeseries
timeseries analysis functions
1#!/usr/bin/env python3 2'''timeseries analysis functions''' 3 4import numpy as np 5import pandas as pd 6import statsmodels.api as sm 7from statsmodels.tsa.seasonal import STL, DecomposeResult 8from statsmodels.nonparametric.smoothers_lowess import lowess 9 10__all__ = ['STL_harmonic', 11 'STL_smoother', 12 'DecomposeResult_to_DataFrame', 13 'boxcar', 14 'boxcarpoly'] 15 16def STL_smoother( data, period=None, smooth=4, **kwargs): 17 '''STL decomposition of time series into trend, seasonal, and residual components 18 with additional smoothing of the seasonal component 19 20 Parameters 21 ---------- 22 data : array_like 23 time series data to be decomposed 24 period : int 25 The length of the seasonal cycle 26 smooth : int, optional 27 length of the lowess filter applied to the STL seasonal cycle, default is 4 28 **kwargs : 29 Additional keyword arguments to pass to the STL function. 30 For example, robust=True. See the statsmodels documentation for more details 31 32 Returns 33 ------- 34 result : DecomposeResult 35 object containing the original data, seasonal component, trend component, residuals, and weights (if robust=True) 36 ''' 37 38 result = STL(data, period, **kwargs).fit() 39 40 #Additional smoothing of the seasonal component 41 seasonal_smooth = lowess(result.seasonal, 42 result.seasonal.index, 43 frac=smooth/len(data), 44 return_sorted=False) 45 seasonal_smooth = pd.Series(seasonal_smooth, 46 index=result.seasonal.index, 47 name='seasonal') 48 49 # Adjust residuals 50 resid_adjusted = result.resid + (result.seasonal - seasonal_smooth) 51 resid_adjusted = resid_adjusted.rename('resid') 52 53 return DecomposeResult(data, 54 seasonal_smooth, 55 result.trend, 56 resid_adjusted, 57 weights=result.weights) 58 59def STL_harmonic( y, x=None, data=None, 60 period=None, trend_ratio=1.5, 61 nharmonics=4, 62 maxiter=4, rtol=1e-2, robust=False, verbose=False, 63 **kwargs): 64 '''Decompose time series into seasonal, trend, and residual components 65 using harmonic regression for the seasonal component and lowess smoothing/filtering to separate the trend and residual components. 66 67 This is similar STL decomposition with the following differences: 68 * The seasonal component is modeled using harmonic regression, which guarantees a smooth seasonal cycle. 69 * The seasonal cycle is fixed in this decomposition, while it can vary in STL. 70 * STL requires regularly spaced data, while this decomposition can handle irregularly spaced data, 71 as long as the time variable `x` is provided. Nevertheless, the quality is likely to be better if the time series has no large gaps. 72 73 Like STL, the decomposition is iterative. The seasonal cycle is fitted to detredended data, 74 then the seasonal cycle is subtracted from the data. Lowess is applied to the deseasonalized data to estimate 75 the trend. The updated trend is subtracted from the original data and an updated seasonal cycle 76 is fitted to the updated detrended data. This process is repeated until the seasonal cycle converges 77 or the maximum number of iterations is reached. 78 79 Parameters 80 ---------- 81 y : array-like, pandas.Series or str 82 The time series to decompose. If a string, it is treated as a column name in `data`. 83 x : array-like or str, optional 84 The time variable corresponding to `y`. If a string, it is treated as a column name in `data`. 85 If None (default), the index of `y` is used if it has one, otherwise a range from 0 to len(y)-1 is used. 86 data : DataFrame, optional 87 A DataFrame containing the data. Required if `y` or `x` are strings referring to column names. 88 Not needed if `y` and `x` are array-like. 89 period : float or np.timedelta64 or pd.Timedelta 90 The length of the seasonal cycle in the same units as `x`. 91 If `x` is datetime-like, `period` should be a np.timedelta64 or pandas.Timedelta, e.g. `pd.Timedelta(days=365.25)` for annual seasonal cycles. 92 If `x` is in months, `period` should be 12, and if `x` is in days, then `period` should be 365.25. 93 trend_ratio : float, optional 94 The ratio of the trend smoother span to the seasonal cycle length (period). Default is 1.5, which means the trend smoother span will be 1.5 times the period. 95 nharmonics : int, optional 96 The number of harmonics to include in the seasonal model. Default is 4, which includes the annual, semiannual, triannual, and quadriannual harmonics for yearly seasonality. 97 maxiter : int, optional 98 The maximum number of iterations for the decomposition. Default is 4. 99 rtol : float, optional 100 The relative tolerance for convergence of the seasonal component. Default is 1e-2. 101 robust : bool or int, optional 102 Specifies whether robust lowess fitting should be used for trend smoothing. 103 If False (default), no robust fitting is done. If True, 3 iterations of robust fitting are done within the lowess filter. 104 If an integer >= 1, lowess uses that many iterations of robust fitting. 105 verbose : bool, optional 106 If True, prints convergence information. Default is False. 107 **kwargs : 108 Additional keyword arguments to pass to the lowess function for trend smoothing. For example, ` 109 110 Returns 111 ------- 112 DecomposeResult 113 An object containing the original data, seasonal component, trend component, and residuals. 114 ''' 115 116 if period is None: 117 raise ValueError('Keyword `period` must be specified') 118 119 # If y is a string, treat it as a column name in data 120 if isinstance(y, str): 121 try: 122 y = data[y] 123 except KeyError: 124 raise ValueError(f'y must be a 1-D array or a column name in data. {y} not found in data.') 125 126 # If x is a string, treat it as a column name in data. 127 if isinstance(x, str): 128 try: 129 x = data[x] 130 except KeyError: 131 raise ValueError(f'x must be a 1-D array or a column name in data. {x} not found in data.') 132 # If x is None, use the index of y or a range if y has no index. 133 elif x is None: 134 try: 135 x = y.index 136 except AttributeError: 137 x = np.arange(len(y)) 138 139 # Ensure nharmonics is a positive integer 140 if not isinstance(nharmonics, int) or nharmonics < 1: 141 raise ValueError('nharmonics must be a positive integer') 142 143 # Robust fitting iterations 144 if robust is False: 145 robust_iterations=1 146 elif robust is True: 147 robust_iterations=3 148 elif isinstance(robust, int) and robust >= 1: 149 robust_iterations=robust 150 robust = True 151 else: 152 raise ValueError('Invalid value for robust parameter. Must be True, False, or an integer >= 1.') 153 154 # Normalize x, so that it is the number of periods since the start of the time series 155 # xnorm = ( x - np.min(x) ) / period 156 xnorm = ( x - x[0] ) / period 157 158 # Initialize trend and seasonal_old 159 trend = np.ones_like(y)*np.mean(y) 160 seasonal_old = np.zeros_like(y) 161 converged = False 162 163 for i in range(1,maxiter+1): 164 165 # Detrend 166 detrended = y - trend 167 168 # Fit seasonal cycle with nharmonics harmonics (annual, semiannual, triannual, etc.) 169 X = np.column_stack([func(2*i*np.pi*xnorm) for i in range(1, nharmonics+1) for func in (np.sin, np.cos)]) 170 if robust: 171 res = sm.RLM(detrended, X, M=sm.robust.norms.TukeyBiweight()).fit() 172 else: 173 res = sm.OLS(detrended, X).fit() 174 seasonal = res.predict(X) 175 176 # Update trend by fitting lowess to data - seasonal 177 trend = lowess(y-seasonal, xnorm, 178 frac = trend_ratio/np.max(xnorm), it=robust_iterations, 179 # frac=ratio*period/len(y), it=robust_iterations, 180 return_sorted=False, **kwargs) 181 182 # Exit if seasonal cycle has converged 183 if np.max(np.abs(seasonal - seasonal_old)) < rtol*np.mean(np.abs(seasonal)): 184 converged = True 185 break 186 187 # Update seasonal_old for next iteration 188 seasonal_old = seasonal 189 190 # Print convergence message if verbose 191 if verbose: 192 if converged: 193 print(f'STL_harmonic converged in {i} iterations') 194 else: 195 print(f'STL_harmonic did not converge in {maxiter} iterations') 196 197 # Compute residual 198 resid = y - trend - seasonal 199 200 # Weights are not computed in this implementation, so set to NaN 201 weights = np.full_like(resid, np.nan) 202 203 if isinstance(y, pd.Series): 204 seasonal = pd.Series(seasonal, index=y.index, name='seasonal') 205 trend = pd.Series(trend, index=y.index, name='trend') 206 resid = pd.Series(resid, index=y.index, name='resid') 207 weights = pd.Series(weights, index=y.index, name='weights') 208 209 # return trend, seasonal, resid 210 return DecomposeResult(y, 211 seasonal, 212 trend, 213 resid, 214 weights) 215 216def DecomposeResult_to_DataFrame(result): 217 '''Convert a DecomposeResult object to a pandas DataFrame. 218 219 Parameters 220 ---------- 221 result : DecomposeResult 222 223 Returns 224 ------- 225 DataFrame 226 A DataFrame with columns 'original', 'seasonal', 'trend', 'residual', and 'weights'. 227 ''' 228 try: 229 index = result.observed.index 230 except AttributeError: 231 index = np.arange(len(result.data)) 232 return pd.DataFrame({'original': result.observed, 233 'seasonal': result.seasonal, 234 'trend': result.trend, 235 'residual': result.resid, 236 'weights': result.weights}, 237 index=index) 238 239 240def boxcarpoly( array, width, order=2, align='center'): 241 '''Calculate a boxcar polynomial (i.e. running polynomial fit) of an array. 242 243 See `boxcar` for parameter definitions 244 ''' 245 return boxcar( array, width, order=order, align=align, method='polynomial') 246 247def boxcar( array, width, align='center', method='mean', order=2 ): 248 '''Calculate a boxcar average (i.e. running mean, median, or polynomial fit) of an array. 249 250 Elements of input array are assumed to be equally spaced along an (unspecified) 251 coordinate dimension. 252 253 A centered boxcar average with even widths is traditionally 254 not allowed. This BOXCAR program allows even widths by 255 giving half weights to elements on either end of the 256 averaging window and full weights to all others. A centered 257 average with even width therefore uses uses WIDTH+1 258 elements in its averaging kernel. 259 260 Parameters 261 ---------- 262 array : array of float (N,) 263 1D array of values to average 264 width : int 265 number of elements to include in the average. 266 align : str 267 specifies how the averaging window for each element of the output array 268 aligns with the input array. 269 Values: center (default), forward (trailing elements), backward (leading elements) 270 method : {'mean' (default), 'median', 'polynomial'} 271 specifies the averaging kernel function 272 order : int, default=2 273 specifies the polynomial order to fit within each boxcar window. 274 This has no effect unless `method`='polynomial' 275 A parabola is order=2; order=0 is equivalent to method="mean" 276 277 Returns 278 ------- 279 result : array of float (N,) 280 1D array with averages with same size as input array. 281 Elements on either end may be NaN 282 ''' 283 284 N = array.size 285 286 # Initialize smoothed array 287 smootharray = np.empty_like(array) 288 smootharray[:] = np.NaN 289 290 # uniform averaging kernel 291 kernel = np.ones(width) 292 293 # Setup for backward boxcar 294 if align=='backward': 295 # Half widths before and after element I 296 297 HW1 = width-1 298 HW2 = width-HW1-1 299 300 # Setup for forward boxcar 301 elif align=='forward': 302 HW1 = 0 303 HW2 = width-HW1-1 304 305 # Setup for centered boxcar 306 elif align=='center': 307 # Separate treatments for odd and even widths 308 if np.mod(width,2)==1: 309 # Odd widths 310 HW1 = np.floor(width/2) 311 HW2 = width-HW1-1 312 313 else: 314 # Even widths 315 HW1 = width/2 316 HW2 = width-HW1 317 318 # Uniform kernel in middle 319 kernel = np.ones(width+1) 320 321 # Half weight kernel for ends 322 kernel[0] = 0.5 323 kernel[width] = 0.5 324 325 # Normalize the kernel 326 kernel=kernel/kernel.sum() 327 else: 328 raise ValueError( f'align={align} not implemented. ' 329 + 'Value should be "center", "forward" or "backward".' ) 330 331 # Convert to integer type 332 HW1 = int(HW1) 333 HW2 = int(HW2) 334 335 # Do boxcar 336 for i in range(HW1,N-HW2-1): 337 338 # Sub array that we operate on 339 sub = array[i-HW1:i+HW2+1] 340 341 if method=='median': 342 # Running median 343 smootharray[i] = np.nanmedian(sub) 344 345 elif method=='mean': 346 # Running mean 347 # Kernel average, only over finite elements 348 #(NaNs removed from numerator and denominator 349 smootharray[i] = np.nansum( sub*kernel ) / np.sum(kernel[np.where(np.isfinite(sub))]) 350 351 elif method=='polynomial': 352 353 # Local x coordinate 354 x = np.arange(-HW1,HW2+1) 355 356 # Fit with polynomial of specified order 357 # Avoid NaNs 358 idx = np.isfinite(sub) 359 360 # number of valid points (non NaN) 361 nvalid = np.sum(idx) 362 363 if nvalid==0: 364 # smoothed value is NaN when there are no valid points 365 smootharray[i] = np.nan 366 367 else: 368 # A polynomial fit requires at least (order+1) points. 369 # When there are fewer points, use the highest possible order. 370 ordmax = np.max([np.min([order,np.sum(idx)-1]),0]) 371 372 p = np.polyfit(x[idx],sub[idx],ordmax,w=kernel[idx]) 373 374 # The fitted value at local x=0 is just the constant term 375 smootharray[i] = p[order] 376 377 else: 378 raise ValueError( f'method={method} not implemented. ' 379 + 'Value should be "mean", "median" or "polynomial"') 380 381 return smootharray
60def STL_harmonic( y, x=None, data=None, 61 period=None, trend_ratio=1.5, 62 nharmonics=4, 63 maxiter=4, rtol=1e-2, robust=False, verbose=False, 64 **kwargs): 65 '''Decompose time series into seasonal, trend, and residual components 66 using harmonic regression for the seasonal component and lowess smoothing/filtering to separate the trend and residual components. 67 68 This is similar STL decomposition with the following differences: 69 * The seasonal component is modeled using harmonic regression, which guarantees a smooth seasonal cycle. 70 * The seasonal cycle is fixed in this decomposition, while it can vary in STL. 71 * STL requires regularly spaced data, while this decomposition can handle irregularly spaced data, 72 as long as the time variable `x` is provided. Nevertheless, the quality is likely to be better if the time series has no large gaps. 73 74 Like STL, the decomposition is iterative. The seasonal cycle is fitted to detredended data, 75 then the seasonal cycle is subtracted from the data. Lowess is applied to the deseasonalized data to estimate 76 the trend. The updated trend is subtracted from the original data and an updated seasonal cycle 77 is fitted to the updated detrended data. This process is repeated until the seasonal cycle converges 78 or the maximum number of iterations is reached. 79 80 Parameters 81 ---------- 82 y : array-like, pandas.Series or str 83 The time series to decompose. If a string, it is treated as a column name in `data`. 84 x : array-like or str, optional 85 The time variable corresponding to `y`. If a string, it is treated as a column name in `data`. 86 If None (default), the index of `y` is used if it has one, otherwise a range from 0 to len(y)-1 is used. 87 data : DataFrame, optional 88 A DataFrame containing the data. Required if `y` or `x` are strings referring to column names. 89 Not needed if `y` and `x` are array-like. 90 period : float or np.timedelta64 or pd.Timedelta 91 The length of the seasonal cycle in the same units as `x`. 92 If `x` is datetime-like, `period` should be a np.timedelta64 or pandas.Timedelta, e.g. `pd.Timedelta(days=365.25)` for annual seasonal cycles. 93 If `x` is in months, `period` should be 12, and if `x` is in days, then `period` should be 365.25. 94 trend_ratio : float, optional 95 The ratio of the trend smoother span to the seasonal cycle length (period). Default is 1.5, which means the trend smoother span will be 1.5 times the period. 96 nharmonics : int, optional 97 The number of harmonics to include in the seasonal model. Default is 4, which includes the annual, semiannual, triannual, and quadriannual harmonics for yearly seasonality. 98 maxiter : int, optional 99 The maximum number of iterations for the decomposition. Default is 4. 100 rtol : float, optional 101 The relative tolerance for convergence of the seasonal component. Default is 1e-2. 102 robust : bool or int, optional 103 Specifies whether robust lowess fitting should be used for trend smoothing. 104 If False (default), no robust fitting is done. If True, 3 iterations of robust fitting are done within the lowess filter. 105 If an integer >= 1, lowess uses that many iterations of robust fitting. 106 verbose : bool, optional 107 If True, prints convergence information. Default is False. 108 **kwargs : 109 Additional keyword arguments to pass to the lowess function for trend smoothing. For example, ` 110 111 Returns 112 ------- 113 DecomposeResult 114 An object containing the original data, seasonal component, trend component, and residuals. 115 ''' 116 117 if period is None: 118 raise ValueError('Keyword `period` must be specified') 119 120 # If y is a string, treat it as a column name in data 121 if isinstance(y, str): 122 try: 123 y = data[y] 124 except KeyError: 125 raise ValueError(f'y must be a 1-D array or a column name in data. {y} not found in data.') 126 127 # If x is a string, treat it as a column name in data. 128 if isinstance(x, str): 129 try: 130 x = data[x] 131 except KeyError: 132 raise ValueError(f'x must be a 1-D array or a column name in data. {x} not found in data.') 133 # If x is None, use the index of y or a range if y has no index. 134 elif x is None: 135 try: 136 x = y.index 137 except AttributeError: 138 x = np.arange(len(y)) 139 140 # Ensure nharmonics is a positive integer 141 if not isinstance(nharmonics, int) or nharmonics < 1: 142 raise ValueError('nharmonics must be a positive integer') 143 144 # Robust fitting iterations 145 if robust is False: 146 robust_iterations=1 147 elif robust is True: 148 robust_iterations=3 149 elif isinstance(robust, int) and robust >= 1: 150 robust_iterations=robust 151 robust = True 152 else: 153 raise ValueError('Invalid value for robust parameter. Must be True, False, or an integer >= 1.') 154 155 # Normalize x, so that it is the number of periods since the start of the time series 156 # xnorm = ( x - np.min(x) ) / period 157 xnorm = ( x - x[0] ) / period 158 159 # Initialize trend and seasonal_old 160 trend = np.ones_like(y)*np.mean(y) 161 seasonal_old = np.zeros_like(y) 162 converged = False 163 164 for i in range(1,maxiter+1): 165 166 # Detrend 167 detrended = y - trend 168 169 # Fit seasonal cycle with nharmonics harmonics (annual, semiannual, triannual, etc.) 170 X = np.column_stack([func(2*i*np.pi*xnorm) for i in range(1, nharmonics+1) for func in (np.sin, np.cos)]) 171 if robust: 172 res = sm.RLM(detrended, X, M=sm.robust.norms.TukeyBiweight()).fit() 173 else: 174 res = sm.OLS(detrended, X).fit() 175 seasonal = res.predict(X) 176 177 # Update trend by fitting lowess to data - seasonal 178 trend = lowess(y-seasonal, xnorm, 179 frac = trend_ratio/np.max(xnorm), it=robust_iterations, 180 # frac=ratio*period/len(y), it=robust_iterations, 181 return_sorted=False, **kwargs) 182 183 # Exit if seasonal cycle has converged 184 if np.max(np.abs(seasonal - seasonal_old)) < rtol*np.mean(np.abs(seasonal)): 185 converged = True 186 break 187 188 # Update seasonal_old for next iteration 189 seasonal_old = seasonal 190 191 # Print convergence message if verbose 192 if verbose: 193 if converged: 194 print(f'STL_harmonic converged in {i} iterations') 195 else: 196 print(f'STL_harmonic did not converge in {maxiter} iterations') 197 198 # Compute residual 199 resid = y - trend - seasonal 200 201 # Weights are not computed in this implementation, so set to NaN 202 weights = np.full_like(resid, np.nan) 203 204 if isinstance(y, pd.Series): 205 seasonal = pd.Series(seasonal, index=y.index, name='seasonal') 206 trend = pd.Series(trend, index=y.index, name='trend') 207 resid = pd.Series(resid, index=y.index, name='resid') 208 weights = pd.Series(weights, index=y.index, name='weights') 209 210 # return trend, seasonal, resid 211 return DecomposeResult(y, 212 seasonal, 213 trend, 214 resid, 215 weights)
Decompose time series into seasonal, trend, and residual components using harmonic regression for the seasonal component and lowess smoothing/filtering to separate the trend and residual components.
This is similar STL decomposition with the following differences:
- The seasonal component is modeled using harmonic regression, which guarantees a smooth seasonal cycle.
- The seasonal cycle is fixed in this decomposition, while it can vary in STL.
- STL requires regularly spaced data, while this decomposition can handle irregularly spaced data,
as long as the time variable
xis provided. Nevertheless, the quality is likely to be better if the time series has no large gaps.
Like STL, the decomposition is iterative. The seasonal cycle is fitted to detredended data, then the seasonal cycle is subtracted from the data. Lowess is applied to the deseasonalized data to estimate the trend. The updated trend is subtracted from the original data and an updated seasonal cycle is fitted to the updated detrended data. This process is repeated until the seasonal cycle converges or the maximum number of iterations is reached.
Parameters
- y (array-like, pandas.Series or str):
The time series to decompose. If a string, it is treated as a column name in
data. - x (array-like or str, optional):
The time variable corresponding to
y. If a string, it is treated as a column name indata. If None (default), the index ofyis used if it has one, otherwise a range from 0 to len(y)-1 is used. - data (DataFrame, optional):
A DataFrame containing the data. Required if
yorxare strings referring to column names. Not needed ifyandxare array-like. - period (float or np.timedelta64 or pd.Timedelta):
The length of the seasonal cycle in the same units as
x. Ifxis datetime-like,periodshould be a np.timedelta64 or pandas.Timedelta, e.g.pd.Timedelta(days=365.25)for annual seasonal cycles. Ifxis in months,periodshould be 12, and ifxis in days, thenperiodshould be 365.25. - trend_ratio (float, optional): The ratio of the trend smoother span to the seasonal cycle length (period). Default is 1.5, which means the trend smoother span will be 1.5 times the period.
- nharmonics (int, optional): The number of harmonics to include in the seasonal model. Default is 4, which includes the annual, semiannual, triannual, and quadriannual harmonics for yearly seasonality.
- maxiter (int, optional): The maximum number of iterations for the decomposition. Default is 4.
- rtol (float, optional): The relative tolerance for convergence of the seasonal component. Default is 1e-2.
- robust (bool or int, optional): Specifies whether robust lowess fitting should be used for trend smoothing. If False (default), no robust fitting is done. If True, 3 iterations of robust fitting are done within the lowess filter. If an integer >= 1, lowess uses that many iterations of robust fitting.
- verbose (bool, optional): If True, prints convergence information. Default is False.
- **kwargs :: Additional keyword arguments to pass to the lowess function for trend smoothing. For example, `
Returns
- DecomposeResult: An object containing the original data, seasonal component, trend component, and residuals.
17def STL_smoother( data, period=None, smooth=4, **kwargs): 18 '''STL decomposition of time series into trend, seasonal, and residual components 19 with additional smoothing of the seasonal component 20 21 Parameters 22 ---------- 23 data : array_like 24 time series data to be decomposed 25 period : int 26 The length of the seasonal cycle 27 smooth : int, optional 28 length of the lowess filter applied to the STL seasonal cycle, default is 4 29 **kwargs : 30 Additional keyword arguments to pass to the STL function. 31 For example, robust=True. See the statsmodels documentation for more details 32 33 Returns 34 ------- 35 result : DecomposeResult 36 object containing the original data, seasonal component, trend component, residuals, and weights (if robust=True) 37 ''' 38 39 result = STL(data, period, **kwargs).fit() 40 41 #Additional smoothing of the seasonal component 42 seasonal_smooth = lowess(result.seasonal, 43 result.seasonal.index, 44 frac=smooth/len(data), 45 return_sorted=False) 46 seasonal_smooth = pd.Series(seasonal_smooth, 47 index=result.seasonal.index, 48 name='seasonal') 49 50 # Adjust residuals 51 resid_adjusted = result.resid + (result.seasonal - seasonal_smooth) 52 resid_adjusted = resid_adjusted.rename('resid') 53 54 return DecomposeResult(data, 55 seasonal_smooth, 56 result.trend, 57 resid_adjusted, 58 weights=result.weights)
STL decomposition of time series into trend, seasonal, and residual components with additional smoothing of the seasonal component
Parameters
- data (array_like): time series data to be decomposed
- period (int): The length of the seasonal cycle
- smooth (int, optional): length of the lowess filter applied to the STL seasonal cycle, default is 4
- **kwargs :: Additional keyword arguments to pass to the STL function. For example, robust=True. See the statsmodels documentation for more details
Returns
- result (DecomposeResult): object containing the original data, seasonal component, trend component, residuals, and weights (if robust=True)
217def DecomposeResult_to_DataFrame(result): 218 '''Convert a DecomposeResult object to a pandas DataFrame. 219 220 Parameters 221 ---------- 222 result : DecomposeResult 223 224 Returns 225 ------- 226 DataFrame 227 A DataFrame with columns 'original', 'seasonal', 'trend', 'residual', and 'weights'. 228 ''' 229 try: 230 index = result.observed.index 231 except AttributeError: 232 index = np.arange(len(result.data)) 233 return pd.DataFrame({'original': result.observed, 234 'seasonal': result.seasonal, 235 'trend': result.trend, 236 'residual': result.resid, 237 'weights': result.weights}, 238 index=index)
Convert a DecomposeResult object to a pandas DataFrame.
Parameters
- result (DecomposeResult):
Returns
- DataFrame: A DataFrame with columns 'original', 'seasonal', 'trend', 'residual', and 'weights'.
248def boxcar( array, width, align='center', method='mean', order=2 ): 249 '''Calculate a boxcar average (i.e. running mean, median, or polynomial fit) of an array. 250 251 Elements of input array are assumed to be equally spaced along an (unspecified) 252 coordinate dimension. 253 254 A centered boxcar average with even widths is traditionally 255 not allowed. This BOXCAR program allows even widths by 256 giving half weights to elements on either end of the 257 averaging window and full weights to all others. A centered 258 average with even width therefore uses uses WIDTH+1 259 elements in its averaging kernel. 260 261 Parameters 262 ---------- 263 array : array of float (N,) 264 1D array of values to average 265 width : int 266 number of elements to include in the average. 267 align : str 268 specifies how the averaging window for each element of the output array 269 aligns with the input array. 270 Values: center (default), forward (trailing elements), backward (leading elements) 271 method : {'mean' (default), 'median', 'polynomial'} 272 specifies the averaging kernel function 273 order : int, default=2 274 specifies the polynomial order to fit within each boxcar window. 275 This has no effect unless `method`='polynomial' 276 A parabola is order=2; order=0 is equivalent to method="mean" 277 278 Returns 279 ------- 280 result : array of float (N,) 281 1D array with averages with same size as input array. 282 Elements on either end may be NaN 283 ''' 284 285 N = array.size 286 287 # Initialize smoothed array 288 smootharray = np.empty_like(array) 289 smootharray[:] = np.NaN 290 291 # uniform averaging kernel 292 kernel = np.ones(width) 293 294 # Setup for backward boxcar 295 if align=='backward': 296 # Half widths before and after element I 297 298 HW1 = width-1 299 HW2 = width-HW1-1 300 301 # Setup for forward boxcar 302 elif align=='forward': 303 HW1 = 0 304 HW2 = width-HW1-1 305 306 # Setup for centered boxcar 307 elif align=='center': 308 # Separate treatments for odd and even widths 309 if np.mod(width,2)==1: 310 # Odd widths 311 HW1 = np.floor(width/2) 312 HW2 = width-HW1-1 313 314 else: 315 # Even widths 316 HW1 = width/2 317 HW2 = width-HW1 318 319 # Uniform kernel in middle 320 kernel = np.ones(width+1) 321 322 # Half weight kernel for ends 323 kernel[0] = 0.5 324 kernel[width] = 0.5 325 326 # Normalize the kernel 327 kernel=kernel/kernel.sum() 328 else: 329 raise ValueError( f'align={align} not implemented. ' 330 + 'Value should be "center", "forward" or "backward".' ) 331 332 # Convert to integer type 333 HW1 = int(HW1) 334 HW2 = int(HW2) 335 336 # Do boxcar 337 for i in range(HW1,N-HW2-1): 338 339 # Sub array that we operate on 340 sub = array[i-HW1:i+HW2+1] 341 342 if method=='median': 343 # Running median 344 smootharray[i] = np.nanmedian(sub) 345 346 elif method=='mean': 347 # Running mean 348 # Kernel average, only over finite elements 349 #(NaNs removed from numerator and denominator 350 smootharray[i] = np.nansum( sub*kernel ) / np.sum(kernel[np.where(np.isfinite(sub))]) 351 352 elif method=='polynomial': 353 354 # Local x coordinate 355 x = np.arange(-HW1,HW2+1) 356 357 # Fit with polynomial of specified order 358 # Avoid NaNs 359 idx = np.isfinite(sub) 360 361 # number of valid points (non NaN) 362 nvalid = np.sum(idx) 363 364 if nvalid==0: 365 # smoothed value is NaN when there are no valid points 366 smootharray[i] = np.nan 367 368 else: 369 # A polynomial fit requires at least (order+1) points. 370 # When there are fewer points, use the highest possible order. 371 ordmax = np.max([np.min([order,np.sum(idx)-1]),0]) 372 373 p = np.polyfit(x[idx],sub[idx],ordmax,w=kernel[idx]) 374 375 # The fitted value at local x=0 is just the constant term 376 smootharray[i] = p[order] 377 378 else: 379 raise ValueError( f'method={method} not implemented. ' 380 + 'Value should be "mean", "median" or "polynomial"') 381 382 return smootharray
Calculate a boxcar average (i.e. running mean, median, or polynomial fit) of an array.
Elements of input array are assumed to be equally spaced along an (unspecified) coordinate dimension.
A centered boxcar average with even widths is traditionally not allowed. This BOXCAR program allows even widths by giving half weights to elements on either end of the averaging window and full weights to all others. A centered average with even width therefore uses uses WIDTH+1 elements in its averaging kernel.
Parameters
- array (array of float (N,)): 1D array of values to average
- width (int): number of elements to include in the average.
- align (str): specifies how the averaging window for each element of the output array aligns with the input array. Values: center (default), forward (trailing elements), backward (leading elements)
- method ({'mean' (default), 'median', 'polynomial'}): specifies the averaging kernel function
- order (int, default=2):
specifies the polynomial order to fit within each boxcar window.
This has no effect unless
method='polynomial' A parabola is order=2; order=0 is equivalent to method="mean"
Returns
- result (array of float (N,)): 1D array with averages with same size as input array. Elements on either end may be NaN
241def boxcarpoly( array, width, order=2, align='center'): 242 '''Calculate a boxcar polynomial (i.e. running polynomial fit) of an array. 243 244 See `boxcar` for parameter definitions 245 ''' 246 return boxcar( array, width, order=order, align=align, method='polynomial')
Calculate a boxcar polynomial (i.e. running polynomial fit) of an array.
See boxcar for parameter definitions