Time Series Analysis

Setup

library(astsa)

## Warning: package 'astsa' was built under R version 4.0.3

set.seed(280875)

Time Series Plots

Johnson & Johnson Quarterly Earnings

The following figure shows quarterly earnings per share for the U.S. company Johnson & Johnson, furnished by Professor Paul Griffin (personal communication) of the Graduate School of Management, University of California, Davis. There are 84 quarters (21 years) measured from the first quarter of 1960 to the last quarter of 1980.

plot(jj, type="o", ylab="Quarterly Earnings per Share")

Global Warming

The data are the global mean land–ocean temperature index from 1880 to 2015, with the base period 1951-1980. In particular, the data are deviations, measured in degrees centigrade, from the 1951-1980 average,

plot(globtemp, type="o", ylab="Global Temperature Deviations")

Speech Data

The following figure shows a small .1 second (1000 point) sample of recorded speech for the phrase aaa · · · hhh, and we note the repetitive nature of the signal and the rather regular periodicities. One current problem of great interest is computer recognition of speech, which would require converting this particular signal into the recorded phrase aaa · · · hhh.

plot(speech)

Dow Jones Industrial Average

As an example of financial time series data, Figure 1.4 shows the daily returns (or percent change) of the Dow Jones Industrial Average (DJIA) from April 20, 2006 to April 20, 2016. It is easy to spot the financial crisis of 2008 in the figure. The data shown in Figure 1.4 are typical of return data.

library(xts)

## Warning: package 'xts' was built under R version 4.0.3

## Loading required package: zoo

## Warning: package 'zoo' was built under R version 4.0.3

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

djiar = diff(log(djia$Close))[-1] # approximate returns
plot(djiar, main="DJIA Returns", type="n")

lines(djiar)

El Niño and Fish Population

We may also be interested in analyzing several time series at once. Figure 1.5 shows monthly values of an environmental series called the Southern Oscillation Index (SOI) and associated Recruitment (number of new fish) furnished byDr. Roy Mendelssohn of the Pacific Environmental Fisheries Group (personal communica- tion). Both series are for a period of 453 months ranging over the years 1950–1987.

plot(soi, ylab="", xlab="", main="Southern Oscillation Index")

plot(rec, ylab="", xlab="", main="Recruitment")

fMRI Imaging

A fundamental problem in classical statistics occurs when we are given a collection of independent series or vectors of series, generated under varying experimental conditions or treatment configurations. Such a set of series is shown in Figure 1.6, where we observe data collected from various locations in the brain via functional magnetic resonance imaging (fMRI). In this example, five subjects were given pe- riodic brushing on the hand. The stimulus was applied for 32 seconds and then stopped for 32 seconds; thus, the signal period is 64 seconds. The sampling rate was one observation every 2 seconds for 256 seconds (n = 128).

ts.plot(fmri1[,2:5], col=1:4, ylab="BOLD", main="Cortex")

ts.plot(fmri1[,6:9], col=1:4, ylab="BOLD", main="Thalamus & Cerebellum")

Earthquakes and Explosions

plot(EQ5, main="Earthquake")

plot(EXP6, main="Explosion")

Time Series Statistical Models

White Noise and Moving Average and Filtering

w = rnorm(500,0,1) # 500 N(0,1) variates 
v = filter(w, sides=2, filter=rep(1/3,3)) # moving average
plot.ts(w, main="white noise")

plot.ts(v, ylim=c(-3,3), main="moving average")

Autoregressions

Suppose we consider the white noise series wt of Example 1.8 as input and calculate the output using the second-order equation \[x_t = x_{t−1} − .9x_{t−2} + w_t\]

w = rnorm(550,0,1) # 50 extra to avoid startup problems 
x = filter(w, filter=c(1,-.9), method="recursive")[-(1:50)] # remove first 50 
plot.ts(x, main="autoregression")

Random Walk with Drift

A model for analyzing trend such as seen in the global temperature data in a Figure above, is the random walk with drift model given by \[x_t =\delta +x_{t−1}+w_t\]

for \(t = 1, 2, \ldots\), with initial condition \(x_0 = 0\), and where wt is white noise.

set.seed(154) # so you can reproduce the results
w = rnorm(200); x = cumsum(w) # two commands in one line
wd = w +.2; xd = cumsum(wd)
plot.ts(xd, ylim=c(-5,55), main="random walk", ylab='')
lines(x, col=4); abline(h=0, col=4, lty=2); abline(a=0, b=.2, lty=2)

Signal in Noise

Many realistic models for generating time series assume an underlying signal with some consistent periodic variation, contaminated by adding a random noise. For example, it is easy to detect the regular cycle fMRI series displayed on the top of a figure abvoe Consider the model \[x_t =2\cos(2\pi \frac{t+15}{50})+w_t\] for t = 1, 2, . . . , 500, where the first term is regarded as the signal, shown in the upper panel of the following Figure

cs = 2*cos(2*pi*1:500/50 + .6*pi); w = rnorm(500,0,1)
plot.ts(cs, main=expression(2*cos(2*pi*t/50+.6*pi)))

plot.ts(cs+w, main=expression(2*cos(2*pi*t/50+.6*pi) + N(0,1)))

plot.ts(cs+5*w, main=expression(2*cos(2*pi*t/50+.6*pi) + N(0,25)))

Estimation of Correlation

Sample ACF and Scatterplots

Estimating autocorrelation is similar to estimating of correlation in the usual setup where we have pairs of observations, say \((x_i, y_i )\), for \(i = 1, \ldots, n.\) For example, if we have time series data \(x_t\) for \(t = 1,\ldots, n,\) then the pairs of observations for estimating \(\rho(h)\) arethen−hpairs given by \(\{(x_t,x_{t+h}); t =1,\ldots,n−h\}\). The figure below shows an example using the SOI series where \(\rho(1) = .604\) and \(\rho(6) = −.187\).

(r = round(acf(soi, 6, plot=FALSE)$acf[-1], 3)) # first 6 sample acf values

## [1]  0.604  0.374  0.214  0.050 -0.107 -0.187

plot(lag(soi,-1), soi); legend('topleft', legend=r[1])

plot(lag(soi,-6), soi); legend('topleft', legend=r[6])

A simulated Time series

To compare the sample ACF for various sample sizes to the theoretical ACF, consider a contrived set of data generated by tossing a fair coin,letting \(x_t =1\) when a head is obtained and \(x_t = −1\) when a tail is obtained. Then, construct \(y_t\) as \[y_t = 5 + x_t − .7x_{t−1}.\]

set.seed(101010)
x1 = 2*rbinom(11, 1, .5) - 1 # simulated sequence of coin tosses
x2 = 2*rbinom(101, 1, .5) - 1
y1 = 5 + filter(x1, sides=1, filter=c(1,-.7))[-1]
y2 = 5 + filter(x2, sides=1, filter=c(1,-.7))[-1]
plot.ts(y1, type='s'); plot.ts(y2, type='s')

c(mean(y1), mean(y2))

## [1] 5.080 5.002

acf(y1, lag.max=4, plot=FALSE)

## 
## Autocorrelations of series 'y1', by lag
## 
##      0      1      2      3      4 
##  1.000 -0.688  0.425 -0.306 -0.007

acf(y2, lag.max=4, plot=FALSE)

## 
## Autocorrelations of series 'y2', by lag
## 
##      0      1      2      3      4 
##  1.000 -0.480 -0.002 -0.004  0.000

ACF of a Speech Signal

acf(speech, 250)

SOI and Recruitment Correlation Analysis

acf(soi, 48, main="Southern Oscillation Index")

acf(rec, 48, main="Recruitment")

ccf(soi, rec, 48, main="SOI vs Recruitment", ylab="CCF")

Time Series Regression and Exploratory Data Analysis

Estimating a Linear Trend

Consider the monthly price (per pound) of a chicken in the US from mid-2001 to mid-2016 (180 months), say \(x_t\) , shown in the following figure. There is an obvious upward trend in the series, and we might use simple linear regression to estimate that trend by fitting the model

summary(fit <- lm(chicken~time(chicken), na.action=NULL))

## 
## Call:
## lm(formula = chicken ~ time(chicken), na.action = NULL)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.7411 -3.4730  0.8251  2.7738 11.5804 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.131e+03  1.624e+02  -43.91   <2e-16 ***
## time(chicken)  3.592e+00  8.084e-02   44.43   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.696 on 178 degrees of freedom
## Multiple R-squared:  0.9173, Adjusted R-squared:  0.9168 
## F-statistic:  1974 on 1 and 178 DF,  p-value: < 2.2e-16

plot(chicken, ylab="cents per pound") 
abline(fit)

Pollution,Temperature and Mortality

The data shown in Figure 2.2 are extracted series from a study by Shumway et al. (1988) of the possible effects of temperature and pollution on weekly mortality in Los Angeles County. Note the strong seasonal components in all of the series, corresponding to winter-summer variations and the downward trend in the cardiovascular mortality over the 10-year period. A scatterplot matrix, shown the figure below, indicates a possible linear relation between mortality and the pollutant particulates and a possible relation to temperature. Note the curvilinear shape of the temperature mortality curve, indicating that higher temperatures as well as lower temperatures are associated with increases in cardiovascular mortality.

Based on the scatterplot matrix, we entertain, tentatively, four models where Mt denotes cardiovascular mortality, Tt denotes temperature and Pt denotes the particulate levels. They are

\[\begin{eqnarray*} M_t &=& \beta_0+\beta_1 t+w_t \\ M_t &=& \beta_0 + \beta_1 t + \beta_2(T_t − T·) + w_t \\ M_t &=& \beta_0 +\beta_1 t+\beta_2(T_t −T_.)+\beta_3(T_t −T_·)^2 +w_t \\ M_t &=& \beta_0 +\beta_1 t+\beta_2(T_t −T_.)+\beta_3(T_t −T_·)^2 +\beta_4 P_t +w_t \end{eqnarray*}\]

plot(cmort, main="Cardiovascular Mortality", xlab="", ylab="")

plot(tempr, main="Temperature", xlab="", ylab="")

plot(part, main="Particulates", xlab="", ylab="")

ts.plot(cmort,tempr,part, col=1:3) # all on same plot (not shown)

pairs(cbind(Mortality=cmort, Temperature=tempr, Particulates=part))

temp = tempr-mean(tempr)
temp2 = temp^2

trend = time(cmort)
fit = lm(cmort~ trend + temp +temp2 +part, na.action=NULL)
summary(fit) # regression results

## 
## Call:
## lm(formula = cmort ~ trend + temp + temp2 + part, na.action = NULL)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -19.0760  -4.2153  -0.4878   3.7435  29.2448 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.831e+03  1.996e+02   14.19  < 2e-16 ***
## trend       -1.396e+00  1.010e-01  -13.82  < 2e-16 ***
## temp        -4.725e-01  3.162e-02  -14.94  < 2e-16 ***
## temp2        2.259e-02  2.827e-03    7.99 9.26e-15 ***
## part         2.554e-01  1.886e-02   13.54  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.385 on 503 degrees of freedom
## Multiple R-squared:  0.5954, Adjusted R-squared:  0.5922 
## F-statistic:   185 on 4 and 503 DF,  p-value: < 2.2e-16

summary(aov(fit)) # ANOVA table

##              Df Sum Sq Mean Sq F value Pr(>F)    
## trend         1  10667   10667  261.62 <2e-16 ***
## temp          1   8607    8607  211.09 <2e-16 ***
## temp2         1   3429    3429   84.09 <2e-16 ***
## part          1   7476    7476  183.36 <2e-16 ***
## Residuals   503  20508      41                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(aov(lm(cmort~cbind(trend, temp, temp2, part)))) # Table 2.1

##                                  Df Sum Sq Mean Sq F value Pr(>F)    
## cbind(trend, temp, temp2, part)   4  30178    7545     185 <2e-16 ***
## Residuals                       503  20508      41                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

num = length(cmort) # sample size
AIC(fit)/num - log(2*pi) # AIC

## [1] 4.721732

BIC(fit)/num - log(2*pi) # BIC

## [1] 4.771699

(AICc = log(sum(resid(fit)^2)/num) + (num+5)/(num-5-2)) # AICc

## [1] 4.722062

Differencing Chicken Prices

fit = lm(chicken~time(chicken), na.action=NULL) # regress chicken on time

plot(resid(fit), type="o", main="detrended")

plot(diff(chicken), type="o", main="first difference")

acf(chicken, 48, main="chicken")

acf(resid(fit), 48, main="detrended")

acf(diff(chicken), 48, main="first difference")

Difference Global Temperature

plot(diff(globtemp), type="o")

mean(diff(globtemp)) # drift estimate = .008

## [1] 0.007925926

acf(diff(gtemp), 48)

Scatter plot Matrices,SOI and Recruitment

lag1.plot(soi, 12)

lag2.plot(soi, rec, 8)

Using Regression to Discovera Signal in Noise

set.seed(90210) # so you can reproduce these results

x = 2*cos(2*pi*1:500/50 + .6*pi) + rnorm(500,0,5)

z1 = cos(2*pi*1:500/50)

z2 = sin(2*pi*1:500/50)

summary(fit <- lm(x~0+z1+z2))

## 
## Call:
## lm(formula = x ~ 0 + z1 + z2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.8584  -3.8525  -0.3186   3.3487  15.5440 
## 
## Coefficients:
##    Estimate Std. Error t value Pr(>|t|)    
## z1  -0.7442     0.3274  -2.273   0.0235 *  
## z2  -1.9949     0.3274  -6.093 2.23e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.177 on 498 degrees of freedom
## Multiple R-squared:  0.07827,    Adjusted R-squared:  0.07456 
## F-statistic: 21.14 on 2 and 498 DF,  p-value: 1.538e-09

plot.ts(x)

plot.ts(x, col=8, ylab=expression(hat(x))) 

lines(fitted(fit), col=2)

Moving Average Smoother

wgts = c(.5, rep(1,11), .5)/12

soif = filter(soi, sides=2, filter=wgts)

plot(soi)

lines(soif, lwd=2, col=4)

nwgts = c(rep(0,20), wgts, rep(0,20))

plot(nwgts, type="l", ylim = c(-.02,.1), xaxt='n', yaxt='n', ann=FALSE)

plot(soi)

lines(ksmooth(time(soi), soi, "normal", bandwidth=1), lwd=2, col=4)

gauss = function(x) { 1/sqrt(2*pi) * exp(-(x^2)/2) }

x = seq(from = -3, to = 3, by = 0.001)

plot(x, gauss(x), type ="l", ylim=c(-.02,.45), xaxt='n', yaxt='n', ann=FALSE)

plot(soi)

lines(lowess(soi, f=.05), lwd=2, col=4) 

lines(lowess(soi), lty=2, lwd=2, col=2)

Smoothing Spline and Smoothing One Series as a Function of Another

plot(soi)

lines(smooth.spline(time(soi), soi, spar=.5), lwd=2, col=4) 
lines(smooth.spline(time(soi), soi, spar= 1), lty=2, lwd=2, col=2)

plot(tempr, cmort, xlab="Temperature", ylab="Mortality")
lines(lowess(tempr, cmort))

ARIMA Model

The sample path of an AR(1) process

plot(arima.sim(list(order=c(1,0,0), ar=.9), n=100), ylab="x",main=(expression(AR(1)~~~phi==+.9)))

plot(arima.sim(list(order=c(1,0,0), ar=-.9), n=100), ylab="x", main=(expression(AR(1)~~~phi==-.9)))

The MA(1) Process

plot(arima.sim(list(order=c(0,0,1), ma=.9), n=100), ylab="x", main=(expression(MA(1)~~~theta==+.5)))

plot(arima.sim(list(order=c(0,0,1), ma=-.9), n=100), ylab="x", main=(expression(MA(1)~~~theta==-.5)))

The ARMA model

set.seed(8675309) #
 
x = rnorm(150, mean=5) # generate iid N(5,1)s 

arima(x, order=c(1,0,1))

## 
## Call:
## arima(x = x, order = c(1, 0, 1))
## 
## Coefficients:
##           ar1     ma1  intercept
##       -0.9595  0.9527     5.0462
## s.e.   0.1688  0.1750     0.0727
## 
## sigma^2 estimated as 0.7986:  log likelihood = -195.98,  aic = 399.96

ARMAtoMA(ar = .9, ma = .5, 10) # first 10 psi-weights, xt = .9xt−1 + .5wt−1 + wt .

##  [1] 1.4000000 1.2600000 1.1340000 1.0206000 0.9185400 0.8266860 0.7440174
##  [8] 0.6696157 0.6026541 0.5423887

ARMAtoMA(ar = -.5, ma = -.9, 10) # first 10 pi-weights, The values of πj may be calculated in R as follows by reversing the roles of wt and xt ; i.e., write the model as wt = −.5wt−1 + xt − .9xt−1

##  [1] -1.400000000  0.700000000 -0.350000000  0.175000000 -0.087500000
##  [6]  0.043750000 -0.021875000  0.010937500 -0.005468750  0.002734375

The AR(2) with Complex Roots

\[x_1=1.5x_{t-1}-.75 x_{t-2} +w_t \]

  z = c(1,-1.5,.75) # coefficients of the polynomial
  (a = polyroot(z)[1]) # print one root = 1 + i/sqrt(3)

## [1] 1+0.57735i

  arg = Arg(a)/(2*pi) # arg in cycles/pt 1/arg # the pseudo period

set.seed(8675309)
ar2 = arima.sim(list(order=c(2,0,0), ar=c(1.5,-.75)), n = 144) 
plot(ar2, axes=FALSE, xlab="Time")
axis(2); axis(1, at=seq(0,144,by=12)); box() 
abline(v=seq(0,144,by=12), lty=2)

ACF = ARMAacf(ar=c(1.5,-.75), ma=0, 50) 
plot(ACF, type="h", xlab="lag") 
abline(h=0)

The \(\Psi\)-weight for an ARMA model

ARMAtoMA(ar=.9, ma=.5, 50) # for a list

##  [1] 1.400000000 1.260000000 1.134000000 1.020600000 0.918540000 0.826686000
##  [7] 0.744017400 0.669615660 0.602654094 0.542388685 0.488149816 0.439334835
## [13] 0.395401351 0.355861216 0.320275094 0.288247585 0.259422826 0.233480544
## [19] 0.210132489 0.189119240 0.170207316 0.153186585 0.137867926 0.124081134
## [25] 0.111673020 0.100505718 0.090455146 0.081409632 0.073268669 0.065941802
## [31] 0.059347622 0.053412859 0.048071573 0.043264416 0.038937975 0.035044177
## [37] 0.031539759 0.028385783 0.025547205 0.022992485 0.020693236 0.018623913
## [43] 0.016761521 0.015085369 0.013576832 0.012219149 0.010997234 0.009897511
## [49] 0.008907760 0.008016984

plot(ARMAtoMA(ar=.9, ma=.5, 50))

The PACF of an AR(p)

ACF = ARMAacf(ar=c(1.5,-.75), ma=0, 24)[-1]

PACF = ARMAacf(ar=c(1.5,-.75), ma=0, 24, pacf=TRUE)

plot(ACF, type="h", xlab="lag", ylim=c(-.8,1)); abline(h=0)

plot(PACF, type="h", xlab="lag", ylim=c(-.8,1)); abline(h=0)

Preliminary Analysis of the Recruitment Series

Modeling the Recruitment series shown in a figure above. There are 453 months of observed recruitment ranging over the years 1950-1987. The ACF and the PACF given in Figures below are consistent with the behavior of an AR(2).

acf2(rec, 48) # will produce values and a graphic

##      [,1]  [,2]  [,3]  [,4] [,5]  [,6]  [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF  0.92  0.78  0.63  0.48 0.36  0.26  0.18 0.13 0.09  0.07  0.06  0.02 -0.04
## PACF 0.92 -0.44 -0.05 -0.02 0.07 -0.03 -0.03 0.04 0.05 -0.02 -0.05 -0.14 -0.15
##      [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
## ACF  -0.12 -0.19 -0.24 -0.27 -0.27 -0.24 -0.19 -0.11 -0.03  0.03  0.06  0.06
## PACF -0.05  0.05  0.01  0.01  0.02  0.09  0.11  0.03 -0.03 -0.01 -0.07 -0.12
##      [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37]
## ACF   0.02 -0.02 -0.06 -0.09 -0.12 -0.13 -0.11 -0.05  0.02  0.08  0.12  0.10
## PACF -0.03  0.05 -0.08 -0.04 -0.03  0.06  0.05  0.15  0.09 -0.04 -0.10 -0.09
##      [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48]
## ACF   0.06  0.01 -0.02 -0.03 -0.03 -0.02  0.01  0.06  0.12  0.17  0.20
## PACF -0.02  0.05  0.08 -0.02 -0.01 -0.02  0.05  0.01  0.05  0.08 -0.04

(regr = ar.ols(rec, order=2, demean=FALSE, intercept=TRUE))

## 
## Call:
## ar.ols(x = rec, order.max = 2, demean = FALSE, intercept = TRUE)
## 
## Coefficients:
##       1        2  
##  1.3541  -0.4632  
## 
## Intercept: 6.737 (1.111) 
## 
## Order selected 2  sigma^2 estimated as  89.72

regr$asy.se.coef # standard errors of the estimates

## $x.mean
## [1] 1.110599
## 
## $ar
## [1] 0.04178901 0.04187942

Forcasting

Using the parameter estimates as the actual parameter values, Figure below shows the result of forecasting the Recruitment series over a 24-month horizon, \(m = 1, 2, \ldots , 24\). The actual forecasts are calculated as \[x^n_{n+m} = 6.74 + 1.35x_{n+m-1} − .46x^2_{n+m-2} \] for \(n = 453\) and \(m = 1,2,...,12.\)

regr = ar.ols(rec, order=2, demean=FALSE, intercept=TRUE)
fore = predict(regr, n.ahead=24)
ts.plot(rec, fore$pred, col=1:2, xlim=c(1980,1990), ylab="Recruitment")
U = fore$pred+fore$se; L = fore$pred-fore$se
xx = c(time(U), rev(time(U))); yy = c(L, rev(U)) 
polygon(xx, yy, border = 8, col = gray(.6, alpha = .2)) 
lines(fore$pred, type="p", col=2)

ARMA(1,1)

set.seed(90210)
x = arima.sim(list(order = c(1,0,1), ar =.9, ma=.5), n = 100)
xr = rev(x) # xr is the reversed data 
pxr = predict(arima(xr, order=c(1,0,1)), 10) # predict the reversed data
pxrp = rev(pxr$pred)  #reorder the preditors  (for plotting) 
pxrse = rev(pxr$se)   # reorder the SEs
nx = ts(c(pxrp, x), start=-9)
plot(nx, ylab=expression(X[~t]), main='Backcasting')
U = nx[1:10] + pxrse;  L = nx[1:10] - pxrse
xx = c(-9:0, 0:-9); yy = c(L, rev(U))
polygon(xx, yy, border = 8, col = gray(0.6, alpha = 0.2)) 
lines(-9:0, nx[1:10], col=2, type='o')

Estimation

Yule–Walker Estimation of the Recruitment Series

rec.yw = ar.yw(rec, order=2)
rec.yw$x.mean #  (mean estimate)

## [1] 62.26278

rec.yw$ar #  (coefficient estimates)

## [1]  1.3315874 -0.4445447

sqrt(diag(rec.yw$asy.var.coef)) # = .04, .04 (standard error)

## [1] 0.04222637 0.04222637

rec.yw$var.pred #  (error variance estimate )

## [1] 94.79912

rec.pr = predict(rec.yw, n.ahead=24) 
ts.plot(rec, rec.pr$pred, col=1:2) 
lines(rec.pr$pred + rec.pr$se, col=4, lty=2) 
lines(rec.pr$pred - rec.pr$se, col=4, lty=2)

MLE for the Recruitment Series

rec.mle = ar.mle(rec, order=2) 
rec.mle$x.mean

## [1] 62.26153

rec.mle$ar

## [1]  1.3512809 -0.4612736

sqrt(diag(rec.mle$asy.var.coef))

## [1] 0.04099159 0.04099159

Fitting Glacial Varve Series

Consider the series of glacial varve thicknesses from Massachusetts for n = 634 years. It was argued that a first-order moving average model might fit the logarithmically transformed and differenced varve series. The sample ACF and PACF, shown in figures, confirm the tendency of \(\Delta \log(x_t)\) to behave as a first-order moving average process as the ACF has only a significant peak at lag one and the PACF decreases exponentially.

x = diff(log(varve))
acf(x)

pacf(x)

# Evaluate Sc on a Grid

c(0)->w ->z
c() ->Sc->Sz->Szw 
num = length(x)
th = seq(-.3,-.94,-.01) 
for (p in 1:length(th)){
  for (i in 2:num){ w[i] = x[i]-th[p]*w[i-1] } 
  Sc[p] = sum(w^2) }
plot(th, Sc, type="l", ylab=expression(S[c](theta)), xlab=expression(theta),
             lwd=2)

# Gauss-Newton Estimation
r = acf(x, lag=1, plot=FALSE)$acf[-1]
rstart = (1-sqrt(1-4*(r^2)))/(2*r) 
c(0) ->w->z
c() ->Sc->Sz->Szw->para 
niter = 12
para[1] = rstart

for (p in 1:niter){
   for (i in 2:num){ w[i] = x[i]-para[p]*w[i-1] 
                     z[i] = w[i-1]-para[p]*z[i-1] }
   Sc[p] = sum(w^2)
   Sz[p] = sum(z^2)
   Szw[p] =sum(z*w)
   para[p+1] = para[p] + Szw[p]/Sz[p]  }

round(cbind(iteration=0:(niter-1), thetahat=para[1:niter] , Sc , Sz ), 3)

##       iteration thetahat      Sc      Sz
##  [1,]         0   -0.495 158.739 171.240
##  [2,]         1   -0.668 150.747 235.266
##  [3,]         2   -0.733 149.264 300.562
##  [4,]         3   -0.756 149.031 336.823
##  [5,]         4   -0.766 148.990 354.173
##  [6,]         5   -0.769 148.982 362.167
##  [7,]         6   -0.771 148.980 365.801
##  [8,]         7   -0.772 148.980 367.446
##  [9,]         8   -0.772 148.980 368.188
## [10,]         9   -0.772 148.980 368.522
## [11,]        10   -0.773 148.980 368.673
## [12,]        11   -0.773 148.980 368.741

abline(v = para[1:12], lty=2)
points(para[1:12], Sc[1:12], pch=16)

Figure abvoe displays the conditional sum of squares, \(Sc(\theta)\) as a function of \(\theta\), as well as indicating the values of each step of the Gauss–Newton algorithm. that the Gauss–Newton procedure takes large steps toward the minimum initially, and then takes very small steps as it gets close to the minimizing value.