Statistical Details

The convergence failure warning shows the score test for the following hypothesis: that the unknown maximum likelihood estimate (MLE) is consistent with the parameter given in the final iteration of the model-fitting algorithm. This hypothesis test is possible because the relative gradient criterion is algebraically equivalent to the score test statistic. Remarkably, the score test does not require knowledge of the true MLE.

Score Test

Consider first the case of a single parameter, θ. Let l be the log-likelihood function for θ and let x be the data. The score is the derivative of the log-likelihood function with respect to θ:

The observed information is:

The statistic for the score test of H0:

is:

This statistic has an asymptotic Chi-square distribution with 1 degree of freedom under the null hypothesis.

The score test can be generalized to multiple parameters. Consider the vector of parameters θ. Then the test statistic for the score test of H0:

is:

where

and

denotes the transpose of the matrix U.

The test statistic is asymptotically Chi-square distribution with k degrees of freedom. Here k is the number of unbounded parameters.

Relative Gradient

The convergence criterion for the Mixed Model fitting procedure is based on the relative gradient

. Here,

is the gradient of the log-likelihood function and

is its Hessian.

Let

be the value of

where the algorithm terminates. Note that the relative gradient evaluated at

is the score test statistic. A p-value is calculated using a Chi-square distribution with k degrees of freedom. This p-value gives an indication of whether the value of the unknown MLE is consistent with

. The number of unbounded parameters listed in the Random Effects Covariance Parameter Estimates report equals k.

Random Coefficient Model

The standard random coefficient model specifies a random intercept and slope for each subject. Let yij denote the measurement of the jth observation on the ith subject. Then the random coefficient model can be written as follows:

where

and

You can reformulate the model to reflect the fixed and random components that are estimated by JMP as follows.

where

and

with G and

defined as above.

Repeated Measures

The form of the repeated measures model is yijk = αij + sik + eijk, where

αij can be written as a treatment and time factorial

sik is the random effect of the kth subject assigned to the ith treatment

j = 1,…,m denotes the repeated measurements over time.

Assume that the sik are independent and identically distributed N(0, σs2) variables. Denote the number of treatment factors by t and the number of subjects by s. Then the distribution of eijk is N(0, Σ), where

and

Denote the block diagonal component of the covariance matrix Σ corresponding to the ikth subject within treatment by Σik. In other words, Σik = Var(yik|sik). Because observations over time within a subject are not typically independent, it is necessary to estimate the variance of yijk|sik. Failure to account for the correlation leads to distorted inference. The following sections describe the structures available for Σik.

Unstructured Covariance Structure

Here, the variance among observations taken at time j is:

The covariance between observations taken at times j and j’ is:

Observations at every time have a unique variance and observations within the same subject at every pair of distinct times have a unique covariance.

AR(1) Covariance Structure

Here tj is the time of observation j. In this structure, observations taken at any given time have the same variance,

. The parameter ρ, where -1 < ρ < 1, is the correlation between two observations that are one unit of time apart. As the time difference between observations increases, their covariance decreases because ρ is raised to a higher power. In many applications, AR(1) provides an adequate model of the within subject correlation, providing more power without sacrificing Type I error control.

Compound Symmetry Covariance Structure

In JMP, a compound symmetry covariance structure is implemented using the independent errors, mixed-model approach. Random effects are classified into two categories: G-side or R-side. See Searle, Casella, and McCulloch (1992) for additional details.

The G-side random effects are associated with the design matrix for random effects. The R-side random effects are associated with residual error. Within-subject variance is part of the design structure and is modeled on the G-side. Between-subject variance falls into the residual structure and is modeled R-side. In the independent structure:

•	The random effects G-side variance is modeled by sik ~ iid N(0, σs2).

•	The R-side variance is modeled by eijk ~ iid N(0, σ 2).

It follows that the covariance matrix is given as follows:

where J is a matrix consisting of 1s and I is an identity matrix.

Alternatively, all variance could be modeled R-side. Under the Gaussian assumption, this compound-symmetry covariance structure is equivalent to the independence model (Type=CS in SAS). This structure is not available in JMP and is listed here for informational purposes only.

where

and

Spatial and Temporal Variability

Consider the simple model

. The spatial or temporal structure is modeled through the error term, ei. In general, the spatial correlation model can be defined as

and

Let si denote the location of yi, where si is specified by coordinates reflecting space or time. The spatial or temporal structure is typically restricted by assuming that the covariance is a function of the Euclidean distance, dij, between si and sj. The covariance can be written as

, where

represents the correlation between observations yi and yj.

In the case of two or more location coordinates, if f(dij) does not depend on direction, then the covariance structure is isotropic. If it does, then the structure is anisotropic.

Spatial Correlation Structures

The correlation structures for spatial models available in JMP are shown below. These are parametrized by ρ, which is positive unless it is otherwise constrained.

•

Spherical

where

•	Exponential

•

Gaussian

•

Power

For an anisotropic model, the correlation function contains a parameter, ρκ, for each direction.

Variogram

When the spatial process is second-order stationary, the structures listed in Spatial Correlation Structures define variograms. Borrowed from geostatistics, the variogram is the standard tool for describing and estimating spatial variability. It measures spatial variability as a function of the distance, dij, between observations using the semivariance.

Let Z(s) denote the value of the response at a location s. The semivariance between observations at si and sj is given as follows:

If the response has a constant mean, then the expression can be simplified to the following:

If the process is isotropic, the semivariance depends only on the distance h between points and the function can be written as follows:

The following terms are associated with variograms:

Nugget

Defined as the intercept. This represents a jump discontinuity at h = 0.

Sill

Defined as the value of the semivariogram at the plateau reached for larger distances. It corresponds to the variance of an observation. In models with no nugget effect, the sill is

. In models with a nugget effect, the sill is

, where c1 represents the nugget. The partial sill is defined as

Range

Defined as the distance at which the semivariogram reaches the sill. At distances less than the range, observations are spatially correlated. For distances greater than or equal to the range, spatial correlation is effectively zero. In spherical models, ρ is the range. In exponential models, 3ρ is the practical range. In Gaussian models,

is the practical range. The practical range is defined as the distance where covariance is reduced to 95% of the sill.

In Fit a Spatial Structure Model, the repeated effects covariance parameter estimates represent the various semivariogram features:

Spatial Spherical

An estimate of the range, ρ.

Nugget

A scaled estimate of

. The Residual times the Nugget is

Residual

The partial sill or the sill in no nugget models.

Variogram Estimate

For a given isotropic spatial structure, the estimated variogram is obtained using a nonlinear least squares fit of the observed data to the appropriate function in Spatial Correlation Structures.

Empirical Semivariance

To compute the empirical semivariance, the distances between all pairs of points for the variables selected for the variogram covariance are computed. The range of the distances is divided into 10 equal intervals. If the data do not allow for 10 intervals, then as many intervals as possible are constructed.

Distance classes consisting of pairs of points are constructed. The hth distance class consists of all pairs of points whose distances fall in the hth interval.

Consider the following notation:

total number of pairs of points

distance class consisting of points whose distance falls into the hth largest interval

Z(x)

value of the response at x, where x is a vector of temporal or spatial coordinates

γ(h)

semivariance for distance class Ch

The semivariance function, γ, is defined as follows:

Here

is an estimate of the nugget effect.

The Kackar-Harville Correction

The variance matrix of the fixed effects is always modified to include a Kackar-Harville correction. The variance matrix of the BLUPs, and the covariances between the BLUPs and the fixed effects, are not Kackar-Harville corrected. The rationale for this approach is that corrections for BLUPs can be computationally and memory intensive when the random effects have many levels. In SAS, the Kackar-Harville correction is done for both fixed effects and BLUPs only when the DDFM=KENWARDROGER is set.

For covariance structures that have nonzero second derivatives with respect to the covariance parameters, the Kenward-Roger covariance matrix adjustment includes a second-order term. This term can result in standard error shrinkage. Also, the resulting adjusted covariance matrix can then be indefinite and is not invariant under reparameterization. The first-order Kenward-Roger covariance matrix adjustment eliminates the second derivatives from the calculation. All spatial structures and the AR(1) structure are covariance structures that generally lead to nonzero second derivatives.

Because JMP implements the Kenward-Roger first-order adjustment

•	Standard errors for linear combinations involving only fixed effects parameters match PROC MIXED DDFM=KENWARDROGER(FIRSTORDER). This presumes that one has taken care to transform between the different parameterizations used by PROC MIXED and JMP.

•	Standard errors for linear combinations involving only BLUP parameters match PROC MIXED DDFM=SATTERTHWAITE.

•

Standard errors for linear combinations involving both fixed effects and BLUPS do not match PROC MIXED for any DDFM option if the data are unbalanced. However, these standard errors are between those obtained using the DDFM=SATTERTHWAITE and DDFM=KENWARDROGER options. If the data are balanced, JMP matches SAS regardless of the DDFM option, because the Kackar-Harville correction is null.

Degrees of Freedom

The degrees of freedom for tests involving only linear combinations of fixed effect parameters are calculated using the first-order Kenward-Roger correction. So JMP’s results for these tests match PROC MIXED using the DDFM=KENWARDROGER(FIRSTORDER) option. If there are BLUPs in the linear combination, JMP uses a Satterthwaite approximation to get the degrees of freedom. The results then follow a pattern similar to what is described for standard errors in the preceding paragraph.

For more details about the Kackar-Harville correction and the Kenward-Roger DF approach, see Kenward and Roger (1997). The Satterthwaite method is described in detail in the SAS PROC MIXED documentation (SAS/STAT 12.3 User’s Guide, Chapter 59).