software for linear mixed models

131
Mixed Models Symposium Agricultural field trials 10.10.2007 University of Hohenheim Dr. sc. agr. Andreas Büchse 07/2002 - 04/2007 Fachgebiet Bioinformatik. Universität Hohenheim since 05/2007 Agrarzentrum Limburgerhof. BASF Aktiengesellschaft [email protected]

Upload: others

Post on 21-Feb-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Software for Linear Mixed Models

Mixed Models

Symposium Agricultural field trials10.10.2007

University of Hohenheim

Dr. sc. agr. Andreas Büchse07/2002 - 04/2007 Fachgebiet Bioinformatik. Universität Hohenheimsince 05/2007 Agrarzentrum Limburgerhof. BASF Aktiengesellschaft

[email protected]

Page 2: Software for Linear Mixed Models

Outline

1. Definition of mixed models

2. Examples for use (usefulness) of mixed models2.1 Series of experiments

2.2 Hierarchical Designs

2.3 Repeated measurement

2.4 Split-Plot, multifactorial trials

2.5 Random coefficient models

2.6 Spatial correlation

3. Summary, Conclusions, rules of thumb

Page 3: Software for Linear Mixed Models

First...

I will put the slides on the symposium homepage

www.uni-hohenheim.de/bioinformatik

next week

Or send an eMail to

[email protected]

Page 4: Software for Linear Mixed Models

Literature

Littell, Milliken, Stroup, Wolfinger (1996): SAS System for Mixed Models. SAS Institute, Cary, NC. 2nd ed. 2006!

Verbeke, G & G. Molenberghs, 2000: Linear Mixed Models for Longitudinal Data. Springer-Verlag, New York.

Piepho, H.P., Büchse, A., and Emrich, K. (2003): A hitchhiker's guide to the mixed model analysis of randomized experiments. Journal of Agronomy and Crop Science, 189, 310-322

Piepho, H.P., Büchse, A., Richter, C. (2004): A mixed modelling approach to randomized experiments with repeated measures. Journal of Agronomy and Crop Science, 190, 230-247

Zimmerman, D. L. & D. A. Harville, 1991: A Random Field Approach to the Analysis of Field-Plot Experiments and Other Spatial Experiments. Biometrics 47: 223-23.

Longford, N. T., 1993: Random Coefficient Models. Oxford University Press.

Page 5: Software for Linear Mixed Models

Definition

What does this mean, „mixed“ model?

Page 6: Software for Linear Mixed Models

a variety trial

One location, several varieties, e.g. 4 replications

Randomized Complete Block Design

Page 7: Software for Linear Mixed Models

fixed effects model

Model of a randomized complete block design

yik = μ + bk + αi + eik

yij = yield of the i-th variety in the k-th blockμ = constant (fix)bk = effect of the k-th block (fix)αi = effect of the i-th variety (fix)eik = residual error ~N (0, σ²e)

Only one random effect ⇒ no mixed model

Page 8: Software for Linear Mixed Models

a variety trial (again)

One location, several varieties, maybe from a cross of twolines, e.g. 4 replicationsRandomized Complete Block Design

Page 9: Software for Linear Mixed Models

a variety trial (again)

If the breeder is interested in estimating response to selection

R = i h σG

R = response to selectioni = selection intensityh = sqrt of heritabilityσG = sqrt of genetic variance

Then, it is reasonable to treat variety effects as randomeffects (sampled from a Gaussian distribution)

Page 10: Software for Linear Mixed Models

mixed model

Model of a randomized complete block design, withrandom variety effects

yik = μ + bk + αi + eik

yik = yield of the i-th variety in the k-th blockμ = constant (fix)bk = effect of the k-th block (fix)αi = effect of the i-th variety ~N (0, σ²G)eik = residual error ~N (0, σ²e)

more than one random effect ⇒ mixed model !

Page 11: Software for Linear Mixed Models

Definition

A mixed linear model is a model that have morethan one random effect (not only the residual error).

It has random and fixed effects.

Random = random sample from a (normal) distribution

Page 12: Software for Linear Mixed Models

Fixed effects versus random effects

Interest in linear models [...] lies mainly in estimating (and testing hypotheses about) linear functions of the effects in the models. These effects are what we call fixed effects, and the models are correspondingly called fixed effectsmodels. There are, however, situations where we have no interestin linear functions of effects but where, by the nature of the data and their derivation, the things of prime interestconcerning the effects are variances. Effects of this nature are called random effects [...] and the models are calledrandom effects models.Models involving a mixture of fixed and random effects arecalled mixed models.

SEARLE, Linear models, 1971

Page 13: Software for Linear Mixed Models

Fixed or random?

Are inferences going to be drawn from these dataabout just these levels of the factor?

„Yes“ – then the effects are to be considered as fixed effects

„No“ – then, presumably, inferences will be madenot just about the levels occurring in the data butabout some population of levels [...] the effectsare considered as being random.

SEARLE, Linear models, 1971

Page 14: Software for Linear Mixed Models

Fix or random?

Levels of factor aresample from a randomdistribution?

no fix factor estimate fixedeffects (BLUE)

yes

randomfactor

What isthe maininterest?

only distribution of random effects

Estimate variancecomponents (REML)

distribution of randomeffects as well as effectsthemselves

Estimate variancecomponents (REML) and realized random effects(BLUP)

Page 15: Software for Linear Mixed Models

Are years random? (SEARLE, Linear models, 1971)

[...] take the case of year effects, for example in studying wheat yields: are the effects of years on yield to be considered fixed or random?

The years themselves are unlikely to be random, for they will probably be a group of consecutiveyears over which the data have been gathered orthe experiments run.

But the effects on yield may reasonably beconsidered random - unless, perhaps, one isinterested in comparing specific years for somepurpose.

Page 16: Software for Linear Mixed Models

Are blocks fix or random?

Everlasting story...

• With fix blocks s.e.m. is underestimated

• For s.e.d. it does not matter!

• With random blocks, in small experiments, the variancecomponent is sometimes estimated <0, which results in model reduction

• > my opinion: blocks fix is OK.

Page 17: Software for Linear Mixed Models

Some examples

• Genotypic lines in a breeding experiment (random)• Varieties in a VCU-trial (fix, or random with Shrinkage ->

BLUP, see Michel & Piepho)• Different doses of plant protection agents (fix)• Different agents at same level (fix)• Locations (fix or random)• Regions (fix)• Years (random, mostly)• Complete blocks (both is possible)• Incomplete blocks (both is possible, random allows use of

interblock information (see Williams, Proceedings p. 266)

Page 18: Software for Linear Mixed Models

some theory

Page 19: Software for Linear Mixed Models

General form of a mixed model in matrix notation

y = Xβ + Zu + e

y = a vector with n observations

X, Z = Designmatrices

β = a vector of fixed effectsu = a vector of random effectse = a vector of residuals

e ∼ N(0,R) u ∼ N(0,G)

R, G = variance-covariance matrices

Page 20: Software for Linear Mixed Models

General form of a mixed model in matrix style

y = Xβ + Zu + e e ∼ N(0,R) u ∼ N(0,G)

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

+⎟⎟⎟

⎜⎜⎜

⎛⋅

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

+⎟⎟⎟

⎜⎜⎜

⎛⋅

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

=

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

6

5

4

3

2

1

3

2

1

2

1

100001010100010001

101101101011011011

141515151612

eeeeee

bb

αααμ

6 plots, 2 fixed block effects, 3 random variety effects

Page 21: Software for Linear Mixed Models

G: the variance-covariance-matrix of random effects

A 3 by 3 matrix because we have 3 varieties

Genetic variance on the diagonal

No covariance between genotypic effects

IG 22

2

2

2

100010001

000000

GG

G

G

G

σσσ

σσ

=⎟⎟⎟

⎜⎜⎜

⎛=

⎟⎟⎟

⎜⎜⎜

=

Page 22: Software for Linear Mixed Models

R: the residual variance-covariance-matrix (iid)

6 by 6 matrix because we have 6 plots.Residual variance on the diagonalNo covariance (in case of independent errors)This is in every case the form of R in a fixed effects model. But don‘t have to be in a mixed model. In a mixed model alternative forms of R are possible, we will see later.

IR 22

2

2

2

2

2

2

100000010000001000000100000010000001

000000000000000000000000000000

ee

e

e

e

e

e

e

σσ

σσ

σσ

σσ

=

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

=

⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜

=

Page 23: Software for Linear Mixed Models

Estimation

The fixed and random effects are estimated withthe method of WEIGHTED LEAST SQUARES

Weights depend on variance components (G, R)

Normally the variance components are unknownand have to be estimated themselves

An accepted method for the estimation is REML(Patterson & Thompson 1971)

Page 24: Software for Linear Mixed Models

Estimation of random and fixed effects

( ) yVX'XVX'β 11 −−−=ˆ

Weighted least square mean estimation of:

V = ZGZ' + R

For more details see e.g. Littell et al. „SAS-System for MIXED Models“and cited literature

Difficult matrix equations, but the computer does thisfor us ☺

Page 25: Software for Linear Mixed Models

Software for Linear Mixed Models

SAS PROC MIXED

MIXED Procedure in SPSS

GenStat

ASReml

S-Plus

R

Several other packages (Statistica, Systat, ...)

Page 26: Software for Linear Mixed Models

Some examples for using mixed models

Page 27: Software for Linear Mixed Models

Example 1: Trial series

A cultivar performance trial14 oat varieties, 18 sites

Model for a single siteyik = μ + bk + αi + eik

yik = yield of the i-th variety in the k-th blockμ = constant (fix)bk = effect of the k-th block (fix)α i = effect of the i-th variety (fix)eik = residual error ~N (0, σ²e)

Page 28: Software for Linear Mixed Models

From single trial to a series

yik = μ + αi + bk + eik add index j for sites

yijk = μj + αij + bjk + eijk

yijk = yield of the i-th variety in the k-th block at site jμj = mean yield at site j (fix or random?)αij = effect of the i-th variety at site j (fix?) bjk = effect of the k-th block at site j (fix?)eijk = residual error ~N (0, σ²e)

Page 29: Software for Linear Mixed Models

Main effects for site and variety

yijk = μj + αij + bjk + eijk = μ + αi + βj + (αβ)ij + bjk + eijk

yijk = yield of the i-th variety in the k-th block at site jμ = global meanαi = effect of the i-th varietyβj = effect of the j-th site(αβ)ij = interaction of the i-th variety by the j-th sitebjk = effect of the k-th block at site jeijk = residual error ~N (0, σ²e)

Which effects should be random?

Page 30: Software for Linear Mixed Models

have sites fixed or random effects?

Aim of the series?

want to make inferences for a region or a state (in thisexample whole Germany) -> random

Are the sites a random sample from „a population“ of sitesin Germany?

The sites are mostly no random sample, but maybe thesite effects are.

(for single years site effects are confounded with site byyear interaction!)

Page 31: Software for Linear Mixed Models

Are variety effects fixed or random?

Aim of the series?

want to know about performance of the testedvarieties

-> fix

The varieties are no random sample from „a population“ of varieties

Page 32: Software for Linear Mixed Models

mixed model

yijk = μ + αi + βj + (αβ)ij + bjk + eijk

sites: randomvarieties: fixedsite × variety: random (sites are random)blocks within site: random (sites are random)

βj ∼ N(0, σ²β)(αβ)ij ∼ N(0, σ²αβ)bjk ∼ N(0, σ²b)eijk ∼ N(0, σ²e)

Page 33: Software for Linear Mixed Models

SAS Code for trial series

data t; input variety site block yield; datalines; 1 1 1 58.1300 1 1 2 60.8000 <more data> 14 18 4 60.0000 ; proc mixed data=t; class site variety block; model yield= variety / ddfm=kenwardroger; random site variety*site block(site);lsmeans variety/pdiff; run;

Denominator degrees of freedom according to themethod of Kenward & Roger

Page 34: Software for Linear Mixed Models

Output

Cov Parm Estimate

site 82.1917site*variety 5.9844block(site) 8.0897Residual 9.4411

Type 3 Tests of Fixed Effects

Num DenEffect DF DF F Value Pr > F

variety 13 221 8.67 <.0001

This are thevariancecomponents

Page 35: Software for Linear Mixed Models

OutputLeast Squares Means

StandardEffect variety Estimate Error DF t Value Pr > |t|

variety 1 53.6514 2.2676 20.2 23.66 <.0001variety 2 51.0719 2.2676 20.2 22.52 <.0001variety 3 55.6722 2.2676 20.2 24.55 <.0001variety 4 56.2379 2.2676 20.2 24.80 <.0001variety 5 55.9819 2.2676 20.2 24.69 <.0001variety 6 53.6103 2.2676 20.2 23.64 <.0001variety 7 53.8129 2.2676 20.2 23.73 <.0001variety 8 52.3715 2.2676 20.2 23.10 <.0001variety 9 58.4689 2.2676 20.2 25.78 <.0001variety 10 56.5832 2.2676 20.2 24.95 <.0001variety 11 53.8617 2.2676 20.2 23.75 <.0001variety 12 57.2378 2.2676 20.2 25.24 <.0001variety 13 54.7246 2.2676 20.2 24.13 <.0001variety 14 53.7700 2.2676 20.2 23.71 <.0001

Page 36: Software for Linear Mixed Models

Output

Differences of Least Squares Means

StandardEffect variety _variety Estimate Error DF t Value Pr > |t|

variety 1 2 2.5794 0.9629 221 2.68 0.0079variety 1 3 -2.0208 0.9629 221 -2.10 0.0370variety 1 4 -2.5865 0.9629 221 -2.69 0.0078variety 1 5 -2.3306 0.9629 221 -2.42 0.0163variety 1 6 0.04111 0.9629 221 0.04 0.9660variety 1 7 -0.1615 0.9629 221 -0.17 0.8669variety 1 8 1.2799 0.9629 221 1.33 0.1852variety 1 9 -4.8175 0.9629 221 -5.00 <.0001variety 1 10 -2.9318 0.9629 221 -3.04 0.0026variety 1 11 -0.2103 0.9629 221 -0.22 0.8273variety 1 12 -3.5864 0.9629 221 -3.72 0.0002variety 1 13 -1.0732 0.9629 221 -1.11 0.2663variety 1 14 -0.1186 0.9629 221 -0.12 0.9021...variety 13 14 0.9546 0.9629 221 0.99 0.3226

Page 37: Software for Linear Mixed Models

allow heterogenous error for each site

proc mixed data=t lognote; class site variety block; model yield= variety / ddfm=kenwardroger; random site variety*site block(site);

repeated / group=site;lsmeans variety/pdiff; run;

Page 38: Software for Linear Mixed Models

CPU-time effect of different syntax

/*heterogenous errors - long cpu time*/proc mixed data=t lognote; class site variety block; model yield = variety / ddfm=kenwardroger; random site variety*site block(site); large G-matrixrepeated / group=site;run;

/*heterogenous errors - short cpu time*/proc mixed data=t lognote; class site variety block; model yield = variety / ddfm=kenwardroger; random int variety block / subject=site;repeated / group=site;run;

Page 39: Software for Linear Mixed Models

allow heterogenous error for each siteCov Parm Subject Group Estimate

Intercept site 82.3446variety site 6.0686block site 7.7449Residual site 1 3.2903Residual site 2 5.3894Residual site 3 11.0351Residual site 4 5.9991Residual site 5 3.5523Residual site 6 9.7576Residual site 7 5.9514Residual site 8 4.7286Residual site 9 9.7070Residual site 10 15.2087Residual site 11 6.3460Residual site 12 5.3596Residual site 13 7.2015Residual site 14 7.9454Residual site 15 44.0650Residual site 16 4.9044Residual site 17 7.6436Residual site 18 11.1075

Page 40: Software for Linear Mixed Models

Model selection

Homogenous errors or heterogenous errors?

Comparison of AIC (Akaike criterion)

Fit Statistics heterogeous homogenous

-2 Res Log Likelihood 5471.4 5635.7AIC (smaller is better) 5513.4 5643.7AICC (smaller is better) 5514.3 5643.7BIC (smaller is better) 5532.1 5647.2

Page 41: Software for Linear Mixed Models

Model selection

Likelihood Ratio test

Fit Statistics heterogeous homogenous

-2 Res Log Likelihood 5471.4 5635.7

The log-Likelihood of heterogenous is higher

T = 164.3

Compare to a Chi²-distribution at 17 df.

Chi²(5%, 17) = 27.6 ⇒ significantly better

Page 42: Software for Linear Mixed Models

outlook

Several different structures for genotype-environmentinteraction possible

Piepho, H.-P., 1997: Analyzing genotype-environment data by mixedmodels with multiplicative effects. Biometrics 53: 761-766.

F. v. Eeuwijk: Statistical methods for the analysis of seriesof plant breeding experiments. (Proceed. of this meeting, 242 ff.)

2-step-analysis: Estimate means „variety×site“ in first step. Then analysethis means (not the plot values) over all sites in 2nd step

Michel et al. (Proceedings 136-141)

Page 43: Software for Linear Mixed Models

Example 2: Hierarchical designs

(analogous to a example from Snedecor & Cochran, Statistical methods, 1967)

plant calcium content [%] of single leaves

1 3.28 3.09 3.03 3.03

2 3.52 3.48 3.38 3.38

3 2.88 2.80 2.81 2.76

4 3.34 3.38 3.23 3.26

Page 44: Software for Linear Mixed Models

Calcium content of the four plants

calcium

2.7

2.8

2.9

3.0

3.1

3.2

3.3

3.4

3.5

3.6

plant

1 2 3 4

Page 45: Software for Linear Mixed Models

Statistic inference

What is the best estimator for average Ca-content?

1) Simple arithmetic mean from 16 leaves

2) Calculate the average per plant first and thencalculate the average over four plant means

Does this matter?

Page 46: Software for Linear Mixed Models

Calculate plant means

data beet;input plant @@;do leaf=1 to 4;input calcium @@; output;end;

datalines;1 3.28 3.09 3.03 3.032 3.52 3.48 3.38 3.383 2.88 2.80 2.81 2.764 3.34 3.38 3.23 3.26;proc means data = beet noprint;

by plant;var calcium;output out = plantmean mean=plantmean;

proc print;run;

Page 47: Software for Linear Mixed Models

Plant means

plant _FREQ_ plantmean

1 4 3.1075

2 4 3.4400

3 4 2.8125

4 4 3.3025

Global mean =

(3.1075 + 3.4400 + 2.8125 + 3.3025) / 4 = 3.166

Page 48: Software for Linear Mixed Models

Simple arithmetic mean from 16 values

proc means data = beet noprint;*by plant;var calcium;output out = plantmeanmean=plantmean;proc print;run;

_TYPE_ _FREQ_ plantmean

0 16 3.166

Page 49: Software for Linear Mixed Models

Statistic inference

3.166 is only an estimator for the real mean

> We want to calculate standard errors

1) Calculate the simple arithmetic mean and its standarderror from 16 leaves

2) Calculate the average per plant first and thencalculate the average over four plant means and itsstandard error2 a) With fixed plant effects

2 b) With random plant effects

Does this matter?

Page 50: Software for Linear Mixed Models

SAS PROC MIXED

Proc mixed data=beet; /*independent observations*/class plant;model Calcium = /ddfm=kr;estimate "independent" intercept 1 /cl;

run;

Proc mixed data=beet; /*fix plant effect*/class plant; model Calcium = plant / ddfm=kr;estimate "fix plants" intercept 1 plant 0.25 0.25 0.250.25 /cl;

run;

Proc mixed data=beet; /*random plant effect*/class plant;model Calcium = /ddfm=kr;random plant;estimate "random plants" intercept 1 /cl ;

run;

Page 51: Software for Linear Mixed Models

Three different results

StandardLabel Estimate Error DF Lower Upper

independent 3.166 0.064 15 3.030 3.301

fix plants 3.166 0.020 12 3.121 3.210

random plants 3.166 0.136 3 2.733 3.599

Page 52: Software for Linear Mixed Models

Three different results

StandardLabel Estimate Error DF Lower Upper

independent 3.166 0.064 15 3.030 3.301

fix plants 3.166 0.020 12 3.121 3.210

random plants 3.166 0.136 3 2.733 3.599

Independent: wrong! Because of correlation within plants

Fix plants: narrow inference space, only valid for the sampledplants

Random plants: broad inference space. Inference possible forpopulation of all plants (that have had a chance to be part of thesample)

Page 53: Software for Linear Mixed Models

different sources of random variation

StandardLabel Estimate Error DF Lower Upper

random plants 3.166 0.136 3 2.733 3.599

Covariance Parameter Estimates

plant 0.072380residual 0.006602

calcium

2.7

2.8

2.9

3.0

3.1

3.2

3.3

3.4

3.5

3.6

plant

1 2 3 4

Page 54: Software for Linear Mixed Models

different sources of random variation

StandardLabel Estimate Error DF Lower Upper

random plants 3.166 0.136 3 2.733 3.599

Covariance Parameter Estimates

plant 0.072380residual 0.006602

01851,016

006602,04

07238,0)(22

=+=+=rtr

Var leavesplant σσμ

136,001851,0)(.. === μVarES

Page 55: Software for Linear Mixed Models

Two ways to specify the correlation

y = Xβ + Zu + e

G and R Matrix

y = Xβ + e

only R Matrix

1)Proc mixed data=beet;class plant;model Calcium = /ddfm=kr;random plant;run;

2)Proc mixed data=beet;class plant;model Calcium = /ddfm=kr;repeated /subject=plant type=cs;run;

Page 56: Software for Linear Mixed Models

1) G and R matrix

Covariance Parameter Estimates

plant 0.072380residual 0.006602

IG 2

2

2

2

2

1000010000100001

07238.0

000000000000

G

G

G

G

G

σ

σσ

σσ

=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⋅=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

Covariance between leavesfrom the same plant

IR 2

2

2

2

2

1000010000100001

0066.0

000000000000

e

e

e

e

e

σ

σσ

σσ

=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⋅=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

This is the R-matrix for 4 leaves of 1 plant. The total R is a 16 by 16 matrix

Page 57: Software for Linear Mixed Models

2) Only R matrix

Covariance Parameter Estimates

Cov Parm Subject Estimate

CS plant 0.07238Residual 0.006602

Covariancebetween leavesfrom the sameplant

JIR 22

22222

22222

22222

22222

plante

planteplantplantplant

plantplanteplantplant

plantplantplanteplant

plantplantplantplante

σσ

σσσσσσσσσσσσσσσσσσσσ

+=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

++

++

=

This structure is called „compound-symmetry“

Page 58: Software for Linear Mixed Models

Estimated R Matrix for plant 1

Row Col1 Col2 Col3 Col41 0.07898 0.07238 0.07238 0.072382 0.07238 0.07898 0.07238 0.072383 0.07238 0.07238 0.07898 0.072384 0.07238 0.07238 0.07238 0.07898

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

++

++

=

2222

2222

2222

2222

²²

²²

plantplantplantplant

plantplantplantplant

plantplantplantplant

plantplantplantplant

σσσσσσσσσσσσσσσσσσσσ

R

Page 59: Software for Linear Mixed Models

The complete V-Matrix, it‘s block diagonal

Row Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11 Col12 Col13 Col14 Col15 Col161 0.07898 0.07238 0.07238 0.07238 0 0 0 0 0 0 0 0 0 0 0 02 0.07238 0.07898 0.07238 0.07238 0 0 0 0 0 0 0 0 0 0 0 03 0.07238 0.07238 0.07898 0.07238 0 0 0 0 0 0 0 0 0 0 0 04 0.07238 0.07238 0.07238 0.07898 0 0 0 0 0 0 0 0 0 0 0 05 0 0 0 0 0.07898 0.07238 0.07238 0.07238 0 0 0 0 0 0 0 06 0 0 0 0 0.07238 0.07898 0.07238 0.07238 0 0 0 0 0 0 0 07 0 0 0 0 0.07238 0.07238 0.07898 0.07238 0 0 0 0 0 0 0 08 0 0 0 0 0.07238 0.07238 0.07238 0.07898 0 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0.07898 0.07238 0.07238 0.07238 0 0 0 0

10 0 0 0 0 0 0 0 0 0.07238 0.07898 0.07238 0.07238 0 0 0 011 0 0 0 0 0 0 0 0 0.07238 0.07238 0.07898 0.07238 0 0 0 012 0 0 0 0 0 0 0 0 0.07238 0.07238 0.07238 0.07898 0 0 0 013 0 0 0 0 0 0 0 0 0 0 0 0 0.07898 0.07238 0.07238 0.0723814 0 0 0 0 0 0 0 0 0 0 0 0 0.07238 0.07898 0.07238 0.0723815 0 0 0 0 0 0 0 0 0 0 0 0 0.07238 0.07238 0.07898 0.0723816 0 0 0 0 0 0 0 0 0 0 0 0 0.07238 0.07238 0.07238 0.07898

Every row or column represents one leaf, every 4 by 4 block represents one plant. The values on the diagonal are thevariance of single observations, added up from residual and plant variance. Beside the diagonal are the covariances of leaves from the same plant (= the plant variance).

Page 60: Software for Linear Mixed Models

Repeated measurements

Each plant is measured four times (4 leaves)

The plant is the „subject“ in repeated measurementsyntax and is measured „repeatedly“

Proc mixed data=beet;class plant;model Calcium = /ddfm=kr;repeated /subject=plant type=cs r rcorr;run;

Page 61: Software for Linear Mixed Models

covariance and correlation

Covariance Parameter Estimates

plant 0.072380residual 0.006602

Covariance between leavesfrom the same plant

92,0078982,007238,0

006602,007238,007238,0),( 22

2

' ==+

=+

=leavesplants

plantsjiji yycorr

σσσ

The higher the correlation, the lower the informationget from an additional observation from the same plant

Page 62: Software for Linear Mixed Models

Estimated R Correlation Matrix for plant 1

Row Col1 Col2 Col3 Col4

1 1.0000 0.9164 0.9164 0.91642 0.9164 1.0000 0.9164 0.91643 0.9164 0.9164 1.0000 0.91644 0.9164 0.9164 0.9164 1.0000

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

44434241

43333231

42322221

41312111

ρρρρρρρρρρρρρρρρ

corrR

Page 63: Software for Linear Mixed Models

Example 3: Repeated measurements

Mineralisation in wheat after ploughing of grass-clover

Mineral N measured in 4 depths

3 different soil tillages

randomized complete blocks

no description

1 rotortiller (2 times), ploughing2 rotortiller (1 time), ploughing3 only ploughing

Page 64: Software for Linear Mixed Models

4 depths in the soil under a block design

Block I Block II Block III Block IV

Tiefe 1 Tiefe 2 Tiefe 3 Tiefe 4

Page 65: Software for Linear Mixed Models

How can we model repeated mesurements?

How to work out the correct linear model

See Piepho et al. (2004): A mixed modelapproach for randomized experiments withrepeated measures. JACS 190, 230-247.

Page 66: Software for Linear Mixed Models

1. step: set up model for single depth

yij = μ + bj + τi + eij

yij = Nmin-content of i-th tillage in j-th block (log-transformed!)

μ = global meanbj = effect of j-th blockτi = effect of i-th treatmenteij = error of ij-h plot

Page 67: Software for Linear Mixed Models

model over 4 depths

Block I Block II Block III Block IV

yij1 = μ1 + bj1 + τi1 + eij1

yij2 = μ2 + bj2 + τi2 + eij2

yij3 = μ3 + bj3 + τi3 + eij3

yij4 = μ4 + bj4 + τi4 + eij4

Page 68: Software for Linear Mixed Models

2. step: combine depths

yijt = μt + bjt + τit + eijt t =1,2,3,4: index for depth

factorial split-up of treatment effects:

μt + τit = μ + αi + βt + (αβ)it

αi = main effect of i-th tillage

βt = main effect of t-th depth

(αβ)it = interaction tillage × depth

Page 69: Software for Linear Mixed Models

complete model

yijt = μ + bj + (bβ)jt + αi + βt + (αβ)it + eijt

bj = block effect(bβ)jt = depth specific block effectαi = main effect of i-th tillageβt = main effect of t-th depth(αβ)it = interaction tillage × depth

Page 70: Software for Linear Mixed Models

Set up the linear model

y = Xβ + e

Observations from same plot are not independent, theyare correlated!

IR 2

2

2

2

2

000000000000

e

e

e

e

e

σ

σσ

σσ

=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

Page 71: Software for Linear Mixed Models

possible correlation structures

Compound Symmetry (CS):corr(eijs, eijt) = ρ

Autoregressiv [AR(1)]:corr(eijs, eijt) = ρ|s-t|

Unstructured (UN):Each depth has its own variance and covariances

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

++

++

=

22222

22222

22222

22222

ploteplotplotplot

plotploteplotplot

plotplotploteplot

plotplotplotplote

σσσσσσσσσσσσσσσσσσσσ

R

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

11

11

23

2

2

32

2

ρρρρρρρρρρρρ

σ eR

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

24434241

34233231

24232221

14131221

e

e

e

e

σσσσσσσσσσσσσσσσ

R

Page 72: Software for Linear Mixed Models

compound symmetry

eijt = fij + gijt (Split-Plot model)

fij = ij-th plot effect

gijt = residual = depth specific deviation of plot effect

( ) 2g

2f

2f

ijtijs σσσρe,ecorr+

==

Page 73: Software for Linear Mixed Models

AR(1) model

0.00.10.20.30.40.50.60.70.80.91.0

0 1 2 3 4 5 6

Korrelation

Abstand |s-t|

ρ = 0,9

ρ = 0,3

ρ = 0,6

Page 74: Software for Linear Mixed Models

model selection

Modell -2 log L§ AIC$ (smaller is better)

homogeneous variances:Unabhängig 21,9 23,9CS 15,7 19,7AR(1) 13,9 17,9

Heterogeneous variances:independent 15,1 23,1CS 6,7 16,7AR(1) 6,4 16,4Unstructured -2,5 17,5

§ log L = log-Likelihood; p = Zahl der Varianzparameter$ AIC = Akaike Information Criterion = -2 log L + 2p

best model

Page 75: Software for Linear Mixed Models

MIXED Code

proc mixed;class depth block tillage;model log_nmin= depth block block*depth

tillage tillage*depth /ddfm=kr;repeated depth /sub=block*tillage type=cs;lsmeans tillage*depth /pdiff;run;

Observations from different plots („subjects“) are independent

„subject“ = individual unit, which is measured repeatedly

Page 76: Software for Linear Mixed Models

MIXED Code

proc mixed;class depth block tillage;model log_nmin= depth block block*depth

tillage tillage*depth /ddfm=kr;repeated depth /sub=block*tillage type=cs;/*repeated depth /sub=block*tillage type=ar(1);repeated depth /sub=block*tillage type=csh;repeated depth /sub=block*tillage type=arh(1);repeated depth /sub=block*tillage type=un; */lsmeans tillage*depth /pdiff;run;

Observations from different plots („subjects“) are independent

„subject“ = individual unit, which is measured repeatedly

Page 77: Software for Linear Mixed Models
Page 78: Software for Linear Mixed Models

Model selection

We don‘t know the true model!

There are no good or bad models, but somemodels are useful – to make inferences

Akaike's Information Criterion (AIC), AICC

Based on logLikelihood and penalized for numberof parameters

Selects good models with low complexity

Page 79: Software for Linear Mixed Models

Example 4: Split-Plot designs

Effect of different crops on yield of following cornvarieties.

PreCrops in mainplots

Corn varieties in subplots

Data from Dr. C. Böhmel

(Univ. Hohenheim)

PreCrop A PreCrop Bvariety 1 variety 2variety 2 block 1 variety 4variety 3 variety 1variety 4 variety 3

PreCrop B PreCrop Avariety 3 variety 4variety 2 block 2 variety 3variety 1 variety 2variety 4 variety 1

Page 80: Software for Linear Mixed Models

Complete trial design

A B C D E FS1 S6S2 S2S3 S8S4 S3S5 S1S6 S7S7 S9S8 S4S9 S5

D F A E C B

E D C B F A

bloc

k 1

bloc

k 2

bloc

k 3

3 blocks, 6 crops, 9 varieties

Page 81: Software for Linear Mixed Models

Linear Model for RCB

yijk = µ + ai + bj + rk + abij + eijk

yij yield of the j-th variety after i-th crop in k-th block

µ global mean or constant (fix)

ai effect of the i-th crop (fix)

bj effect of the j-th variety (fix)

rk effect of the k-th block (fix)

abij interaction between i-th crop and j-th variety (fix)

eijk residual error of the ijk-th plot ~N(0,σ²e)

Page 82: Software for Linear Mixed Models

Linear Model for Split-plot-design

yijk = µ + ai + bj + rk + abij + fik + eijk

yij yield of the j-th variety after i-th crop in k-th blockµ global mean or constant (fix)ai effect of the i-th crop (fix)bj effect of the j-th variety (fix)rk effect of the k-th block (fix)abij interaction between i-th crop and j-th variety (fix)fik error of the ik-th mainplot ~N(0, σ ²ra)eijk residual error of the ijk-th plot ~N(0,σ²e)

Page 83: Software for Linear Mixed Models

Why do we need the mainplot error?

1: the observations are not independent. Plots in same mainplot are closer to another, should bemore equal to another. -> Covariance betweenplots from same mainplot

2: Repeated measurement design/ mainplots aremeasured 9 times (9 varieties)

3: „Analyze as randomized“: Two steps of randomization. Each step is connected to oneeffect. Randomization units that not contain all treatments should be random.

Page 84: Software for Linear Mixed Models

SAS-Code (MIXED)

proc mixed data = corn;class block variety crop;model yield = block variety crop

variety*crop /ddfm=kr;random crop*block ;lsmeans variety*crop/pdiff;run;

Page 85: Software for Linear Mixed Models

alternative SAS-Code (MIXED)

proc mixed data = corn;class block variety crop;model yield = block variety crop

variety*crop /ddfm=kr;repeated/subject=crop*block type=cs;lsmeans variety*crop/pdiff;run;

Page 86: Software for Linear Mixed Models

Output (MIXED)

Covariance ParameterEstimates

Cov Parm Estimate

block*crop 22.47Residual 275.67

Type 3 Tests of Fixed Effects

Num DenEffect DF DF F Value Pr > F

Block 2 10 2.88 0.1029Variety 8 96 15.10 <.0001Crop 5 10 1.17 0.3878Variety*Crop 40 96 1.22 0.2185

Covariance between yieldsfrom the same mainplot

Page 87: Software for Linear Mixed Models

Analysis with software for fixed effects model (Proc GLM)

proc glm data=corn; class block variety crop;model yield = block variety crop

variety*crop crop*block;random crop*block / test;lsmeans variety*crop;run;

Page 88: Software for Linear Mixed Models

Output GLM “Tests of Hypotheses for Mixed Model Analysis of Variance”

Same F-values as with MIXED

source df SS MS Fvalue ProbFblock 2 2752 1376.0 2.88 0.103variety 8 33293 4161.6 15.10 <.0001crop 5 2796 559.2 1.17 0.388variety*crop 40 13407 335.2 1.22 0.219block*crop 10 4779 477.9 1.73 0.084error 96 26465 275.7

Page 89: Software for Linear Mixed Models

Drawback of Proc GLM

F-tests are correct (when using the „/test“ – option),

but standard errors of difference for meancomparisons mostly not!

Standard error of difference for variety comparisonsafter different crops:

( ) ( ) 76,1983

67,27547,2222)(

22

'' =+⋅

=+⋅

=−r

yyVar erajiji

σσ

sed = sqrt(198.76) = 14.09

Page 90: Software for Linear Mixed Models

Approximation of denominator df

Satterthwaite-approximation

²)(²)1()1(]²)1[()1()1(

rae

rae

MQaMQbaMQMQbaraDDF

⋅+⋅−⋅−+⋅−⋅−⋅−⋅

=

proc mixed data = corn;class block variety crop;model yield = block variety

crop variety*crop /ddfm=kr;random crop*block ;lsmeans variety*crop/pdiff;run; Method of

Kenward & Roger

Page 91: Software for Linear Mixed Models

correct sed with MIXED

Differences of Least Squares MeansStandard

Effect V Crop V Crop Estimate Error DF tValue Pr>|t|

Crop 1 2 0.0846 5.9499 10 0.01 0.9889Crop 1 3 -3.7664 5.9499 10 -0.63 0.5409V 1 2 -19.3639 5.5345 96 -3.50 0.0007V 1 3 -33.9237 5.5345 96 -6.13 <.0001V*Crop 1 1 1 2 -13.8173 14.0983 98 -0.98 0.3295V*Crop 1 1 1 3 -4.5292 14.0983 98 -0.32 0.7487V*Crop 1 1 2 1 -34.3912 13.5566 96 -2.54 0.0128V*Crop 1 1 3 1 -28.6052 13.5566 96 -2.11 0.0375

Page 92: Software for Linear Mixed Models

Split plot with repeated measurements

At four different dates some plants out of each plotwere harvested

How can we model this?

Repeated measurement for mainplots and subplots!

Page 93: Software for Linear Mixed Models

First step

Take the model for one single date:

proc mixed data = corn;class block variety crop;model yield = block variety crop

variety*crop /ddfm=kr;random crop*block ;lsmeans variety*crop/pdiff;run;

Page 94: Software for Linear Mixed Models

Second step

Add a fixed time effect

proc mixed data = corn;class block variety crop date;model yield = block variety crop datevariety*crop variety*datevariety*crop*date/ddfm=kr;random crop*block ;lsmeans variety*crop*date/pdiff;run;

Page 95: Software for Linear Mixed Models

third step

Add a covariance structure for the main- and subplots-The structure for mainplots in the G-matrix

-The structure for subplots in the R-matrix

proc mixed data = corn;class block variety crop date;model yield = block variety crop date variety*crop variety*date variety*crop*date/ddfm=kr;random date / subject=crop*block type=cs;repeated date / subject=crop*block*variety

type=cs;lsmeans variety*crop*date/pdiff;run;

Page 96: Software for Linear Mixed Models

Output

Covariance Parameter Estimates

Cov Parm Subject Estimate

Variance WDH*ZF 70.2625CS WDH*ZF 72.0857CS WDH*Sorte*ZF 81.8311Residual 326.37

Page 97: Software for Linear Mixed Models

Output G-matrix

Estimated G Matrix

Row block crop date Col1 Col2 Col3 Col4

1 date 1 1 167 142.35 72.0857 72.0857 72.08572 date 1 1 182 72.0857 142.35 72.0857 72.08573 date 1 1 193 72.0857 72.0857 142.35 72.08574 date 1 1 207 72.0857 72.0857 72.0857 142.35

Estimated G Correlation Matrix

Row block crop date Col1 Col2 Col3 Col4

1 date 1 1 167 1.0000 0.5064 0.5064 0.50642 date 1 1 182 0.5064 1.0000 0.5064 0.50643 date 1 1 193 0.5064 0.5064 1.0000 0.50644 date 1 1 207 0.5064 0.5064 0.5064 1.0000

Page 98: Software for Linear Mixed Models

Output R-matrix

Estimated R Matrix for block*crop 1 1

Row Col1 Col2 Col3 Col41 408.20 81.8311 81.8311 81.83112 81.8311 408.20 81.8311 81.83113 81.8311 81.8311 408.20 81.83114 81.8311 81.8311 81.8311 408.20

Estimated R Correlation Matrix for block*crop 1 1

Row Col1 Col2 Col3 Col41 1.0000 0.2005 0.2005 0.20052 0.2005 1.0000 0.2005 0.20053 0.2005 0.2005 1.0000 0.20054 0.2005 0.2005 0.2005 1.0000

Page 99: Software for Linear Mixed Models

fourth step

Compound Symmetry: type = CS

Select the best covariance structure

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

++

++

1111

1111

1111

1111

²²

²²

σσσσσσσσσσσσσσσσσσσσ

Page 100: Software for Linear Mixed Models

Select the best covariance structure

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

11

11

²

23

2

2

32

ρρρρρρρρρρρρ

σ

Autoregressiv: type = AR(1)

Page 101: Software for Linear Mixed Models

Select the best covariance structure

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

24342414

43232313

42322212

41312121

σρσσρσσρσσρσσσρσσρσσρσσρσσσρσσρσσρσσρσσσ

Heterogenous CS: type = CSH

Page 102: Software for Linear Mixed Models

Select the best covariance structure

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

2434

224

314

432323

213

24232

2212

341

23121

21

σρσσρσσρσσρσσσρσσρσσρσσρσσσρσσρσσρσσρσσσ

Heterogenous AR(1): type = ARH(1)

Page 103: Software for Linear Mixed Models

Select the best covariance structure

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

24342414

43232313

42322212

41312121

σσσσσσσσσσσσσσσσ

Unstrukturiert: type = UN

Page 104: Software for Linear Mixed Models

Unstructured showed best AIC

CS AIC (smaller is better) 5570.1

CSH AIC (smaller is better) 5538.7

AR(1) AIC (smaller is better) 5561.2

ARH(1)AIC (smaller is better) 5526.1

UN AIC (smaller is better) 5516.4

Page 105: Software for Linear Mixed Models

last step

Perform hypotheses tests, calculate means and standard errors with the chosen covariancestructure

Page 106: Software for Linear Mixed Models

Example 5: Random coefficient models

A sugar beet variety trial series underinfestation with Rhizoctonia solani

Infestation measured in % necrotic rootsurface

4 tolerant varieties, 1 susceptible varietiy

17 sites in 2002

Data provided from German Institute of sugar beetresearch.

Ladewig & Lukashyk (proceedings, p. 98-103)

Page 107: Software for Linear Mixed Models

Relation between infestation and yield, one site

0

2

4

6

8

10

12

14

16

0 10 20 30 40 50 60 70 80

Befallsdruck (% verfaulte Rübenkörperoberfläche Indikatorsorte)

Ber

eini

gter

Zuc

kere

rtrag

(t/h

a)

"Fabiola""Premiere""Syncro""Solea""Impuls"

BZE von vier rhizoctonia-toleranten Zuckerrübensorten in Abhängigkeitvom Rhizoctonia-Befallsdruck. Standort: Gangelt 2002 (LIZ Jülich)

variety 1variety 2variety 3variety 4variety 5 (susceptible)

% infestation susceptible variety

yield

Page 108: Software for Linear Mixed Models

ANCOVA-Model

infestation (x)

yield (y)

µ

ai

αi{

xyßδδ

=

regression of variety ixyßiδδ

=

generalregression

xµxayE iiii )()( γβαβ +++=+=

Page 109: Software for Linear Mixed Models

[ ] ijkjkijkijjiijjiijk ebxvvuuy +++++++++= )()( γγβααμ

µ = interceptαi = effect of i-th varietyuj = effect of j-h site (random)(αu)ij = interaction variety by site (random)xijk = infestation of ijk-th plotß = general regression between infestation and yieldγi = deviation of i-th variety from regressionvj = devitation of j-th site from regression (random)(γv)ij = deviation from regression due to interaction (random)bjk = effect of k-th block at j-th site (random)eijk= residual error

Complete model for the trial series

Page 110: Software for Linear Mixed Models

„Random Coefficient Model“

Random intercept (site effects) and randomdeviations from mean regression

Regressions are random sample from a population of regression lines

• Laird & Ware, Biometrics 1982; • Longford 1993, Random

Coefficient Models; • Wolfinger 1996, JABES; • Littell et al. 1996, SAS System for

Mixed Models

Page 111: Software for Linear Mixed Models

Σ1 und Σ2 unstructured variance-covariance-matrices

⎟⎟⎠

⎞⎜⎜⎝

⎛==⎟⎟

⎞⎜⎜⎝

⎛2

2

varvuv

uvu

j

j

vu

σσσσ

⎟⎟⎠

⎞⎜⎜⎝

⎛==⎟⎟

⎞⎜⎜⎝

⎛2

,

,2

)()(

varvvu

vuu

ij

ij

vu

γγα

γαα

σσσσ

γα

Model

• allowing for covariance between intercept and slope makes parameter estimates „resistent“against scale effects

Variance of site-intercepts(σ²u) and site-slopes (σ²ν); σuν = covariance

Page 112: Software for Linear Mixed Models

SAS-Code (1)

Proc mixed data=gesamt method=reml maxiter=100;class site variety block;model yield = variety variety*infestation/ solutionnoint ddfm=kr;random int infestation / subject= site type=unr ;random int infestation / subject= site*variety

type=unr ;random block/subject= site;

lsmeans variety /pdiff at infestation =0;lsmeans variety /pdiff at infestation =0.1;lsmeans variety /pdiff at infestation =0.3;lsmeans variety /pdiff at infestation =0.5;lsmeans variety /pdiff at infestation =1;ods output lsmeans=estimates_yield ;ods output diffs=diffs_yield ;

run;

Page 113: Software for Linear Mixed Models

SAS-Output (1)

WARNING: Did not converge.

Covariance Parameter ValuesAt Last Iteration

Cov Parm Subject Estimate

UN(1,1) site 1.5704UN(2,1) site -0.5173UN(2,2) site -0.4490Var(1) site*variety 0.2140Var(2) site*variety 1.4670Corr(2,1) site*variety 8.13E-34Block site 0.02270Residual 1.4028

Page 114: Software for Linear Mixed Models

• Change scaling of covariate so (additition, subtraction, division, multiplication) that variances on diagonal of variance-cov-matrices have equal dimension

• Reduction of model

• try models with heterogenous errors

Model selection to reach convergence

Page 115: Software for Linear Mixed Models

SAS-Code (2)

data gesamt; set Gesamt;infestation_v=((infestation)-0)/100; run;Proc mixed data=gesamt method=reml maxiter=100;

class site variety block;model yield = variety variety*infestation/ solution nointddfm=kr;

* random int infestation/ subject=site type=unr ;random int infestation/ subject=site*variety type=unr ;random block/subject=site;parms (2)(2)(0.2)(0.03)(1.3);

lsmeans variety/pdiff at infestation=0;lsmeans variety/pdiff at infestation=0.1;lsmeans variety/pdiff at infestation=0.3;lsmeans variety/pdiff at infestation=0.5;lsmeans variety/pdiff at infestation=1;ods output lsmeans=estimates_yield ;ods output diffs=diffs_yield ;

run;

Page 116: Software for Linear Mixed Models

Covariance Parameter Estimates

Cov Parm Subject Estimate

Var(1) site*variety 2.8899Var(2) site*variety 2.3892Corr(2,1) site*variety -0.6637Block site 0.03468Residual 1.3555

StandardEffect variety Estimate Error DF t Value Pr > |t|

variety Fabiola 9.2547 0.4779 59.6 19.36 <.0001variety Impuls 10.2814 0.4558 56.7 22.56 <.0001variety Premiere 10.0701 0.4797 60.5 20.99 <.0001variety Solea 10.1740 0.5997 60.9 16.97 <.0001variety Syncro 9.8342 0.4972 60.2 19.78 <.0001infestation*variety Fabiola -3.5664 0.7648 20 -4.66 0.0001infestation*variety Impuls -7.9021 0.7013 19 -11.27 <.0001infestation*variety Premiere -1.6013 0.8433 23.1 -1.90 0.0701infestation*variety Solea -1.8531 1.0612 32.6 -1.75 0.0902infestation*variety Syncro -0.6893 0.8913 21.7 -0.77 0.4476

Num DenEffect DF DF F Value Pr > F

variety 5 59.5 397.51 <.0001infestation*variety 5 22.5 30.53 <.0001

site x varietysite x variety x infest.Corr between interceptand slope

SAS-Output (2): lsmeans

variety 1sus.varietyvariety 2variety 4variety 3variety 1sus.varietyvariety 2variety 4variety 3

Page 117: Software for Linear Mixed Models

expected values, 17 sites, 2002

0

2

4

6

8

10

12

0 10 20 30 40 50 60 70 80 90 100

Befallsdruck [% verfaulte Rübenkörperoberfläche Indikatorsorte]

Ber

eini

gter

Zuc

kere

rtrag

[t/h

a]

FabiolaImpulsPremiereSoleaSyncro

Fabiola

Impuls

PremiereSyncro

Solea

error bar = mean standard error of difference

susceptible

variety 3variety 2variety 4

variety 1variety 1suscep. varietyvariety 2variety 4variety 3

Page 118: Software for Linear Mixed Models

data 2005, centred for site effects

High residual error, due to small plots (30 plants)

Page 119: Software for Linear Mixed Models

Example 6: Spatial models

Several examples for spatial models in thissymposium (e.g. SCHNEIDER et al. page 209 Proceedings)

What are thebasic methodsof spatialstatistics?

Page 120: Software for Linear Mixed Models

Starting point

Spatial statistics can be seen as an multidimensional extension of repeatedmeasurements. [...] Observations are correlatedin two spatial (mostly) dimensions.

LITTELL et al. 1996

• The nearer two points are together, the higherthe correlation should be

• Correlation depends on distance

• Several models possible to describe thisdistance-correlation dependency

Page 121: Software for Linear Mixed Models

Semivariance: a measure of similarity

h = a distance „class“

νi, νj = measurements at single observations

∑=

−=hhji

jiij

vvhN

h)|,(

)²()(2

1)(γ

The higher the correlation, the lower the semi-variance

Page 122: Software for Linear Mixed Models

Anisotropic Variogram

dotted = East west, solid = north-south

sill

nugget

Page 123: Software for Linear Mixed Models

Some models to describe the empirical variogram

)/exp()( ρijij ddf −=

0)(;(22

31)( 3

3

=≤⎥⎥⎦

⎢⎢⎣

⎡+−= ijij

ijijij dfelsed

dddf ρ

ρρ

)/exp()( 22 ρij

ddf ij −=

dij = distance between two points i and j

ρ = an estimable parameter (the correlation between points)

Exponential, Spherical and Gaussian model

Page 124: Software for Linear Mixed Models

How to fit spatial models with SAS: A uniformity trial

data spatvar;input rep bloc row col yield;datalines;1 4 1 1 10.54111 4 1 2 8.58061 2 1 3 11.27901 2 1 4 12.43441 4 2 1 10.34161 4 2 2 11.3103...4 15 7 7 12.02534 15 7 8 10.42984 14 8 5 7.32204 14 8 6 10.51044 15 8 7 12.68084 15 8 8 10.4482;

Page 125: Software for Linear Mixed Models

The spherical model

proc mixed scoring=5;

model yield= ;

parms (0 to 10 by 2.5) (1 to 10 by 3);

repeated / subject=intercept type=sp(sph)(row col);

run;

Page 126: Software for Linear Mixed Models

Output for spherical model

Covariance Parameter Estimates

Cov Parm Subject Estimate

SP(SPH) Intercept 2.7120Residual 3.2607

Fit Statistics

-2 Res Log Likelihood 227.5AIC (smaller is better) 231.5AICC (smaller is better) 231.7BIC (smaller is better) 235.8

Page 127: Software for Linear Mixed Models
Page 128: Software for Linear Mixed Models

Output for spherical model

Covariance Parameter Estimates

Cov Parm Subject Estimate

SP(SPH) Intercept 2.7120Residual 3.2607

Fit Statistics

-2 Res Log Likelihood 227.5AIC (smaller is better) 231.5AICC (smaller is better) 231.7BIC (smaller is better) 235.8

0)(;(22

31)( 3

3

=≤⎥⎥⎦

⎢⎢⎣

⎡+−= ijij

ijijij dfelsed

dddf ρ

ρρ

Page 129: Software for Linear Mixed Models

Spherical model with nugget effect

proc mixed scoring=50 convh=1e-06;model yield= ;parms (0 to 10 by 2.5)

(1 to 10 by 3) (0.05 to 1.05 by 0.25);repeated / subject=intercept local

type=sp(sph)(row col);run;

Page 130: Software for Linear Mixed Models

Output for spherical model with nugget

Covariance Parameter Estimates

Cov Parm Subject Estimate

Variance Intercept 3.2189SP(SPH) Intercept 2.7162Residual 0.02144

Fit Statistics

-2 Res Log Likelihood 227.5AIC (smaller is better) 233.5AICC (smaller is better) 233.9BIC (smaller is better) 240.0

Page 131: Software for Linear Mixed Models

Some take home messages

An effect is (should be) random when it is ...

... related to a randomization step

... a random sample from a population (of effects)

... models the covariance between observations

Adress the mixed model structure in R or G matrix

Different software available (not only SAS)

Model selection important (the mixed world is morecomplex than the fixed effects models)