mann kendalltest (reparado)
Post on 11-Feb-2018
223 Views
Preview:
TRANSCRIPT
-
7/22/2019 Mann KendallTest (Reparado)
1/27
Kendall, M. G. (1970) Rank Correlation Methods, 4th ed. London: Griffin
Statistical Methods for Environmental Pollution Monitoring, Richard O. Gilbert (1987)http://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdf
(Good intro, but lacks look-up table, pdf image cuts off last sentence on each page)
Myles Hollander and Douglas A. Wolfe (1999)
Nonparametric Statistical Methods, 2nd Edition
Wiley-Interscience
ISBN-10: 0471190454
ISBN-13: 978-0471190455
A User-Written SAS Program for Estimating Temporal Trends and Their Magnitudehttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdf
Techniques of Water-Resources Investigations of the United States Geological Survey
Book 4, Hydrologic Analysis and Interpretation
Chapter A3 Statistical Methods in Water Resources
By D.R. Helsel and R.M. Hirschhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdf
Detecting Trends of Annual Values of Atmospheric Pollutants by the Mann-Kendall Test ahttp://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdf
Statistical Sirens: The Allure of Nonparametrics, Ecology 76(6), 1995, Douglas H. Johnson,http://www.jstor.org/pss/1940733
Why Kendall tau?http://rsscse.org.uk/ts/bts/noether/text.html
Kendalls tau and Spearmans Rhohttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlati
Non-parametric Measures of Bivariate Relationshipshttp://www.unesco.org/webworld/idams/advguide/Chapt4_2.htm
Kendall's rank correlation
http://www.statsdirect.com/help/nonparametric_methods/kend.htm
(clearer description of how to handle ties)
Powerpoint on nonparametric time serieshttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdf
http://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdfhttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdfhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdfhttp://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdfhttp://www.jstor.org/pss/1940733http://rsscse.org.uk/ts/bts/noether/text.htmlhttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlation-coefficient/http://www.unesco.org/webworld/idams/advguide/Chapt4_2.htmhttp://www.statsdirect.com/help/nonparametric_methods/kend.htmhttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdfhttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdfhttp://www.statsdirect.com/help/nonparametric_methods/kend.htmhttp://www.unesco.org/webworld/idams/advguide/Chapt4_2.htmhttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlation-coefficient/http://rsscse.org.uk/ts/bts/noether/text.htmlhttp://www.jstor.org/pss/1940733http://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdfhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdfhttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdfhttp://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdf -
7/22/2019 Mann KendallTest (Reparado)
2/27
d Sens Slope Estimates -The Excel Template Application Makesens
pp. 1998-2000
n-coefficient/
-
7/22/2019 Mann KendallTest (Reparado)
3/27
This Excel file has been designed to calculate a Mann-Kendall trend statistic for ten data points (i.e.
Instructions
Enter your data values into the green-highlighted cells C5:C14 of the sheet labeled "MannKendall"
Change the slide title (B1), Y-axis title (C4) and the year labels (if necessary).If you have fewer than ten years of data, you must also
Enter the number of time periods (e.g., years) into cell C18.
Clear the contents of any irrelevant cells from D26 to L34.
That's it. The worksheet will calculate the Mann-Kendall S statistic (FYI, some authors refer to it as t
Here is what the worksheet is doing:
For every n*n pair of values, subtract the value in yearK from the value in yearJ in all n(n-1)/2 cas
(Subtract the value on the left from the value on the top for all cells above the diagonal - top value
Above the diagonal will be values for which the column value is from a later year than the row valu
Count the number of n(n-1)/2 cells that yielded a positive value (result > 0) and put the count valuCount the number of n(n-1)/2 cells that yielded a negative value (result < 0) and put the count valu
Count the number of n(n-1)/2 cells that yielded a zero value (tied values) and put the count value i
Sum all the plusses and all the minuses and subtract the total of minuses from the total of pluses.
S=number of cells with positive values minus the number of cells with negative values.
The sign of S indicates the slope of the trend (positive=upward, negative=downward).
If n>=10, then calculate variance and use the formula for the normal approximation of the probabil
There are two formulae, one if there are no tied values and another if there are tied values.
If n=5 is required to reach p < .05.
Evaluation
# Positive diffs 45.00
# Negative diffs -35.00
S 80.00
Variance(S) * 0 (n(n-1))(2n+5)/18) (This formula may be conservative in the p
ZS ** 0.00
Zcrit,.05 165 1.96 (positive or negative) is the critical value for Z, two-tailed
Interpretation -2.64689641
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/
* Note: This variance formula assumes there are no tied values (i.e., no differences=0).Tied values may reduce the validity of the normal approximation when the number of data values i
If there are tied values, then the following formula with the correction factor for the tied values sho
-
7/22/2019 Mann KendallTest (Reparado)
4/27
Where q is the number of tied groups and tp is the number of data values in the pth
group.
I'm not ENTIRELY sure, but I think in this infant mortality example there is actually ONE tied "grou
of years that share the same value. I think (according to how I am interpreting Gilbert's example) t
The value 6.1 occurrs in three different years.
Because we only had 10 years to begin with, and three are tied, we should use the lookup table to
According to the table, S must be at least 30 for significance at the p < .05 level, and 30 is what w
But I will also recalculate the variance with the correction factor.
So, based on Gilbert's example (see last sheet in this file), q=1 and t1=3.
Var(S) correction factor for tied values: = 3(3-1)((2*3)+5)) = 102
Our variance after correction: =125-102 = 23
Evaluation of Tied Data
# Positive diffs 45.00
# Negative diffs -35.00S 80.00
Variance(S) -102 using correction factor for tied data
ZS #NUM! **
Zcrit,.05 1.96 1.96 (positive or negative) is the critical value for Z, two-tailed
Interpretation #NUM!
In any case, the correction factor decreases the variance, which increases the Z-score, and the lik
So if there are tied values, and we do not use the variance formula with the correction factor, our t
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/
**The direction of Z indicates the direction of the trend. A positive(negative) value of Z indicates an
Formula for ZS:
if S > 0 then Z = S-1/SQRT(variance S)
if S = 0 then Z = 0
if S < 0 then Z = S+1/SQRT(variance S)
Some sources said the calculation for the normal approximation of the probability of S should only
but others said only when n>=40. But there was some ambiguity about the definition of n (#years
#values in the matrix). If nyears=10, then the number of values inside the matrix is n(n-1)/2, or 10*
So I'm thinking nyears>=10 is okay.
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/
Lookup Table for Significance of S:
n
-
7/22/2019 Mann KendallTest (Reparado)
5/27
5
10
15
20
25
30
3540
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/
Information of the Power of Mann-Kendall S Test:
For Mann-
Kendall S to
yield a signifi-
cance level of:
p < .10 it requires 4 or more data points (e.g., years).
p < .05 it requires 5 or more data points (e.g., years).
p < .01 it requires 6 or more data points (e.g., years).
p < .001 it requires 7 or more data points (e.g., years).
0
20
40
60
80
100
120
140
160
180
5 10 15 20 25 30 35 40
CriticalValue
of
S
Number of Years (data points)
Critical Values of Mann Kendall S Statistic foralpha=.05 and Varying Values of N
-
7/22/2019 Mann KendallTest (Reparado)
6/27
ten years).
he K statistic).
s/cells where yearJ > yearK
minus left value for each cell.
e, or yearJ > yearK.)
in the first column to the right.e in the first column to the right.
the row at the bottom.
ity of S
resence of tied values.)
, at p < .05. The one-tailed value is 1.65.
s close to 10.
ld be used.
-
7/22/2019 Mann KendallTest (Reparado)
7/27
." It looks likeGilbert defines a "group" as a group
atour 6.1 value constitutes one "group."
gauge the significance of S.
have.
, at p < .05. The one-tailed value is 1.65.
elihood of significance.
st is conservative.
pward(downward) trend.
be used if n>=10,
ersus
9/2 = 45.
critical value of S for alpha=.05
-
7/22/2019 Mann KendallTest (Reparado)
8/27
11
30
40
62
85
111
139169
-
7/22/2019 Mann KendallTest (Reparado)
9/27
-
7/22/2019 Mann KendallTest (Reparado)
10/27
Graph Title: Infant Mortality. New Mexico, 1999-2009
y-axis title: Deaths per
1000 Live
Births
1999 6.622000 6.42
2001 6.06
2002 5.40
2003 6.28
2004 6.07
2005 5.68
2006 6.14
2007 5.03
2008 4.74
2009 5.00
n=number of time periods
n= 11
Subtract each earlier year from each later year
year J: 1999 2000 2001 2002 2003
year K: 6.6 6.4 6.1 5.4 6.3
1999 6.6 -0.20 -0.55 -1.22 -0.34
2000 6.4 -0.36 -1.02 -0.14
2001 6.1 -0.67 0.212002 5.4 0.88
2003 6.3
2004 6.1
2005 5.7
2006 6.1
2007 5.0
2008 4.7
2009 5.0
# ties (diff=0): 0 0 0 0 0
If n>=10, then use the variance calculation to estimate probability.If n=5 is required to reach p
-
7/22/2019 Mann KendallTest (Reparado)
11/27
Evaluation (Normal App roxim ation, N>=10)
Variance(S) 165 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence
ZS -2.65
Zcrit,.05 1.96 (two-tailed. For one-tailed test use 1.65)
Interpretation Sig. Decreasing
Evaluation (Lookup Table for Fewer Than 10 Years)
If S>=S-crit, then reject H0
S-crit (p
-
7/22/2019 Mann KendallTest (Reparado)
12/27
2004 2005 2006 2007 2008 2009
6.1 5.7 6.1 5.0 4.7 5.0 # +
-0.54 -0.93 -0.47 -1.58 -1.88 -1.62 0.00
-0.35 -0.74 -0.28 -1.39 -1.68 -1.42 0.00
0.01 -0.38 0.08 -1.03 -1.33 -1.06 3.000.68 0.29 0.75 -0.36 -0.66 -0.40 4.00
-0.21 -0.60 -0.13 -1.25 -1.54 -1.28 0.00
-0.39 0.07 -1.04 -1.33 -1.07 1.00
0.46 -0.65 -0.94 -0.68 1.00
-1.11 -1.40 -1.14 0.00
-0.29 -0.03 0.00
0.26 1.00
0 0 0 0 0 10.00
S= 10.00 minus
S= -35.00
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Infant Mortality. New Mexico, 1999-2009
urce: NM Death Certificate and Birth Certificate Data. NMDOH Bureau of Vital Records and Statistics.
-
7/22/2019 Mann KendallTest (Reparado)
13/27
f tied values.
-
7/22/2019 Mann KendallTest (Reparado)
14/27
# -
10.00
9.00
5.003.00
6.00
4.00
3.00
3.00
2.00
0.00
45.00
45.00
009
-
7/22/2019 Mann KendallTest (Reparado)
15/27
-
7/22/2019 Mann KendallTest (Reparado)
16/27
-
7/22/2019 Mann KendallTest (Reparado)
17/27
Graph Title: New Mexico Infant Mortality Rate from 1999-2009
y-axis title: Inf Deaths
per 1000
Live Births
2000 6.62001 6.4
2002 6.1
2003 5.4
2004 6.3
2005 6.1
2006 5.7
2007 6.1
2008 5.1
2009 5
n=number of time periodsn= 10
Subtract each earlier year from each later year
year J: 2000 2001 2002 2003 2004
year K: 6.6 6.4 6.1 5.4 6.3
2000 6.6 -0.20 -0.50 -1.20 -0.30
2001 6.4 -0.30 -1.00 -0.10
2002 6.1 -0.70 0.20
2003 5.4 0.90
2004 6.3
2005 6.1
2006 5.7
2007 6.1
2008 5.1
2009 5.0
# ties (diff=0): 0 0 0 0 0
If n>=10, then use the variance calculation to estimate probability.
If n=5 is required to reach p=10)
0
1
2
3
4
5
6
7
InfDeathsper1000LiveBirths
-
7/22/2019 Mann KendallTest (Reparado)
18/27
Variance(S) 125 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence
ZS -2.59
Zcrit,.05 1.96
Interpretation Sig. Decreasing
Evaluation (Lookup Table for Fewer Than 10 Years)
If S>=S-crit, then reject H0
S-crit (p
-
7/22/2019 Mann KendallTest (Reparado)
19/27
2005 2006 2007 2008 2009
6.1 5.7 6.1 5.1 5.0 # + # -
-0.50 -0.90 -0.50 -1.50 -1.60 0.00 9.00
-0.30 -0.70 -0.30 -1.30 -1.40 0.00 8.00
0.00 -0.40 0.00 -1.00 -1.10 1.00 4.00
0.70 0.30 0.70 -0.30 -0.40 4.00 2.00
-0.20 -0.60 -0.20 -1.20 -1.30 0.00 5.00
-0.40 0.00 -1.00 -1.10 0.00 3.00
0.40 -0.60 -0.70 1.00 2.00
-1.00 -1.10 0.00 2.00
-0.10 0.00 1.00
1 0 2 0 0 6.00 36.00
S= 6.00 minus 36.00
S= -30.00
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Year
New Mexico Infant Mortality Rate from 1999-2009
-
7/22/2019 Mann KendallTest (Reparado)
20/27
f tied values.
-
7/22/2019 Mann KendallTest (Reparado)
21/27
Table A 30, Upper-tail Probabilities for the Null Distribution of the KendallK Statistic.
For N>10 use the approximation given in section 8.12 (of Hollander an
One-sided p = Prob [S x] = Prob [S x]
N = Number of time periods
N=3 N=4 N=5 N=6 N=7 N=8 N=9 N=10x
1 0.5 0.5 0.5 0.5 0.05
2 0.3335 0.375 0.408 0.43 0.443 0.452 0.46 0.4655 0.05
3 0.167 0.271 0.325 0.36 0.386 0.406 0.4205 0.431 0.05
4 0.167 0.242 0.2975 0.3335 0.36 0.381 0.3975 0.05
5 0.1045 0.1795 0.235 0.281 0.317 0.3435 0.364 0.05
6 0.042 0.117 0.1855 0.236 0.274 0.306 0.332 0.05
7 0.0795 0.136 0.191 0.2365 0.272 0.3 0.05
8 0.042 0.102 0.155 0.199 0.238 0.271 0.05
9 0.02515 0.068 0.119 0.1685 0.2085 0.242 0.05
10 0.0083 0.048 0.0935 0.138 0.179 0.216 0.05
11 0.028 0.068 0.1135 0.1545 0.19 0.0512 0.01815 0.0515 0.089 0.13 0.168 0.05
13 0.0083 0.035 0.0715 0.11 0.146 0.05
14 0.00485 0.025 0.054 0.09 0.127 0.05
15 0.0014 0.015 0.0425 0.075 0.108 0.05
16 0.0102 0.031 0.06 0.093 0.05
17 0.0054 0.0233 0.049 0.078 0.05
18 0.0034 0.0156 0.038 0.066 0.05
19 0.0014 0.01135 0.03 0.054 0.05
20 0.0008 0.0071 0.022 0.045 0.05
21 0.0002 0.00495 0.0172 0.036 0.05
22 0.0028 0.0124 0.0295 0.05
23 0.00185 0.00935 0.023 0.05
24 0.0009 0.0063 0.01865 0.05
25 0.00055 0.0046 0.0143 0.05
26 0.0002 0.0029 0.0113 0.05
27 0.0002 0.00205 0.0083 0.05
28
-
7/22/2019 Mann KendallTest (Reparado)
22/27
The table was adapted from D. Helsel and R. M. Hirsch, Statistical Methods in Water Resources
Helsel and Hirsch cited Table A30 in Myles Hollander and Douglas A. Wolfe (1999)
Original Table from Helsel & Hirsch:
Table B8 -- Quantiles (p-values) for Kendall's S statistic and tau correlation coefficient
For N>10 use the approximation given in section 8.2.2
One-sided p = Prob [S x] = Prob [S x]
N = Number of time periods N = Number of time periods
4 5 8 9 3 6 7 10x x
0 0.625 0.592 0.548 0.54 1 0.5 0.5 0.5 0.5
2 0.375 0.408 0.452 0.46 3 0.167 0.36 0.386 0.43
4 0.167 0.242 0.36 0.381 5 0.235 0.281 0.36
6 0.042 0.117 0.274 0.306 7 0.136 0.191 0.3
8 0.042 0.199 0.238 9 0.068 0.119 0.24
10 0.0083 0.138 0.179 11 0.028 0.068 0.19
12 0.089 0.13 13 0.0083 0.035 0.15
14 0.054 0.09 15 0.0014 0.015 0.11
16 0.031 0.06 17 0.0054 0.08
18 0.0156 0.038 19 0.0014 0.05
20 0.0071 0.022 21 0.0002 0.04
22 0.0028 0.0124 23 0.02
24 0.0009 0.0063 25 0.01
26 0.0002 0.0029 27 0.01
28
-
7/22/2019 Mann KendallTest (Reparado)
23/27
Wolfe)
0.03 RED These do not appear on the table in the textbook because they are impossible values
0.03 Significant at p
-
7/22/2019 Mann KendallTest (Reparado)
24/27
This is from Helsel & Hirsch This includes interpolated values (red text)
S-crit (p=S=crit, then reject H0
01
01
01
01
-
7/22/2019 Mann KendallTest (Reparado)
25/27
but they ARE possible if there are tied cells. We still need to figure out how to handle ties.
22 23 24 25 26 27 28 29 30
N=3
N=4
N=5
N=6
N=7
N=8
N=9
N=10
p=0.05
p=0.025
-
7/22/2019 Mann KendallTest (Reparado)
26/27
Gilbert, 1987, on Tied Values in Mann-Kendall Test
23 24 0.1 6 0.1 24
23 1 -22.9 -17 -22.9 1
24 -23.9 -18 -23.9 00.1 5.9 0 23.9
6 -5.9 18
0.1 23.9
24
24
0.1
23
# ties (diff=0): 0 0 0 1 1
This is from the Gilbert (1987) article.Gilbert says:
the number of tied groups=3 (!?)*
t1=2 for the tied value 23
t2=3 for the tied value 24
t3=3 for the tied value .1
*Does he mean the number of different/unique values with a tie?
There are five columns, above, with tied values. But the number of uniq
matrix = 3. How on Earth am I supposed to ask SAS to do that!
0.1
0.1
0.1 The "value" of 23 happens twice
6 The "value" of 24 happens three times
23 The "value" of 23 happens three times
23
24
24
24
-
7/22/2019 Mann KendallTest (Reparado)
27/27
24 0.1 23 # + # -
1 -22.9 0 3.00 4.00
0 -23.9 -1 0.00 5.0023.9 0 22.9 4.00 0.00
18 -5.9 17 3.00 2.00
23.9 0 22.9 3.00 0.00
0 -23.9 -1 0.00 2.00
-23.9 -1 0.00 2.00
22.9 1.00 0.00
2 2 1 14.00 15.00
-1.00
ue values that happen to have ties in the
top related