Welcome to the Harris Geospatial product documentation center. Here you will find reference guides, help documents, and product libraries.


  >  Docs Center  >  IDL Reference  >  Advanced Math and Stats  >  IMSL_WILCOXON

IMSL_WILCOXON

IMSL_WILCOXON

The IMSL_WILCOXON function performs a Wilcoxon rank sum test or a Wilcoxon signed rank test.



Syntax


Result = IMSL_WILCOXON( x1 [ , x2 ] [, /DOUBLE] [, FUZZ=value] [, STATS=variable] )

Return Value


If a Wilcoxon rank sum test is performed, returns the two-sided
p-value for the Wilcoxon rank sum statistic that is computed with average ranks used in the case of ties.

If a Wilcoxon signed rank test is performed, returns an array of length two containing the following values:

  • The asymptotic probability of not exceeding the standardized (to an asymptotic variance of 1.0) minimum of (W+, W–) using method 1 under the null hypothesis that the distribution is symmetric about 0.0.
  •  

  • And, the asymptotic probability of not exceeding the standardized (to an asymptotic variance of 1.0) minimum of (W+, W–) using method 2 under the null hypothesis that the distribution is symmetric about 0.0.

Arguments


x1

One-dimensional array containing the first sample.

x2

(Optional) One-dimensional array containing the second sample.

Keywords


DOUBLE

If present and nonzero, double precision is used.

FUZZ

Nonnegative constant used to determine ties in computing ranks in the combined samples. A tie is declared when two observations in the combined sample are within Fuzz of each other. Default: Fuzz = 100 x e x max { |xi 1|, |xj 2|}, where e is machine precision for a Wilcoxon rank sum test, and Fuzz = 0.0 for a Wilcoxon signed rank test.

STATS

Named variable into which one-dimensional array of length 10 containing the statistics shown in Table 18-1 and Table 18-2 is stored. If a Wilcoxon rank sum test is performed:

Table 18-1: Stats Values for Wilcoxon Rank Sum Test 

Row
Statistics
0
Wilcoxon W statistic (the sum of the ranks of the x observations) adjusted for ties in such a manner that W is as small as possible
1
2 x E (W) – W, where E (W)is the expected value of W
2
probability of obtaining a statistic less than or equal to
min {W, 2 x E (W) – W}
3
W statistic adjusted for ties in such a manner that W is as large as possible
4
2 x E (W) – W, where E (W) is the expected value of W, adjusted for ties in such a manner that W is as large as possible
5
probability of obtaining a statistic less than or equal to
min {W, 2 x E (W) – W}, adjusted for ties in such a manner that W is as large as possible
6
W statistic with average ranks used in case of ties
7
estimated standard error of Stats (6) under the null hypothesis of no difference
8
standard normal score associated with Stats (6)
9
two-sided p-value associated with Stats (6)

If a Wilcoxon signed rank test is performed:

Table 18-2: Stats Values for Wilcoxon Signed Rank Test

Row
Statistics
0
The positive rank sum, W+, using method 1.
1
The absolute value of the negative rank sum, W–, using method 1.
2
The standardized (to anasymptotic variance of 1.0) minimum of (W+, W–) using method 1.
3
The asymptotic probability of not exceeding stats(2) under the null hypothesis that the distribution is symmetric about 0.0.
4
The positive rank sum, W+, using method 2.
5
The absolute value of the negative rank sum, W–, using method 2.
6
The standardized (to an asymptotic variance of 1.0) minimum of (W+, W–) using method 2.
7
The asymptotic probability of not exceeding stats(6) under the null hypothesis that the distribution is symmetric about 0.0.
8
The number of zero observations.
9
The total number of observations that are tied, and that are not within fuzz of zero.

Discussion


If Two Positional Arguments Are Supplied

The IMSL_WILCOXON function performs the Wilcoxon rank sum test for identical population distribution functions. The Wilcoxon test is a linear transformation of the Mann-Whitney U test. If the difference between the two populations can be attributed solely to a difference in location, then the Wilcoxon test becomes a test of equality of the population means (or medians) and is the nonparametric equivalent of the two-sample t-test. The IMSL_WILCOXON function obtains ranks in the combined sample after first eliminating missing values from the data. The rank sum statistic is then computed as the sum of the ranks in the x1 sample.

Three methods for handling ties are used. (A tie is counted when two observations are within Fuzz of each other.) Method 1 uses the largest possible rank for tied observations in the smallest sample, while Method 2 uses the smallest possible rank for these observations. Thus, the range of possible rank sums is obtained. Method 3 for handling tied observations between samples uses the average rank of the tied observations. Asymptotic standard normal scores are computed for the W score (based on a variance that has been adjusted for ties) when average ranks are used (see Conover 1980, p. 217). The probability associated with the two-sided alternative is then computed.

Hypothesis Tests

In each of the tests listed in Table 18-3, the first line gives the hypothesis (and its alternative) under the assumptions 1 to 3 below, while the second line gives the hypothesis when assumption 4 is also true. The rejection region is the same for both hypotheses and is given in terms of Method 3 for handling ties. Another output statistic should be used, (Stats(0) or Stats (3)), if another method for handling ties is desired.

Table 18-3: Hypothesis Tests 

Test
Null Hypothesis
Alternative Hypothesis
Action
1
H0 : Pr(x1 < x2) = 0.5
H1 : Pr(x1 < x2) ¹ 0.5
Reject if Stats (9) is less than the significance level of the test. Alternatively, reject the null hypothesis if Stats (6) is too large or too small.
H0 : E(x1) = E(x2)
(H1 : E(x1) ¹ E(x2))
2
H0 : Pr(x1 < x2) £ 0.5
H1 : Pr(x1 < x2) > 0.5
Reject if Stats (6) is too small.
H0 : E(x1) ³ E(x2)
H1 : E(x1) < E(x2)
3
H0 : Pr(x1 < x2) ³ 0.5
H0 : E(x1) £ E(x2)
H1 : Pr(x1 < x2) < 0.5
H1 : E(x1) > E(x2)
Reject if Stats (6) is too large.

Assumptions
  1. x1 and x2 contain random samples from their respective populations.
  2.  

  3. All observations are mutually independent.
  4.  

  5. The measurement scale is at least ordinal (i.e., an ordering less than, greater than, or equal to exists among the observations).
  6.  

  7. If f(x) and g(y) are the distribution functions of x and y, then g(y) = f(x + c) for some constant c (i.e., the distribution of y is, at worst, a translation of the distribution of x).

Tables of critical values of the W statistic are given in the references for small samples.

If One Positional Argument is Supplied

The IMSL_WILCOXON function performs a Wilcoxon signed rank test of symmetry about zero. In one sample, this test can be viewed as a test that the population median is zero. In matched samples, a test that the medians of the two populations are equal can be computed by first computing difference scores. These difference scores would then be used as input to IMSL_WILCOXON. A general reference for the methods used is Conover (1980).

Routine IMSL_WILCOXON computes statistics for two methods for handling zero and tied observations. In the first method, observations within Fuzz of zero are not counted, and the average rank of tied observations is used. (Observations within Fuzz of each other are said to be tied.) In the second method, observations within Fuzz of zero are randomly assigned a positive or negative sign, and the ranks of tied observations are randomly permuted.

The W+ and W– statistics are computed as the sums of the ranks of the positive observations and the sum of the ranks of the negative observations, respectively. Asymptotic probabilities are computed using standard methods (see, e.g., Conover 1980, page 282).

Hypothesis Tests

The W+ and W– statistics may be used to test the following hypotheses about the median, M. In deciding whether to reject the null hypothesis, use the bracketed statistic if method 2 for handling ties is preferred. Possible null hypotheses and alternatives are given as follows:

  • H0 : M £  0
    H1 : M > 0
  •  

  • Reject if stats(0) [or stats(4)] is too large.
  •  

  • H0 : M ³ 0
    H1 : M < 0
  •  

  • Reject if stats(1) [or stats(5)] is too large.
  •  

  • H0 : M = 0
    H1 : M ¹ 0
  •  

  • Reject if stats(2) [or stats(6)] is too small. Alternatively, if an asymptotic test is desired, reject if 2*stats(3) [or 2*stats(7)] is less than the significance level.

Tabled values of the test statistic can be found in the references. If possible, tabled values should be used. If the number of nonzero observations is too large, then the asymptotic probabilities computed by IMSL_WILCOXON can be used.

Assumptions

The assumptions required for the hypothesis tests are as follows:

  1. The distribution of each Xi is symmetric.
  2.  

  3. The Xi are mutually independent.
  4.  

  5. All Xi's have the same median.
  6.  

  7. An ordering of the observations exists (i.e., X1 > X2 and X2 > X3 implies that X1 > X3).

If other assumptions are made, related hypotheses that are more (or less) restrictive can be tested.

Examples


Example 1

The following example is taken from Conover (1980, p. 224). It involves the mixing time of two mixing machines using a total of 10 batches of a certain kind of batter, five batches for each machine. The null hypothesis is not rejected at the 5-percent level of significance. The warning error is always printed when one or more ties are detected.

x1 = [7.3, 6.9, 7.2, 7.8, 7.2]  
x2 = [7.4, 6.8, 6.9, 6.7, 7.1]
p = IMSL_WILCOXON(x1, x2, Stats = stats)
PRINT, ';p-Value = ', p

p-Value = 0.141238

Example 2

The following example uses the same data as the previous example. Now, all the statistics are output in the array Stats. First, a procedure is defined to output the results.

.RUN  
PRO print_results, stats
   PRINT, ';Wilcoxon W Statistic .....', stats(0)
   PRINT, ';2*E(W) - W ...............', stats(1)
   PRINT, ';P-Value .....................', stats(2)
   PRINT, ';Adjusted Wilcoxon Statistic..', stats(3)
   
   PRINT, ';Adjusted 2*E(W) - W .........', stats(4)
   PRINT, ';Adjusted P-Value ............', stats(5)
   PRINT, ';W Statistics for Averaged Ranks ..', stats(6)
   PRINT, ';Std Error of W (Averaged Ranks) ..', stats(7)
   PRINT, ';Std Normal Score of W (Averaged Ranks)..', stats(8)
   PRINT, ';Two-Sided P-Value of W (Averaged Ranks) ..', stats(9)
END

x1 = [7.3, 6.9, 7.2, 7.8, 7.2]
x2 = [7.4, 6.8, 6.9, 6.7, 7.1]
p = IMSL_WILCOXON(x1, x2, Stats = stats)
print_results, stats

   Wilcoxon W Statistic .................... 34.0000
   2*E(W) - W .............................. 21.0000
   P-Value ................................ 0.110072
   Adjusted Wilcoxon Statistic ............. 35.0000
   Adjusted 2*E(W) - W ..................... 20.0000
   Adjusted P-Value ...................... 0.0745036
   W Statistics for Averaged Ranks ......... 34.5000
   Std Error of W (Averaged Ranks) ......... 4.75803
   Std Normal Score of W (Averaged Ranks)... 1.47120
   Two-Sided P-Value of W (Averaged Ranks). 0.141238

Example 3

This example illustrates the application of the Wilcoxon signed rank test to a test on a difference of two matched samples (matched pairs) {X1 = 223, 216, 211, 212, 209, 205, 201; and X2 = 208, 205, 202, 207, 206, 204, 203}. A test that the median difference is 10.0 (rather than 0.0) is performed by subtracting 10.0 from each of the differences prior to calling IMSL_WILCOXON. As can be seen from the output, the null hypothesis is rejected. The warning error will always be printed when the number of observations is 50 or less unless printing is turned off for warning errors.

.RUN  
PRO output_results, stats
   PRINT, ';Statistic                Method 1     Method2'
   PRINT, ';W+ ...................', stats(0), stats(4)
   PRINT, ';W- ...................', stats(1), stats(5)
   PRINT, ';Standardized Minimum...', stats(2), stats(6)
   PRINT, ';p-value ...............', stats(3), stats(7)
   PRINT
   PRINT, ';Number of zeros .......', stats(8)
   PRINT, ';Number of ties ........', stats(9)
END

x = [-25.0, -21.0, -19.0, -15.0, -13.0, -11.0, -8.0]
p = IMSL_WILCOXON(x, Fuzz = 0.0001, Stats = stats)
OUTPUT_RESULTS, stats

Statistic               Method 1     Method 2
W+ .....................0.00000      0.00000
W- .....................28.0000      28.0000
Standardized Minimum ... -2.36643    -2.36643
p-value ................ 0.00898023   0.00898024

Number of zeros .........0.00000
Number of ties ..........0.00000

Errors


Warning Errors

STAT_AT_LEAST_ONE_TIE—At least one tie is detected between the samples.

Fatal Errors

STAT_ALL_X_Y_MISSING—Each element of x1 and/or x2 is a missing NaN (Not a Number) value.

Version History


6.4
Introduced



© 2018 Harris Geospatial Solutions, Inc. |  Legal
My Account    |    Store    |    Contact Us