>  Docs Center  >  Libraries  >  ASTROLIB  >  SAFE_CORRELATE

SAFE_CORRELATE

SAFE_CORRELATE

Name


      SAFE_CORRELATE

Purpose


      This function computes the probability by which the null hypothesis of
      uncorrelated data may be rejected while accounting for uncertainty in
      the data values.

Explanation


      This function generates NSIM simulated X,Y datasets based on the
      provided points and their erros. These are then used to compute
      the probability that uncorrelated data could explain the arrangement
      of the points, the probability-to-exceed or PTE, using Spearman's rank
      correlation test. Each simulated dataset is assigned a probability of
      1/NSIM of occuring. Thus, for a given dataset, the probability that the
      true data (given the uncertainties) are arranged as simulated AND
      that this particular arrangment of data can be explained without an
      underlying correlation is PTE/NSIM. These values are summed to compute
      the overall probability that the data represent an uncorrelated
      arrangement of points (in other words, the p-value or PTE for the null
      hypothesis of uncorrelated data).
      A tutorial on SAFE_CORRELATE is available at
      http://parkeloyd.com/output/code/safe_correlate/

Calling Sequence


      Result = SAFE_CORRELATE(X, Y, XERR, YERR, [NSIM=1e4, SEED=SEED])

Inputs


      X,Y: N-element vectors of the data points. These are ignored if
                  PDF input is supplied for X or Y (see below).
     
      XERR,YERR: The data point errors. These may be supplied as a scalar,
                  N-element vector, 2xM array, or Nx2xM array.
                  scalar: The identical Gaussian 1-sigma error for all
                          points.
                  N vector: The Gaussian 1-sigma error for each respective
                            point.
                  Nx2xM array: M points sampling the probability distribution
                              function (PDF) for each data point. The values
                              are contained in [N,0,*] and probability
                              densities in [N,1,*]. This is useful for
                              non-Gaussian errors, especially upper limits.

Keyword Parameters


      NSIM: The number of X,Y datasets to simulate. Default = 1e4.
      SEED: Random number seed for use with RANDOMN and RANDOMU. Useful for
            ensuring reproducible results. Can either be an input value or
            a variable into which the used value will be stored.

Examples


      Data with identical errors:
        xerr = 2.0
        yerr = 3.0
         
        ;generate linear data with errors
        N = 10
        x = findgen(N) + randomn(seed,N)*xerr
        y = findgen(N) + randomn(seed,N)*yerr
         
        ;plot
        ep = errorplot(x,y,replicate(xerr,N),replicate(yerr,N),'o')
         
        ;corrrelate
        print,safe_correlate(x,y,xerr,yerr)
         
      Data with differing errors, 5e3 simulations:
        ;generate nonuniform errors
        N = 10
        xerr = randomu(seed,N) + 1.0
        yerr = randomu(seed,N)*1.5 + 1.0
       
        ;generate linear data with errors
        x = findgen(N) + randomn(seed,N)*xerr
        y = findgen(N) + randomn(seed,N)*yerr
       
        ;plot
        ep = errorplot(x,y,xerr,yerr,'o')
       
        ;correlate
        print,safe_correlate(x,y,xerr,yerr,nsim=5e3)
       
      Data with non-gaussian errors
        ;generate linear data with some scatter
        N = 10
        x = findgen(N) + 5 + 2*randomn(seed,N)
        y = findgen(N) + 5 + 3*randomn(seed,N)
       
        ;assign uniform pdfs to the x data and gamma distributions to the
        ;y data (just for example, since the data were actaully generated
        ;from a Gaussian PDF)
        ;note that the PDFs do not have to be normalized
        M = 1000 ;number of points sampling pdfs
        xerr = fltarr(N,2,M)
        yerr = fltarr(N,2,M)
        t = 0.7 ;gamma distribution scale parameter
        for i = 0,N-1 do begin &$
          xvalues = findgen(M)/(M-1) + x[i] - 0.5 &$ ;width = 1.0
          xprobs = replicate(1.0, M) &$
          xerr[i,0,*] = xvalues &$
          xerr[i,1,*] = xprobs &$
          yvalues = findgen(M)/(M-1)*y[i]*2.0 &$
          k = y[i]/t + 1 &$
          yprobs = yvalues^(k-1)*exp(-yvalues/t)/t^k/gamma(k) &$
          yerr[i,0,*] = yvalues &$
          yerr[i,1,*] = yprobs &$
        endfor
       
        ;correlate
        print,safe_correlate(x,y,xerr,yerr)

Reference


      See Numerical Recipes by Press et al. for information on the
      Spearman Rank correlation test.

Modification History


      Written by: R. O. Parke Loyd, 2014-07



© 2019 Harris Geospatial Solutions, Inc. |  Legal
My Account    |    Store    |    Contact Us