CND README 11 September 2004

CND stands for CytoNuclear Disequilibria. This package includes two programs for calculating cytonuclear disequilibria from joint cytonuclear counts. CNDd ( http://statgen.ncsu.edu/cnd/CNDd.php ) will do it for diallelic data sets, and CNDm ( http://statgen.ncsu.edu/cnd/CNDm.php ) will do it for multiallelic data sets. The package can be downloaded via ftp at ftp://statgen.ncsu.edu/pub/cnd .


CNDd

This program calculates cytonuclear disequilibria for dilocus systems (Asmussen and Basten, 1996). It takes as input one or more sets of joint cytonuclear counts and computes disequilibria, normalized disequilibria, variances, standard deviations, test statistics, asymptotic p values, p values based on Fisher's exact test and minimum sample sizes needed to detect the observed disequilibria with power 50% and 90%.

In 2004, Chris added code to calculate exact sample sizes for 2x3 tables (Sanchez, et al., 2006). See the CNDd.html or the CNDd.txt file for details.


CNDm

This program calculates disequilibrium statistics for cytonuclear data with multiple alleles (Asmussen and Basten, 1996 and Basten and Asmussen, 1997). It also does exact tests by collapsing the data into 2 by 2 tables for each nuclear genotype and cytotype, in turn. Finally, it will do a Monte Carlo and a Markov simulation to get a probability that the entire data set is in equilibrium (as opposed to any pair of genotype/cytotypes).

This program doesn't check for Hardy-Weinberg disequilibria at the nuclear loci.


PLATFORMS

Macintosh

The Macintosh archive, CNDMac.hqx, contains four programs:

  1. CNDm.classic, the multiallelic binary for non-carbonized Macs.

  2. CNDd.classic, the diallelic binary for non-carbonized Macs.

  3. CNDm.Carbon, the multiallelic binary for PowerPC Macs. This should run under MacOSX and MacOS 9.

  4. CNDd.Carbon, the diallelic binary for PowerPC Macs, which should run under OS 9 or OS X.

Put the file CNDMac.hqx through Unstuffit to unpack the archive. With Macintosh OSX and the development suite, you should consider downloading the UNIX version and compiling it.

UNIX

The UNIX distribution, CNDUnix.tar.gz, is a compressed, tarred file of the source and documentation. Put CNDUnix.tar.gz in an appropriate place, uncompress it, untar it, edit the Makefile to specify your compiler and bin directory. Once this is done, make the programs. If I downloaded the files to my home directory (say /Users/basten), I might create a directory CND and do the following:

  %  mkdir CND
  %  cp CNDUnix.tar.gz CND
  %  cd CND
  %  gunzip CNDUnix.tar.gz
  %  tar xf CNDUnix.tar
  %  vi Makefile
  %  make install

When editing the Makefile with vi (or any text editor), check the lines:

  CC=cc
  BINDIR= ../bin
  CFLAGS=

You might change them to

  CC=gcc
  BINDIR= /usr/local/bin
  CFLAGS= -O3

if you have gcc, want the binaries in /usr/local/bin/ and want the code optimized.

MS-Windows

Download the CNDWin.zip file and unzip in an appropriate place. To run the programs, open a command window, change directory to the CNDWin\bin directory and type the name of the program you want to run. If you want to analyze data, then you need to put the data files in the CNDWin\bin directory. For example, if you unzipped CNDWin.zip in C:\home, and copied the file hyla.dat from C:\home\CNDWin\dat to C:\home\CNDWin\bin, do this:

  c:\>  cd c:\home\CNDWin\bin
  c:\>  CNDd -i hyla.dat

You can then look at the output file (CNDd.out) with any text editor. It is best to use TextEdit, because Word can't help but mess up the file.


GENERAL USAGE

The files CNDd.dat and CNDm.dat are sample data file and will indicate how to format your own data. There are also three real data sets from bluegill, hyla tree frogs and eels.

For both programs, you can get a summary of options but using them with the -h option. For example, when you run CNDm with -h, (CNDm -h), you get:

  USAGE:
  
     CNDm [-i input] [-o output] [-e logfile] [-r Reps] [-M MonteReps] 
      [-m MarkovReps] [-X ] [-V] [-t]
  
  DEFAULTS:
  
       [-o CNDm.out]
       [-e CNDm.err]
       [-i CNDm.dat]
       [-r 0 ] [-rB 0 ] [-rC 0 ]
       [-M 0 ] [-MB 0 ] [-MC 0 ]
       [-m 0 ] [-mB 0 ] [-mC 0 ]
       [-X ] Turns off exact test calculations.
       [-V ] Turns off verbosity mode.
  
  Notes:
      1. Setting Reps sets both MonteReps and MarkovReps to Reps.
      2. rB is the number of batches, and sets MB and mB.
      3. rC is the number of observations per batch, and sets MC and mC.
      4. Reps, if positive, will be reset to rB * rC. These default to 10 and 100.
      5. Markov and Monte B and C values are set similar to Reps.
      6. If you are on a Mac or PC, you need to quit now.

If MonteReps equals 0, then a Monte Carlo shuffle test for the entire data set will not be run. If it is greater than 0, then the test will be performed. By default, it is 0 unless the -r or -M flags are used with a long integer value, indicating the number of shuffles to do. To calculate the sample standard error of the statistic, the number of batches will be 10 by default. You can set this with the -rB or -MB flag.

If MarkovReps equals 0, then a Markov Chain Monte Carlo test for the entire data set will not be run. If it is greater than 0, then it will be run. By default, it is 0 unless the -r or -m flags are used. To calculate the sample standard error of the statistic, the number of batches will be 10 by default. You can set this with the -rB or -mB flag.

If the -r flag is used, then both MarkovReps and MonteReps will be set to the number of reps indicated by the -r option. Note that if a -M or -m option is typed on the command line AFTER the -r option, its value will take precedence. The number of batches can be set with the -rB flag. The -rC flag sets the number of observations per batch. The number of reps will be the product of these numbers if both are set.

-X turns off the exact test for each cytonuclear genotype. This is the opposite behavior as was the case prior to 24 July 1996.

There is also a -V option which turns on the Verbosity mode.

The -t option causes the program to output its results in LaTeX2e format.

For Macintoshes, just click on the icon of choice. In the console window, you can use the options to specify the input, output and error files. You can also specify the number of reps as above. In Windows, select the program in the File Manager and choose Run under the File menu.

Do not type the square braces... they just indicate options.

  CNDm -o test.out -e test.err -i test.i -r 10000

would work. It would set the number of reps for both the Markov and Monte Carlo approximations to 10,000.

The -h flag with CNDd yields:

  USAGE:
 
   CNDd [-o outfile] [-e errorfile] [-i inputfile] -t
 
  DEFAULTS:
 
         [-o CNDd.out]
         [-e CNDd.err]
         [-i CNDd.dat]
         [-t ] Means output data analysis in LaTeX2e format 
         [-a]  Set alpha = p(Type I error) for exact sample size (0.05)
         [-b]  Set beta = p(Type II error) for same (0.5)
         [-S ] Set the Sanchez flag for exact sample sizes [0,1,2,3,4,5]
  -S with greater than 0 may require a long time. 
  The following options are for theoretical calculations.
         [-p1] Set p.1 (0.1)
         [-p2] Set p.2 (0.1)
         [-pM] Set p1. (0.1)
         [-r1] Set rho1 (0.5)
         [-r2] Set rho2 (0.5)
         [-d1] Set D1 (0.0) and ignore rho1
         [-d2] Set D2 (0.0) and ignore rho2
         [-pu] Set upper bound for power in power curve
         [-Nl] Set lower bound for sample size (50)
         [-Nu] Set upper bound for sample size (100)
         [-Ni] Set increment on sample size (25)
 
        Note that -p1, -p2 and -pM require real valued arguments in (0,1)
        with 0 < p1+p2 < 1.  -Nl, -Nu and -Ni set the lower bound, upper bound
        and increment on the sample size.  -S with value > 1 mean ignore data files.
        The given values are the defaults.

Look at the html files in the doc folder for more information.


REFERENCES

  1. M.A. Asmussen and C.J. Basten (1994). Sampling theory for cytonuclear disequilibria. Genetics 138:1351-1363.

  2. M.A. Asmussen and C.J. Basten (1996). Constraints and normalized measures for cytonuclear disequilibria. Heredity 76:207-214.

  3. C.J. Basten and M.A. Asmussen (1997). The exact test for cytonuclear disequilibria. Genetics 146:1165-1171.

  4. Sanchez, M.S., C.J. Basten, A.M. Ferrenberg, M.A. Asmussen and J. Arnold (2006). Exact sample sizes needed to detect dependence in 2 x 3 tables. Theoret. Pop. Biol. 69:111-120.


SEE ALSO

CNDd(1), CNDm(1)


CONTACT INFO

In general, it is best to contact us via email.

  Christopher J. Basten 
  Syngenta Biotechnology, Inc.
  Research Triangle Park   
  Phone: (919)597-3021
  christopher.basten at syngenta.com
Maria S. Sanchez
  Department of Environmental Science, Policy and Management
  University of California 
  Berkeley, CA 94720   USA
  msanchez at nature.berkeley.edu

The BRC web site ( http://statgen.ncsu.edu/ ) has links to a software page and from there to the newest version of this program (Cytonuclear Disequilibria).

The direct link is here: http://statgen.ncsu.edu/brcwebsite/software_BRC.php#Cytonuclear-diseq

There are versions for Windows ( ftp://statgen.ncsu.edu/pub/cnd/CNDWin.zip ), Macintosh ( ftp://statgen.ncsu.edu/pub/cnd/CNDMac.hqx ) and Unix ( ftp://statgen.ncsu.edu/pub/cnd/CNDUnix.tar.gz ).



Home    NCSU Home    E-mail Webmaster