MultiRegress

NAME

MultiRegress - Multiple Regression analysis of QTL data

SYNOPSIS

MultiRegress [ -o output ] [ -i input ] [ -t trait ] [ -c cat ] [ -S size ] [ -w window ] [ -u MaxSteps ] [ -I hypo ]

DESCRIPTION

MultiRegress uses stepwise regression to map quantitative trait loci. The data consist of trait values that will be mapped onto expected genotypes. The standard data set can be translated by JZmapqtl using model 9. Map information is encoded in the data file and thus a separate map is not needed.

You might rightly ask ``What does this program add to the QTL Cartographer system?'' First, it doesn't require a map like the other programs in the QTL Cartographer system. Second, since all of the genotypic expected values have been calculated, just about any type of cross could be analyzed. The user could write a program to calculate expected genotypes at specified sites. Finally, it can speed up the process of finding QTL when using MImapqtl (see the EXAMPLE section).

OPTIONS

See QTLcart(1) for more information on the global options -h for help, -A for automatic, -V for non-Verbose -W path for a working directory, -R file to specify a resource file, -e to specify the log file, -s to specify a seed for the random number generator and -X stem to specify a filename stem. The options below are specific to this program.

If you use this program without specifying any options, then you will get into a menu that allows you to set them interactively.

-o: This requires a filename for output. MultiRegress will append the file if it exists, and create a new file if it does not. If not used, then MultiRegress will use qtlcart.mr.
-i: This requires an input filename. This file must exist. The format is defined below. The default file is qtlcart.zr.
-t: Use this to specify which trait MultiRegress will analyze. If this number is greater than the number of traits, then all traits whose names do not begin with a minus sign will be analyzed. If 0, then no traits except those beginning with a plus sign will be analyzed. The default is to analyze trait 1 only.
-c: This tells MultiRegress whether to use the categorical traits in the analysis. Use a 1 to include categorical traits and a 0 to exclude them.
-S: Requires a real number in the range 0.0 to 1.0. This is a threshold p value for adding or deleting sites from the model. The default is 0.05.
-w: Requires a non-negative real number. This defines a window around a site already in a regression model to block from further analysis.
-u: Requires an integer valued argument. This allows you to specify a hard limit to the number parameters in the regression analysis. By default, it is 100.
-I: Requires a value of 10 or 30. The value 10 means just analyze additive effects, while the value 30 means analyze for dominance and additive effects.

INPUT FORMAT

Here is an example of the data input file:

        #     1002909319   -filetype JZmapqtl.zr 
        #
        #       QTL Cartographer v. 1.15e, October 2001
        #       This output file (qtlcart.zr) was 
        #       created by JZmapqtl...
        #
        #       It is 13:55:19 on Friday, 12 October 2001
        #
        #
        #  This output of JZmapqtl is meant to be used 
        #  with MultiRegress 
        -walk         2.00      Interval distance in cM
        -cross          B1      Cross
        -otraits         1      Number of explanatory variables
        -traits          2      Number of Traits 
        -positions 39           Number of positions
        -n               9      Sample Size
        -Trait 1   Trait.1
                                5             5.3           6.2           
                                4.1           5.5 
                            5.8           6.7           6.1           
                            6.4 
        -Trait 2   Trait.2
                           15            15.3          16.2          
                           24.1          25.5 
                           25.8          16.7          26.1          
                           16.4 
        -Otrait 1   Sex
                1     2     1     1     1     2     2     
                1     2 
        -Site 1 -parameter additive -chromosome 1  
        -marker 1 -name c1m1  -position 0.000100  -values
                          0.5           0.5           0.5         
                          0.499         0.499 
                         -0.5          -0.5          -0.5          
                         -0.5 
        -Site 2 -parameter additive -chromosome 1  
        -marker 1 -name c1m1  -position 0.020100  -values
                   0.4984        0.4984        0.4984        
                   0.3026        0.3026 
                  -0.4984       -0.4984       -0.4984       
                  -0.4984 
        -Site 3  ......

The data file above was created by JZmapqtl with model 9. The header of the file is similar to the qtlcart.cro format: The first line has a long integer and specifies the filetype as JZmapqtl.zr. Some header information is followed by parameter definitions that include the distances between sites (same as the walking speed in Zmapqtl, JZmapqtl and MImapqtl), the cross, numbers of categorical traits and quantitative traits, positions (or sites) and sample size (n). The data set above has a sample size of 9 for two traits, one categorical trait and 39 genotype sites. The cross and walk parameters are not needed by MultiRegress: They are provided as a reminder of how the data set was created. The genotypes are expected QTL types based on flanking marker information.

After the parameters, the traits are listed. For each trait, there will be a token -Trait followed by the trait number, trait name and n real values. After the traits come the categorical traits in the same format: The token -Otrait is followed by the categorical trait number, name and then n integer values.

Finally, data for each of the sites are presented. Site data start with the token -Site followed by information about the site. The token -parameter is followed by the word additive or dominance indicating what expected value is calculated. The other tokens indicate which chromosome and left-flanking marker define the site, and the position is from the left telomere of the chromosome. The token -values is followed by n expected values of the QTL genotype at the site. This structure is repeated for each site.

EXAMPLES

Suppose we have a data set for an SF3 population in qtlcart.zr with three traits and the filename stem has already been set to qtlcart.

        % MultiRegress -I 30 -t 4

Does a stepwise regression with backward elimination steps for the dataset. All three traits are analyzed and both additive and dominance effects are estimated.

One can also speed up the process of finding QTL using multiple interval mapping. The core algorithms of MImapqtl are very compute intensive. As an example, using MImapqt to search for QTL de novo takes 934 seconds on a Macintosh G4 with an 867 MHz processor. Contrast this with the following sequence:

        % JZmapqtl -X mletest -M 9 -A -V
        % MultiRegress  -A -V
        % Rqtl -i mletest.mr -o mletestPhase0.mqt
        % MImapqtl -p 1  -IsMPrtseC

Converting the data with JZmapqtl and searching for putative QTL with MultiRegress yields a starting point for MImapqtl. Rqtl translates the output of MultiRegress so that MImapqtl can use it as an initial model. The -p 1 option tells MImapqtl to set the phase variable to one, and thus the program expects the input model to be in mletestPhase0.mqt. This method takes about 25 seconds and comes up with a very similar set of QTL as using MImapqtl to search from scratch.

BUGS

If you have a multitrait data set, then use all of the traits. Convert them all with JZmapqtl by using a trait value greater than the number of traits, and be sure that none of the traits have names beginning with a minus sign.

CONTACT INFO

In general, it is best to contact us via email (basten@statgen.ncsu.edu)

        Christopher J. Basten, B. S. Weir and Z.-B. Zeng
        Bioinformatics Research Center, North Carolina State University
        1523 Partners II Building/840 Main Campus Drive
        Raleigh, NC 27695-7566     USA
        Phone: (919)515-1934

Please report all bugs via email to qtlcart-bug@statgen.ncsu.edu.

The QTL Cartographer web site ( http://statgen.ncsu.edu/qtlcart ) has links to the manual, man pages, ftp server and supplemental materials.

Home NCSU Home E-mail Webmaster