• Usage: R CMD INSTALL [options] pkgs

    Install the add-on packages specified by pkgs. The elements of pkgs can
    be relative or absolute paths to directories with the package (bundle)
    sources, or to gzipped package &apostar&apos archives. The library tree to
    install to can be specified via &apos--library&apos. By default, packages are
    installed in the library tree rooted at the first directory given in the
    environment variable R_LIBS if this is set and non-null, and into the
    default R library tree (/usr/lib/R/library) otherwise.

    Options:
    -h, --help print short help message and exit
    -v, --version print INSTALL version info and exit
    --configure-args=ARGS
    set arguments for the package&aposs configure script
    (if any)
    --configure-vars=VARS
    set variables for the configure script (if any)
    -c, --clean remove all files created during installation
    -s, --save[=ARGS] save the package source as an image file, and
    arrange for this file to be loaded when the
    package is attached; if given, ARGS are passed
    to R when creating the save image
    --no-save do not save the package source as an image file
    --lazy use lazy loading
    --no-lazy do not use lazy loading
    --lazy-data use lazy loading for data
    --no-lazy-data do not use lazy loading for data (current default)
    -d, --debug turn on shell and build-help debugging
    -l, --library=LIB install packages to library tree LIB
    --no-configure do not use the package&aposs configure script
    --no-docs do not build and install documentation
    --with-package-versions
    allow for multiple versions of the same package
    --use-zip-data collect data files in zip archive
    --use-zip-help collect help and examples into zip archives
    --use-zip combine &apos--use-zip-data&apos and &apos--use-zip-help&apos
    --fake do minimal install for testing purposes
    --no-lock install on top of any existing installation
    without using a lock directory
    --build build binary tarball(s) of the installed package(s)

    Report bugs to .

  • 文件位置:
    /usr/share/vim/addons/ftplugin/r.vim
    因为缺省的xterm不支持本地local,但是R能够识别本地local,我用的是zh_CN.GBK,所以把脚本r.vim进行了修改:
    " ftplugin for R files
    "
    " Author: Iago Mosqueira
    " Author: Johannes Ranke
    " Author: Fernando Henrique Ferraz Pereira da Rosa
    " Maintainer: Johannes Ranke
    " Last Change: 2006 Feb 28
    " SVN: $Id: r.vim 37 2006-02-28 22:36:51Z ranke $
    "
    " Code written in vim is sent to R through a perl pipe
    " [funnel.pl, by Larry Clapp ], as individual lines,
    " blocks, or the whole file.

    " Press to open a new xterm with a new R interpreter listening
    " to its standard input (you can type R commands into the xterm)
    " as well as to code pasted from within vim.
    "
    " After selecting a visual block, &aposr' sends it to the R interpreter
    "
    " In insert mode, sends the active line to R and moves to the next
    " line (write and process mode).
    "
    " Maps:
    " Start a listening R interpreter in new xterm
    " Start a listening R-devel interpreter in new xterm
    " Start a listening R --vanilla interpreter in new xterm
    " Run current file
    " Run line under cursor
    " r Run visual block
    " Write and process

    " Only do this when not yet done for this buffer
    if exists("b:did_ftplugin")
    finish
    endif

    " Don&apost load another plugin for this buffer
    let b:did_ftplugin = 1

    "disable backup for .r-pipe
    setl backupskip=.*pipe

    "set r-friendly tabbing
    set expandtab
    set tabstop=4
    set shiftwidth=4

    "Start a listening R interpreter in new xterm
    "noremap :!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R &&
    echo -e &aposInterpreter has finished. Exiting. Goodbye.\n'"&
    "here we call a konsole terminal because the xterm cann&apost support the
    my locale:zh_CN.GBK
    noremap :!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R &&
    echo -e &aposInterpreter has finished. Exiting. Goodbye.\n'"&

    "Start a listening R-devel interpreter in new xterm
    "noremap :!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R-devel
    && echo &aposInterpreter has finished. Exiting. Goodbye.'"&
    "here we call a konsole terminal because the xterm cann&apost support the
    my locale:zh_CN.GBK
    noremap :!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R-devel
    && echo &aposInterpreter has finished. Exiting. Goodbye.'"&

    "Start a listening R --vanilla interpreter in new xterm
    "noremap :!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R
    -vanilla && echo &aposInterpreter has finished. Exiting.
    Goodbye.'"&

    "here we call a konsole terminal because the xterm cann&apost support the
    my locale:zh_CN.GBK
    noremap :!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R
    -vanilla && echo &aposInterpreter has finished. Exiting.
    Goodbye.'"&

    "send line under cursor to R
    noremap :execute line(".") &aposw >> ~/.r-pipe'
    inoremap :execute line(".") &aposw >> ~/.r-pipe'

    "send visual selected block to R
    vnoremap r :w >> ~/.r-pipe

    "write and process mode (somehow mapping does not work)
    inoremap :execute line(".") &aposw >> ~/.r-pipe'o

    "send current file to R
    noremap :execute &apos1 ,' line("$") &aposw >> ~/.r-pipe'
  • SAS 可以通过UNIVARIATE GUO过程实现

    具体格式:proc univariate data=luan.norm1 normal
    如果数据集小于2000,采用W:Normal检验 否则采用D:norma检验
    如果结果w:normal Pr>D的值小于0.05 那么不符合正态分布。或者d:normal Pr>D的值小于0.05,也不符合正态分布。
    R 通过shapiro.test(x)来检验是否符合正态分布。数据集小于2000

    shapiro.test(rnorm(100, mean = 5, sd = 3))
    shapiro.test(runif(100, min = 2, max = 4))

    SPSS通过非参数检验中的analysis-noparametrics tests-1 sample ks菜单项来实现

  • SAS 可以通过UNIVARIATE GUO过程实现

    具体格式:proc univariate data=luan.norm1 normal
    如果数据集小于2000,采用W:Normal检验 否则采用D:norma检验
    如果结果w:normal Pr>D的值小于0.05 那么不符合正态分布。或者d:normal Pr>D的值小于0.05,也不符合正态分布。
    R 通过shapiro.test(x)来检验是否符合正态分布。数据集小于2000

    shapiro.test(rnorm(100, mean = 5, sd = 3))
    shapiro.test(runif(100, min = 2, max = 4))


  • 2005-02-27

    geneland的简介

    Back to Gilles'

    Back to Gilles'
    Geneland : an R package for landscape genetics.
    Geneland is a free software distributed as an add-on to the free statistical
    software R. It is currently available for Linux and Windows and Mac-OS.

    Introduction
    Features
    Statistical model
    Estimation algorithm
    Example

    Haploid organisms

    Installation
    Getting started
    Reference manual
    Known bugs
    Changes log
    Credit
    References

    Mailing list
    Geneland hot-line


    Introduction

    The main purpose of Geneland it to process geo-referenced individual
    multilocus genetic data and to detect population structure, i.e
    sub-populations. Although populations refers often to a genetic structure
    only, it is often realistic to assume that populations are spatially
    organised. Therefore it makes sense not only to estimate population
    membership of each individual of a dataset but also to try to delineate
    spatial domains of each such population. Toward this aim, Geneland makes
    use of both spatial and genetic informations to estimate the number of
    populations in a dataset and delineate their spatial organisation.




    Features
    Geneland
    - estimates the number of populations present in the dataset
    - produces maps giving the population memberships of each geographical pixel
    either as probabilities or as
    population label
    - produces files giving population membership of each individuals
    - computes pair-wise Fst for all pairs of inferred populations


    Statistical model


    Geneland is based on four main assumptions :

    i) The number of populations is unknown and all values between 1 and an
    upper bound (which has to be set by the user) are considered equally likely.


    ii) Sub-populations are spread over areas given by a the union of some
    polygons in the spatial domain. In mathematical term we assume a hidden
    colored Poisson-Voronoi tessellation such as the ones given below :


    iii) Hardy-Weinberg equilibrium is assumed within each population

    iv) Allele frequencies in each population are unknown and treated as random
    variable either following the so-called Dirichlet model or the F-model
    (Falush et al. [2003] )


    Estimation algorithm
    All unknown quantities in the model are treated as random variables, namely
    :
    - number of populations
    - number, locations and population memberships of polygons coding the
    spatial organisation of populations
    - allele frequencies (in the present time populations and also in the
    ancestral population in the F-model)
    - drift parameters (in the F-model)

    Inference is carried out through an MCMC algorithm. All parameters are
    considered as unknown within a Bayesian model and averages over samples from
    their posterior distribution allow to plot maps of posterior probability of
    population membership like these :


    Example
    Below is an example of output of Geneland on a real data set of 88
    wolverines sampled at 10 micro-satellite loci
    (data kindly provided by Lisette Waits from the University of Idaho).

    Geneland provides estimates of the number of population at Hardy-Weinberg
    equilibrium....


    and a map of theareas of such populations :


    Haploid organisms


    The model currently implemented in Geneland makes computations for diploid
    organisms.
    Extension to haploid organisms is scheduled for a future version.
    If you want to use the present version of Geneland with haploid organisms,
    you need first to diploidise
    your haploid data. Here is link to some R functions written by Silke Werth
    to handle haploid genotypes.

    Two warnings:
    - The diploidised dataset or the original haploid dataset would give
    exactly similar results under maximum likelihood estimators.
    But with Bayesian estimators the diploidised dataset gives a trade-off
    between the haploid Bayesian estimator and the haploid maximum likelihood
    estimator. As they are both well behaved estimators, any average of them
    makes sense.
    - In addition, the diploidised dataset is formally equivalent to using 2*n
    individuals instead of n, hence, in a Bayesian setting it leads to
    underestimating the true uncertainty about the parameters.


    Installation

    - In order to use Geneland, you need first to have R installed on you
    computer, see the R homepage.
    Note that compatibility of the current version of Geneland is checked with
    the current version of R only.

    - Launch R

    - Type install.packages("Geneland") in the R prompt

    - Answer yes to Delete downloaded files (y/N)?


    Getting started

    - Launch R

    - Load Geneland with the command library(Geneland)

    - Launch the on-line help of R with the command help.start()

    - Poke around the help of Geneland with help(Geneland).

    A complete sequence of example commands is given in the help page of
    function mcmcFmodel wich you can acces by typing help(mcmcFmodel).


    Known bugs
    In Geneland 0.5
    - Under windows: extra carriage return in some output files after the tenth
    column if the number of populations is allowed to be larger than 10. Files
    seem to be OK, just a bit messy.

    - On Mac OS X 10.4: compilation error at installation

    - Under windows: error message along the line of "the instruction at
    "0x5ad71531" referenced memory at
    "0x00000014". The memory could not be "read". Click OK to terminate the
    program." from mcmcFmodel.


    In Geneland 0.6
    - Function simFmodel returns a list containing element c instead of
    color.nuclei

    - call of X11()

    - documentation refers to path.data which no longer exists

    - array ptemp not passed as argument in subroutine rpriof in Fortran

    - Documentation of PostProcessChain() explains wrongly the storage forat of
    images (row wise instead of column wise)

    - PostProcessChain transmits coordinates but should transmit its transpose.
    Therefore, the function PostProcessChain() did not computed the limits of
    the spatial domain correctly.
    It could result in wrong maps with PosteriorMode(). PostProcessChain()
    worked correctly ifthe range of the x values of the individuals includes
    the range of the y values (or vice-versa). If your data complies with this
    condition, any map computed with previous version of Geneland should be OK.
    Otherwise it should have resulted in very fancy maps (displaying bundles of
    straight lines; the more it departs from this condition, the fancier).


    Changes log
    Changes from version 0.4 to version 0.5:
    - In Fortran subroutine mcmc (called by mcmcFmodel): true coordinates are
    written on an ascci file named hidden.coord.txt

    - Input data passed to functions as R objects instead of through a path to
    ascci files


    Changes from version 0.5 to version 0.6:
    - path.mcmc not written any longer in parameter file parameter.txt (in
    order to avoid issues with path containing spaces)

    - All function of Geneland now first transform input data (coordinates,
    genotypes and allele.numbers) into matrices (to avoid troubles with data
    frames)

    - warning message of setplot function avoided by replacing last instruction
    return (xlim, ylim, oldpin, newpin) by list(xlim, ylim, oldpin, newpin)

    - carriage return after the tenth column in file proba.pop.membership.txt
    and proba.pop.membership.perm.txt removed. Now correct writing with up to
    thousand populations (though not recomanded).

    - path.mcmc now set as paste(tempdir(),"/",sep="") in the example of
    mcmcFmodel


    Changes from version 0.6 to version 0.7:
    -- NO SYNTAXIC CHANGE --

    * Function simFmodel now returns a list containing element color.nuclei

    * call of X11() in functions simfmodel, Plot* and PosteriorMode replaced by
    get(getOption("device"))()

    * Documentation of PostProcessChain do not refer any longer to
    path.data (which no longer exists since version 0.6)

    * Array ptemp now passed as argument in subroutine
    rpriof and rpriorfa in Fortran

    * PostProcessChain passes matrix t(coordinates) instead of
    coordinates

    * Function PlotTesselation and PosteriorMode uses directly coordinates
    instead of
    using this useless matrix s

    * Documentation of PostProcessChain() now explains correctly the
    storage format of images (column wise instead of row wise).

    * Example code in mcmcFmodel implemented with 5 loci, 10 alleles/locus
    on a reactangle domain, on a longer MCMC run

    * Character strings for path to the various files now declared in
    Fortran as character*256 instead of character*200

    * Remove declaration of useless matrix s in PostProcessChain

    * Additional functionality in PostProcessChain:
    now also computes posterior probabilities of population membership
    for individuals and writes them in file named
    proba.pop.membership.indiv.txt"
    The modal population for individual is written in a file named
    "modal.pop.indiv.txt"


    Credit
    People who helped improving Geneland by stimulating comments includes :
    - Annie Bouvier
    - Aurélie Coulon
    - Arnaud Estoup
    - Frédéric Mortier


    Bibliography
    On the implementation of mixture models in population genetics:
    - J.K. Pritchard, M. Stephens and P. Donnelly,
    Inference of population structure using multilocus genotype data,
    Genetics, pp 945-959 vol. 155, 2000

    - Falush D., M. Stephens and J.K. Pritchard,
    Inference of population structure using multilocus genotype data:
    Linked loci and correlated allele frequencies, Genetics, pp 1567-1587,
    vol 164, 2003


    On the implementation of variable dimension MCMC algorihtm in population
    genetics:
    - Corander, J.C., Waldmann, P. and Sillanpaa, M.J.,
    Bayesian analysis of genetic differentiation between populations,
    Genetics, 2003, 163, 367-374

    - Corander, J.C., P. Waldmann, P. Martinen and M.J. Sillanpaa, ,
    BAPS2: Enhanced possibilities for the analysis of genetic population
    structure,
    Bioinformatics, vol. 20,number 15, 2004


    On the use of Voronoi tessellationsin population genetics :
    - Dupanloup, I., Schneider, S. and Excoffier, L.,
    A simulated annealing approach to define genetic structure of populations,
    Molecular Ecology, 2002, 11, 2571-2581.


    References on this model

    - G. Guillot, F. Mortier, A. Estoup Geneland: a program for landscape
    Genetics, to appear in Molecular Ecology Notes [pdf]

    - G. Guillot, Estoup, A., Mortier, F. Cosson, J.F. A spatial statistical
    model for landscape genetic. To appear in Genetics [pdf]


    Mailing List
    If you want to be informed of changes in Geneland and progress on related
    works, please let me know by email
    (guillot[at]inapg.inra.fr). You will be added to the mailing list.


    Geneland hot-line


    Given that:
    - most people in ecology are not familiar with R
    - Geneland is primarily intended to linux users, while most users are
    actually windows users
    - the Geneland on-line does not give much details
    - there are probably a few bugs in the present version

    I&aposll try to answer to all users inquiries about Geneland (as long as I can).

    Feel free to report any bugs, criticism and more generally any good or bad
    experience with Geneland (guillot[at]inapg.inra.fr). Try to give as much
    detail as you can, especially, try to give the exact sequence of R commands
    involved in the problem.

    For general questions about R, the R FAQ and the R mailing lists are the
    right places to ask.

  • 今天晚上找到了一个R的生物信息学网站:http://www.bioconductor.org/,提供了一套生物信息学的解决方案。同时找到了2个处理群体遗传学数据的包:
    1.genetics: Population Genetics: http://cran.r-project.org/src/cont ... iptions/genetics.html
    Classes and methods for handling genetic data. Includes classes to represent genotypes and haplotypes at single markers up to multiple markers on multiple chromosomes. Function include allele frequencies, flagging homo/heterozygotes, flagging carriers of certain alleles, estimating and testing for Hardy-Weinberg disequilibrium, estimating and testing for linkage disequilibrium

    2.rmetasim: An individual-based population genetic simulation environment
    http://cran.r-project.org/src/cont ... iptions/rmetasim.html
    http://linum.cofc.edu/software.html
    An interface between R and the metasim simulation engine. Facilitates the use of the metasim engine to build and run individual based population genetics simulations. The simulation environment is documented in: Allan Strand. Metasim 1.0: an individual-based environment for simulating population genetics of complex population dynamics. Mol. Ecol. Notes, 2:373-376, 2002. (Please contact Allan Strand with comments, bug reports, etc).


  • rm(list=ls(all=TRUE))
    eigenvalue1=numeric(5000)
    angles=numeric(0)
    replicates=5000
    vectorlength1=49
    vectorlength2=50
    data1=read.table("e:/rstudy/gdfj.txt",header=F)
    vectorlength=length(data1$V1)
    vectors=length(data1)
    newdata1=numeric(vectorlength1*vectors)
    newdata2=numeric(vectorlength2*vectors)
    dim(newdata1)<-c(vectorlength1,vectors)
    dim(newdata2)<-c(vectorlength2,vectors)


    macs1=numeric(replicates*vectors)
    macs2=numeric(replicates*vectors)
    dim(macs1)=c(replicates,vectors)
    dim(macs2)=c(replicates,vectors)

    for(k in 1:replicates)
    {
    xx=sample(1:vectorlength,vectorlength1,replace=FALSE)
    counter1=0
    counter2=0
    newdata1=data1[xx,]
    newdata2=data1[-xx,]
    pc1.cr=eigen(cov(newdata1),EISPACK=TRUE)
    pc2.cr=eigen(cov(newdata2),EISPACK=TRUE)
    eigenvectors1=pc1.cr$vectors
    eigenvectors2=pc2.cr$vectors
    eigenvalue1=eigenvectors1[,1]
    eigenvalue2=eigenvectors2[,1]
    mac1=eigenvalue1/mean(eigenvalue1)
    mac2=eigenvalue2/mean(eigenvalue2)
    xxxx=sum(mac1*mac2)/(sqrt(sum(mac1^2))*sqrt(sum(mac2^2)))
    angles[k]=acos(xxxx)*180/pi
    macs1[k,]=mac1;
    macs2[k,]=mac2;
    }
    write.table(angles,file="angles.txt")
    write.table(macs1,file="macs1.txt")
    write.table(macs2,file="macs2.txt")
    hist(angles,probability=TRUE,main="重排频率分布",xlab="角度",ylab="频率")
    print("Game is over!!!!")

    模拟的θ1如图所示
  • 2005-02-27

    apropos fuction

    This function can tell you what exactly is the name that you can want. It is very useful.
    for example:
    >apropos("sqr")
    >1] "sqrt"

  • gap包:http://www.ucl.ac.uk/~rmjdjhz/software/gap_1.0-3.zip
    作者的主页:http://www.hgmp.mrc.ac.uk/~jzhao/
    This is a package bundle based on the original R package for genetic data analysis (gap), it currently consists of gap, pathmix and pointer (alpha).

    License:
    Programs included in this package by Jing hua Zhao will be under GPL. Specific requirement may be possible for programs written by other authors.

    geneland:http://cran.r-project.org/bin/window ... lease/Geneland_0.7.zip
    http://www.inapg.inra.fr/ens_rech/mathinfo/p ... eneland.html#Introduction_
    Introduction

    The main purpose of Geneland it to process geo-referenced individual multilocus genetic data and to detect population structure, i.e sub-populations. Although populations refers often to a genetic structure only, it is often realistic to assume that populations are spatially organised. Therefore it makes sense not only to estimate population membership of each individual of a dataset but also to try to delineate spatial domains of each such population. Toward this aim, Geneland makes use of both spatial and genetic informations to estimate the number of populations in a dataset and delineate their spatial organisation.

    作者的homepage: http://www.inapg.inra.fr/ens_rech/ma ... l/guillot/welcome.html
    有关空间群体遗传学的一些集合。对于模拟群体的子分很有意义。

  • 2005-02-27

    数组array

    > #dim retrieve or set dimension of an object 重新得到或者设置一个对象的维数
    > x=1:12;dim(x)=c(3,4)
    > print(x)
    [,1] [,2] [,3] [,4]
    [1,] 1 4 7 10
    [2,] 2 5 8 11
    [3,] 3 6 9 12
    > print(x[3,]) #
    [1] 3 6 9 12
    > x[,] #表示数组全部
    [,1] [,2] [,3] [,4]
    [1,] 1 4 7 10
    [2,] 2 5 8 11
    [3,] 3 6 9 12
    >
    > #array,create or test arrays
    > y=array(1:20,dim=c(4,5))
    > print(y)
    [,1] [,2] [,3] [,4] [,5]
    [1,] 1 5 9 13 17
    [2,] 2 6 10 14 18
    [3,] 3 7 11 15 19
    [4,] 4 8 12 16 20
    > z=array(c(1:3,3:1),dim=c(3,2))
    > print(z)
    [,1] [,2]
    [1,] 1 3
    [2,] 2 2
    [3,] 3 1
    >
    > #extract elements y[1,3],y[2,2],y[3,1]
    > print(y[z])
    [1] 9 6 3