-
2005-05-12
ubuntu下安装R包!
Usage: R CMD INSTALL [options] pkgs
Install the add-on packages specified by pkgs. The elements of pkgs can
be relative or absolute paths to directories with the package (bundle)
sources, or to gzipped package &apostar&apos archives. The library tree to
install to can be specified via &apos--library&apos. By default, packages are
installed in the library tree rooted at the first directory given in the
environment variable R_LIBS if this is set and non-null, and into the
default R library tree (/usr/lib/R/library) otherwise.
Options:
-h, --help print short help message and exit
-v, --version print INSTALL version info and exit
--configure-args=ARGS
set arguments for the package&aposs configure script
(if any)
--configure-vars=VARS
set variables for the configure script (if any)
-c, --clean remove all files created during installation
-s, --save[=ARGS] save the package source as an image file, and
arrange for this file to be loaded when the
package is attached; if given, ARGS are passed
to R when creating the save image
--no-save do not save the package source as an image file
--lazy use lazy loading
--no-lazy do not use lazy loading
--lazy-data use lazy loading for data
--no-lazy-data do not use lazy loading for data (current default)
-d, --debug turn on shell and build-help debugging
-l, --library=LIB install packages to library tree LIB
--no-configure do not use the package&aposs configure script
--no-docs do not build and install documentation
--with-package-versions
allow for multiple versions of the same package
--use-zip-data collect data files in zip archive
--use-zip-help collect help and examples into zip archives
--use-zip combine &apos--use-zip-data&apos and &apos--use-zip-help&apos
--fake do minimal install for testing purposes
--no-lock install on top of any existing installation
without using a lock directory
--build build binary tarball(s) of the installed package(s)
Report bugs to.
-
2005-03-26
修改VIM-r 脚本,调用konsole代替xterm!
文件位置:
/usr/share/vim/addons/ftplugin/r.vim
因为缺省的xterm不支持本地local,但是R能够识别本地local,我用的是zh_CN.GBK,所以把脚本r.vim进行了修改:
" ftplugin for R files
"
" Author: Iago Mosqueira
" Author: Johannes Ranke
" Author: Fernando Henrique Ferraz Pereira da Rosa
" Maintainer: Johannes Ranke
" Last Change: 2006 Feb 28
" SVN: $Id: r.vim 37 2006-02-28 22:36:51Z ranke $
"
" Code written in vim is sent to R through a perl pipe
" [funnel.pl, by Larry Clapp], as individual lines,
" blocks, or the whole file.
" Pressto open a new xterm with a new R interpreter listening
" to its standard input (you can type R commands into the xterm)
" as well as to code pasted from within vim.
"
" After selecting a visual block, &aposr' sends it to the R interpreter
"
" In insert mode,sends the active line to R and moves to the next
" line (write and process mode).
"
" Maps:
"Start a listening R interpreter in new xterm
"Start a listening R-devel interpreter in new xterm
"Start a listening R --vanilla interpreter in new xterm
"Run current file
"Run line under cursor
" r Run visual block
"Write and process
" Only do this when not yet done for this buffer
if exists("b:did_ftplugin")
finish
endif
" Don&apost load another plugin for this buffer
let b:did_ftplugin = 1
"disable backup for .r-pipe
setl backupskip=.*pipe
"set r-friendly tabbing
set expandtab
set tabstop=4
set shiftwidth=4
"Start a listening R interpreter in new xterm
"noremap:!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R &&
echo -e &aposInterpreter has finished. Exiting. Goodbye.\n'"&
"here we call a konsole terminal because the xterm cann&apost support the
my locale:zh_CN.GBK
noremap:!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R &&
echo -e &aposInterpreter has finished. Exiting. Goodbye.\n'"&
"Start a listening R-devel interpreter in new xterm
"noremap:!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R-devel
&& echo &aposInterpreter has finished. Exiting. Goodbye.'"&
"here we call a konsole terminal because the xterm cann&apost support the
my locale:zh_CN.GBK
noremap:!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R-devel
&& echo &aposInterpreter has finished. Exiting. Goodbye.'"&
"Start a listening R --vanilla interpreter in new xterm
"noremap:!xterm -T &aposR' -e funnel.pl ~/.r-pipe "R
-vanilla && echo &aposInterpreter has finished. Exiting.
Goodbye.'"&
"here we call a konsole terminal because the xterm cann&apost support the
my locale:zh_CN.GBK
noremap:!konsole -T &aposR' -e funnel.pl ~/.r-pipe "R
-vanilla && echo &aposInterpreter has finished. Exiting.
Goodbye.'"&
"send line under cursor to R
noremap:execute line(".") &aposw >> ~/.r-pipe'
inoremap:execute line(".") &aposw >> ~/.r-pipe'
"send visual selected block to R
vnoremapr :w >> ~/.r-pipe
"write and process mode (somehow mappingdoes not work)
inoremap:execute line(".") &aposw >> ~/.r-pipe' o
"send current file to R
noremap:execute &apos1 ,' line("$") &aposw >> ~/.r-pipe' -
2005-03-09
SAS SPSS R 正态分布检验的方法
SAS 可以通过UNIVARIATE GUO过程实现
具体格式:proc univariate data=luan.norm1 normal
如果数据集小于2000,采用W:Normal检验 否则采用D:norma检验
如果结果w:normal Pr>D的值小于0.05 那么不符合正态分布。或者d:normal Pr>D的值小于0.05,也不符合正态分布。
R 通过shapiro.test(x)来检验是否符合正态分布。数据集小于2000shapiro.test(rnorm(100, mean = 5, sd = 3))
shapiro.test(runif(100, min = 2, max = 4))SPSS通过非参数检验中的analysis-noparametrics tests-1 sample ks菜单项来实现
-
2005-02-27
正态分布检验 Normality test
SAS 可以通过UNIVARIATE GUO过程实现
具体格式:proc univariate data=luan.norm1 normal
shapiro.test(rnorm(100, mean = 5, sd = 3))
如果数据集小于2000,采用W:Normal检验 否则采用D:norma检验
如果结果w:normal Pr>D的值小于0.05 那么不符合正态分布。或者d:normal Pr>D的值小于0.05,也不符合正态分布。
R 通过shapiro.test(x)来检验是否符合正态分布。数据集小于2000
shapiro.test(runif(100, min = 2, max = 4)) -
2005-02-27
geneland的简介
Back to Gilles'Back to Gilles'
Geneland : an R package for landscape genetics.
Geneland is a free software distributed as an add-on to the free statistical
software R. It is currently available for Linux and Windows and Mac-OS.
Introduction
Features
Statistical model
Estimation algorithm
Example
Haploid organisms
Installation
Getting started
Reference manual
Known bugs
Changes log
Credit
References
Mailing list
Geneland hot-line
Introduction
The main purpose of Geneland it to process geo-referenced individual
multilocus genetic data and to detect population structure, i.e
sub-populations. Although populations refers often to a genetic structure
only, it is often realistic to assume that populations are spatially
organised. Therefore it makes sense not only to estimate population
membership of each individual of a dataset but also to try to delineate
spatial domains of each such population. Toward this aim, Geneland makes
use of both spatial and genetic informations to estimate the number of
populations in a dataset and delineate their spatial organisation.
Features
Geneland
- estimates the number of populations present in the dataset
- produces maps giving the population memberships of each geographical pixel
either as probabilities or as
population label
- produces files giving population membership of each individuals
- computes pair-wise Fst for all pairs of inferred populations
Statistical model
Geneland is based on four main assumptions :
i) The number of populations is unknown and all values between 1 and an
upper bound (which has to be set by the user) are considered equally likely.
ii) Sub-populations are spread over areas given by a the union of some
polygons in the spatial domain. In mathematical term we assume a hidden
colored Poisson-Voronoi tessellation such as the ones given below :
iii) Hardy-Weinberg equilibrium is assumed within each population
iv) Allele frequencies in each population are unknown and treated as random
variable either following the so-called Dirichlet model or the F-model
(Falush et al. [2003] )
Estimation algorithm
All unknown quantities in the model are treated as random variables, namely
:
- number of populations
- number, locations and population memberships of polygons coding the
spatial organisation of populations
- allele frequencies (in the present time populations and also in the
ancestral population in the F-model)
- drift parameters (in the F-model)
Inference is carried out through an MCMC algorithm. All parameters are
considered as unknown within a Bayesian model and averages over samples from
their posterior distribution allow to plot maps of posterior probability of
population membership like these :
Example
Below is an example of output of Geneland on a real data set of 88
wolverines sampled at 10 micro-satellite loci
(data kindly provided by Lisette Waits from the University of Idaho).
Geneland provides estimates of the number of population at Hardy-Weinberg
equilibrium....
and a map of theareas of such populations :
Haploid organisms
The model currently implemented in Geneland makes computations for diploid
organisms.
Extension to haploid organisms is scheduled for a future version.
If you want to use the present version of Geneland with haploid organisms,
you need first to diploidise
your haploid data. Here is link to some R functions written by Silke Werth
to handle haploid genotypes.
Two warnings:
- The diploidised dataset or the original haploid dataset would give
exactly similar results under maximum likelihood estimators.
But with Bayesian estimators the diploidised dataset gives a trade-off
between the haploid Bayesian estimator and the haploid maximum likelihood
estimator. As they are both well behaved estimators, any average of them
makes sense.
- In addition, the diploidised dataset is formally equivalent to using 2*n
individuals instead of n, hence, in a Bayesian setting it leads to
underestimating the true uncertainty about the parameters.
Installation
- In order to use Geneland, you need first to have R installed on you
computer, see the R homepage.
Note that compatibility of the current version of Geneland is checked with
the current version of R only.
- Launch R
- Type install.packages("Geneland") in the R prompt
- Answer yes to Delete downloaded files (y/N)?
Getting started
- Launch R
- Load Geneland with the command library(Geneland)
- Launch the on-line help of R with the command help.start()
- Poke around the help of Geneland with help(Geneland).
A complete sequence of example commands is given in the help page of
function mcmcFmodel wich you can acces by typing help(mcmcFmodel).
Known bugs
In Geneland 0.5
- Under windows: extra carriage return in some output files after the tenth
column if the number of populations is allowed to be larger than 10. Files
seem to be OK, just a bit messy.
- On Mac OS X 10.4: compilation error at installation
- Under windows: error message along the line of "the instruction at
"0x5ad71531" referenced memory at
"0x00000014". The memory could not be "read". Click OK to terminate the
program." from mcmcFmodel.
In Geneland 0.6
- Function simFmodel returns a list containing element c instead of
color.nuclei
- call of X11()
- documentation refers to path.data which no longer exists
- array ptemp not passed as argument in subroutine rpriof in Fortran
- Documentation of PostProcessChain() explains wrongly the storage forat of
images (row wise instead of column wise)
- PostProcessChain transmits coordinates but should transmit its transpose.
Therefore, the function PostProcessChain() did not computed the limits of
the spatial domain correctly.
It could result in wrong maps with PosteriorMode(). PostProcessChain()
worked correctly ifthe range of the x values of the individuals includes
the range of the y values (or vice-versa). If your data complies with this
condition, any map computed with previous version of Geneland should be OK.
Otherwise it should have resulted in very fancy maps (displaying bundles of
straight lines; the more it departs from this condition, the fancier).
Changes log
Changes from version 0.4 to version 0.5:
- In Fortran subroutine mcmc (called by mcmcFmodel): true coordinates are
written on an ascci file named hidden.coord.txt
- Input data passed to functions as R objects instead of through a path to
ascci files
Changes from version 0.5 to version 0.6:
- path.mcmc not written any longer in parameter file parameter.txt (in
order to avoid issues with path containing spaces)
- All function of Geneland now first transform input data (coordinates,
genotypes and allele.numbers) into matrices (to avoid troubles with data
frames)
- warning message of setplot function avoided by replacing last instruction
return (xlim, ylim, oldpin, newpin) by list(xlim, ylim, oldpin, newpin)
- carriage return after the tenth column in file proba.pop.membership.txt
and proba.pop.membership.perm.txt removed. Now correct writing with up to
thousand populations (though not recomanded).
- path.mcmc now set as paste(tempdir(),"/",sep="") in the example of
mcmcFmodel
Changes from version 0.6 to version 0.7:
-- NO SYNTAXIC CHANGE --
* Function simFmodel now returns a list containing element color.nuclei
* call of X11() in functions simfmodel, Plot* and PosteriorMode replaced by
get(getOption("device"))()
* Documentation of PostProcessChain do not refer any longer to
path.data (which no longer exists since version 0.6)
* Array ptemp now passed as argument in subroutine
rpriof and rpriorfa in Fortran
* PostProcessChain passes matrix t(coordinates) instead of
coordinates
* Function PlotTesselation and PosteriorMode uses directly coordinates
instead of
using this useless matrix s
* Documentation of PostProcessChain() now explains correctly the
storage format of images (column wise instead of row wise).
* Example code in mcmcFmodel implemented with 5 loci, 10 alleles/locus
on a reactangle domain, on a longer MCMC run
* Character strings for path to the various files now declared in
Fortran as character*256 instead of character*200
* Remove declaration of useless matrix s in PostProcessChain
* Additional functionality in PostProcessChain:
now also computes posterior probabilities of population membership
for individuals and writes them in file named
proba.pop.membership.indiv.txt"
The modal population for individual is written in a file named
"modal.pop.indiv.txt"
Credit
People who helped improving Geneland by stimulating comments includes :
- Annie Bouvier
- Aurélie Coulon
- Arnaud Estoup
- Frédéric Mortier
Bibliography
On the implementation of mixture models in population genetics:
- J.K. Pritchard, M. Stephens and P. Donnelly,
Inference of population structure using multilocus genotype data,
Genetics, pp 945-959 vol. 155, 2000
- Falush D., M. Stephens and J.K. Pritchard,
Inference of population structure using multilocus genotype data:
Linked loci and correlated allele frequencies, Genetics, pp 1567-1587,
vol 164, 2003
On the implementation of variable dimension MCMC algorihtm in population
genetics:
- Corander, J.C., Waldmann, P. and Sillanpaa, M.J.,
Bayesian analysis of genetic differentiation between populations,
Genetics, 2003, 163, 367-374
- Corander, J.C., P. Waldmann, P. Martinen and M.J. Sillanpaa, ,
BAPS2: Enhanced possibilities for the analysis of genetic population
structure,
Bioinformatics, vol. 20,number 15, 2004
On the use of Voronoi tessellationsin population genetics :
- Dupanloup, I., Schneider, S. and Excoffier, L.,
A simulated annealing approach to define genetic structure of populations,
Molecular Ecology, 2002, 11, 2571-2581.
References on this model
- G. Guillot, F. Mortier, A. Estoup Geneland: a program for landscape
Genetics, to appear in Molecular Ecology Notes [pdf]
- G. Guillot, Estoup, A., Mortier, F. Cosson, J.F. A spatial statistical
model for landscape genetic. To appear in Genetics [pdf]
Mailing List
If you want to be informed of changes in Geneland and progress on related
works, please let me know by email
(guillot[at]inapg.inra.fr). You will be added to the mailing list.
Geneland hot-line
Given that:
- most people in ecology are not familiar with R
- Geneland is primarily intended to linux users, while most users are
actually windows users
- the Geneland on-line does not give much details
- there are probably a few bugs in the present version
I&aposll try to answer to all users inquiries about Geneland (as long as I can).
Feel free to report any bugs, criticism and more generally any good or bad
experience with Geneland (guillot[at]inapg.inra.fr). Try to give as much
detail as you can, especially, try to give the exact sequence of R commands
involved in the problem.
For general questions about R, the R FAQ and the R mailing lists are the
right places to ask. -
2005-02-27
R 生物信息学bioinfomatics and 遗传学包
今天晚上找到了一个R的生物信息学网站:http://www.bioconductor.org/,提供了一套生物信息学的解决方案。同时找到了2个处理群体遗传学数据的包:
1.genetics: Population Genetics: http://cran.r-project.org/src/cont ... iptions/genetics.html
Classes and methods for handling genetic data. Includes classes to represent genotypes and haplotypes at single markers up to multiple markers on multiple chromosomes. Function include allele frequencies, flagging homo/heterozygotes, flagging carriers of certain alleles, estimating and testing for Hardy-Weinberg disequilibrium, estimating and testing for linkage disequilibrium2.rmetasim: An individual-based population genetic simulation environment
http://cran.r-project.org/src/cont ... iptions/rmetasim.html
http://linum.cofc.edu/software.html
An interface between R and the metasim simulation engine. Facilitates the use of the metasim engine to build and run individual based population genetics simulations. The simulation environment is documented in: Allan Strand. Metasim 1.0: an individual-based environment for simulating population genetics of complex population dynamics. Mol. Ecol. Notes, 2:373-376, 2002. (Please contact Allan Strand with comments, bug reports, etc). -
2005-02-27
多元异速生长系数显著性检验的置换重排程序
rm(list=ls(all=TRUE))
eigenvalue1=numeric(5000)
angles=numeric(0)
replicates=5000
vectorlength1=49
vectorlength2=50
data1=read.table("e:/rstudy/gdfj.txt",header=F)
vectorlength=length(data1$V1)
vectors=length(data1)
newdata1=numeric(vectorlength1*vectors)
newdata2=numeric(vectorlength2*vectors)
dim(newdata1)<-c(vectorlength1,vectors)
dim(newdata2)<-c(vectorlength2,vectors)
macs1=numeric(replicates*vectors)
macs2=numeric(replicates*vectors)
dim(macs1)=c(replicates,vectors)
dim(macs2)=c(replicates,vectors)
for(k in 1:replicates)
{
xx=sample(1:vectorlength,vectorlength1,replace=FALSE)
counter1=0
counter2=0
newdata1=data1[xx,]
newdata2=data1[-xx,]
pc1.cr=eigen(cov(newdata1),EISPACK=TRUE)
pc2.cr=eigen(cov(newdata2),EISPACK=TRUE)
eigenvectors1=pc1.cr$vectors
eigenvectors2=pc2.cr$vectors
eigenvalue1=eigenvectors1[,1]
eigenvalue2=eigenvectors2[,1]
mac1=eigenvalue1/mean(eigenvalue1)
mac2=eigenvalue2/mean(eigenvalue2)
xxxx=sum(mac1*mac2)/(sqrt(sum(mac1^2))*sqrt(sum(mac2^2)))
angles[k]=acos(xxxx)*180/pi
macs1[k,]=mac1;
macs2[k,]=mac2;
}
write.table(angles,file="angles.txt")
write.table(macs1,file="macs1.txt")
write.table(macs2,file="macs2.txt")
hist(angles,probability=TRUE,main="重排频率分布",xlab="角度",ylab="频率")
print("Game is over!!!!")
模拟的θ1如图所示
-
2005-02-27
apropos fuction
This function can tell you what exactly is the name that you can want. It is very useful.
for example:
>apropos("sqr")
>1] "sqrt" -
2005-02-27
两个新的R包:gap and geneland
gap包:http://www.ucl.ac.uk/~rmjdjhz/software/gap_1.0-3.zip
作者的主页:http://www.hgmp.mrc.ac.uk/~jzhao/
This is a package bundle based on the original R package for genetic data analysis (gap), it currently consists of gap, pathmix and pointer (alpha).
License:
Programs included in this package by Jing hua Zhao will be under GPL. Specific requirement may be possible for programs written by other authors.
geneland:http://cran.r-project.org/bin/window ... lease/Geneland_0.7.zip
http://www.inapg.inra.fr/ens_rech/mathinfo/p ... eneland.html#Introduction_
IntroductionThe main purpose of Geneland it to process geo-referenced individual multilocus genetic data and to detect population structure, i.e sub-populations. Although populations refers often to a genetic structure only, it is often realistic to assume that populations are spatially organised. Therefore it makes sense not only to estimate population membership of each individual of a dataset but also to try to delineate spatial domains of each such population. Toward this aim, Geneland makes use of both spatial and genetic informations to estimate the number of populations in a dataset and delineate their spatial organisation.
作者的homepage: http://www.inapg.inra.fr/ens_rech/ma ... l/guillot/welcome.html
有关空间群体遗传学的一些集合。对于模拟群体的子分很有意义。 -
2005-02-27
数组array
> #dim retrieve or set dimension of an object 重新得到或者设置一个对象的维数
> x=1:12;dim(x)=c(3,4)
> print(x)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> print(x[3,]) #
[1] 3 6 9 12
> x[,] #表示数组全部
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
>
> #array,create or test arrays
> y=array(1:20,dim=c(4,5))
> print(y)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
> z=array(c(1:3,3:1),dim=c(3,2))
> print(z)
[,1] [,2]
[1,] 1 3
[2,] 2 2
[3,] 3 1
>
> #extract elements y[1,3],y[2,2],y[3,1]
> print(y[z])
[1] 9 6 3







