Tuesday, June 28, 2011

linux解压 tar/gz/bz/gz2/bz2...压缩与解压缩

linux解压 tar命令

tar命令 tar [-cxtzjvfpPN] 文件与目录 .... 参数:
-c :建立一个压缩文件的参数指令(create 的意思);
-x :解开一个压缩文件的参数指令!
-t :查看 tarfile 里面的文件!
特别注意,在参数的下达中, c/x/t 仅能存在一个!不可同时存在!
-z :是否同时具有 gzip 的属性?亦即是否需要用 gzip 压缩?
-j :是否同时具有 bzip2 的属性?亦即是否需要用 bzip2 压缩?
-v :压缩的过程中显示文件!这个常用,但不建议用在背景执行过程!
-f :使用档名,请留意,在 f 之后要立即接档名喔!不要再加参数!
   例如使用『 tar -zcvfP tfile sfile』就是错误的写法,要写成
   『 tar -zcvPf tfile sfile』才对喔!
-p :使用原文件的原来属性(属性不会依据使用者而变)
-P :可以使用绝对路径来压缩!
-N :比后面接的日期(yyyy/mm/dd)还要新的才会被打包进新建的文件中!
--exclude FILE:在压缩的过程中,不要将 FILE 打包!
范例一:将整个 /etc 目录下的文件全部打包成为 /tmp/etc.tar [root@linux ~]#
tar -cvf /tmp/etc.tar /etc<==仅打包,不压缩!
[root@linux ~]# tar -zcvf /tmp/etc.tar.gz /etc<==打包后,以 gzip 压缩
[root@linux ~]# tar -jcvf /tmp/etc.tar.bz2 /etc<==打包后,以 bzip2 压缩
# 特别注意,在参数 f 之后的文件档名是自己取的,我们习惯上都用 .tar 来作为辨识。
# 如果加 z 参数,则以 .tar.gz 或 .tgz 来代表 gzip 压缩过的 tar file ~
# 如果加 j 参数,则以 .tar.bz2 来作为附档名啊~
# 上述指令在执行的时候,会显示一个警告讯息:
# 『tar: Removing leading `/" from member names』那是关於绝对路径的特殊设定。

范例二:查阅上述 /tmp/etc.tar.gz 文件内有哪些文件?
[root@linux ~]# tar -ztvf /tmp/etc.tar.gz
# 由於我们使用 gzip 压缩,所以要查阅该 tar file 内的文件时,
# 就得要加上 z 这个参数了!这很重要的!

范例三:将 /tmp/etc.tar.gz 文件解压缩在 /usr/local/src 底下
[root@linux ~]# cd /usr/local/src
[root@linux src]# tar -zxvf /tmp/etc.tar.gz
# 在预设的情况下,我们可以将压缩档在任何地方解开的!以这个范例来说,
# 我先将工作目录变换到 /usr/local/src 底下,并且解开 /tmp/etc.tar.gz ,
# 则解开的目录会在 /usr/local/src/etc 呢!另外,如果您进入 /usr/local/src/etc
# 则会发现,该目录下的文件属性与 /etc/ 可能会有所不同喔!

范例四:在 /tmp 底下,我只想要将 /tmp/etc.tar.gz 内的 etc/passwd 解开而已
[root@linux ~]# cd /tmp
[root@linux tmp]# tar -zxvf /tmp/etc.tar.gz etc/passwd
# 我可以透过 tar -ztvf 来查阅 tarfile 内的文件名称,如果单只要一个文件,
# 就可以透过这个方式来下达!注意到! etc.tar.gz 内的根目录 / 是被拿掉了!

范例五:将 /etc/ 内的所有文件备份下来,并且保存其权限!
[root@linux ~]# tar -zxvpf /tmp/etc.tar.gz /etc
# 这个 -p 的属性是很重要的,尤其是当您要保留原本文件的属性时!
范例六:在 /home 当中,比 2005/06/01 新的文件才备份
[root@linux ~]# tar -N "2005/06/01" -zcvf home.tar.gz /home
范例七:我要备份 /home, /etc ,但不要 /home/dmtsai
[root@linux ~]# tar --exclude /home/dmtsai -zcvf myfile.tar.gz /home/* /etc
范例八:将 /etc/ 打包后直接解开在 /tmp 底下,而不产生文件!
[root@linux ~]# cd /tmp
[root@linux tmp]# tar -cvf - /etc | tar -xvf -
# 这个动作有点像是 cp -r /etc /tmp 啦~依旧是有其有用途的!
# 要注意的地方在於输出档变成 - 而输入档也变成 - ,又有一个 | 存在~
# 这分别代表 standard output, standard input 与管线命令啦!
# 这部分我们会在 Bash shell 时,再次提到这个指令跟大家再解释啰!

source: http://www.21andy.com/blog/20060820/389.html

*.gz2用gunzip2 *.gz2
For examplegunzip2 *.tar.gz2,解出一个*.tar文件,
然后tar -vxf *.tar即可

解压:[*******]$ rar a FileName.rar
压缩:[*******]$ rar e FileName.rar    
指定的目录也行):[*******]$ cp rar_static /usr/bin/rar
解包:tar xvf FileName.tar
打包:tar cvf FileName.tar DirName
解压1:gunzip FileName.gz
解压2:gzip -d FileName.gz
压缩:gzip FileName
.tar.gz 和 .tgz
解压:tar zxvf FileName.tar.gz
压缩:tar zcvf FileName.tar.gz DirName
解压1:bzip2 -d FileName.bz2
解压2:bunzip2 FileName.bz2
压缩: bzip2 -z FileName
解压:tar jxvf FileName.tar.bz2
压缩:tar jcvf FileName.tar.bz2 DirName
解压1:bzip2 -d FileName.bz
解压2:bunzip2 FileName.bz
解压:tar jxvf FileName.tar.bz
解压:uncompress FileName.Z
压缩:compress FileName
解压:tar Zxvf FileName.tar.Z
压缩:tar Zcvf FileName.tar.Z DirName
解压:unzip FileName.zip
压缩:zip FileName.zip DirName
解压:rar x FileName.rar
压缩:rar a FileName.rar DirName

rar请到:http://www.rarsoft.com/download.htm 下载!
[root@www2 tmp]# cp rar_static /usr/bin/rar

解压:lha -e FileName.lha
压缩:lha -a FileName.lha FileName

[root@www2 tmp]# cp lha /usr/bin/

解包:rpm2cpio FileName.rpm | cpio -div
解包:ar p FileName.deb data.tar.gz | tar zxf -
.tar .tgz .tar.gz .tar.Z .tar.bz .tar.bz2 .zip .cpio .rpm .deb .slp .arj .rar .ace .lha .lzh .lzx .lzs .arc .sda .sfx .lnx .zoo .cab .kar .cpt .pit .sit .sea
解压:sEx x FileName.*
压缩:sEx a FileName.* FileName

sEx请到: http://sourceforge.net/projects/sex下载!
[root@www2 tmp]# cp sEx /usr/bin/

gzip 命令

减少文件大小有两个明显的好处,一是可以减少存储空间,二是通过网络传输文件时,可以减少传输的时间。gzip 是在 Linux 系统中经常使用的一个对文件进行压缩和解压缩的命令,既方便又好用。
语法:gzip [选项] 压缩(解压缩)的文件名
-c 将输出写到标准输出上,并保留原有文件。
-d 将压缩文件解压。
-l 对每个压缩文件,显示下列字段:
-r 递归式地查找指定目录并压缩其中的所有文件或者是解压缩。
-t 测试,检查压缩文件是否完整。
-v 对每一个压缩和解压的文件,显示文件名和压缩比。
-num 用指定的数字 num 调整压缩的速度,-1 或 --fast 表示最快压缩方法(低压缩比),
-9 或--best表示最慢压缩方法(高压缩比)。系统缺省值为 6。
gzip *
% 把当前目录下的每个文件压缩成 .gz 文件。

gzip -dv *
% 把当前目录下每个压缩的文件解压,并列出详细的信息。

gzip -l *
% 详细显示例1中每个压缩的文件的信息,并不解压。

gzip usr.tar
% 压缩 tar 备份文件 usr.tar,此时压缩文件的扩展名为.tar.gz。

source: http://hi.baidu.com/koomo007/blog/item/4904bb2642928c09918f9d02.html


tar -tzvf u2file.tar.gz
-rw-r--r-- user/user 45489156 2008-08-04 23:59:46 foder/access.log.20080804
-rw-r--r-- user/user 37469223 2008-08-05 23:59:46 foder/access.log.20080805

tar -zxvf u2file.tar.gz foder/access.log.0805

tar -zxvf u2file.tar.gz foder/access.log.*

tar -xzvf u2file.tar.gz foder/access.log.0805 -C /new/dir/    # -C 指定解压到的目录.


Thursday, June 23, 2011

Scatterplot with marginal histograms

The scatterplot is one of the most ubiquitous, and useful graphics. It's also very basic. One of its shortcomings is that it can hide important aspects of the marginal distributions of the two variables. To address this weakness, you can add a histogram of each margin to the plot. We demonstrate using the SF-36 MCS and PCS subscales in the HELP data set.

scatterhist = function(x, y, xlab="", ylab=""){
 zones=matrix(c(2,0,1,3), ncol=2, byrow=TRUE)
 layout(zones, widths=c(4/5,1/5), heights=c(1/5,4/5))
 xhist = hist(x, plot=FALSE)
 yhist = hist(y, plot=FALSE)
 top = max(c(xhist$counts, yhist$counts))
 barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0)
 barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE)
 mtext(xlab, side=1, line=1, outer=TRUE, adj=0,     at=.8 * (mean(x) - min(x))/(max(x)-min(x)))
 mtext(ylab, side=2, line=1, outer=TRUE, adj=0,     at=(.8 * (mean(y) - min(y))/(max(y) - min(y))))

ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
with(ds, scatterhist(mcs, pcs, xlab="MCS", ylab="PCS"))

An Example of ANOVA using R

 In its simplest form ANOVA provides a statistical test of whether or not the means of several groups are all equal, and therefore generalizes t-test to more than two groups. The t-test tells us if the variation between two groups is "significant".


An Example of ANOVA using R

by EV Nordheim, MK Clayton & BS Yandell, November 11, 2003

In class we handed out ”An Example of ANOVA”. Below we redo the example using R.
There are three groups with seven observations per group. We denote group i values by yi:
> y1 = c(18.2, 20.1, 17.6, 16.8, 18.8, 19.7, 19.1)
> y2 = c(17.4, 18.7, 19.1, 16.4, 15.9, 18.4, 17.7)
> y3 = c(15.2, 18.8, 17.7, 16.5, 15.9, 17.1, 16.7)

Now we combine them into one long vector, with a second vector, group, identifying group

> y = c(y1, y2, y3)
> n = rep(7, 3)
> n
[1] 7 7 7
> group = rep(1:3, n)
> group
[1] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3

Here are summaries by group and for the combined data. First we show stem-leaf diagrams.

> tmp = tapply(y, group, stem)
The decimal point is at the |
16 | 8
17 | 6
18 | 28
19 | 17
20 | 1
The decimal point is at the |
15 | 9
16 | 4
17 | 47
18 | 47
19 | 1
The decimal point is at the |
15 | 29
16 | 57
17 | 17
18 | 8
> stem(y)
The decimal point is at the |
15 | 299
16 | 4578
17 | 14677
18 | 24788
19 | 117
20 | 1

Now we show summary statistics by group and overall. We locally define a temporary
function, tmpfn, to make this easier.

> tmpfn = function(x) c(sum = sum(x), mean = mean(x), var = var(x),n = length(x))
> tapply(y, group, tmpfn)
       sum       mean        var          n
130.300000  18.614286   1.358095   7.000000
       sum       mean        var          n
123.600000  17.657143   1.409524   7.000000
       sum       mean        var          n
117.900000  16.842857   1.392857   7.000000
> tmpfn(y)
       sum       mean        var          n
371.800000  17.704762   1.798476  21.000000

While we could show you how to use R to mimic the computation of SS by hand, it is
more natural to go directly to the ANOVA table. See Appendix 11 for other examples of the
use of R commands for ANOVA.

> data = data.frame(y = y, group = factor(group))
> fit = lm(y ~ group, data)
> anova(fit)
Analysis of Variance Table
Response: y
          Df Sum Sq Mean Sq F value  Pr(>F) 
group      2 11.007  5.5033  3.9683 0.03735 *
Residuals 18 24.963  1.3868                 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The anova(fit) object can be used for other computations on the handout and in class.
For instance, the tabled F values can be found by the following. First we extract the treatment
and error degrees of freedom. Then we use qt to get the tabled F values.

> df = anova(fit)[, "Df"]
> names(df) = c("trt", "err")
> alpha = c(0.05, 0.01)
> qf(alpha, df["trt"], df["err"], lower.tail = FALSE)
[1] 3.554557 6.012905

A confidence interval on the pooled variance can be computed as well using the anova(fit)
object. First we get the residual sum of squares, SSTrt, then we divide by the appropriate
chi-square tabled values.

> anova(fit)["Residuals", "Sum Sq"]
[1] 24.96286
> anova(fit)["Residuals", "Sum Sq"]/qchisq(c(0.025, 0.975), 18,lower.tail = FALSE)
[1] 0.7918086 3.0328790

Five statistical things I wished I had been taught 20 years ago

These are the pieces of hard won statistical knowledge I wish someone had taught me 20 years ago rather than my meandering, random walk approach.

1. Non parametric statistics. These are statistical tests which make a bare minimum of assumptions of underlying distributions; in biology we are rarely confident that we know the underlying distribution, and hand waving about central limit theorem can only get you so far. Wherever possible you should use a non parameteric test. This is Mann-Whitney (or Wilcoxon if you prefer) for testing "medians" (Medians is in quotes because this is not quite true. They test something which is closely related to the median) of two distributions, Spearman's Rho (rather pearson's r2) for correlation, and the Kruskal test rather than ANOVAs (though if I get this right, you can't in Kruskal do the more sophisticated nested models you can do with ANOVA). Finally, don't forget the rather wonderful Kolmogorov-Smirnov (I always think it sounds like really good vodka) test of whether two sets of observations come from the same distribution. All of these methods have a basic theme of doing things on the rank of items in a distribution, not the actual level. So - if in doubt, do things on the rank of metric, rather than the metric itself.

2. R (or I guess S). R is a cranky, odd statistical language/system with a great scientific plotting package. Its a package written mainly by statisticians for statisticians, and is rather unforgiving the first time you use it. It is defnitely worth persevering. It's basically a combination of excel spreadsheets on steriods (with no data entry. an Rdata frame is really the same logical set as a excel workbook - able to handle millions of points, not 1,000s), a statistical methods compendium (it's usually the case that statistical methods are written first in R, and you can almost guarantee that there are no bugs in the major functions - unlike many other scenarios) and a graphical data exploration tool (in particular lattice and ggplot packages). The syntax is inconsistent, the documentation sometimes wonderful, often awful and the learning curve is like the face of the Eiger. But once you've met p.adjust(), xyplot() and apply(), you can never turn back.

3. The problem of multiple testing, and how to handle it, either with the Expected value, or FDR, and the backstop of many of piece of bioinformatics - large scale permutation. Large scale permutation is sometimes frowned upon by more maths/distribution purists but often is the only way to get a sensible sense of whether something is likely "by chance" (whatever the latter phrase means - it's a very open question) given the complex, hetreogenous data we have. 10 years ago perhaps the lack of large scale compute resources meant this option was less open to people, but these days basically everyone should be working out how to appropriate permute the data to allow a good estimate of "surprisingness" of an observation.

4. The relationship between Pvalue, Effect size, and Sample size - this needs to be drilled into everyone - we're far too trigger happy quoting Pvalues, when we should often be quoting Pvalues and Effect size. Once a Pvalue is significant, it's higher significance is sort of meaningless (or rather it compounds Effect size things with Sample size things, the latter often being about relative frequency). So - if something is significantly correlated/different, then you want to know about how much of an effect this observation has. This is not just about GWAS like statistics - in genomic biology we're all too happy about quoting some small Pvalue not realising that with a million or so points often, even very small deviations will be significant. Quote your r2, Rhos or proportion of variance explained...

5. Linear models and PCA. There is a tendency often to jump to quite complex models - networks, or biologically inspired combinations, when our first instinct should be to crack out the well established lm() (linear model) for prediction and princomp() (PCA) for dimensionality reduction. These are old school techniques - and often if you want to talk about statistical fits one needs to make gaussian assumptions about distributions - but most of the things we do could be either done well in a linear model, and most of the correlation we look at could have been found with a PCA biplot. The fact that these are 1970s bits of statistics doesn't mean they don't work well.

Sunday, June 19, 2011

Side-by-side histograms

In R, the lattice package provides a similarly direct approach.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
ds$gender = ifelse(ds$female==1, "female", "male")
histogram(~ cesd | gender, data=ds)

sources: http://sas-and-r.blogspot.com/2011/06/example-840-side-by-side-histograms.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+SASandR+%28SAS+and+R%29

Computing Odds Ratios in R

for two binary variables, x and y, each taking the values 0 and 1, the odds ratio is defined on the basis of the following four numbers:

            N00 = the number of data records with x = 0 and y = 0
            N01 = the number of data records with x = 0 and y = 1
            N10 = the number of data records with x = 1 and y = 0
            N11 = the number of data records with x = 1 and y = 1

Specifically, the odds ratio is given by the following expression:

OR = N00 N11 / N01 N10

Similarly, confidence intervals for the odds ratio are easily constructed by appealing to the asymptotic normality of log OR, which has a limiting variance given by the square root of the sum of the reciprocals of these four numbers.  The R procedure oddsratioWald.proc available from the companion website for Exploring Data computes the odds ratio and the upper and lower confidence limits at a specified level alpha from these four values:

oddsratioWald.proc <- function(n00, n01, n10, n11, alpha = 0.05){
  #  Compute the odds ratio between two binary variables, x and y,
  #  as defined by the four numbers nij:
  #    n00 = number of cases where x = 0 and y = 0
  #    n01 = number of cases where x = 0 and y = 1
  #    n10 = number of cases where x = 1 and y = 0
  #    n11 = number of cases where x = 1 and y = 1
  OR <- (n00 * n11)/(n01 * n10)
  #  Compute the Wald confidence intervals:
  siglog <- sqrt((1/n00) + (1/n01) + (1/n10) + (1/n11))
  zalph <- qnorm(1 - alpha/2)
  logOR <- log(OR)
  loglo <- logOR - zalph * siglog
  loghi <- logOR + zalph * siglog
  ORlo <- exp(loglo)
  ORhi <- exp(loghi)
  oframe <- data.frame(LowerCI = ORlo, OR = OR, UpperCI = ORhi, alpha = alpha)

Including “alpha = 0.05” in the parameter list fixes the default value for alpha at 0.05, which yields the 95% confidence intervals for the computed odds ratio, based on the Wald approximation described above.  An important practical point is that these intervals become infinitely wide if any of the four numbers Nij are equal to zero; also, note that in this case, the computed odds ratio is either zero or infinite.  Finally, it is worth noting that if the numbers Nij are large enough, the procedure just described can encounter numerical overflow problems (i.e., the products in either the numerator or the denominator become too large to be represented in machine arithmetic).  If this is a possibility, a better alternative is to regroup the computations as follows:

OR = (N00 / N01) x (N11 / N10 )

To use the routine just described, it is necessary to have the four numbers defined above, which form the basis for a two-by-two contingency table.  Because contingency tables are widely used in characterizing categorical data, these numbers are easily computed in R using the table command.  As a simple example, the following code reads the UCI mushroom dataset and generates the two-by-two contingency table for the EorP and GillSize attributes:

> mushrooms <- read.csv("mushroom.csv")
> table(mushrooms$EorP, mushrooms$GillSize)
       b    n
  e 3920  288
  p 1692 2224

(Note that the first line reads the csv file containing the mushroom data; for this command to work as shown, it is necessary for this file to be in the working directory.  Alternatively, you can change the working directory using the setwd command.) 

To facilitate the computation of odds ratios, the following preliminary procedure combines the table command with the oddsratioWald.proc procedure, allowing you to compute the odds ratio and its level-alpha confidence interval from the two-level variables directly:

TableOR.proc00 <- function(x,y,alpha=0.05){
  xtab <- table(x,y)
  n00 <- xtab[1,1]
  n01 <- xtab[1,2]
  n10 <- xtab[2,1]
  n11 <- xtab[2,2]

The primary disadvantage of this procedure is that it doesn’t tell you which levels of the two variables are being characterized by the computed odds ratio.  In fact, this characterization describes the first level of each of these variables, and the following slight modification makes this fact explicit:

TableOR.proc <- function(x,y,alpha=0.05){
   xtab <- table(x,y)
   n00 <- xtab[1,1]
   n01 <- xtab[1,2]
   n10 <- xtab[2,1]
   n11 <- xtab[2,2]
   outList <- vector("list",2)
   outList[[1]] <- paste("Odds ratio between the level [",dimnames(xtab)[[1]][1],"] of the first variable and the level [",dimnames(xtab)[[2]][1],"] of the second variable:",sep=" ")
   outList[[2]] <- oddsratioWald.proc(n00,n01,n10,n11,alpha)

Specifically, I have used the fact that the dimension names of the 2x2 table xtab correspond to the levels of the variables x and y, and I have used the paste command to include these values in a text string displayed to the user.  (I have enclosed the levels in square brackets to make them stand out from the surrounding text, particularly useful here since the levels are coded as single letters.)  Applying this procedure to the mushroom characteristics EorP and GillSize yields the following results:

> TableOR.proc(mushrooms$EorP, mushrooms$GillSize)
[1] "Odds ratio between the level [ e ] of the first variable and the level [ b ] of the second variable:"

   LowerCI       OR  UpperCI alpha
1 15.62615 17.89073 20.48349  0.05


Almost certainly, the formatting I have used here could be improved – probably a lot – but the key point is to provide a result that is reasonably complete and easy to interpret. 

Finally, I noted in my last post that if we are interested in using odds ratios to compare or rank associations, it is useful to code the levels so that the computed odds ratio is larger than 1.  In particular, note that applying the above procedure to characterize the relationship between edibility and the Bruises characteristic yields:

> TableOR.proc(mushrooms$EorP, mushrooms$Bruises)
[1] "Odds ratio between the level [ e ] of the first variable and the level [ f ] of the second variable:"

     LowerCI        OR   UpperCI alpha
1 0.09014769 0.1002854 0.1115632  0.05


It is clear from these results that both Bruises and GillSize exhibit odds ratios with respect to mushroom edibility that are significantly different from the neutral value 1 (i.e., the 95% confidence interval excludes the value 1 in both cases), but it is not obvious which variable has the stronger association, based on the available data.  The following procedure automatically restructures the computation so that the computed odds ratio is larger than or equal to 1, allowing us to make this comparison:

AutomaticOR.proc <- function(x,y,alpha=0.05){
   xtab <- table(x,y)
   n00 <- xtab[1,1]
   n01 <- xtab[1,2]
   n10 <- xtab[2,1]
   n11 <- xtab[2,2]
   rawOR <- (n00*n11)/(n01*n10)
   if (rawOR < 1){
     n01 <- xtab[1,1]
     n00 <- xtab[1,2]
     n11 <- xtab[2,1]
     n10 <- xtab[2,2]
     iLevel <- 2
     iLevel <- 1
   outList <- vector("list",2)
   outList[[1]] <- paste("Odds ratio between the level [",dimnames(xtab)[[1]][1],"] of the first variable and the level [",dimnames(xtab)[[2]][iLevel],"] of the second variable:",sep=" ")
   outList[[2]] <- oddsratioWald.proc(n00,n01,n10,n11,alpha)

Note that this procedure first constructs the 2x2 table on which everything is based and then computes the odds ratio in the default coding: if this value is smaller than 1, the coding of the second variable (y) is reversed.  The odds ratio and its confidence interval are then computed and the levels of the variables used in computing it are presented as before.  Applying this procedure to the Bruises characteristic yields the following result, from which we can see that GillSize appears to have the stronger association, as noted last time:

> AutomaticOR.proc(mushrooms$EorP, mushrooms$Bruises)
[1] "Odds ratio between the level [ e ] of the first variable and the level [ t ] of the second variable:"

   LowerCI       OR  UpperCI alpha
1 8.963532 9.971541 11.09291  0.05

Saturday, June 18, 2011


What are you trying to say?(你到底想说什么?)
Don't be silly.(别胡闹了。)
How strong are your glasses?(你近视多少度?)
Just because.(没有别的原因。)
It isn't the way I hoped it would be.(这不是我所盼望的。)
You will never guess.(你永远猜不到。)
No one could do anything about it.(众人对此束手无措。)
I saw something deeply disturbing.(深感事情不妙。)
Money is a good servant but a bad master.(要做金钱的主人,莫做金钱的奴隶。)
I am not available.(我正忙着)
Wisdom in the mind is better than money in the hand.(脑中的知识比手中的金钱更重要)
Never say die.it's a piece of cake.别泄气,那只是小菜一碟。
Don't worry.you'll get use to it soon.别担心,很快你就会习惯的。
I konw how you feel.我明白你的感受。
You win some.you lose some.胜败乃兵家常事。
Don't bury your head in the sand.不要逃避现实。
I didn't expect you to such a good job.我没想到你干得这么好。
You are coming alone well.你做得挺顺利。
She is well-build.她的身材真棒。
You look neat and fresh.你看起来很清纯。
You have a beautiful personality.你的气质很好。
You flatter me immensely.你过奖啦。
You should be slow to judge others.你不应该随意评论别人。
I hope you will excuse me if i make any mistake.如有任何错误,请你原谅
It was most careless ofme.我太粗心了。
It was quite by accident.真是始料不及。
I wish i had all the time i'd ever wasted,so i could waste it all over again.我希望所有被我浪费的时间重新回来,让我再浪费一遍。
I like you the way you were.我喜欢你以前的样子。
You two go ahead to the movie without me,i don't want to be a third wheel.你们两个自己去看电影吧,我不想当电灯泡。
Do you have anyone in mind?你有心上人吗?
How long have you known her?你认识她多久了?
It was love at frist sight.一见钟情
I'd bettle hit the books.我要复习功课啦。
a piece of one's mind .直言不讳
He gave me a piece of mind,"Don't shift responsibility onto others."他责备道:“不要把责任推卸到别人身上。”
a cat and dog life 水火不容的生活
The husband and his wife are always quarrelling,and they are leading a cat and dog life.这对夫妇老是吵架,相互之间水火不容。
a dog's life 潦倒的生活
The man lived a dog's life.这个人生活潦倒。 
A to Z 从头至尾
I know that from A to Z. 我很了解这件事。
above somebody 深奥
Well,this sort of talk is above me.我不懂你们在讲什么。
all ears 全神贯注地倾听着
When you tell Mary some gossip,she is all ears.跟Mary讲一些小道消息,她会听地仔仔细细。
all the more 更加,益发
You'll be all the better for a holiday.度一次假,对你会更有好处。
all dressed up 打扮得整整齐齐
She is all dressed up and nowhere to go.她打扮得整整齐却无处炫耀。
all in all 总的说来;最心爱的东西
The daughter is all in all to him.女儿是他的无价宝。
all out 竭尽全力
They went all out.他们鼓足了干劲。
all over 全部结束;浑身,到处
Glad,it is all over.这事全部结束了,好得很。
I'm wet all over.我浑身都湿了。
all set 准备就绪
He is all set for an early morning start.他已做好清晨出发的一切准备。
all you have to do 需要做得是
All you have to do is to calm yourself down and wait for the good news.你需要做得是静下心来等好消息。
as easy as falling of a log /as easy as snapping your fingers /as easy as ABC 容易得很
To me,a good story teller,it would be as easy as falling of a log.
as busy as a bee 非常忙
Mum is always as busy as a bee in the moring.妈妈每天早上都忙得不可开交。
at one's fingertips 了如指掌
How to get at that little island is at his fingertips.他知道怎么去那个小岛。
at one's wit's end 智穷
Don't ask him.It is also at his wit's end.不要问他了,他也不知道。
big shot 大人物,大亨
He is a big shot in our little town.
black sheep 败家子,害群之马
Every family has a black sheep.家家有本难念的经。
black and blue 遍体鳞伤
The thief was caught of red-handed and beaten black and blue.那个小偷当场被抓住并被打得青一块紫一块的。
black and white 白纸黑字
The proof is in black and white and the murderer has no any excuses.证据确凿,凶手再也无话可说。
blind alley 死胡同
You are heading into a blind alley.你正在钻牛角尖。
blow hot and cold 摇摆不定
This guy seemed to have no own idea.He always blew hot and cold.这家伙好象没什么主张,总是摇摆不定。
blow one's own trumpet 自吹自擂
Don't blow your own tumpet.Let us see what on earth you can do.不要自吹自擂了,让我们看看你到底能做什么。
born with a silver in one's mouth 出生在富贵人家
He is born with a silver in one's mouth.他是含着金钥匙出生的。
bland new 崭新的
a bland new coat 新衣服
break the ice 打破沉默
The couple hadn't spoken to each other for a week.They were both waiting for the other one to break the ice.这对夫妇已经一个星期没说过话了。两人都在等另一方先开口。
by a blow 无意中的一击
He is beaten to the ground by a blow.他被击到在地。
can't stand it any longer 不能再忍受了
I can't stand it any longer,I quit.我再也忍受不了了,我走。
carry something too far 过火了
You are carrying your joke too far.你玩笑开得太过分了。
castle in the sky 海市蜃楼
You plan is nearly a castle in the sky.你的计划简直就是空想。
cats got one's tongue 哑口无言
chain smoker 老烟枪
come up with 产生,想出
Let me come up with some ideas.让我想一想。
come easily 容易
Languages come easily to some people.有些人能够很容易地掌握语言。
cup of tea 喜欢
Movies are not my cup of tea.我不喜欢看电影。
cut it out 停止,住嘴
Cut it out!I can't stand you any longer.
call it a day 不再做下去,停止(某种活动)
Let us call it a day,stop.这一天工作够了,停工吧! dark horse 黑马
Nobody considered that John would win the game.He was a dark horse in the final.
dear John letter 绝交信
Jack received a dear John letter from his girlfriend because he had broken her heart.
do somone good 对某人有好处
Having some moring exercises does you good.
Do you get me? 你明白我的意思吗?
doesn't count 这次不算
It doesn't count this time,try again.
doesn't make sense 不懂;没有任何意义
The sentence you made doesn't make any sense to me.
down and out 穷困潦倒
Being down and out,he couldn't support his family.
drive at 用意,意欲
What's he driving at?他用意何在?
drop in 偶然拜访
I dropped him in on my way to the hospital.
drop me a line 写信给我
On arriving the University,please drop me a line.
early bird 早起的人
An early bird catchs worms。捷足先登
easy come easy go 来得快去得也快
eat my words 收回前言,认错道歉
I said something bad to my mum.Although I want to eat my words back, it didn't work,for I had hurt my mum's feeling.
face the music 直面困难
He knew he'd never get away with it so he decided to face the music and give himself up to the police.他知道自己不可能逃脱,因此决定一人做事一人当——向警察自首。
face up to 勇敢地面对某事
You must learn to face up to your responsibilities.
fed up 厌倦
I am rather fed up with your complaints.
feel free to do something 不要拘束
Please feel free to make suggestions.
few and far between 很少,稀少
Human beings are few and far between in this zone.
French leave 不辞而别
give me a headache 让人头痛
The naughty boy gave me a headache.
give me a hand 帮我一下
go Dutch AA制
God bless you 上帝保佑你
God bless you with your examinations.
God knows 天知道
Got it? 明白了吗?
green thumbs /fingers 园艺技能
hands are full 非常忙
have a ball 勇敢
have had it 受够了
I have had it with all your excuses.我受够了你的借口。
hold water 站得住脚
Non of his arguements seem to hold water.
in every sense of word 在某种意义上说
It's a lie in every sense of word.这是不折不扣的谎言。
keep an eye on 提高警惕
kill time打发时间
lazy bones 懒骨头
Get up lazy bones!
leave it to me 让我来吧
leave me alone 别管我 
like father like son 有其父必有其子
like it or not 不管你喜不喜欢
make a fool of oneself 愚弄某人
make a big money 赚大钱
make both ends need 收支平衡
We have to cut our expenses to make both ends need.
make waves 引起轰动;兴风作浪
His achievement made waves in his country.
make yourself at home 别拘束
no good 没有好结果
Bad mam comes to no good.
no kidding 不要开玩笑
none of your bussiness 不关你的事
not really 也不是……
old hand 老手
He is an old hand at stealing.
old story 老一套
I am tired of it,same old story.
on one's word of honor 以某人的人格担保
on occasion 间或
of one's own accord 自愿地
packed like sardins 拥挤
During the holidays,people in the trains are packed like sardins.
pass away 去世
pay the price 付出代价
You are playing with the fire and you must pay the poice one day.
put up with 忍受
I cann't put up with your rudeness any more;leave my room.
red-letter day 重要的或值得纪念的日子
red tape 繁文缛节
red carpet 红地毯
run into 偶遇
I ran into an old friend in the shop yesterday.
run out of 用尽,缺少
Quick,quick,we are running out of time.
show up 炫耀
small potatos 小人物
so what? 那怎么样呢?
stand up for 忍受
suit one's taste 对某人的胃口
sunday dress 最好的衣服
sure thing 十有把握的事
take one's time 尽情玩
Take your time and enjoy it.
take the words out of one's mouth 说出某人想说得话
that's it 就是
that is really something 太好了
there is nothing I can do 我什么都不能做
there you go 这边请
there is nothing wrong with me 我没事
under the table 死底下,秘密地
under the weather 身体不适
what's going on 怎么了
what a man 多勇敢的人啊
walking dictionary 活字典
what is up 近来可好
Hi,I haven't seen you for a long time,what's up?
world class 一流的


Tuesday, June 14, 2011



Friday, June 10, 2011

Perl and Linux

sed -i "s/Linux/Linuxidc/g" `grep Linux -rl /home/dir`
perl -p -i -e "s/Linux/Linuxidc/g" *.java

本篇文章来源于 Linux公社网站(http://www.linuxidc.com/)  原文链接:http://www.linuxidc.com/Linux/2008-02/10954.htm


use File::Copy;
 copy("file1","file2") or die "Copy failed: $!";

The copy function takes two parameters: a file to copy from and a file to copy to.

The move function also takes two parameters: the current name and the intended name of the file to be moved. If the destination already exists and is a directory, and the source is not a directory, then the source file will be renamed into the directory specified by the destination.

If possible, move() will simply rename the file. Otherwise, it copies the file to the new location and deletes the original.
sources: http://perldoc.perl.org/File/Copy.html

Sunday, June 5, 2011


  18, 醋渍:衣物上沾上了醋迹或酱油迹,可撒上少许白砂糖搓揉,再用温水洗净。

source: http://wenwen.soso.com/z/q2006577134.htm?pid=mail.wen8