Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 06-19-2012
Registered User
 
Join Date: Jan 2012
Posts: 267
Thanks: 222
Thanked 4 Times in 4 Posts
Histograms on Linux

Hi,

I have a file of the following style


Code:
chr1 122333 1223344 1.87887877778
chr1 2324343 433433 1.98794334343
chr2 343545 54545454 9.0009123232
chr3 843749 89449379 10.004343722
................................................
................................................

I would like to plot a histogram with normal curve for the above input file's 4th column.

I have been working with R. I could get the histogram but the normal curve is being truncated. There are more than 2090000 records in the input file and the range of values in the 4th column is from 1.7566666 till 162430.090000.

Any helps appreciated. This is the R code I have been using..


Code:
setwd("/path/")
data<-read.table("input.txt",sep="\t",header=F)
x<-data$V4
h<-hist(x, axes=FALSE, xaxt='n', yaxt='n',col="green",main="GATA3-TPMDistribution",xlab="TPMValues",probability=TRUE)
axis (side=1, at=seq(0,162450, 1000))
axis (side=2, at=seq(0,0.008,0.0001))
s=sd(x)
m=mean(x)
curve(dnorm(x,mean=m,sd=s),add=TRUE,lwd=3,col="red")
lines(density(x),col="blue")
abline(v=mean(x),col="blue")
mtext(paste("mean",round(mean(x),1),";sd",round(sd(x),1),";N",length(x),sep=""),side=1,cex=.75)
dev.off()

Any thoughts on how to achieve the samething on Linux.

Last edited by jacobs.smith; 06-19-2012 at 02:00 PM..
Sponsored Links
    #2  
Old 06-19-2012
Mead Rotor
 
Join Date: Aug 2005
Location: Saskatchewan
Posts: 16,371
Thanks: 490
Thanked 2,534 Times in 2,417 Posts
In what way is it being truncated? What are the results you expect, and what are the results you get?

How large is your input data in bytes, not records?
Sponsored Links
    #3  
Old 06-19-2012
Registered User
 
Join Date: Jan 2012
Posts: 267
Thanks: 222
Thanked 4 Times in 4 Posts
Hi Corona,

I could only see the end of the normal curve. But, I would like to see the whole normal curve, I mean its start, peak and stop.

I would like to look at the distribution of the values in the 4th column using an histogram. The results I see are fine enough, but without properly looking into the curve, I can't infer anything.

My input file is around 30MB.

Please let me know if you would like to know anything in detail.

I would like to have my histogram with a curve like this -




But, I am getting it this way -



Last edited by jacobs.smith; 06-19-2012 at 12:03 PM..
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes



All times are GMT -4. The time now is 05:27 AM.