Sponsored Content
Top Forums Shell Programming and Scripting Help generating a script for next-generation sequencing data Post 302578109 by m.d.ludwig on Wednesday 30th of November 2011 09:56:24 PM
Old 11-30-2011
Kelly,

Can the (non-".") value of the 15th column of data occur more than once in a file. And if so, does that increment the frequency of that value? If so, then you might find something like:
Code:
awk '
    BEGIN { FS = OFS = "\t"; } # tab is the column separator?
    FNR == 1 { next; }
    { N[$(15)]++; }
    END { for (p in N) { print n, N[p]; }
' file1 file2... > outputfile

might help get you started.
 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

generating data for 1 hour

Hi Folks, The reqirement is that i need to generate 1 hr file with a time interval of five minutes.. For ex: my i/p is 0000-0000 and desired o/p is 0000-0005 0005-0010 0010-0015 0015-0020 0020-0025 0025-0030 0030-0035 0040-0045 0050-0055 0055-0100 Script neede urgent ... (0 Replies)
Discussion started by: aajan
0 Replies

2. Virtualization and Cloud Computing

Cloud Enabling Computing for the Next Generation Data Center

Hear how the changing needs of massive scale-out computing is driving a transfomation in technology and learn how HP is supporting this new evolution of the web. More... (1 Reply)
Discussion started by: Linux Bot
1 Replies

3. Shell Programming and Scripting

generating reports based on time field of network data

hi i have data extracted in the following format ranging around 300000 to 800000 records in a text file , the format is of network data . No. Time Source Destination Protocol 1 1998-06-05 17:20:23.569905 HP_61:aa:c9 HP_61:aa:c9 ... (1 Reply)
Discussion started by: renukaprasadb
1 Replies

4. Shell Programming and Scripting

Sliding window for sequencing data

Hi! I have some sequencing data that I have aligned using maq software Now, I have data that looks like this each line is a 'tag' chr1 10001 chr1 10002 chr1 10005 chr1 10007 chr1 10008 chr1 10008 chr1 10008 chr1 10019 chr1 10019 chr1 10020 What I really want to find out is how... (1 Reply)
Discussion started by: biobio
1 Replies

5. UNIX for Dummies Questions & Answers

Generating 512MB file with dd using random data

Hello. Could anyone help me with my little annoying problem? I have to generate a 512 MB file made up with random data using DD. After some internet digging I found out that the command is: dd if=/dev/urandom of=/exemple/file bs=512MB After running this command the... (2 Replies)
Discussion started by: razolo13
2 Replies

6. Shell Programming and Scripting

Generating CSV from Column data

Hi List, I have a chunk of data like so: User Account Control: User Account Control: User Account Control: User Account Control: Disabled User Account Control: User Account Control: User Account Control: Disabled User Account Control: User Account Control: ... (3 Replies)
Discussion started by: landossa
3 Replies

7. Shell Programming and Scripting

Generating summary data (use awk?)

I have a data file similar to this (but many millions of lines long). You can assume that it is totally unsorted but has no duplicate rows. Date ,Tool_Type ,Tool_ID ,Time_Used 3/13/2014,Screwdriver,Screwdriver02, 6 3/13/2014,Screwdriver,Screwdriver02,20... (2 Replies)
Discussion started by: Michael Stora
2 Replies

8. Shell Programming and Scripting

Is there a way to handle commas inside the data when generating a csv file from shell script?

I am extracting data via sql query and some of the data has commas. Output File must be csv and I cannot update the data in the db (as it is used by other application). Example table FavoriteThings Person VARCHAR2(25), Favorite VARCHAR2(100) Sample Data Greta rain drop on... (12 Replies)
Discussion started by: patk625
12 Replies
paste(1)						      General Commands Manual							  paste(1)

Name
       paste - merge file data

Syntax
       paste file1 file2...
       paste -dlist file1 file2...
       paste -s [-dlist] file1 file2...

Description
       In  the	first  two forms, concatenates corresponding lines of the given input files file1, file2, etc.	It treats each file as a column or
       columns of a table and pastes them together horizontally (parallel merging).

       In the last form, the command combines subsequent lines of the input file (serial merging).

       In all cases, lines are glued together with the tab character, or with characters from an optionally specified  list.   Output  is  to  the
       standard output, so it can be used as the start of a pipe, or as a filter, if - is used in place of a file name.

Options
       -       Used in place of any file name, to read a line from the standard input.	(There is no prompting).

       -dlist  Replaces  characters  of  all but last file with nontabs characters (default tab).  One or more characters immediately following -d
	       replace the default tab as the line concatenation character.  The list is used circularly, i. e. when exhausted, it is reused.	In
	       parallel  merging  (i. e. no -s option), the lines from the last file are always terminated with a new-line character, not from the
	       list.  The list may contain the special escape sequences: 
 (new-line), 	 (tab), \ (backslash), and  (empty string, not a null
	       character).   Quoting  may  be  necessary,  if characters have special meaning to the shell (for example, to get one backslash, use
	       -d"\\" ).
	       Without this option, the new-line characters of each but the last file (or last line in case of the -s option) are  replaced  by  a
	       tab character.  This option allows replacing the tab character by one or more alternate characters (see below).

       -s      Merges  subsequent  lines  rather  than	one  from  each input file.  Use tab for concatenation, unless a list is specified with -d
	       option.	Regardless of the list, the very last character of the file is forced to be a new-line.

Examples
       ls | paste -d" " -
       list directory in one column
       ls | paste - - - -
       list directory in four columns
       paste -s -d"	
" file
       combine pairs of lines into lines

Diagnostics
       line too long
		 Output lines are restricted to 511 characters.

       too many files
		 Except for -s option, no more than 12 input files may be specified.

See Also
       cut(1), grep(1), pr(1)

																	  paste(1)
All times are GMT -4. The time now is 11:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy