Here there are four tab-separated columns. The first column is the used algorithm for prediction, and there are 4 of them A-D. The second column are the predicted targets (which actually are genes), x and y. The third and fourth column indicate the start and the end of the predicted site in the sequence of the genes.
I'd need to unique the entries in column 2, based on the common range in the columns 3 and 4, something like this:
Here, for example, at the first line we have algorithms A, B and C which predict the gene x, and the predicted positions all fall into the same site, i.e. the position 70-80 for algorithm B and 75-85 for algorithm C are both located inside the same predicted position by algorithm A, which is 65-85; and the last column indicates how many algorithms predicted this position. On the contrary, the predicted site by algorithm D for the entry x does not coincide with the others, so is presented in a separate line. The results for the entry y are explained in the same way.
Hope this is clear.
Thank you in advanced
Last edited by flyfisherman; 01-25-2014 at 02:42 PM..
Howdy experts,
We have some ranges of number which belongs to particual group as below.
GroupNo StartRange EndRange
Group0125 935300 935399
Group2006 935400 935476
937430 937459
Group0324 935477 935549
... (6 Replies)
Hi,
I am going to fetch a list of numbers that starts with "0032" from a file with a format like the given below:
"
0032459999 0032458888 0032457777
0032451111 0032452222 0032453333
0032459999 0032458888 0032457777
0032451111 0032452222 0032453333
"
I want to get a unique... (6 Replies)
Hi, I have a small piece of awk code (see below) that generates random numbers.
gawk -F"," 'BEGIN { srand(); for (i = 1; i <= 30; i++) printf("%s AM329_%04d\n",$0,int(36 * rand())+1) }' OFS=, AM329_hole_names.csv
The code works fine and generates alphanumeric numbers like AM329_0001,... (2 Replies)
I am trying to extract specific information from a large *.sam file (it's originally 28Gb).
I want to extract all lines that are on chr3 somewhere in the range of 112,937,439-113,437,438.
Here is a sample line from my file so you can get a feel for what each line looks like:
seq.4 0 ... (8 Replies)
I want to create entries based on the series as in examples below:
Input:
2dat3 grht-5&&-15
3dat3 grht-16&&-30
4dat3 ftht-4&&-12
5sat3 ftht-16&&-20
Output:
2dat3 grht-5
2dat3 grht-6
2dat3 grht-7
2dat3 grht-8 (7 Replies)
Hi all,
I wanted to save the values of a file that contains unique entries based on a specific column (column 4). my sample file looks like the following:
input file: 200006-07file.txt
145 35 10 3
147 35 12 4
146 36 11 3
145 34 12 5
143 31 15 4
146 30 14 5
desired output files:... (5 Replies)
I have some files named file1, file2, fille3......etc. These files are in a folder f1. The content of files are shown below. I would like to count the unique pairs of third column in each file. some files have no data. It should be printed as zero. Your help would be appreciated.
file1
ARG... (1 Reply)
Discussion started by: samra
1 Replies
LEARN ABOUT MINIX
roff
is a text formatter. Its input consists of the text to be out-
put, intermixed with formatting commands. A formatting commandis a line containing the control character followed by a twocharacter command name, and possibly one or more arguments. Thecontrol character is initially . (dot). The formatted output isproduced on standard output. The formatting commands are listedbelow, with being a number, being a character, and being a title.A + before n means it may be signed, indicating a positive ornegative change from the current value. Initial values for whererelevant, are given in parentheses.
.ad Adjust right margin.
.ar Arabic page numbers.
.br Line break. Subsequent text will begin on a new line.
.bl n Insert n blank lines.
.bp +n Begin new page and number it n. No n means +1.
.cc c Control character is set to c.
.ce n Center the next n input lines.
.de zz Define a macro called zz. A line with .. ends definition.
.ds Double space the output. Same as .ls 2.
.ef t Even page footer title is set to t.
.eh t Even page header title is set to t.
.fi Begin filling output lines as full as possible.
.fo t Footer titles (even and odd) are set to t.
.hc c The character c (e.g., %) tells roff where hyphens are permitted.
.he t Header titles (even and odd) are set to t.
.hx Header titles are suppressed.
.hy n Hyphenation is done if n is 1, suppressed if it is 0. Default is 1.
.ig Ignore input lines until a line beginning with .. is found.
.in n Indent n spaces from the left margin; force line break.
.ix n Same as .in but continue filling output on current line.
.li n Literal text on next n lines. Copy to output unmodified.
.ll +n Line length (including indent) is set to n (65).
.ls +n Line spacing: n (1) is 1 for single spacing, 2 for double, etc.
.m1 n Insert n (2) blank lines between top of page and header.
.m2 n Insert n (2) blank lines between header and start of text.
.m3 n Insert n (1) blank lines between end of text and footer.
.m4 n Insert n (3) blank lines between footer and end of page.
.na No adjustment of the right margin.
.ne n Need n lines. If fewer are left, go to next page.
.nn +n The next n output lines are not numbered.
.n1 Number output lines in left margin starting at 1.
.n2 n Number output lines starting at n. If 0, stop numbering.
.ni +n Indent line numbers by n (0) spaces.
.nf No more filling of lines.
.nx f Switch input to file f.
.of t Odd page footer title is set to t.
.oh t Odd page header title is set to t.
.pa +n Page adjust by n (1). Same as .bp
.pl +n Paper length is n (66) lines.
.po +n Page offset. Each line is started with n (0) spaces.
.ro Page numbers are printed in Roman numerals.
.sk n Skip n pages (i.e., make them blank), starting with next one.
.sp n Insert n blank lines, except at top of page.
.ss Single spacing. Equivalent to .ls 1.
.ta Set tab stops, e.g., .ta 9 17 25 33 41 49 57 65 73 (default).
.tc c Tabs are expanded into c. Default is space.
.ti n Indent next line n spaces; then go back to previous indent.
.tr ab Translate a into b on output.
.ul n Underline the letters and numbers in the next n lines.