awk to output lines less than number Post: 302950958

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print number of lines with awk ?

Can some body tell me how to print number of line from a particular file, with sed. ? Input file format AAAA BBBB CCCC SDFFF DDDD DDDD Command to print line 2 and 3 ? BBBB CCCC And also please tell me how to assign column sum to variable. I user the following command it...

2. Shell Programming and Scripting

awk - Counting number of similar lines

Hi All I have the input file OMAK_11. OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE ...

3. Shell Programming and Scripting

awk, ignore first x number of lines.

Is there a way to tell awk to ignore the first 11 lines of a file?? example, I have a csv file with all the heading information in the first lines. I want to split the file into 5-6 different files but I want to retain the the first 11 lines of the file. As it is now I run this command: ...

4. Shell Programming and Scripting

How do I print out lines with the same number in front using awk?

Hi, I need help in printing out the dates with the largest value in front of it using awk. 436 28/Feb/2008 436 27/Feb/2008 436 20/Feb/2008 422 13/Feb/2008 420 23/Feb/2008 409 21/Feb/2008 402 26/Feb/2008 381 22/Feb/2008 374 24/Feb/2008 360...

5. Shell Programming and Scripting

How to (n)awk lines of CSV with certain number of fields?

I have a CSV file with a variable number of fields per record. How do I print lines of a certain number of fields only? Several permutations of the following (including the use of escape characters) have failed to retrieve the line I'm after (1,2,3,4)... $ cat myfile 1,2,3,4 1,2,3 $ # Print...

6. Shell Programming and Scripting

Awk number of lines

How do I get the last NR of a csv file? If I use the line awk -F, '{print NR}' csvfile.csv and there are 42 lines, I get: ... 39 40 41 42 How do I extract the last number, which in this case is 42? ---------- Post updated at 11:05 AM ---------- Previous update was at 10:57 AM...

7. Shell Programming and Scripting

awk number output

Hi, I have a problem when doing calculations in awk. I want to add up a few numbers and output the result. testfile: 48844322.87 7500.00 10577415.87 3601951.41 586877.64 1947813.89 $ awk '{x=x+$1};END{print x}' testfile 6.55659e+07The problem is the number format. It should show...

8. Shell Programming and Scripting

Getting awk Output as number instead of String

Hi All, I have a file a.txt, content as mentioned below: 22454750 This data in this control file and I have a variable called vCount which contains a number. I need to extract the 22454750 from the above file and compare with the variable vCount. If match fine or else exit. ...

9. Shell Programming and Scripting

awk - Skip x Number of Lines in Counter

Hello, I am new to AWK and in UNIX in general. I am hoping you can help me out here. Here is my data: root@ubuntu:~# cat circuits.list WORD1 AA BB CC DD Active ISP1 ISP NAME1 XX-XXXXXX1 WORD1 AA BB CC

10. UNIX for Beginners Questions & Answers

How to output non-number lines with grep?

I want to check my data quality. I want to output the lines with non-number. I used the grep command: grep '' myfile.csv Since my file is csv file, I don't want to output the lines with comma. And I also don't want to output "." or space. But I still get the lines like the following:...

LEARN ABOUT DEBIAN

tabix

tabix(1)						       Bioinformatics tools							  tabix(1)

NAME

       bgzip - Block compression/decompression utility

       tabix - Generic indexer for TAB-delimited genome position files

SYNOPSIS

       bgzip [-cdhB] [-b virtualOffset] [-s size] [file]

       tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]]

DESCRIPTION

       Tabix  indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com-
       mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is
       able  to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over
       network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally.

OPTIONS OF TABIX

       -p STR	 Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together  with  any
		 of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff]

       -s INT	 Column  of  sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval.
		 [1]

       -b INT	 Column of start chromosomal position. [4]

       -e INT	 Column of end chromosomal position. The end column can be the same as the start column. [5]

       -S INT	 Skip first INT lines in the data file. [0]

       -c CHAR	 Skip lines started with character CHAR. [#]

       -0	 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based.

       -h	 Print the header/meta lines.

       -B	 The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will
		 be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line.

       -f	 Force to overwrite the index file if it is present.

       -l	 List the sequence names stored in the index file.

EXAMPLE

       (grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz;

       tabix -p gff sorted.gff.gz;

       tabix sorted.gff.gz chr1:10,000,000-20,000,000;

NOTES

       It  is  straightforward	to  achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data-
       bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly  works  with  a
       lot  of	widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats.
       Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases	do
       not.  The  GenCode annotation GTF can be compressed down to 4%.	Thirdly, tabix is fast. The same indexing algorithm is known to work effi-
       ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but  not  the
       least,  tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web
       services will be able to get a slice without downloading the entire file.

AUTHOR

       Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote  file  access
       and in-memory caching.

SEE ALSO

       samtools(1)

tabix-0.2.0							    11 May 2010 							  tabix(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print number of lines with awk ?

Discussion started by: maheshsri

2. Shell Programming and Scripting

awk - Counting number of similar lines

Discussion started by: dhanamurthy

3. Shell Programming and Scripting

awk, ignore first x number of lines.

Discussion started by: trey85stang

4. Shell Programming and Scripting

How do I print out lines with the same number in front using awk?

Discussion started by: SIFA