Sponsored Content
Top Forums Shell Programming and Scripting awk to output lines less than number Post 302950958 by cmccabe on Thursday 30th of July 2015 01:12:49 PM
Old 07-30-2015
awk to output lines less than number

I am trying to output all lines in a file where $7 is less than 30. The below code does create a result file, but with all lines in the original file. The original file is tab deliminated is that the problem? Thank you Smilie.

Code:
 awk 'BEGIN{FS=OFS=","} $7 < 30 {print}' file.txt > result.txt

file.txt
Code:
chr1    40539722    40539865    chr1:40539722-40539865    PPT1    1    159
chr1    40539722    40539865    chr1:40539722-40539865    PPT1    2    161
chr1    40539722    40539865    chr1:40539722-40539865    PPT1    3    161
chr1    40539722    40539865    chr1:40539722-40539865    PPT1    3    28

Desired result.txt
Code:
 chr1    40539722    40539865    chr1:40539722-40539865    PPT1    3    28

---------- Post updated at 12:12 PM ---------- Previous update was at 11:59 AM ----------

It was the FS=OFS="'," .... should be OFS="/t" , guess I need to pay more attention. FS is Field seperator and OFS is Output Field Seperator, right? Thank you Smilie.

Last edited by cmccabe; 07-30-2015 at 02:14 PM.. Reason: added desired result
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print number of lines with awk ?

Can some body tell me how to print number of line from a particular file, with sed. ? Input file format AAAA BBBB CCCC SDFFF DDDD DDDD Command to print line 2 and 3 ? BBBB CCCC And also please tell me how to assign column sum to variable. I user the following command it... (1 Reply)
Discussion started by: maheshsri
1 Replies

2. Shell Programming and Scripting

awk - Counting number of similar lines

Hi All I have the input file OMAK_11. OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE ... (8 Replies)
Discussion started by: dhanamurthy
8 Replies

3. Shell Programming and Scripting

awk, ignore first x number of lines.

Is there a way to tell awk to ignore the first 11 lines of a file?? example, I have a csv file with all the heading information in the first lines. I want to split the file into 5-6 different files but I want to retain the the first 11 lines of the file. As it is now I run this command: ... (8 Replies)
Discussion started by: trey85stang
8 Replies

4. Shell Programming and Scripting

How do I print out lines with the same number in front using awk?

Hi, I need help in printing out the dates with the largest value in front of it using awk. 436 28/Feb/2008 436 27/Feb/2008 436 20/Feb/2008 422 13/Feb/2008 420 23/Feb/2008 409 21/Feb/2008 402 26/Feb/2008 381 22/Feb/2008 374 24/Feb/2008 360... (7 Replies)
Discussion started by: SIFA
7 Replies

5. Shell Programming and Scripting

How to (n)awk lines of CSV with certain number of fields?

I have a CSV file with a variable number of fields per record. How do I print lines of a certain number of fields only? Several permutations of the following (including the use of escape characters) have failed to retrieve the line I'm after (1,2,3,4)... $ cat myfile 1,2,3,4 1,2,3 $ # Print... (1 Reply)
Discussion started by: cs03dmj
1 Replies

6. Shell Programming and Scripting

Awk number of lines

How do I get the last NR of a csv file? If I use the line awk -F, '{print NR}' csvfile.csv and there are 42 lines, I get: ... 39 40 41 42 How do I extract the last number, which in this case is 42? ---------- Post updated at 11:05 AM ---------- Previous update was at 10:57 AM... (1 Reply)
Discussion started by: locoroco
1 Replies

7. Shell Programming and Scripting

awk number output

Hi, I have a problem when doing calculations in awk. I want to add up a few numbers and output the result. testfile: 48844322.87 7500.00 10577415.87 3601951.41 586877.64 1947813.89 $ awk '{x=x+$1};END{print x}' testfile 6.55659e+07The problem is the number format. It should show... (3 Replies)
Discussion started by: Subbeh
3 Replies

8. Shell Programming and Scripting

Getting awk Output as number instead of String

Hi All, I have a file a.txt, content as mentioned below: 22454750 This data in this control file and I have a variable called vCount which contains a number. I need to extract the 22454750 from the above file and compare with the variable vCount. If match fine or else exit. ... (5 Replies)
Discussion started by: Arun Mishra
5 Replies

9. Shell Programming and Scripting

awk - Skip x Number of Lines in Counter

Hello, I am new to AWK and in UNIX in general. I am hoping you can help me out here. Here is my data: root@ubuntu:~# cat circuits.list WORD1 AA BB CC DD Active ISP1 ISP NAME1 XX-XXXXXX1 WORD1 AA BB CC (9 Replies)
Discussion started by: tattoostreet
9 Replies

10. UNIX for Beginners Questions & Answers

How to output non-number lines with grep?

I want to check my data quality. I want to output the lines with non-number. I used the grep command: grep '' myfile.csv Since my file is csv file, I don't want to output the lines with comma. And I also don't want to output "." or space. But I still get the lines like the following:... (8 Replies)
Discussion started by: twotwo
8 Replies
tabix(1)						       Bioinformatics tools							  tabix(1)

NAME
bgzip - Block compression/decompression utility tabix - Generic indexer for TAB-delimited genome position files SYNOPSIS
bgzip [-cdhB] [-b virtualOffset] [-s size] [file] tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]] DESCRIPTION
Tabix indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com- mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally. OPTIONS OF TABIX
-p STR Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together with any of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff] -s INT Column of sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval. [1] -b INT Column of start chromosomal position. [4] -e INT Column of end chromosomal position. The end column can be the same as the start column. [5] -S INT Skip first INT lines in the data file. [0] -c CHAR Skip lines started with character CHAR. [#] -0 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based. -h Print the header/meta lines. -B The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line. -f Force to overwrite the index file if it is present. -l List the sequence names stored in the index file. EXAMPLE
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz; tabix -p gff sorted.gff.gz; tabix sorted.gff.gz chr1:10,000,000-20,000,000; NOTES
It is straightforward to achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data- bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly works with a lot of widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats. Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases do not. The GenCode annotation GTF can be compressed down to 4%. Thirdly, tabix is fast. The same indexing algorithm is known to work effi- ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but not the least, tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web services will be able to get a slice without downloading the entire file. AUTHOR
Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote file access and in-memory caching. SEE ALSO
samtools(1) tabix-0.2.0 11 May 2010 tabix(1)
All times are GMT -4. The time now is 03:04 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy