Select lines based on character length


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Select lines based on character length
# 1  
Old 03-14-2016
Select lines based on character length

Hi,

I've got a file like this:

HTML Code:
22	22:35645163:T:<CN0>:0	0	35645163	T	<CN0>
22	rs140738445:20902439:TTTTTTTG:T	0	20902439	T	TTTTTTTG
22	rs149602065:40537763:TTTTTTG:T	0	40537763	T	TTTTTTG
22	rs71670155:50538408:TTTTTTG:T	0	50538408	T	TTTTTTG
22	rs147956527:27899116:TTTTTG:T	0	27899116	T	TTTTTG
22	rs112169882:26309326:T:TTTTTC	0	26309326	T	TTTTTC
22	rs112170669:29942398:T:TTTTTC	0	29942398	T	TTTTTC
22	rs148467612:32268721:TTTTTA:T	0	32268721	T	TTTTTA
22	rs71806779:32681699:TTTTTA:T	0	32681699	T	TTTTTA
22	rs7291429	0	17294251	G	T
22	rs2192431:17303596:T:G	0	17303596	G	T
22	rs175140	0	17306104	G	T
22	rs175147:17309362:G:T	0	17309362	G	T
22	rs12628206:17316990:T:G	0	17316990	G	T
22	rs7510758:17432482:T:G	0	17432482	G	T
22	rs4819923:17433210:T:G	0	17433210	G	T
And I need to print out lines that have only one character in columns 5 and 6. So, the output should look like this:

HTML Code:
22	rs7291429	0	17294251	G	T
22	rs2192431:17303596:T:G	0	17303596	G	T
22	rs175140	0	17306104	G	T
22	rs175147:17309362:G:T	0	17309362	G	T
22	rs12628206:17316990:T:G	0	17316990	G	T
22	rs7510758:17432482:T:G	0	17432482	G	T
22	rs4819923:17433210:T:G	0	17433210	G	T
So far, I've tried to use awk:

HTML Code:
awk '{print $5,$6}' in.file | awk 'length <3' > out.file
Unfortunately, (1) it is not that great because I do not get the whole line in the end and (2) it does not select the lines that I need.

Any help would be greatly appreciated!

Many thanks!
# 2  
Old 03-14-2016
How about
Code:
awk '$5 $6 ~ "^..$"' file
22    rs7291429    0    17294251    G    T
22    rs2192431:17303596:T:G    0    17303596    G    T
22    rs175140    0    17306104    G    T
22    rs175147:17309362:G:T    0    17309362    G    T
22    rs12628206:17316990:T:G    0    17316990    G    T
22    rs7510758:17432482:T:G    0    17432482    G    T
22    rs4819923:17433210:T:G    0    17433210    G    T

---------- Post updated at 11:59 ---------- Previous update was at 11:57 ----------

This MAY fail if either field is empty and the other has two chars. Then, try
Code:
awk '$5 "," $6 ~ "^.,.$"' file

# 3  
Old 03-14-2016
Or
Code:
awk 'length($5)==1 && length($6)==1' in.file > out.file

This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 03-14-2016
Select lines based on character length

Great solutions! (so simple, as usual!)
Many thanks!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to select lines with maximum value of each record based on column value

Hello, I want to get the maximum value of each record separated by empty line based on the 3rd column of each row within each record? Input: A1 chr5D 634 7 82 707 A2 chr5D 637 6 82 713 A3 chr5D 637 5 82 713 A4 chr5D 626 1 82 704... (4 Replies)
Discussion started by: yifangt
4 Replies

2. Shell Programming and Scripting

Add string based on character length

Good day, I am a newbie here and thanks for accepting me I have a task to modify input data where my input data looks like 123|34567|CHINE 1|23|INDIA 34512|21|USA 104|901|INDIASee that my input has two columns with different character length but max length is 5 and minimum length is 0 which... (1 Reply)
Discussion started by: fastlearner
1 Replies

3. Shell Programming and Scripting

Delimit file based on character length using awk

Hi, I need help with one problem, I came across recently. I have one input file which I need to delimit based on character length. $ cat Input.txt 12345sda231453 asd760kjol62569 sdasw4g76gdf57 And, There is one comma separated file which mentions "start of the field" and "length... (6 Replies)
Discussion started by: Prathmesh
6 Replies

4. Shell Programming and Scripting

Joining lines in TXT file based on first character

Hi, I have a pipe delimeted text file where lines have been split over 2 lines and I need to join them back together. For example the file I have is similar to the following: aaa|bbb |ccc ddd|eee fff|ggg |hhh I ideally need to have it looking like the following aaa|bbb|ccc ddd|eee... (5 Replies)
Discussion started by: fuji_s
5 Replies

5. Shell Programming and Scripting

Replacing a character with a number based on lines

Hi, I am in need of help for the two things which is to be done. First, I have a file that has around four columns. The first column is filled with letter "A". There are around 400 lines in the files as shown below. A 1 5.2 3.2 A 2 0.2 4.5 A 1 2.2 2.2 A 5 2.1 ... (2 Replies)
Discussion started by: begin_shell
2 Replies

6. Shell Programming and Scripting

Select lines from a file based on a criteria

Hi I need to select lines from a txt file, I have got a line starting with ZMIO:MSISDN= and after a few line I have another line starting with 'MOBILE STATION ISDN NUMBER' and another one starting with 'VLR-ADDRESS' I need to copy these three lines as three different columns in a separate... (3 Replies)
Discussion started by: Tlcm sam
3 Replies

7. Shell Programming and Scripting

Short program to select lines from a file based on a second file

Hello, I use UBUNTU 12.04. I want to write a short program using awk to select some lines in a file based on a second file. My first file has this format with about 400,000 lines and 47 fields: SNP1 1 12.1 SNP2 1 13.2 SNP3 1 45.2 SNP4 1 23.4 My second file has this format: SNP2 SNP3... (1 Reply)
Discussion started by: Homa
1 Replies

8. UNIX for Dummies Questions & Answers

how to select lines from one file based on another file

Hi, I would like to know how can I select lines of one file based on a common ID column from another file (keeping the order of the second file). Example of file1: ID A B C D 1-30 1 2 3 5-60 4 5 6 1-20 7 8 9 Example of file2: ID chr pos 1-20 1 20 1-30 1 30 5-60 5 60 Desired... (2 Replies)
Discussion started by: fadista
2 Replies

9. Shell Programming and Scripting

Add character based on record length

All, I can't seem to find exactly what I'm looking for, and haven't had any luck patching things together. I need to look through a file, and if the record length is not 874, then add 'E' in position 778. Your help is greatly appreciated. (4 Replies)
Discussion started by: CutNPaste
4 Replies

10. Shell Programming and Scripting

Merging lines based on occurances of a particular character in a file

Hi, Is there any way to merge two lines based on specific occurance of a character in a file. I am having a flat file which contains multiple records. Each row in the file should contain specified number of delimiter. For a particular row , if the delimiter count is not matched with... (2 Replies)
Discussion started by: mohan_tuty
2 Replies
Login or Register to Ask a Question