Grep in delimited file


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Grep in delimited file
# 1  
Old 07-06-2009
Grep in delimited file

Here is my file : (pipe delimited file)

abc|abc|11112|ffff|3333|oo
abc|abc|11112|ffff|3333|oo
abc|11112|ffff|2222|oo
abc|abc|11112|ffff|3333|oo
abc|22222|ffff|1111|oo

I need 2 things :
1. Need to grep for all rows where in 5th column is 3333. (i can do this using awk.. but my actual file has around 18 million rows... so awk takes more than an hour as 'awk' reads the file line by line).. so i am looking for a solution using grep.
2. Need to grep for alll the lines having the pattern 'abc' in column 1 and 'ffff' in column 4th.

please let me know whether this looks feasible .. thx a lot for helping me out !!!
# 2  
Old 07-06-2009
Quote:
Originally Posted by vinithepoo
Here is my file : (pipe delimited file)

abc|abc|11112|ffff|3333|oo
abc|abc|11112|ffff|3333|oo
abc|11112|ffff|2222|oo
abc|abc|11112|ffff|3333|oo
abc|22222|ffff|1111|oo

I need 2 things :
1. Need to grep for all rows where in 5th column is 3333. (i can do this using awk.. but my actual file has around 18 million rows... so awk takes more than an hour as 'awk' reads the file line by line).. so i am looking for a solution using grep.
What makes you think that grep does not read the file 'line by line'?
Quote:
Originally Posted by vinithepoo

2. Need to grep for alll the lines having the pattern 'abc' in column 1 and 'ffff' in column 4th.

please let me know whether this looks feasible .. thx a lot for helping me out !!!
It's feasible and 'awk' would be one of tools to use.
Please post your 'awk' code - let's see what you've got.
# 3  
Old 07-10-2009
try this instead

Code:
#!/usr/bin/perl

$i=0;
while ( $i eq 0 ){
$line=<STDIN>;
$line=~ s/\n//g;
$line=~ s/\r//g;
($a, $b, $c, $d, $e, $f) = split ("|", $line);
if ( "$d" eq "ffff" ){
if ( "$a" eq "abc" ){
print "$line";
}
}

}

-----------

Or with grep, mabey try

Code:
cat $yourfile|grep -i -e "ffff"|grep -i -e "abc"

this will show all lines only containing both ffff and abc, the perl script I gave you is more controled.

hope it helps


Quick question, Why do you want to return all the information that is duplicated? are you grabbing information from another field?

Last edited by vgersh99; 07-10-2009 at 08:13 PM.. Reason: code tags, PLEASE!
# 4  
Old 07-10-2009
Quote:
Originally Posted by sighK
try this instead

Code:
#!/usr/bin/perl

$i=0;
while ( $i eq 0 ){
$line=<STDIN>;
$line=~ s/\n//g;
$line=~ s/\r//g;
($a, $b, $c, $d, $e, $f) = split ("|", $line);
if ( "$d" eq "ffff" ){
if ( "$a" eq "abc" ){
print "$line";
}
}

}

-----------

Or with grep, mabey try
Code:
cat $yourfile|grep -i -e "ffff"|grep -i -e "abc"

Why exactly do you need a 'cat' here?
Why do you need '-e' for 'grep'?
Quote:
Originally Posted by sighK
this will show all lines only containing both ffff and abc, the perl script I gave you is more controled.

hope it helps


Quick question, Why do you want to return all the information that is duplicated? are you grabbing information from another field?
# 5  
Old 07-10-2009
sure

cat dumps entire file to output

| is a pipe to push into another application's stdin

grep -i -e

-i removes case

-e "string to search for"

I pipe it into another instance of grep to search for the second string, this will make sure all lines output have both instances inside of it. As far as i know, grep doesn't use an internal field separator, so use perl,python or even c if you want something faster then awk.

The perl script I gave you should work with little modification to what you acctually want. I made that in reguards to what I thought you needed.
# 6  
Old 07-10-2009
Quote:
Originally Posted by sighK
cat dumps entire file to output

| is a pipe to push into another application's stdin
Once again WHY do you 'cat'? Read this url
Quote:
Originally Posted by sighK
grep -i -e

-i removes case

-e "string to search for"
I believe '-e' is for the Extended regular expression, which you don't have - loose the '-e'.
Quote:
Originally Posted by sighK

I pipe it into another instance of grep to search for the second string, this will make sure all lines output have both instances inside of it. As far as i know, grep doesn't use an internal field separator, so use perl,python or even c if you want something faster then awk.
You don't need 2 greps piping to each other - instead use ONE 'sed':
Code:
sed '/ffff/!d; /abc/!d'

Quote:
Originally Posted by sighK

The perl script I gave you should work with little modification to what you acctually want. I made that in reguards to what I thought you needed.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grep solutions tab-delimited file

Hello, I am trying to find a solution to problem that's proving to be beyond my newbie skills. The below files comes from a genetics study. File 1 describes a position on the genome and file 2 does the same but is formatted differently and has more information. I am trying to match all lines in... (5 Replies)
Discussion started by: andmal
5 Replies

2. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

3. Shell Programming and Scripting

How to make tab delimited file to space delimited?

Hi How to make tab delimited file to space delimited? in put file: ABC kgy jkh ghj ash kjl o/p file: ABC kgy jkh ghj ash kjl Use code tags, thanks. (1 Reply)
Discussion started by: jagdishrout
1 Replies

4. Shell Programming and Scripting

awk read one delimited file, search another delimited file

Hello folks, I have another doozy. I have two files. The first file has four fields in it. These four fields map to different locations in my second file. What I want to do is read the master file (file 2 - 23 fields) and compare each line against each record in file 1. If I get a match in all four... (4 Replies)
Discussion started by: dagamier
4 Replies

5. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

6. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

7. Shell Programming and Scripting

convert a pipe delimited file to a':" delimited file

i have a file whose data is like this:: osr_pe_assign|-120|wg000d@att.com|4| osr_evt|-21|wg000d@att.com|4| pe_avail|-21|wg000d@att.com|4| osr_svt|-11|wg000d@att.com|4| pe_mop|-13|wg000d@att.com|4| instar_ready|-35|wg000d@att.com|4| nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Discussion started by: priyanka3006
6 Replies

8. UNIX for Dummies Questions & Answers

Converting Space delimited file to Tab delimited file

Hi all, I have a file with single white space delimited values, I want to convert them to a tab delimited file. I tried sed, tr ... but nothing is working. Thanks, Rajeevan D (16 Replies)
Discussion started by: jeevs81
16 Replies

9. Shell Programming and Scripting

Grep for NULL in a pipe delimited file

hi, How can I check for a field in a pipe-delimited file having a NULL value in Unix using a grep command or any other command. Please reply (5 Replies)
Discussion started by: sureshg_sampat
5 Replies

10. Shell Programming and Scripting

Converting Tab delimited file to Comma delimited file in Unix

Hi, Can anyone let me know on how to convert a Tab delimited file to Comma delimited file in Unix Thanks!! (22 Replies)
Discussion started by: charan81
22 Replies
Login or Register to Ask a Question