Better and efficient way to reverse search a file for first matched line number.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Better and efficient way to reverse search a file for first matched line number.
# 8  
Old 10-07-2011
Quote:
Originally Posted by alister
You are mistaken. Take a close look at the original code provided by the OP. It counts from the end of the file until the first match from the end.

Any approach that begins at the beginning of the file would have to read every line in the file for two reasons:

1) To be certain that the last match (first from the end of the file) has been found
2) The total number of lines in the file would be necessary to calculate a line number indexed from the end of the file (NR - x + 1).

Regards,
Alister
I actually mentioned about that "q" command with sed so it can take only a single-line address and when once the line matching address reached, the sed will be terminated,therefore, just sed with q must spend less effort than tail reads and use pipe(s)..

Code:
# wc -l <infile
   83678

# grep -n 'Start' infile
1001:Start

Code:
# time tail infile | sed -n '/Start/{=;q;}'

real    0m0.072s
user    0m0.019s
sys     0m0.046s

Code:
# time sed -n '/Start/{=;q;}' infile
1001

real    0m0.037s
user    0m0.010s
sys     0m0.022s

regards
ygemici
# 9  
Old 10-07-2011
For some reason awk/gawk solution is not working on my Solaris8.
Code:
 
$> vStart='Started'; awk -vn=$vStart '$0~n{p=NF+1} END{print p}' abc.log
awk: syntax error near line 1
awk: bailing out near line 1
 
$> vStart='Started';nawk -vn=$vStart '$0~n{print NR;exit}' abc.log
nawk: empty regular expression
 input record number 1, file abc.log
 source line number 1

Thanks to Alister,
With this I am getting what I wanted.
Code:
 
$> tail -r abc.log | sed -n '/Started/{=;q;}' 
2


Just wondering 'sed' has a strange way of reversing a file.
Code:
 
sed -n '1!G;h;$p'

Can we replace 'tail -r <file>' with sed's reversing and printing matched line#?
Code:
 
$> sed -n -e '1!G;h;$p; /Started/{=;q}' abc.log
sed: command garbled: 1!G;h;$p; /Started/{=;q}

ygemici, Thanks for the suggestions, if its not for reverse search I agree 'sed -n '/Start/{=;q;}' infile' would be fastest. There is no point in using tail for regular forward search. For some reason I could not get awk/gawk solutions to work,, otherwise I would have done time test myself.
# 10  
Old 10-07-2011
Quote:
Originally Posted by ygemici
Code:
# time tail infile | sed -n '/Start/{=;q;}'

real    0m0.072s
user    0m0.019s
sys     0m0.046s

Code:
# time sed -n '/Start/{=;q;}' infile
1001

real    0m0.037s
user    0m0.010s
sys     0m0.022s

What's the point of comparing the times of two commands when one of the commands gives an incorrect result? In that comparison, only the tail|sed pipeline gives the desired result.

Regards,
Alister


---------- Post updated at 09:29 AM ---------- Previous update was at 09:24 AM ----------

Quote:
Originally Posted by kchinnam
For some reason awk/gawk solution is not working on my Solaris8.
None of the awk solutions are equivalent to your original code.


Quote:
Originally Posted by kchinnam
Thanks to Alister,
With this I am getting what I wanted.
Code:
 
$> tail -r abc.log | sed -n '/Started/{=;q;}' 
2

You're welcome.


Quote:
Originally Posted by kchinnam
Just wondering 'sed' has a strange way of reversing a file.
Code:
 
sed -n '1!G;h;$p'

Can we replace 'tail -r <file>' with sed's reversing and printing matched line#?
No, you cannot. The line number emitted by the = command would be incorrect.

The tail|sed is so very easy to understand. Why would you want to replace it with a single command whose meaning isn't immediately apparent even to a competent unix admin? Saving one fork-exec is not worth it (unless you're running something very many times in a tight loop). Please be kind to whomever has to look at this code in the future ... it could be you. Smilie

Regards,
Alister

Last edited by alister; 10-07-2011 at 11:58 AM..
# 11  
Old 10-07-2011
I totally agree Alister about choosing easier to understand code even at the expense of saving sime cpu cycles. I am good now.
# 12  
Old 10-07-2011
Quote:
Originally Posted by alister
What's the point of comparing the times of two commands when one of the commands gives an incorrect result? In that comparison, only the tail|sed pipeline gives the desired result.

Regards,
Alister
tail|sed pipeline gives the anything if your pattern is not in the last 10 lines.
Code:
# sed -n '/Start/{=;q;}' infile
1001

Code:
# tail infile|sed -n '/Start/{p;q;}'

i meant if the pattern is in the beginning of the file then tail is not meaningless.because sed already quit when the pattern matches oncely.

Code:
I actually mentioned about that "q" command with sed so it can take only a single-line address and 
when once the line matching address reached, the sed will be terminated,therefore, 
just sed with q must spend less effort than tail reads and use pipe(s)..

therefore i compare its.Smilie

regards
ygemici

---------- Post updated at 05:15 PM ---------- Previous update was at 04:54 PM ----------

Quote:
Originally Posted by kchinnam
For some reason awk/gawk solution is not working on my Solaris8.
[CODE]

Just wondering 'sed' has a strange way of reversing a file.
Code:
 
sed -n '1!G;h;$p'

Can we replace 'tail -r <file>' with sed's reversing and printing matched line#?
Code:
 
$> sed -n -e '1!G;h;$p; /Started/{=;q}' abc.log
sed: command garbled: 1!G;h;$p; /Started/{=;q}

ygemici, Thanks for the suggestions, if its not for reverse search I agree 'sed -n '/Start/{=;q;}' infile' would be fastest. There is no point in using tail for regular forward search. For some reason I could not get awk/gawk solutions to work,, otherwise I would have done time test myself.
Code:
# sed -n '1!G;h;$p'

this code is seem correct but sed has to read all lines for "1!G;h.." commands to all lines.and this should not be used for large files..and sed has to use large buffer in its process mem tmp areas.
you can use like this for not large files. and of course if you do not rush Smilie
Code:
# sed '1!G;h;$p' infile | sed -n '/121/{=;q;}'

regards
ygemici
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Search a string inside a pattern matched block of a file

How to grep for searching a string within a begin and end pattern of a file. Sent from my Redmi 3S using Tapatalk (8 Replies)
Discussion started by: Baishali
8 Replies

2. Shell Programming and Scripting

Efficient way to search array in text file by awk

I have one array SPLNO with approx 10k numbers.Now i want to search the subscriber number from MDN.TXT file (containing approx 1.5 lac record)from the array.if subscriber number found in array it will perform below operation.my issue is that it's taking more time because for one number it's search... (6 Replies)
Discussion started by: siramitsharma
6 Replies

3. Shell Programming and Scripting

Perl - use search keywords from array and search a file and print 3rd field when matched

Hi , I have been trying to write a perl script to do this job. But i am not able to achieve the desired result. Below is my code. my $current_value=12345; my @users=("bob","ben","tom","harry"); open DBLIST,"<","/var/tmp/DBinfo"; my @input = <DBLIST>; foreach (@users) { my... (11 Replies)
Discussion started by: chidori
11 Replies

4. Shell Programming and Scripting

Need an efficient way to search for a tag in an xml file having millions of rows

Hi, I have an XML file with around 1 billion rows in it and i am trying to find the number of times a particular tag occurs in it. The solution i am using works but takes a lot of time (~1 hr) .Please help me with an efficient way to do this. Lets say the input file is <Root> ... (13 Replies)
Discussion started by: Sheel
13 Replies

5. Shell Programming and Scripting

search a string in a particular column of file and return the line number of the line

Hi All, Can you please guide me to search a string in a particular column of file and return the line number of the line where it was found using awk. As an example : abc.txt 7000,john,2,1,0,1,6 7001,elen,2,2,0,1,7 7002,sami,2,3,0,1,6 7003,mike,1,4,0,2,1 8001,nike,1,5,0,1,8... (3 Replies)
Discussion started by: arunshankar.c
3 Replies

6. Shell Programming and Scripting

how do I break line in a file when a pattern is matched ?

Hi All, I am stuck for quite sometime now. Below is a line in my file - GS|ED|001075|001081|20110626|1806|100803|X|004010ST|130|100803001 This line occurs only once and it is the second line. I have to break this line into two lines from ST (bold) such that it looks like -... (5 Replies)
Discussion started by: ihussain
5 Replies

7. Shell Programming and Scripting

Match a line in File 1 with Column in File 2 and print whole line in file 2 when matched

Hi Experts, I am very new to scripting and have a prb since few days and it is urgent to solve so much appreciated if u help me. i have 2 files file1.txt 9647810043118 9647810043126 9647810043155 9647810043161 9647810043166 9647810043185 9647810043200 9647810043203 9647810043250... (22 Replies)
Discussion started by: mustafa.abdulsa
22 Replies

8. UNIX for Dummies Questions & Answers

How to read contents of a file from a given line number upto line number again specified by user

Hello Everyone. I am trying to display contains of a file from a specific line to a specific line(let say, from line number 3 to line number 5). For this I got the shell script as shown below: if ; then if ; then tail +$1 $3 | head -n $2 else ... (5 Replies)
Discussion started by: grc
5 Replies

9. Shell Programming and Scripting

reverse search a text file from a specified line

Hello All, I have a following task that I need to accomplish through a script or program and I am looking for some help as I have exhausted my ideas. 1. given: a text file with thousands of lines 2. find: pattern A in file and get line number ( grep -n works) 3. find: the first occurence of... (14 Replies)
Discussion started by: PacificWonder
14 Replies

10. Shell Programming and Scripting

search for the matched pattern by tracing back from the line

Hi, I want to grep the line which has 'data11'.then from that line, i need to trace back and find out the immediate line which has the same timestamp of that grepped line. for eg: log file: ----------- Process - data Process - datavalue - 2345 Process - data Process - data Process... (9 Replies)
Discussion started by: Sharmila_P
9 Replies
Login or Register to Ask a Question