Visit Our UNIX and Linux User Community


get part of file with unique & non-unique string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting get part of file with unique & non-unique string
# 1  
Old 09-17-2009
get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM".

I can find the line number for the beginning of the statement section with sed.

Code:
start=`sed -n '/123456/=' filename`

This will get me the starting line number. It is actually 5 lines before this but this is the line with the unique customer #. From here I would like to find the next occurrence of "ENDSTM" after the line where the customer # was found. Then I could do a sed to grab the section of the file I need based on the starting and ending line #s.

How would I do this? grep, sed, awk?

Or is there a way to get what I need with one command?

Thanks,
Andrew
# 2  
Old 09-17-2009
Quote:
Originally Posted by andrewsc
...
Or is there a way to get what I need with one command?
...
Yes, here's one way to do it with Perl:

Code:
$
$ cat -n f1
     1  456789
     2  stm1 - line 1
     3  stm1 - line 2
     4  stm1 - line 3
     5  ENDSTM
     6
     7  123456
     8  stm2 - line 1
     9  stm2 - line 2
    10  stm2 - line 3
    11  stm2 - line 4
    12  ENDSTM
    13
    14  345678
    15  stm3 - line 1
    16  stm3 - line 2
    17  ENDSTM
    18
    19  567890
    20  stm4 - line 1
    21  stm4 - line 2
    22  stm4 - line 3
    23  stm4 - line 4
    24  ENDSTM
    25
$
$ perl -ne 'BEGIN{undef $/} {/.*123456.(.*?)ENDSTM/s && print $1}' f1
stm2 - line 1
stm2 - line 2
stm2 - line 3
stm2 - line 4
$
$

You can also use awk on that data file:

Code:
$
$ ##
$ awk 'BEGIN{x=0}
>      /123456/ {x=1; getline}
>      /ENDSTM/ && x==1 {x=0}
>      x==1 {print}' f1
stm2 - line 1
stm2 - line 2
stm2 - line 3
stm2 - line 4
$
$

HTH,
tyler_durden

Last edited by durden_tyler; 09-17-2009 at 01:21 PM..
# 3  
Old 09-17-2009
Quote:
Originally Posted by durden_tyler
Yes, here's one way to do it with Perl:

Code:
$
$ ##
$ awk 'BEGIN{x=0}
>      /123456/ {x=1; getline}
>      /ENDSTM/ && x==1 {x=0}
>      x==1 {print}' f1
stm2 - line 1
stm2 - line 2
stm2 - line 3
stm2 - line 4
$
$

The awk gets the page(s) I need. However, on the first page I actually need to start 5 lines above the "123456" (6 including that line).
# 4  
Old 09-17-2009
Quote:
Originally Posted by andrewsc
... However, on the first page I actually need to start 5 lines above the "123456" (6 including that line).
So in case of file f1, you want to start from line (7-5=) 2:

Code:
$
$ cat -n f1
     1  456789
     2  stm1 - line 1
     3  stm1 - line 2
     4  stm1 - line 3
     5  ENDSTM
     6
     7  123456
     8  stm2 - line 1
     9  stm2 - line 2
    10  stm2 - line 3
    11  stm2 - line 4
    12  ENDSTM
    13
    14  345678
    15  stm3 - line 1
    16  stm3 - line 2
    17  ENDSTM
    18
    19  567890
    20  stm4 - line 1
    21  stm4 - line 2
    22  stm4 - line 3
    23  stm4 - line 4
    24  ENDSTM
    25
$
$
$ ##
$ perl -ne 'BEGIN{undef $/} {/.*\n(([^\n]*\n){5}123456.*?)ENDSTM/s && print $1}' f1
stm1 - line 1
stm1 - line 2
stm1 - line 3
ENDSTM
 
123456
stm2 - line 1
stm2 - line 2
stm2 - line 3
stm2 - line 4
$
$

tyler_durden
# 5  
Old 09-17-2009
the perl code does work for me. Can the awk be modified to grab the section 5 rows earlier? Also, include the last line that has the ENDSTM.

Thanks
# 6  
Old 09-17-2009
Quote:
Originally Posted by andrewsc
the perl code does work for me. Can the awk be modified to grab the section 5 rows earlier? Also, include the last line that has the ENDSTM.
...
Sorry, flexibility of regular expressions is one of the (many) reasons I tend to favor Perl. Smilie

Anyway -

Code:
$
$ ##
$ awk 'BEGIN {x = -1}
>      /123456/ {x=0; n=NR}
>      /ENDSTM/ && x==0 {x=1; s[NR]=$0}
>      x<1 {s[NR]=$0}
>      END {for (i=n-5; i<=length(s); i++) {print s[i]}}' f1
stm1 - line 1
stm1 - line 2
stm1 - line 3
ENDSTM
 
123456
stm2 - line 1
stm2 - line 2
stm2 - line 3
stm2 - line 4
ENDSTM
$
$

tyler_durden

Previous Thread | Next Thread
Test Your Knowledge in Computers #555
Difficulty: Easy
In C, if double b = 3.14159265359; printf("b=%10f ", b); then the output with be b=3.141593
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something... (1 Reply)
Discussion started by: funkman
1 Replies

2. Programming

find & Replace text using two non-unique delimiters.

I can find and replace text when the delimiters are unique. What I cannot do is replace text using two NON-unique delimiters: Ex., "This html code <text blah >contains <garbage blah blah >. All tags must go,<text > but some must be replaced with <garbage blah blah > without erasing other... (5 Replies)
Discussion started by: bedtime
5 Replies

3. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies

4. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

5. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

6. Shell Programming and Scripting

grep with date & unique output

alert.log has the entries with ORA-XXXX, .... Mon Sep 24 15:08:09 2012 WARNING: inbound connection timed out (ORA-3136) Mon Sep 24 15:08:09 2012 WARNING: inbound connection timed out (ORA-3136) Mon Sep 24 15:08:09 2012 WARNING: inbound connection timed out (ORA-3136) Mon Sep 24 15:15:01... (4 Replies)
Discussion started by: Daniel Gate
4 Replies

7. Shell Programming and Scripting

Change unique file names into new unique filenames

I have 84 files with the following names splitseqs.1, spliseqs.2 etc. and I want to change the .number to a unique filename. E.g. change splitseqs.1 into splitseqs.7114_1#24 and change spliseqs.2 into splitseqs.7067_2#4 So all the current file names are unique, so are the new file names.... (1 Reply)
Discussion started by: avonm
1 Replies

8. Shell Programming and Scripting

Unique string

Hello, Is ther a way to create a uniq string in ksh and to be able to define exactly how many characters will it be? (1 Reply)
Discussion started by: LiorAmitai
1 Replies

9. Shell Programming and Scripting

Need help with finding unique string in log file

Shell script help Here is 3 sample lines from a log file <date> INFO <java.com.blah> abcd:ID= user login <date> DEBUG <java.com.blah> <nlah bla> abcd:ID=123 user login <date> INFO <java.com.blah> abcd:ID=3243 user login I want to find unique "ID" from this log... (3 Replies)
Discussion started by: gubbu
3 Replies

10. Shell Programming and Scripting

Unique String

Hi, While creating users in one of my application, the passowrd for the user has to obey the following rules. password cannot be the same as the user name password must have a minimum of six characters password must have a maximum of 100 characters password must have at least one alpha... (1 Reply)
Discussion started by: azmathshaikh
1 Replies

Featured Tech Videos