parse by string and difference in substring???


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting parse by string and difference in substring???
# 1  
Old 07-01-2008
parse by string and difference in substring???

I have a big list as the following:

apple X:5_yes_a
apple X:12_no_b
apple X:45_yes_a
apple X:100_no_b
banana X:7_yes_a
banana X:13_yes_a
banana X:42_no_a
cat X:42_no_b
cat X:77_yes_d

I'd like to parse the file so that for each $1 value I return only lines in which the value in $2 after the : and before the "_" is more than 10 greater than the previous value.


e.g.

apple X:5_yes_a
apple X:45_yes_a
apple X:100_no_b
banana X:7_yes_a
banana X:42_no_a
cat X:42_no_b
cat X:77_yes_d

So for each new $1 I'd like to print the line and then for each identical $1 I'd like to only print the line if the substring value in $2 between ":" and the next "_" for line X+1 - line X is > 10.

I guess the last line won't have an 'X+1' line???

It might be easiest to split $2 up first??

I have no idea where to start.
# 2  
Old 07-01-2008
With AWK (using nawk or /usr/xpg4/bin/awk on Solaris):

Code:
awk -F'[:_]' '$2>s+10||!_[$1]++;{s=$2}' file

# 3  
Old 07-01-2008
Wow!

Thank thank you thank you.

SmilieSmilieSmilieSmilieSmilie
# 4  
Old 07-01-2008
Can you explain to me how that script is working?

Thanks again
# 5  
Old 07-01-2008
Sure,
first split the record based on the FS to locate $2:

Code:
awk -F'[:_]' ...

For each record evaluate the following expression:

Code:
$2>s+10||!_[$1]++

The first part is easy, it tests if the current $2 is greater than the previous one + 10 (the variable s is set in the action (s=$2), so when the expression is tested it contains the value of the previous record). !array[string]++ is a common AWK idiom, it returns true when string is matched for the first time, it could be easier to understand like this:

Code:
!_[string]++ is _[string]++==0

AWK auto-initializes variables as NULL (in string context) or 0 (in numeric context). When it post-increments the associative array for the first time, it's value is 0 (see AWK associative arrays for more info on this). The || operator is logical OR.

Hope this helps.

Last edited by radoulov; 07-01-2008 at 04:47 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting substring within string between 2 token within the string

Hello. First best wishes for everybody. here is the input file ("$INPUT1") contents : BASH_FUNC_message_begin_script%%=() { local -a L_ARRAY; BASH_FUNC_message_debug%%=() { local -a L_ARRAY; BASH_FUNC_message_end_script%%=() { local -a L_ARRAY; BASH_FUNC_message_error%%=() { local... (3 Replies)
Discussion started by: jcdole
3 Replies

2. Shell Programming and Scripting

Check/Parse log file's lines using time difference/timestamp

I was looking at this script which outputs the two lines which differs less than one sec. #!/usr/bin/perl -w use strict; use warnings; use Time::Local; use constant SEC_MILIC => 1000; my $file='infile'; ## Open for reading argument file. open my $fh, "<", $file or die "Cannot... (1 Reply)
Discussion started by: cele_82
1 Replies

3. Shell Programming and Scripting

parse a mixed alphanumeric string from within a string

Hi, I would like to be able to parse out a substring matching a basic pattern, which is a character followed by 3 or 4 digits (for example S1234 out of a larger string). The main string would just be a filename, like Thisis__the FileName_S1234_ToParse.txt. The filename isn't fixed, but the... (2 Replies)
Discussion started by: keaneMB
2 Replies

4. Shell Programming and Scripting

Replacing a string with its substring

Hi All, Below is some sample content of my input file: There are many types and traditions of anarchism, some of which are ]. Strains of anarchism have been divided into the categories of ] and ] or similar dual classifications. Anarchism is often considered to be a radical ] ideology, and... (8 Replies)
Discussion started by: satheeshkumar
8 Replies

5. Shell Programming and Scripting

substring and string position

Hi I want to use korn shell. I have files in a directory of following format abc01of09xyz abc02of09mno aabc03of09qrs --- -- requirement first is to check if any files of format "abc*of*" exists. If yes then match the number of such files with the number mentioned in each files(09 in... (1 Reply)
Discussion started by: subusona
1 Replies

6. Shell Programming and Scripting

Help with string and substring also I/O

#!/bin/sh PRINTF=/usr/bin/printf PASSWD=/etc/passwd $PRINTF "Enter a UserID\n" read USERID if ; then $PRINTF "$USERID does not exist, please contact IT service\n" exit 1 fi USERHOME=`grep "^$USERID:" $PASSWD | awk -F : '{print $6}'` USERSHELL=`grep "^$USERID:"... (1 Reply)
Discussion started by: ikeQ
1 Replies

7. Shell Programming and Scripting

get substring from string

Hi All, Problem Description: XML_REP_REQUEST=`CONCSUB "$LOGIN" "SQLAP" "$RESP_NAME" "$USRNM" WAIT="Y" "CONCURRENT" "APPLICATION_SHORT_NAME" "CP_SHORT_NAME"` echo Report Request: $XML_REP_REQUEST --to print value in log file While execution the value of 'XML_REP_REQUEST' is 'Prozess... (5 Replies)
Discussion started by: suman.g
5 Replies

8. UNIX for Dummies Questions & Answers

How to get the substring from the string

Hi All, Can anybody help me to get the substring from the given string. (3 Replies)
Discussion started by: Anshu
3 Replies

9. Shell Programming and Scripting

getting a substring from a string

hi all, I am trying to extract SUBSTRINGS out of a string using ksh. The string is "SAPR3K.FD0.FA.TJ.B0010.T050302" I tried using a= `expr substr $stringZ 1 2` which is giving me a syntax error, donno why?? any ideas why its not working?? I also tried echo "welcome" | awk '{... (3 Replies)
Discussion started by: maradona
3 Replies

10. Programming

can i get a substring from a string?

for example, the string a is "abcdefg", can i get a substring "bcd" (from ato a) from string a? thank you (4 Replies)
Discussion started by: dell9
4 Replies
Login or Register to Ask a Question