Sponsored Content
Top Forums Shell Programming and Scripting find values between values in two different fields Post 302506765 by redse171 on Monday 21st of March 2011 11:24:55 PM
Old 03-22-2011
Data find values between values in two different fields

Hi,

I need help to find values between two different fields based on $6 (NUM) AND $1 (CD), within the same ID. The result should show the values between the NUMs which will be extracted from within $3 and $2 in data.txt file below.

data.txt
Code:
  ex    139    142    Sc_1000004    ID 4
  CD    139    142    Sc_1000004    ID 4    Num1
  sta   139    140    Sc_1000004
  ex    143    144    Sc_1000004    ID 4
  CD    148    150    Sc_1000004    ID 4    Num2
  ex    153    156    Sc_1000004    ID 4
  CD    153    156    Sc_1000004    ID 4    Num3
  sto   156    158    Sc_1000004

  ex    160    163    Sc_1000005    ID 5
  CD    160    163    Sc_1000005    ID 5    Num1
  sta   160    161    Sc_1000005
  ex    167    170    Sc_1000005    ID 5
  CD    167    170    Sc_1000005    ID 5    Num2
  ex    175    205    Sc_1000005    ID 5
  CD    175    205    Sc_1000005    ID 5    Num3
  sto   205    207    Sc_1000005

  ex    212    221    Sc_1000006    ID 6
  CD    212    221    Sc_1000006    ID 6    Num2
  sto   212    215    Sc_1000006
  ex    224    227    Sc_1000006    ID 6
  CD    224    227    Sc_1000006    ID 6    Num1
  sta   227    229    Sc_1000006

  ex    243    248    Sc_1000007    ID 7
  CD    243    248    Sc_1000007    ID 7    Num1
  sta   243    243    Sc_1000007
  ex    251    257    Sc_1000007    ID 7
  CD    251    257    Sc_1000007    ID 7    Num2
  ex    261    263    Sc_1000007    ID 7
  CD    261    263    Sc_1000007    ID 7    Num3
  sto   263    265    Sc_1000007

  ex    275    288    Sc_1000008    ID 8
  CD    275    288    Sc_1000008    ID 8    Num1
  sta   275    277    Sc_1000008

i want to have output like this:-
Code:
  NewVal    143 - 147     ID 4
  NewVal    151 - 152     ID 4
  NewVal    164 - 166     ID 5       
  NewVal    170 - 174     ID 5
  NewVal    222 - 223     ID 6
  NewVal    249 - 250     ID 7
  NewVal    257 - 260     ID 7

in the above output, for eg., "143 - 147" are the CD values extracted between NUM 1 and NUM 2. While "151 - 152" are the CD values extracted between NUM 2 and NUM 3 for ID 4 in the input file (data.txt)... and so on..but if there is only 1 NUM (such as NUM1 for ID 8), which means that only 1 CD exist for that ID, then no NewVal will be extracted.

I have thousands of values that i need to extract from hundreds of files like this Smilie. Would appreciate your kind help or advise to do this in awk or sed. Thanks...
 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

plus values from two files differient fields.

Hi Everyone, I have two files: filea: Sun Jun 21 14:37:56 2009 1 2 3 Sun Jun 21 11:47:16 2009 2 3 4 fileb: Sun Jun 21 14:37:56 2009 1 2 3 Sun Jun 21 11:47:17 2009 33 44 55 The output is filec: Sun Jun 21... (4 Replies)
Discussion started by: jimmy_y
4 Replies

2. UNIX for Dummies Questions & Answers

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (1 Reply)
Discussion started by: vukkusila
1 Replies

3. Shell Programming and Scripting

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (2 Replies)
Discussion started by: vukkusila
2 Replies

4. Shell Programming and Scripting

Assigning fields values to different variables

Hi, I have this code: cat file.txt | awk -F, 'NR==1{print $6","$8","$10","$20","$21","$19}' > file.tmp VAR1=`cat file.tmp | cut -d "," -f1` VAR2=`cat file.tmp | cut -d "," -f2` VAR3=`cat file.tmp | cut -d "," -f3`; VAR4=`cat file.tmp | cut -d "," -f4`; VAR5=`cat... (1 Reply)
Discussion started by: Tr0cken
1 Replies

5. Shell Programming and Scripting

Need help in finding sum for values in 2 different fields

Hi there, I have 2 files in following format cat file_1 Storage Group Name: aaaa HBA UID SP Name SPPort ------- ------- ------ 0 21 Storage Group Name: bbbb HBA UID... (2 Replies)
Discussion started by: jpkumar10
2 Replies

6. Shell Programming and Scripting

How to find the X highest values in a list depending on the values of another list with bash/awk?

Hi everyone, This is an exemple of inpout.txt file (a "," delimited text file which can be open as csv file): ID, Code, Value, Store SP|01, AABBCDE, 15, 3 SP|01, AABBCDE, 14, 2 SP|01, AABBCDF, 13, 2 SP|01, AABBCDE, 16, 3 SP|02, AABBCED, 15, 2 SP|01, AABBCDF, 12, 3 SP|01, AABBCDD,... (1 Reply)
Discussion started by: jeremy589
1 Replies

7. Shell Programming and Scripting

How to get the values of multipledot(.) separated fields?

Hello, I have a file which has the following contents : thewall............0000000000200000 kmemfreelater......0000000000000000 kmemgcintvl........0000000000000002 kmeminuse..........00000000223411C0 allocated..........0000000029394000 bucket.......... @.F1000A02800C2158 The mentioned... (4 Replies)
Discussion started by: rahul2662
4 Replies

8. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies
NUMDIFF(1)							   User Commands							NUMDIFF(1)

NAME
numdiff - compare similar files with numeric fields DESCRIPTION
Usage: numdiff -h|--help|-v|--version or numdiff [-s IFS][-a THRVAL[:RANGE|:RANGE1:RANGE2]][-r THRVAL[:RANGE|:RANGE1:RANGE2]][-2][-F NUM][-# NUM][-P][-N][-I][-c CURRNAME][-d C1C2][-t C1C2][-g N1N2][-p C1C2][-n C1C2][-e C1C2][-i C1C2][-X 1:RANGE][-X 2:RANGE][-E][-D][-b][-V][-O[NUM]][-q][-S][-z 1:RANGE][-z 2:RANGE][-Z 1:RANGE][-Z 2:RANGE][-m][-H][-f[NUM]][-T][-B][-l PATH][-o PATH] FILE1 FILE2 Compare putatively similar files line by line and field by field, ignoring small numeric differences or/and different numeric formats. RANGE, RANGE1 and RANGE2 stay for a positive integer value or for a range of integer values, like 1-, 3-5 or -7. The two arguments after the options are the names of the files to compare. The complete paths of the files should be given, a directory name is not accepted. The given paths cannot refer to the same file but one of them can be "-", which refers to stdin. Exit status: 1 if files differ, 0 if they are equal, -1 (255) in case of error -s, --separator=IFS Specify the set of characters to use to split the input lines into fields (The default set of characters is space, tab and newline). If IFS is prefixed with 1: or 2: then use the given character set only for the lines from the first or the second file respectively -a, --absolute-tolerance=THRVAL[:RANGE|:RANGE1:RANGE2] Set to THRVAL the maximum absolute difference permitted before that two numeric fields are regarded as different (The default value is zero). If a RANGE is given, use the specified threshold only when comparing fields whose positions lie in RANGE. If both RANGE1 and RANGE2 are given and have the same length, then use the specified threshold when comparing a field of FILE1 lying in RANGE1 with the corresponding field of FILE2 in RANGE2 -r, --relative-tolerance=THRVAL[:RANGE|:RANGE1:RANGE2] Set to THRVAL the maximum relative difference permitted before that two numeric fields are regarded as different (The default value is zero). If a RANGE is given, use the specified threshold only when comparing fields whose positions lie in RANGE. If both RANGE1 and RANGE2 are given and have the same length, then use the specified threshold when comparing a field of FILE1 lying in RANGE1 with the corresponding field of FILE2 in RANGE2 -2, --strict Consider two numerical values as equal only if both absolute and relative difference do not exceed the corresponding tolerance threshold -F, --formula=NUM Use the formula indicated by NUM to compute the relative errors. If 'NUM' is 0 use the classic formula. If 'NUM' is 1 compute the relative errors by considering the values in FILE1 as sample values. If 'NUM' is 2 compute the relative errors by considering the values in FILE2 as sample values. -#, --digits=NUM Set to NUM the number of digits in the significands used in multiple precision arithmetic -P, --positive-differences Ignore all differences due to numeric fields of the second file that are less than the corresponding numeric fields in the first file -N, --negative-differences Ignore all differences due to numeric fields of the second file that are greater than the corresponding numeric fields in the first file -I, --ignore-case Ignore changes in case while doing literal comparisons -c, --currency=CURRNAME Set to CURRNAME the currency name for the two files to compare. CURRNAME must be prefixed with 1: or 2: to specify the currency name only for the first or the second file -d, --decimal-point=C1C2 Specify the characters representing the decimal point in the two files to compare -t, --thousands-separator=C1C2 Specify the characters representing the thousands separator in the two files to compare -g, --group-length=N1N2 Specify the number of digits forming each group of thousands in the two files to compare -p, --plus-prefix=C1C2 Specify the (optional) prefixes for positive values used in the two files to compare -n, --minus-prefix=C1C2 Specify the prefixes for negative values used in the two files to compare -e, --exponent-letter=C1C2 Specify the exponent letters used in the two files to compare -i, --imaginary-unit=C1C2 Specify the characters representing the imaginary unit in the two files to compare -X, --exclude=1:RANGE Select the fields of the first file that have to be ignored -X, --exclude=2:RANGE Select the fields of the second file that have to be ignored -E, --essential While printing the differences between the two compared files show only the numerical ones -D, --dummy While printing the differences between the two compared files neglect all the numerical ones (dummy mode) -b, --brief Suppress all messages concerning the differences discovered in the structures of the two files -V, --verbose For every couple of lines which differ in at least one field print an header to show how these lines appear in the two compared files -O, --overview[=NUM] Display a side by side difference listing of the two files showing which lines are present only in one file, which lines are present in both files but with one or more differing fields, and which lines are identical. If 'NUM' is zero or is not specified, output at most 130 columns per line. If 'NUM' is a positive number, output at most 'NUM' columns per line. If 'NUM' is a negative number, do not output common lines and display at most -'NUM' columns per line. -q, --quiet, --silent Suppress all the standard output -S, --statistics Add some statistics to the standard output -z, --blur-if-numerical=1:RANGE Select the fields of the first file that have to be blurred during the synchronization procedure only if they turn out to be numeric -z, --blur-if-numerical=2:RANGE Select the fields of the second file that have to be blurred during the synchronization procedure only if they turn out to be numeric -Z, --blur-unconditionally=1:RANGE Select the fields of the first file that have to be unconditionally blurred during the synchronization procedure -Z, --blur-unconditionally=2:RANGE Select the fields of the second file that have to be unconditionally blurred during the synchronization procedure -m, --minimal During synchronization try hard to find a smaller set of changes -H, --speed-large-files During synchronization assume large files and many scattered small changes -f, --test-filter[=NUM] Run only the filter and then show the results of its attempt to synchronize the two files. If 'NUM' is zero or is not specified, output at most 130 columns per line. If 'NUM' is a positive number, output at most 'NUM' columns per line. If 'NUM' is a negative number, do not output common lines and display at most -'NUM' columns per line. -T, --expand-tabs Expand tabs to spaces in output while displaying the results of the synchronization procedure (meaningful only together with option -O or -f) -B, --binary Treat both files as binary files (only meaningful under Doz/Windoz) -l, --warnings-to=PATH Redirect warning and error messages from stderr to the indicated file -o, --output=PATH Redirect output from stdout to the indicated file -h, --help Show help message and predefined settings -v, --version Show version number, Copyright, Distribution Terms and NO-Warranty Default numeric format (for both files to compare): Currency name = "" Decimal point = `.' Thousands separator = `,' Number of digits in each thousands group = 3 Leading positive sign = `+' Leading negative sign = `-' Prefix for decimal exponent = `e' Symbol used to denote the imaginary unit = `i' COPYRIGHT
Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Ivano Primi <ivprimi@libero.it> License GPLv3+: GNU GPL version 3 or later, see <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
The full documentation for numdiff is maintained as a Texinfo manual. If the info and numdiff programs are properly installed at your site, the command info numdiff should give you access to the complete manual. numdiff 5.6.0 January 2012 NUMDIFF(1)
All times are GMT -4. The time now is 04:27 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy