Sponsored Content
Top Forums Shell Programming and Scripting Search values between ranges in File1 within File2 Post 302282516 by Corona688 on Saturday 31st of January 2009 12:39:35 PM
Old 01-31-2009
Things like awk and grep use string matching, which are great for matching strings. Matching a range of numbers is not a string matching problem, since it means needing to actually process and compare the value of the numbers, not the strings representing them.

I'm sure better perl coders than me could shrink this program down to three lines of garbage, but this seems to work at least.

Code:
#!/usr/bin/perl

my $arr=[];
open FILE1, "file1";
# Get a list of number ranges, store it in an array of arrays
while($line=<FILE1>)
{
        if($line =~ /([0-9]+)[ \t]*([0-9]+)/)
        {
                $arr[++$#arr]=[$1, $2];
        }
}

close FILE1;

# Read in lines from file2
open FILE2, "file2";
while($line=<FILE2>)
{
        $match=0;

# Extract all numbers from $line
        while(($match == 0) && ($line =~ /([0-9]+)/g))
        {
# check each number against the ranges given
                for($n=0; ($n<$#arr)&&($match==0); $n++)
                {
# If it matches, stop hunting and print the line.
                        if(($arr[$n][0] <= $1) &&
                                ($arr[$n][1] >= $1))
                        {
                                print $line;
                                $match=1;
                        }
                }
        }
}

exit 0;

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read each word from File1 and search each file in file2

file1: has all words to be searched. 100007 200999 299997 File2: has all file names to be searched. C:\search1.txt C:\search2.txt C:\search3.txt C:\search4.txt Outfile: should have all found lines. Logic: Read each word in file1 and search each file in the list of File2; if the... (8 Replies)
Discussion started by: clem2610
8 Replies

2. Shell Programming and Scripting

AWK: read values from file1; search for values in file2

I have read another post about this issue and am wondering how to adapt it to my own, much simpler, issue. I have a file of user IDs like so: 333333 321321 546465 ...etc I need to take each number and use it to print records wherein the 5th field matches the user ID pulled from the... (2 Replies)
Discussion started by: Bubnoff
2 Replies

3. Shell Programming and Scripting

Search & replace fields from file1 to file2

hi, I have two xml files with the name source.xml and tobe_replaced.xml. Sample data: source.xml contains: <?xml version="1.0"?> <product description="prod1" product_info="some/info"> <product description="prod2" product_info="xyz/allinfo"> <product description="abc/partialinfo"... (2 Replies)
Discussion started by: dragon.1431
2 Replies

4. Shell Programming and Scripting

Get values from different columns from file2 when match values of file1

Hi everyone, I have file1 and file2 comma separated both. file1 is: Header1,Header2,Header3,Header4,Header5,Header6,Header7,Header8,Header9,Header10 Code7,,,,,,,,, Code5,,,,,,,,, Code3,,,,,,,,, Code9,,,,,,,,, Code2,,,,,,,,,file2... (17 Replies)
Discussion started by: cgkmal
17 Replies

5. Shell Programming and Scripting

Remove lines in file1 with values from file2

Hello, I have two data files: file1 12345 aa bbb cccc 98765 qq www uuuu 76543 pp rrr bbbbb 34567 nn ccc sssss 87654 qq ppp rrrrr file2 98765 34567 I need to remove the lines from file1 if the first field contains a value that appears in file2: output 12345 aa bbb cccc 76543 pp... (2 Replies)
Discussion started by: palex
2 Replies

6. Shell Programming and Scripting

search from file1 and replace into file2

I have 2 files: file1.txt: 1|15|XXXXXX||9630716||0096000||30/04/2012|E|O|X||||20120525135617-30.04.2012|PAT66OLM|STA||||00001|STA_0096000_YYYPPPXTMEX00_20120525135617_02_P.pdf|... (2 Replies)
Discussion started by: pparthiv
2 Replies

7. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

8. Shell Programming and Scripting

Search within file1 numbers from list in file2

Hello to all, I hope somebody could help me with this: I have this File1 (real has 5 million of lines): Number Category --------------- -------------------------------------- 8734060355 3 8734060356 ... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

9. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies

10. Shell Programming and Scripting

Mapping the values of ids of two columns of file1 from file2

I have of two space separated files: ==> File1 <== PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1 PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1 PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1 PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1 ==> File 2 <==... (2 Replies)
Discussion started by: sammy777888
2 Replies
CUT(1)							    BSD General Commands Manual 						    CUT(1)

NAME
cut -- select portions of each line of a file SYNOPSIS
cut -b list [-n] [file ...] cut -c list [file ...] cut -f list [-d delim] [-s] [file ...] DESCRIPTION
The cut utility selects portions of each line (as specified by list) from each file and writes them to the standard output. If no file argu- ments are specified, or a file argument is a single dash ('-'), cut reads from from the standard input. The items specified by list can be in terms of column position or in terms of fields delimited by a special character. Column numbering starts from 1. The list option argument is a comma or whitespace separated set of increasing numbers and/or number ranges. Number ranges consist of a num- ber, a dash ('-'), and a second number and select the fields or columns from the first number to the second, inclusive. Numbers or number ranges may be preceded by a dash, which selects all fields or columns from 1 to the first number. Numbers or number ranges may be followed by a dash, which selects all fields or columns from the last number to the end of the line. Numbers and number ranges may be repeated, over- lapping, and in any order. It is not an error to select fields or columns not present in the input line. The options are as follows: -b list The list specifies byte positions. -c list The list specifies character positions. -d delim Use the first character of delim as the field delimiter character instead of the tab character. -f list The list specifies fields, delimited in the input by a single tab character. Output fields are separated by a single tab character. -n Do not split multi-byte characters. -s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of cut if the -n option is specified. Their effect is described in environ(7). EXAMPLES
Extract users' login names and shells from the system passwd(5) file as ``name:shell'' pairs: cut -d : -f 1,7 /etc/passwd Show the names and login times of the currently logged in users: who | cut -c 1-16,26-38 DIAGNOSTICS
The cut utility exits 0 on success, and >0 if an error occurs. SEE ALSO
paste(1) STANDARDS
The cut utility conforms to IEEE Std 1003.2-1992 (``POSIX.2''). HISTORY
A cut command appeared in AT&T System III UNIX. BUGS
The -c option is a synonym for the -b option, which causes incorrect behaviour in locales that support multibyte characters. When operating on fields (-f option is specified), cut does not recognise multibyte characters, and the delim character is recognised in the middle of multibyte sequences. BSD
June 6, 1993 BSD
All times are GMT -4. The time now is 02:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy