Search within file1 numbers from list in file2


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search within file1 numbers from list in file2
# 1  
Old 09-07-2012
Search within file1 numbers from list in file2

Hello to all,

I hope somebody could help me with this:

I have this File1 (real has 5 million of lines):
Code:
Number          Category                               
--------------- -------------------------------------- 
8734060355                                           3 
8734060356                                           2 
8734079900                                           5 
87342060002                                          1 
87342060004                                          1 
87342440000                                          9 
87342440001                                          7 
87342440003                                          7

and this is File2 (has less lines that File1)
Code:
8734060356
8734079909
87342060002
87342440000
87342440007

File1 contain all the universe of data, but File2 doesn´t contain all numbers.

The desired output is:
Code:
8734060356                                           2 
8734079909                                           Not found in File1
87342060002                                          1 
87342440000                                          9 
87342440007                                          Not found in File1

So,
- if the number in File2 is found in File1 print the number and its category number.
- if the number in File2 is not found in File1, print the number and "Not found".

I hope is not to complex for an awk expert.

Thanks for help me.
# 2  
Old 09-07-2012
Hi


Code:
$ awk 'NR==FNR{a[$0];next}FNR>2{if($1 in a)$0=sprintf("%-11s%60s",$1,"Not found in File1");}1' file2 file1
Number          Category
--------------- --------------------------------------
8734060355                                           3
8734060356                                           Not found in File1
8734079900                                           5
87342060002                                          Not found in File1
87342060004                                          1
87342440000                                          Not found in File1
87342440001                                          7
87342440003                                          7

Guru.
# 3  
Old 09-07-2012
Hello guruprasadpr,

Thank you for your help, it's close but is needed match numbers of file2 in file1, so the output shoulw be:
Code:
8734060356                                           2 
8734079909                                           Not found in File1
87342060002                                          1 
87342440000                                          9 
87342440007                                          Not found in File1

# 4  
Old 09-07-2012
This might hog some memory:
Code:
awk 'FNR==NR{if(FNR>2) a[$1]=$2;next}
{$0=$0 "\t" (($1 in a)?a[$1]:"Not found in File1")}1' file1 file2

Try it and let me know.
# 5  
Old 09-07-2012
If files are sorted and they don't have headers and the second field of file1 is never empty:
Code:
join -a 2 -e 'Not found in File1' -o 2.1,1.2 file1 file2 
8734060356 2
8734079909 Not found in File1
87342060002 1
87342440000 9
87342440007 Not found in File1

With 2 lines headers in file1 :
Code:
join -a 2 -e 'Not found in File1' -o 2.1,1.2 <(tail -n +3 file1) file2

This User Gave Thanks to delugeag For This Post:
# 6  
Old 09-08-2012
Thankk you elixir_sinari. It works perfect! I only dont understand why you use
if(FNR>2).

Thanks delugeag for your help, only I'm not sure why is not working for me Smilie only prints
Code:
$ join -a 2 -e 'Not found in File1' -o 2.1,1.2 <(tail -n +3 file1) file2       Not found in File1                                                            Not found in File1                                                            Not found in File1                                                            Not found in File1                                                           87342440007 Not found in File1

# 7  
Old 09-08-2012
Quote:
Originally Posted by Ophiuchus
Thankk you elixir_sinari. It works perfect! I only dont understand why you use
if(FNR>2).
That's because you mentioned a header of 2 lines in file1.
This User Gave Thanks to elixir_sinari For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk- Indexing a list of numbers in file2 to print certain rows in file1

Hi Does anyone know of an efficient way to index a column of data in file2 to print the coresponding row in file1 which corresponds to the data in file2 AND 30 rows preceding and after the row in file1. For example suppose you have a list of numbers in file2 (single column) as follows:... (6 Replies)
Discussion started by: Geneanalyst
6 Replies

2. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies

3. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

4. Shell Programming and Scripting

search from file1 and replace into file2

I have 2 files: file1.txt: 1|15|XXXXXX||9630716||0096000||30/04/2012|E|O|X||||20120525135617-30.04.2012|PAT66OLM|STA||||00001|STA_0096000_YYYPPPXTMEX00_20120525135617_02_P.pdf|... (2 Replies)
Discussion started by: pparthiv
2 Replies

5. Shell Programming and Scripting

Search & replace fields from file1 to file2

hi, I have two xml files with the name source.xml and tobe_replaced.xml. Sample data: source.xml contains: <?xml version="1.0"?> <product description="prod1" product_info="some/info"> <product description="prod2" product_info="xyz/allinfo"> <product description="abc/partialinfo"... (2 Replies)
Discussion started by: dragon.1431
2 Replies

6. Shell Programming and Scripting

Read each word from File1 and search each file in file2

file1: has all words to be searched. 100007 200999 299997 File2: has all file names to be searched. C:\search1.txt C:\search2.txt C:\search3.txt C:\search4.txt Outfile: should have all found lines. Logic: Read each word in file1 and search each file in the list of File2; if the... (8 Replies)
Discussion started by: clem2610
8 Replies

7. Shell Programming and Scripting

Find numbers from File1 within File2

Hi all, Please your help with this. I have 2 files, File_1-->contains a column of N numbers File_2-->contains many lines with other info and numbers from File_1 within it. I would like to get from File_2 all the lines containing within the same line each of N numbers from File_1... (4 Replies)
Discussion started by: cgkmal
4 Replies

8. Shell Programming and Scripting

Search values between ranges in File1 within File2

Hi people, I have 2 files, one with a list of non consecutive ranges (File1.txt), where each range begins with the value in column 1 and finishes with the value in column 2 in the same line, as can be seen above. 215312581156279 215312581166279 215312582342558 215312582357758... (4 Replies)
Discussion started by: cgkmal
4 Replies

9. Shell Programming and Scripting

awk/sed search lines in file1 matching columns in file2

Hi All, as you can see I'm pretty new to this board. :D I'm struggling around with small script to search a few fields in another file. Basically I have file1 looking like this: 15:38:28 sz:10001 pr:14.16 15:38:28 sz:10002 pr:18.41 15:38:29 sz:10003 pr:19.28 15:38:30 sz:10004... (1 Reply)
Discussion started by: floripoint
1 Replies

10. Shell Programming and Scripting

replacing text in file1 with list from file2

I am trying to automate a process of searching through a set of files and replace all occurrences of a formatted text with the next item in the list of a second file. Basically i need to replace all instances of T????CLK???? with an IP address from a list in a second file. the second file is one IP... (9 Replies)
Discussion started by: dovetail
9 Replies
Login or Register to Ask a Question