Using an input (?) file to search another


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using an input (?) file to search another
# 1  
Old 10-05-2010
Using an input (?) file to search another

I have a file (DCN.txt) that has about 35000 lines. It looks like:
Code:
10004470028
10005470984
10006470301
10007474812
....

I have several other files (a11.txt, a12.txt, a12_1.txt, a13.txt, etc. about 70, each 100 mb large) that have history records like so:
Code:
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvStNum	0	  500
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvStAddress	0	MAIN ST                
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvLicNo	0	535193      
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvLicState	0	TN
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvSpecCode	0	000
....

What I want to do is find the "ProvLicNo" for every value in the DCN.txt file. So for the above examples, I would like the output as:
Code:
10005470984 ProvLicNo 535193
10004470028 Prov...

or even
Code:
LINE	10005470984	01/06/2010 07:50:31	BKGND	STORE	Store:0	ProvLicNo	0	535193   
LINE  10004470028 ...

To be honest, I'm not sure how to tackle this. I know how to in Excel (for much smaller lists), but not so much in Unix. Any help is greatly appreciated!
# 2  
Old 10-05-2010
Well, the join command can match field 1 of the first file with field 2 of the second file. It would be nice to extract just the useful lines of file 2 filtering with fgrep. The files both have to be sorted by key column in alpha order for join. I am assuming the tabs in your post are the separators, not blanks.

Code:
sort -u DCN.txt >/tmp/joina.$$
fgrep '	ProvLicNo	' a*.txt | sort +1 -2 >/tmp/joinb.$$
join -j1 1 -j2 2 /tmp/joina.$$ /tmp/joinb.$$ | sed '
  s/.*LINE/LINE/
 ' >/tmp/join_out.$$

I could do it with pipes not temp files since it is a many to one join, using a join I wrote in C, m1join, that does no seeks!

Much more of this, and it is time for a simple RDBMS (database).

The sort may leave the date order as original. Sorting the join output (or input within each ProvLicNo) on the existing dates can be done, but the parameters are tough for our date format! I used old style sort key parameters, but his might necessitate using the new style. See man sort.
# 3  
Old 10-05-2010
I've been trying to get it to work, but I keep running into an error.

It says: '/tmp/joina.randomnumber: No such file or directory'. It keeps giving different numbers every time. When I view the files, it seems to be creating joina and joinb files, but keeps stopping every time.

I'm doing a workaround by first grepping the a*.txt files for all DCNs and then grepping that for ProvLicNo. It's not elegant, seems to work (on a small number, at least). However, if you know why I get the problem and have a solution, that'll be beyond awesome.
# 4  
Old 10-05-2010
Unless the number of files is very large or you are on a very conservative system, the following should suffice:
Code:
awk 'FNR==NR && NF {k[$1]; next} $2 in k && $8=="ProvLicNo" {print $2, $8, $10}' DCN.txt a*.txt

Regards,
Alister
This User Gave Thanks to alister For This Post:
# 5  
Old 10-05-2010
Code:
$ ruby -ane 'BEGIN{a=File.read("DCN.txt").split("\n")}; print "#{$F[1]} #{$F[-3]} #{$F[-1]}\n" if a.include?($F[1]) and $F[-3]["ProvLicNo"]' *.txt

# 6  
Old 10-05-2010
Code:
grep -f DCN.txt a*.txt |grep ProvLicNo

This User Gave Thanks to rdcwayx For This Post:
# 7  
Old 10-06-2010
Usually, $$ is your shell script or login shell pid, and does not change within the run or login session.

I know name collisions in /tmp are a problem, as is auto clean of /tmp on boot, so I make a $HOME/tmp sym-link to a directory I create: /var/tmp/$LOGNAME/ If you have a private temp dir for your files, you do not have to worry about name collisions and can drop the .$$, which is not foolproof for scripts in any case. Pids roll over pretty often on a busy system.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Reducing input file size after pattern search

I have a very large file with millions of entries identified by @M. I am using the following script to "extract" entries based on specific strings/patterns: #!/bin/bash if ] then file=$1 else echo "Input_file passed as an argument $1 is NOT found." exit; fi MID=(NULL "string-1"... (10 Replies)
Discussion started by: Xterra
10 Replies

2. Shell Programming and Scripting

Search pattern in a file taking input from another file

Hi, Below is my requirement File1: svasjsdhvassdvasdhhgvasddhvasdhasdjhvasdjsahvasdjvdasjdvvsadjhv vdjvsdjasvdasdjbasdjbasdjhasbdasjhdbjheasbdasjdsajhbjasbjasbhddjb svfsdhgvfdshgvfsdhfvsdadhfvsajhvasjdhvsajhdvsadjvhasjhdvjhsadjahs File2: sdh hgv I need a command such that... (8 Replies)
Discussion started by: imrandec85
8 Replies

3. Linux

Search a template file and replace with input

Hi I have a CommonTemplateStop.template file . Inside the file i need to replace the variables DepName and CompInsName with the values(Trade and TradeIns) specified in the script. I have written the below .sh script in linux server which will read the .template file and has to replace the 2... (8 Replies)
Discussion started by: samrat dutta
8 Replies

4. Shell Programming and Scripting

UNIX Scripting help to input string and search a file to find

Hi Don, this is not homework question. I work for a Credit card company and my development goal this year is to learn Unix. I would love if others can help me get started, thanks. Hi everyone I am new to Unix and need help writing a script that can ask user for an input, then search that input... (2 Replies)
Discussion started by: 12ic11
2 Replies

5. Shell Programming and Scripting

UNIX Scripting help to input string and search a file to find

Hi everyone, I am new to Unix and need help writing a script that can ask user for an input, then search that input within a file I know will have to use the read and grep commands, anyone can give me somewhere to start would help Task: Write a script to display which volume pool a given... (1 Reply)
Discussion started by: 12ic11
1 Replies

6. UNIX for Dummies Questions & Answers

UNIX Scripting help to input string and search a file to find

Hi everyone, I am new to Unix and need help writing a script that can ask user for an input, then search that input within a file I know will have to use the read and grep commands, anyone can give me somewhere to start would help Task: Write a script to display... (1 Reply)
Discussion started by: 12ic11
1 Replies

7. Shell Programming and Scripting

Bash to search file based off user input then create new file

In the below bash a file is downloaded when the program is opened and then that file is searched based on user input and the result is written to a new file. For example, the bash is opened and the download.txt is downloaded, the user then enters the id (NA04520). The id is used to search... (5 Replies)
Discussion started by: cmccabe
5 Replies

8. Shell Programming and Scripting

Search on date range of file based on user input

Hello I would like to ask for help with a script to search a directory that contains many log files and based on a users input after being prompted, they enter a date range down to the hour which searches the files that contain that range. I dont know how to go about this. I am hoping that the... (5 Replies)
Discussion started by: lostincashe
5 Replies

9. Solaris

Keyword search input from a file

Hi, I have a file which got only one column and got some keywords. I have another file where the keywords used in the first file are repeated in the second file. Now I would like to know how many times each keyword from the first file is repeated in the second file. Request your help on... (1 Reply)
Discussion started by: pointers
1 Replies

10. Shell Programming and Scripting

Merge of two input file by search

Hi i am running a issue with the way i handel open file in perl i have the following input file <File1> D33963|BNS Default Swap|-261564.923909249| D24484|BNS Default Swap|-53356.6868058492| D24485|BNS Default Swap|-21180.9904679111| D33965|BNS Default Swap|154181.478745804|... (6 Replies)
Discussion started by: kykyboss
6 Replies
Login or Register to Ask a Question