Lookup in another file and conditionally modify it inline


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Lookup in another file and conditionally modify it inline
# 1  
Old 01-16-2017
Lookup in another file and conditionally modify it inline

Hi,

I have an issue where i need to lookup in a given transaction file and if the same transaction is found in another file, then i need to replace a few columns with some other value.
Finally, the changed and unchanged lines must be printed and stored in the same source file.

for example :
Code:
f1
100
200
300

Code:
f2 (tran number is second field)
1,100,AAA,BBB,X,CCC
5,200,AAA,BBB,Y,CCC
3,400,AAA,BBB,X,CCC

output should be

Code:
1,100_P,AAA,BBB,X,CCC
5,200_T,AAA,BBB,Y,CCC
3,400,AAA,BBB,X,CCC

as you can see, only 1st and 2nd record have the matching tran from f1. hence they are changed, record 3 will remain as is.

I have done the following but its a very direct approach and takes a huge amount of time.
Code:
cp main_input.txt tmp1.txt;
while read tran_num
do 
    awk  'BEGIN{FS=","; OFS=","} { if ( $2=='${tran_num}' && $5 == "X")  {$2=$2"_P";print}  else if ($2=='${tran_num}' && $5 == "Y") {$2=$2"_T"; print} else {print} }' tmp1.txt > tmp2.txt;
    cp tmp2.txt tmp1.txt;
done < lookup_tran_file.txt
mv tmp1.txt file_input_file.txt

the above works but i dont like the fact that the tmp file is created everytime to preserve the previous modification done by awk. In other words, if f1 will contain 10000 records to be checked, the tmp file will get overwritten 10000 times. This solution is also takin 1-2 hrs to finish depending on the input file size.



Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by mansoorcfc; 01-16-2017 at 04:31 AM.. Reason: Added CODE tags.
# 2  
Old 01-16-2017
You are right, looping through above shell script 10000 times is an enormous waste of resources and time. It creates 20000 processes for 2 commands to be run 10000 times. How about doing it in one command, leaving the looping to it? If $5 can have ONLY the values "X" and "Y", try

Code:
awk -F, -vOFS=, 'NR==FNR {T[$1]; next} $2 in T {$2=$2 ($5=="X"?"_P":"_T")} 1' file1 file2
1,100_P,AAA,BBB,X,CCC
5,200_T,AAA,BBB,Y,CCC
3,400,AAA,BBB,X,CCC

Should there be other values in $5 for which $2 should not be modified, try
Code:
awk -F, -vOFS=, 'NR==FNR {T[$1]; next} $2 in T {$2=$2 ($5=="X"?"_P":$5=="Y"?"_T":"")} 1' file1 file2

# 3  
Old 01-16-2017
Assuming that your first sample file is named f1 rather than f1 being the first line of your first file and that your second input file is named f2 rather than the first line of your second input file being f2 (tran number is second field), then you can use whichever one of the suggestions RudiC provided that matches the actual input you need to process. If those two lines are actually in your input files, the following seems to produce the output you want:
Code:
awk '
BEGIN {	FS = OFS = ","
}
FNR == 1 {
	next
}
FNR == NR {
	tn[$1]
	next
}
$2 in tn {
	if($5 == "X")
		$2 = $2 "_P"
	else if($5 == "Y")
		$2 = $2 "_T"
}
1' f1 f2

If you want to try this (or either of RudiC's suggestions) on a Solaris/SunOS operating system, change awk to /usr/xpg4/bin/awk or nawk.
# 4  
Old 01-16-2017
thank you Rudi!.

thank you Don!
@Don : f1 and f2 aren't part of the data. they are just the file names. Also your solution is removing 1 record from the output. meaning if my input has 1000 records, i get only 999 as output after the updates/changes. I am trying to see why that is happening.

but both solutions worked on my actual data. This is amazing, i never expected the reply to be so quick! . Thank you so much. this helps me a lot and gives me ideas for my future file processing tasks.
# 5  
Old 01-16-2017
Quote:
Originally Posted by mansoorcfc
thank you Rudi!.

thank you Don!
@Don : f1 and f2 aren't part of the data. they are just the file names. Also your solution is removing 1 record from the output. meaning if my input has 1000 records, i get only 999 as output after the updates/changes. I am trying to see why that is happening.

but both solutions worked on my actual data. This is amazing, i never expected the reply to be so quick! . Thank you so much. this helps me a lot and gives me ideas for my future file processing tasks.
Hi mansoorcfc,
That is exactly what I said in post #3 in this thread:
Quote:
Assuming that your first sample file is named f1 rather than f1 being the first line of your first file and that your second input file is named f2 rather than the first line of your second input file being f2 (tran number is second field), then you can use whichever one of the suggestions RudiC provided that matches the actual input you need to process.
The code I posted was to be used only if the 1st line in each file is to be treated as some kind of header line that should be ignored when producing output. The code in my script:
Code:
FNR == 1 {
	next
}

skips the 1st line in each input file. (FNR is set by awk to be the record number of the current input line from the current input file.) Therefore, the output file will have one line less than your 2nd input file (and whatever transaction number is on the 1st line of your 1st input file will not be recognized as a known transaction number when processing the 2nd input file).
# 6  
Old 01-17-2017
Hi Don,

Thank you for your response and the clarification. sorry, i am little new to working with awk and couldn't see that right away. I have always used sed or other direct shell scripting approaches for any file editing but have always wanted to learn and use awk.

thank you again so much for your help and responses. much appreciate it.

Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Combining awk Inline and File Code

I've ended up with a small collection of libraries I like to use with awk, but this means I can't do awk -f librarycode.awk '{ program code }' filename because awk will assume that anything after -f is a filename, not code. Is there any way I can do both? (6 Replies)
Discussion started by: Corona688
6 Replies

2. Shell Programming and Scripting

Removing inline binary data from txt file

I am trying to parse a file but the filehas binary data inline mixed with text fields. I tried the binutils strings function , it get the binary data out but put the char following the binary data in a new line . input file app_id:1936 pgm_num:0 branch:TBNY ord_num:0500012(–QMK) deal_num:0... (12 Replies)
Discussion started by: tasmac
12 Replies

3. Shell Programming and Scripting

How to replace a text in a file conditionally?

I have got about 100 ascii files and I want replace some variable with a new one on an HP-UX system. But I want to put a line of comments before the change. I want to make file1 to file2. I am explaining below. file1: line1 line2 export QNAME=ABC line4 line5 file2: line1 line2 #... (3 Replies)
Discussion started by: asutoshch
3 Replies

4. Shell Programming and Scripting

Loop through file to sum conditionally

Hi, I have a file with header, detail and trailer records. HDR|111 DTL|abc|100|xyz DTL|abc|50|xyz TRL|150 I need to add the values in 3rd field from DTL records. Using awk, I am doing it as follows: awk -F'|' '$1=="DTL"{a += $3} END {print a}' <source_file> However, I want to... (3 Replies)
Discussion started by: delta21
3 Replies

5. Shell Programming and Scripting

Remove file conditionally between two server using sftp

Hi, I am having 2 servers, Need to delete files from server1 if those files exist in server2 other wise no action using sftp .And the process is non-interactive way. I have got confused how to check the condition in sftp because there is non of the shell condition or loop command is executing.... (2 Replies)
Discussion started by: posix
2 Replies

6. Shell Programming and Scripting

Inline searc and replace inside file

Hello, I have a text file that i want to redirect into a new file , searching and replacing certain string during the opertaion. This should be done using shell script , so it should not be interactive. The script should get four parameters : source file target file source string target... (1 Reply)
Discussion started by: yoavbe
1 Replies

7. Shell Programming and Scripting

perl - reading from a file conditionally

Hi, I am new to perl. I want to read from a file on the basis of some conditions.. I want to define parameters in a configuration file in such a manner like... etc.. in my perl script, theer is a variable like this.. then i want to read values from first if block from the file... (1 Reply)
Discussion started by: shellwell
1 Replies

8. Shell Programming and Scripting

using sed to conditionally extract stanzas of a file based on a search string

Dear All, I have a file with the syntax below (composed of several <log ..... </log> stanzas) I need to search this file for a number e.g. 2348022225919, and if it is found in a stanza, copy the whole stanza/section (<log .... </log>) to another output file. The numbers to search for are... (0 Replies)
Discussion started by: aitayemi
0 Replies

9. UNIX for Advanced & Expert Users

Clueless about how to lookup and reverse lookup IP addresses under a file!!.pls help

Write a quick shell snippet to find all of the IPV4 IP addresses in any and all of the files under /var/lib/output/*, ignoring whatever else may be in those files. Perform a reverse lookup on each, and format the output neatly, like "IP=192.168.0.1, ... (0 Replies)
Discussion started by: choco4202002
0 Replies

10. Shell Programming and Scripting

How to update the contents in a file conditionally?

Hi All, I have a data file which has two columns Location and the Count. The file looks like this India 1 US 0 UK 2 China 0 What I have to do is whenever I fails to login to Oracle then I have to add 1 to the count for that location. Whenever my script fails to login to Oracle for a... (5 Replies)
Discussion started by: rajus19
5 Replies
Login or Register to Ask a Question