Search and Substitute from 2 files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search and Substitute from 2 files
# 1  
Old 04-07-2009
Search and Substitute from 2 files

Hello,

I am not a frequent user of Unix. With my little knowledge and some scrambling I was able to get to this point .. please help me get this right. I have 2 files

file1.txt
123AAAAA111111111111111
456BBBBBB222222222222222222
789CCCCC33333333333333333

file2.txt
AAAAA11111
BBBBB22222
CCCCC33333

my output file should be from file1 with the following manipulations -- search for string pos 4-8 from file 1 in file 2 and replace with string pos 6-10 from file 2. File 1 is fixed width columns but variable record lengths (last columns could be null for some records)

output.txt
12311111111111111111111
45622222222222222222222222
7893333333333333333333333


#!usr/bin/ksh
while read myline
do
A=`awk '{print substr($myline, 1, 3)}'`
B=`awk '{print substr($myline, 4, 5)}'`
C=`awk '{print substr($myline, 9)}'`
sub=`grep $B file2.txt | awk '{print substr($0,6)}'`
echo "${A}${sub}${C}" >> output.txt
done < file1.txt

The error that I get is "grep unable to open AAAAA". The output file is written but the substitution happens for only one record.

Thanks.
# 2  
Old 04-07-2009
could you please post this question with better sample data?

the example is not quite clear enough.

your script has numerous errors to the point where it cannot be fixed,
so we'll need to start fresh.

some relevent questions:

+ are the files sorted according to the 4-8 column positions?
+ are the matching lines unique between the 2 files?
+ same number of lines per file?
+ 1 to 1 match line per line or scrambled?

here's what i'm thinking you want:

filea:
xxxxAAAAisforaardvark
yyyyBBBBisforbasilisk
zzzzCCCCisforchameleon

fileb:
AAAAisactuallyforant
BBBBisactuallyforbat
CCCCisactuallyforcat

output:
xxxxAAAAisactuallyforant
yyyyBBBBisactuallyforbat
zzzzCCCCisactuallyforcat

does this look right?
# 3  
Old 04-07-2009
Quote:
Originally Posted by quirkasaurus
could you please post this question with better sample data?

the example is not quite clear enough.

your script has numerous errors to the point where it cannot be fixed,
so we'll need to start fresh.

some relevent questions:

+ are the files sorted according to the 4-8 column positions?
+ are the matching lines unique between the 2 files?
+ same number of lines per file?
+ 1 to 1 match line per line or scrambled?

here's what i'm thinking you want:

filea:
xxxxAAAAisforaardvark
yyyyBBBBisforbasilisk
zzzzCCCCisforchameleon

fileb:
AAAAisactuallyforant
BBBBisactuallyforbat
CCCCisactuallyforcat

output:
xxxxAAAAisactuallyforant
yyyyBBBBisactuallyforbat
zzzzCCCCisactuallyforcat

does this look right?

I am newbie to shell scripting
i have made the following can somefix this:
Code:
nawk -v f1str=$(substr($0,4,8) -v f2str=$(substr($0,1,5) 'FILENAME="file1.txt"  {array[m++]=$f1str} FILENAME="file2.txt" { if (array[f2str]) {print $0}}' file1.txt file2.txt > file3.txt

# 4  
Old 04-07-2009
Thanks for your response.

My output should look like:

output:
xxxxisactuallyforantisforaardvark
yyyyisactuallyforbatisforbasilisk
zzzzisactuallyforcatisforchameleon

AAAA, BBBB, CCCC in filea should be substituted with lookup values from fileb.

Appreciate your help.
# 5  
Old 04-07-2009
Try this:

Code:
awk 'NR==FNR{a[substr($0,1,4)]=substr($0,5);next}
substr($0,5,4) in a {print substr($0,1,4) a[substr($0,5,4)] substr($0,9)}
' fileb filea

Use nawk or /usr/xpg4/bin/awk on Solaris if you get errors.

Regards
# 6  
Old 04-07-2009
Thank you Franklin. Your code worked fine for me. thanks to zenith and quirk as well ...
# 7  
Old 04-09-2009
just one more question on this .. how do I modify the script to get those rows which are skipped if no match is found ..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Substitute one line of multiple files according to another file

I need to make ~96 configure files from a template config file which has hundreds of rows that looks like: template.config: #average insert size avg_ins=1000 ...... other information omitted Those config files are named in sequence from S01.config, S02.config, ... etc with different... (11 Replies)
Discussion started by: yifangt
11 Replies

2. UNIX for Dummies Questions & Answers

Partial match in two files then substitute

Hi, I was trying to figure this out but failed so I hope someone here can help me, thank you in advance. I have two files. file1: aa M bb N cc O dd P ee Q file2: aa A_87_P254063 cc A_87_P016532 bb A_87_P104793 dd A_87_P055331 ee A_87_P059706 aa A_87_P071636 ee A_87_P028302... (2 Replies)
Discussion started by: yuejian
2 Replies

3. Shell Programming and Scripting

search information in multiple files and save in new files

hi everyone, im stuck in here with shell :) can you help me?? i have a directory with alot files (genbank files ... all ended in .gbk ) more than 1000 for sure ... and i want to read each one of them and search for some information and if i found the right one i save in new file with new... (6 Replies)
Discussion started by: andreia
6 Replies

4. Shell Programming and Scripting

search for content in files. Name of files is in another file. Format as report.

Hi I have multiple files in a folder and one file which contains a list of files (one on each line). I was to search for a string only within these files and not the whole folder. I need the output to be in the form File1<tab>string instance 2<tab> string instance 2<tab>string instance 3... (6 Replies)
Discussion started by: pkabali
6 Replies

5. Shell Programming and Scripting

HELP WITH SEARCH AND SUBSTITUTE IN "|" DELIMITED FILE

Hi I have a "|" delimited file as shown below. 55987|2011-04-07|09:30: 8.0|09:30:27.4|9194 55988|2011-04 07|09:30:21.0|09:30:27.4|9199 55989|2011-04-07|09:30:25.0|09:30:27.5|9176 55990|2011-04-07|09:29:33.0|09:30:27.5|9196 55991|2011-04-07|09:30:21.0|09:30:27.5|9199... (3 Replies)
Discussion started by: alok3141
3 Replies

6. Shell Programming and Scripting

How to access files from different directories and to perform search action in those files?

Hi, I want to access files from different directories (for example: /home/dir1/file1 , /home/dir2/file2 ...) Like this i have to access these files(file1, file2...). (3 Replies)
Discussion started by: bangarukannan
3 Replies

7. SuSE

Search all files based on first and in all listed files search the second patterns

Hello Linux Masters, I am not a linux expert therefore i need help from linux gurus. Well i have a requirement where i need to search all files based on first patterns and after seraching all files then serach second pattern in all files which i have extracted based on first pattern.... (1 Reply)
Discussion started by: Black-Linux
1 Replies

8. Shell Programming and Scripting

sed doubt - search and substitute string from variable.

hi, trying to learn more abt sed :( i want to substitute a variable(a) with other variable(b) appended. Read.txt contains: home/test2/abc home/test/root1 input.txt contains: make test "home/test1/none"version="1.3" wt's wrong test "home/test2/abc"version="1.0" make save... (9 Replies)
Discussion started by: dragon.1431
9 Replies

9. UNIX for Dummies Questions & Answers

Trouble with search and substitute

Hi there, I have a file with 1800+ lines and all are something like this: drwx--x--x 12960 4096 Oct 6 2006 palfvoet drwx--x--x 67205 4096 May 9 05:21 pallsopp drwx--x--x palmgren 4096 Oct 6 2006 palmgren now, as you... (3 Replies)
Discussion started by: besgal
3 Replies

10. UNIX for Advanced & Expert Users

Search files with specfic extention and later search content

Hi, I would appriciate if somebody can help me figure out how to search for all the *.xml file under a specific directory and subdirectroies (/home/username) and later search of content "<start>" inside the xml file returned by search. -Lovin.V (2 Replies)
Discussion started by: lovi_v
2 Replies
Login or Register to Ask a Question