awk - Multiple files - 1 file with multi-line data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk - Multiple files - 1 file with multi-line data
# 1  
Old 05-11-2016
awk - Multiple files - 1 file with multi-line data

Greetings experts,
Have 2 input files, of which 1 file has 1 record per line; in 2nd file, multiple lines constitute 1 record; Hence declared the RS=";"
Now in the first file which ends with ";" at each line of the line; But \nis also being considered as part of the data due to which
I am facing some issues; How to avoid this.

File1: FS=@ RS=; (forced to use RS=; as for the file2 it spans across multiple lines)
Code:
Source_table_name  Source_col_name   Target_table_name    Target_col_name
Src_1@src_col11@Tgt_1@tgt_col11;
Src_2@src_col21@Tgt_2@tgt_col21;
......

File2:
Code:
Src_tbl       Tgt_tbl     join condition
Src_1@Tgt_1 @FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN 
SRC_TEMP_1 ON  SRC_1.COL1=SRC_TEMP_1.COL1) SRC JOIN (SELECT 
TGT_1.* FROM TGT_1   INNER JOIN TGT_TEMP_1 ON 
TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON   SRC.COL1=TGT.COL1;
Src_2@Tgt_2@FROM( SELECT SRC_2.*  FROM SRC_2 LEFT OUTER JOIN TGT_2 ON SRC_2.COL1=TGT_2.COL1;

awk_code:
Code:
awk -F[@] 'BEGIN { RS=";"  OFS=" "}
FNR==NR{src_tgt_array[$1 OFS $3]="something";
next;
}
{
join_array[$1 OFS $2]=$3
}
end {
for (i in src_tgt_array)
{
print "src and tgt tables " i " join condition is " join_array[i]
print "----------------"
} 
}' < file1.txt file2.txt > awk_output.txt

Output:
Code:
src and tgt tables Src_1 Tgt_1 join condition is FROM ( SELECT SRC_1.*
 FROM SRC_1 INNER JOIN SRC_TEMP_1 ON  SRC_1.COL1=SRC_TEMP_1.COL1) SRC
 JOIN (SELECT TGT_1.* FROM TGT_1   INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT 
ON   SRC.COL1=TGT.COL1
----------------
src and tgt tables 
Src_2 Tgt_2 join condition is 
----------------

As you can see that after this line src and tgt tables the data jumped into the other line from which I assume
that there is embedded \n which is not present in join_array I guess (present without embedded newline char)

Expected Output: -- For readability I have split the data into multiple rows
Code:
src and tgt tables Src_1 Tgt_1 join condition is FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON  SRC_1.COL1=SRC_TEMP_1.COL1) SRC 
JOIN (SELECT TGT_1.* FROM TGT_1   INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT 
ON   SRC.COL1=TGT.COL1
----------------

src and tgt tables Src_2 Tgt_2 join condition is FROM( SELECT SRC_2.*  FROM SRC_2 
LEFT OUTER JOIN TGT_2 ON SRC_2.COL1=TGT_2.COL1
----------------

As RS=";" the data in the array for the first record in file1 is as expected; But for the next records, there is \n embedded
into the array before next records first column is read as you can see from the above
hence not printing the data relative to this in join_array;

How to overcome this please..

Edit:
Please excuse any syntax issues as I am not able to copy/paste;
# 2  
Old 05-11-2016
I'm sorry, but we can't ignore syntax issues. If you can't accurately show us what your code really looks like, we can waste our time and yours pointing out reasons why you code won't work when it has nothing to do with your problem. Our crystal balls don't work well enough to read code that you haven't shown us.

I haven't tried to work out the logic of what your awk code is doing, but there are two obvious problems with what you have shown us:
First, just like the BEGIN keyword is all uppercase, the end in your code needs to be END.

And, second, the awk utility reads from standard input when no file operands are specified on the command line and when - is specified as a file operand. So try changing the last line of your script from:
Code:
}' < file1.txt file2.txt > awk_output.txt

to:
Code:
}' file1.txt file2.txt > awk_output.txt

so awk will see two file operands instead of being given your first file on standard input and you second file as the only file operand.

If these changes don't fix your problem, please show us what happens after making these changes and we'll help you debug it further. And, please find a way to show us your actual code (either copy & paste or upload).

And, DO NOT show us expected output that is not your expected output. If we can't tell what was done to make it readable instead of being the actual output you expect, you are just adding confusion to your specification.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. Shell Programming and Scripting

awk - Multi-line data to be stored in variable

Greetings Experts, As part of automating the sql generation, I have the source table name, target table name, join condition stored in a file join_conditions.txt which is a delimited file (I can edit the file if for any reason). The reason I needed to store is I have built SELECT list without... (5 Replies)
Discussion started by: chill3chee
5 Replies

3. Shell Programming and Scripting

awk - 2 files comparison without for loop - multi-line issue

Greetings Experts, I need to handle the views created over monthly retention tables for which every new table in YYYYMMDD format, there is equivalent view created and the older table which might be dropped, the view over it has to be re-created over a dummy table so that it doesn't fail.... (2 Replies)
Discussion started by: chill3chee
2 Replies

4. Shell Programming and Scripting

awk : Filter a set of data to parse header line and last field of multiple same match.

Hi Experts, I have a data with multiple entry , I want to filter PKG= & the last column "00060110" or "00088150" in the output file: ############################################################################################### PKG= P8SDB :: VGS = vgP8SOra vgP8SDB1 vgP8S001... (5 Replies)
Discussion started by: rveri
5 Replies

5. Shell Programming and Scripting

How to substract selective values in multi row, multi column file (using awk or sed?)

Hi, I have a problem where I need to make this input: nameRow1a,text1a,text2a,floatValue1a,FloatValue2a,...,floatValue140a nameRow1b,text1b,text2b,floatValue1b,FloatValue2b,...,floatValue140b look like this output: nameRow1a,text1b,text2a,(floatValue1a - floatValue1b),(floatValue2a -... (4 Replies)
Discussion started by: nricardo
4 Replies

6. Shell Programming and Scripting

Multi-line filtering based on multi-line pattern in a file

I have a file with data records separated by multiple equals signs, as below. ========== RECORD 1 ========== RECORD 2 DATA LINE ========== RECORD 3 ========== RECORD 4 DATA LINE ========== RECORD 5 DATA LINE ========== I need to filter out all data from this file where the... (2 Replies)
Discussion started by: Finja
2 Replies

7. UNIX for Dummies Questions & Answers

awk, extract last line of multiple files

Hi, I have a directory full of *.txt files. I would like to print the last line of every file to screen. I know you can use FNR for printing the first line of each file, but how do I access the last line of each file? This code doesn't work, it only prints the last line of the last file:BEGIN... (5 Replies)
Discussion started by: Liverpaul09
5 Replies

8. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

9. UNIX for Dummies Questions & Answers

AWK, extract data from multiple files

Hi, I'm using AWK to try to extract data from multiple files (*.txt). The script should look for a flag that occurs at a specific position in each file and it should return the data to the right of that flag. I should end up with one line for each file, each containing 3 columns:... (8 Replies)
Discussion started by: Liverpaul09
8 Replies

10. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies
Login or Register to Ask a Question