concat 6 files in 1 file ( maybe use AWK?)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting concat 6 files in 1 file ( maybe use AWK?)
# 1  
Old 09-12-2008
concat 6 files in 1 file ( maybe use AWK?)

hi,
I have the following problem:
- 6 different files that have one key in common.
- this six files must be aggregated in one output file sorted by the key.
- the main file has to be writen twice, one in the beggining of the new output file and another in the end, for each key.
- add one identifier to the output file in order to do a sort with the expected order of the lines.
- a friend told me to use AWK to gather them, but i have never seen this and only after finishing my current task i wil check about this.
- i really have to check about performance because each file has about 800.000 lines and i have 150 files!


if someone can give me some tips or where to get one example to do my work fast i would be appreciated ( this way i would probably start my weekend at decent times).
I think the biggest headdik will be because there can the lines without key (in witch i must use the last readed key), the fact i have to read twice the first file and that the sort must be done by the identifier in order to get the expected order!

example from files to see what i have to gather:

1º:
Code:
410000000000001,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
410000000000002,,dummy, data,separated,by,comma,,,,,,,,
410000000000003,,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,


Code:
410000000000001,,,dummy, data,separated,by,comma,,,,,,,,
410000000000002,,,dummy, data,separated,by,comma,,,,,,,,


Code:
410000000000001,,,dummy, data,separated,by,comma,,,,,,,,
410000000000002,,,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
410000000000003,,,dummy, data,separated,by,comma,,,,,,,,
410000000000004,,,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,


Code:
41000000000001,dummy, data,separated,by,comma,,,,,,,,
41000000000002,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
41000000000003,dummy, data,separated,by,comma,,,,,,,,
41000000000004,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,


Code:
410000000000001,dummy, data,separated,by,comma,,,,,,,,
410000000000002,dummy, data,separated,by,comma,,,,,,,,


Code:
410000000000001,dummy, data,separated,by,comma,,,,,,,,
410000000000002,dummy, data,separated,by,comma,,,,,,,,

in:
Code:
dummy, data,separated,by,comma

i will have a variable number of fields, depending on the file

Thanks for the help.

Best regards,
Ricardo Tomás
# 2  
Old 09-12-2008
Sorry I am not sure of fully understanding what you are trying to achieve. Is is about concatenating the 6 files in one? can you not use cat file1 >> OutputFile and then cat file2 >> OutputFile and so on?

As per the common key, you mean there is a field content that is shared by all the files? Can you provide a shorter example (just 2 or 3 files) with some contents and the expected output file? Like:

File1
xxxxxxxxxx

File2
xxxxxxxxxx

Output File
xxxxxxxxxxxxxxxxxxx

If you want to extract the first field from your files using awk then you can do

cat theFile | awk -F, '{print $1}'

but I am not sure if that's what you are looking for.
# 3  
Old 09-12-2008
better example

explaining my situation best:




the information may came like this:

410000000000001,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,
410000000000002,,dummy, data,separated,by,comma,,,,,,,,
410000000000003,,dummy, data,separated,by,comma,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,

and in some files i cames with the first value in all the lines.
like this:
410000000000002,,dummy, data,separated,by,comma,,,,,,,,
410000000000003,,dummy, data,separated,by,comma,,,,,,,,

i want it to stay like for each file:

X,410000000000001,dummy, data,separated,by,comma,,,,,,,,
X,410000000000001,,,,,,,,,,,,,,,,,,,,,,,,,
X,410000000000001,,,,,,,,,,,,,,,,,,,,,,,,,
X,410000000000001,,,,,,,,,,,,,,,,,,,,,,,,,
X,410000000000002,,dummy, data,separated,by,comma,,,,,,,,
X,410000000000003,,dummy, data,separated,by,comma,,,,,,,,
X,410000000000003,,,,,,,,,,,,,,,,,,,,,,,,,

or like this:
410000000000001,X,dummy, data,separated,by,comma,,,,,,,,
410000000000001,X,,,,,,,,,,,,,,,,,,,,,,,,,,
410000000000001,X,,,,,,,,,,,,,,,,,,,,,,,,,,
410000000000001,X,,,,,,,,,,,,,,,,,,,,,,,,,,
410000000000002,X,,dummy, data,separated,by,comma,,,,,,,,
410000000000003,X,,dummy, data,separated,by,comma,,,,,,,,
410000000000003,X,,,,,,,,,,,,,,,,,,,,,,,,,,

the final result depends on how the sort function works because i need the sort to be done by the value 4100000000000XXX.

And for final result i need this:
1,410000000000001,,dummy, data,separated,by,comma,,,,,,,,
2,410000000000001,,dummy, data,separated,by,comma,,,,,,,,
3,410000000000001,,dummy, data,separated,by,comma,,,,,,,,
...
7,410000000000001,,dummy, data,separated,by,comma,,,,,,,,
1,410000000000002,,dummy, data,separated,by,comma,,,,,,,,
2,410000000000002,,dummy, data,separated,by,comma,,,,,,,,


which means, i need to sort the files by the number 4100000000000XX in order to process all the information for the same number sequentially ( it goes to a oracle database after processing)

I hope this explains my situation best and that you can help me.

best regards,
Ricardo Tomás
# 4  
Old 09-12-2008
Each file has one or more lines
Each line has one instance of 410...XXX followed by one or more fields

You want one resulting file, with each line:
<a number>, followed by the 410....XXX, followed by fields
<a different number>, followed by the 410....XXX, followed by fields.


Where are you coming up with the number that you want at the front of each line for your final text file?
How do you want the common numbered lines (from different files) to be grouped?
How do you want the fields in the common numbered lines (from different files) to be grouped? This will be more difficult unless you can provide information about how each file has it's fields ordered:

Code:
file1
:16digitnumber:, :field1:, :field2:, :field3:

file2
:16digitnumber:, :field4:, :field5:, field6

file3
:field7:, :16digitnumber:, :field1:, field8:

...

Consolidated file
:somenumber:, :16digitnumber:, :field1:, :field2:, :field3:
:somenumber:, :16digitnumber:, :field4:, :field5:, :field6:
:somenumber:, :16digitnumber:, :filed7:, :field8:
...
and so on....

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Concat String with variable after a 'grep' and awk

Here is the structure of my file: MyFile.txt g-4.n.g.fr 10.147.243.63 g-4.n.g.fr-w1 Here is my sript: test.sh #! /bin/sh ip=10.147.243.63 worker=$(grep -e $ip $1 | awk '{ print $3; }') echo "" echo $worker echo "" echo $worker echo "" echo "$worker.v.1" echo... (7 Replies)
Discussion started by: chercheur111
7 Replies

2. UNIX for Dummies Questions & Answers

concat any two lines in a file

I have a file with line 4 : F SITE SPA_M2 SPA_M3 SPA_M4 and a line 237 with: BV_N4 VbdGO_PW Rs_NW_STI Rc_N+OD need a awk liner to concat the two lines so that line 2 sits next to line1 and looks like: F SITE SPA_M2 SPA_M3 SPA_M4 BV_N4 VbdGO_PW ... (8 Replies)
Discussion started by: awkaddict
8 Replies

3. Shell Programming and Scripting

awk concat lines between 2 sequent digits

I would like to print string between two sequent digits and concatenate it into one single line. input.txt 99 cord, rope, strand, twine, twist, 100 strand, twine, twist, cord, rope 101 strand, twine, twist, twine, twist, cord, rope 105 cord, rope ,twi ... (8 Replies)
Discussion started by: sdf
8 Replies

4. Shell Programming and Scripting

concat 3 files

Hello Unix gurus, how to concat 3 files content side by side . i have 3 files more report1.txt select *from tab1 A JOIN tab1 B ON more report2.txt A.PK1=B.PK1 where more report3.txt A.AAA <> B.AAA or A.BBB <> B.BBB or A.CCC<> B.CCCC or .. .. .. A.ZZZ <> B.ZZZ; if i concatinate... (3 Replies)
Discussion started by: kanakaraju
3 Replies

5. Shell Programming and Scripting

Need help in concat of two lines in a file

Hi , Need help in concating two lines based on certain character, for example my file has the messages : :57A:qweqweww :58A:qeqewqeqe -}$ {1:fffff2232323}{2:123123dasds}{4: :20:121323232323232 :21:sdsadasdasddadad if the line ends with "-}$" or if a line starts with "{1:" then it... (5 Replies)
Discussion started by: ulin
5 Replies

6. Shell Programming and Scripting

Awk Concat

Hi All this may be somewhere in internet , but couldnt find the it. i have file as abc01 2010-07-01 12:45:24 2010-07-01 12:54:35 abc02 2010-07-01 12:59:24 2010-07-01 01:05:13 abc03 . . . the output using awk should look like this abc01|2010-07-01 12:45:24|2010-07-01 12:54:35... (3 Replies)
Discussion started by: posner
3 Replies

7. Web Development

Concat of two html files

By launching two SQL scripts I get two html files report_1.html and report_2.html with different background and text colors (white/blue for the former and silver/black for the latter) but if I try to concat the two html by using the CAT function on UNIX Server where Oracle is installed (cat... (2 Replies)
Discussion started by: Mark1970
2 Replies

8. Programming

Concat of two html file

By launching two SQL scripts I get two html files report_1.html and report_2.html with different background and text colors (white/blue for the former and silver/black for the latter) but if I try to concat the two html by using the CAT function on UNIX Server where Oracle is installed (cat... (1 Reply)
Discussion started by: Mark1970
1 Replies

9. Shell Programming and Scripting

Conditional concat lines awk

Hello, I have a text file like this: NONE FILE_Rename frompath: /log_audit/AIX/log/current/AIXAFTPP.log NONE FILE_Unlink filename /audit/tempfile.14041142 NONE FILE_Rename ... (8 Replies)
Discussion started by: carloskl
8 Replies

10. UNIX for Dummies Questions & Answers

Search and then concat 4m other file (comma seperated)

My query is now a bit simplified. file1.txt names; ID; value1 ; values N; ABC; 1 ; a18 ; ... CDF; 2 ; b16 ; .. ABC; 1 ; c13 ; ...... EFG; 3 ;d12 ; ... file2.txt ID(Unique);smVals; smVal1; smVal N; 1; ...; ...; ...; 2; ..; ..; ..; 3; ..; ..; ..; ... (1 Reply)
Discussion started by: szchmaltz
1 Replies
Login or Register to Ask a Question