Splitting the file using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Splitting the file using awk
# 8  
Old 12-10-2012
Hi All,
To make things clear, below is my actual requirement. Sorry for giving more details lately.
Could you please help me in getting this quickly.

Thanks a lot in advance.

Requirement Details:-
Input File Details
File Name ---> INFILE.TXT
File Content --->
Code:
V|xxxxxxxxxxxxx|01/12/2012|0411111111
V|yyyyyyyyyyyy|01/12/2012|0422222222
V|zzzzzzzzzzz|01/12/2012|0388888888
V|aaaaaaaaaaaa|01/12/2012|0388888888
M|xxxxxxxxxxxxx|01/12/2012|0411111111|20
M|yyyyyyyyyyyy|01/12/2012|0422222222|25.5
M|zzzzzzzzzzz|01/12/2012|0388888888|30.2
M|aaaaaaaaaaaa|01/12/2012|0388888888|20.2
M|bbbbbbbbbb|01/12/2012|0299999999|50.5
D|xxxxxxxxxxxxx|0411111111
D|yyyyyyyyyyyy|0422222222
D|zzzzzzzzzzz|0388888888
D|aaaaaaaaaaaa|0388888888
D|bbbbbbbbbb|0299999999
D|yasdfasdfasdfasd|0299999999
O|xxxxxxxxxxxxx|0411111111|canada
O|yyyyyyyyyyyy|0422222222|canada
O|zzzzzzzzzzz|0388888888|USA
O|aaaaaaaaaaaa|0388888888|UK
O|bbbbbbbbbb|0299999999|Behrain

Requirement is to split this input file based on first character as all Vs should go one out file, all Ms should go to second out file and so on.
The output files should like below:
OUTFILE_V.TXT:-
Code:
V|xxxxxxxxxxxxx|01/12/2012|0411111111
V|yyyyyyyyyyyy|01/12/2012|0422222222
V|zzzzzzzzzzz|01/12/2012|0388888888
V|aaaaaaaaaaaa|01/12/2012|0388888888

OUTFILE_M.TXT:-
Code:
M|xxxxxxxxxxxxx|01/12/2012|0411111111|20
M|yyyyyyyyyyyy|01/12/2012|0422222222|25.5
M|zzzzzzzzzzz|01/12/2012|0388888888|30.2
M|aaaaaaaaaaaa|01/12/2012|0388888888|20.2
M|bbbbbbbbbb|01/12/2012|0299999999|50.5

OUTFILE_D.TXT:-
Code:
D|xxxxxxxxxxxxx|0411111111
D|yyyyyyyyyyyy|0422222222
D|zzzzzzzzzzz|0388888888
D|aaaaaaaaaaaa|0388888888
D|bbbbbbbbbb|0299999999
D|yasdfasdfasdfasd|0299999999

OUTFILE_O.TXT:-
Code:
O|xxxxxxxxxxxxx|0411111111|canada
O|yyyyyyyyyyyy|0422222222|canada
O|zzzzzzzzzzz|0388888888|USA
O|aaaaaaaaaaaa|0388888888|UK
O|bbbbbbbbbb|0299999999|Behrain

Regards,
Sagar
Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 12-10-2012 at 07:46 PM.. Reason: code tags, please!
# 9  
Old 12-10-2012
Code:
awk -F'|' '{o="OUTPUT_" $1 ".TXT";print $0 >o;close(o)}' INFILE.TXT

# 10  
Old 12-10-2012
Hi vgersh99,

Thanks for your quick reply. But, I am getting only one record in each output file.

Could you please advise further to get all records in the output file.

Regards,
pavan
# 11  
Old 12-10-2012
sorry...
Code:
awk -F'|' '{o="OUTPUT_" $1 ".TXT";print $0 >>o;close(o)}' INFILE.TXT

# 12  
Old 12-10-2012
Hi vgersh99,

Thanks a lot for your quick reply. I shall further test and come back if I see any issues.

Regards,
Sagar.

---------- Post updated at 07:48 PM ---------- Previous update was at 07:45 PM ----------

Hi vgersh99,

One more thing is that my file is 4GB of size and I need to split this file into 9 different files based on one distinguishing code.

Could you please advise whether I see any performance issue.

If in that case, is there any other way to achieve the same result.

Regards,
Sagar.
# 13  
Old 12-10-2012
if you can guarantee that the number of unique values in the first column is less than 10, than you optimize the code a bit more saving on the 'close':
Code:
awk -F'|' '{o="OUTPUT_" $1 ".TXT";print $0 >>o}' INFILE.TXT

test and see...
# 14  
Old 12-10-2012
Hi vgersh99,

Below is the expectation from the source system:
"Code Type" is the distinguishing field in the input file from source.
We are going to receive 9 different Code Types from the source.
Total Size of file is expected to be 4GB to 5GB on daily basis.

In the said scenario, I will see any performance issues?

Thanks a lot for all your inputs.

Regards,
Sagar.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk solution for Splitting a file.

Hi I have a csv file with as below sdg-catalog-00000001 sdg-sku-00000317 sdg-sku-00000318 sdg-sku-00000319 sdg-sku-00000320 sdg-catalog-00000002 sdg-sku-00000321 sdg-sku-00000322 sdg-sku-00000323 sdg-sku-00000324 sdg-sku-00000325 sdg-catalog-00000003 sdg-sku-00000326... (3 Replies)
Discussion started by: Raghuram717
3 Replies

2. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Hello, I have some large text files that look like, putrescine Mrv1583 01041713302D 6 5 0 0 0 0 999 V2000 2.0928 -0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.6650 0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5217 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

3. Shell Programming and Scripting

Splitting fixed length file using awk

Hi, I need to split a fixed length file of 160 characters based on value of a column. Example: ABC 456780001 DGDG SDFSF BCD 444440002 SSSS TTTTT ABC 777750003 HHHH UUUUU THH 888880001 FFFF LLLLLL HHH 999990002 GGGG OOOOO I need to split this file on basis of column from... (7 Replies)
Discussion started by: Neelkanth
7 Replies

4. Shell Programming and Scripting

Splitting file using awk

I have file with below content FG1620000|20000 FG1623000|23000 FG1625000|25000 FG1643894|43894 FG1643895|43895 FG1643896|43896 FG1643897|43897 FG1643898|43898 My aim is to split the above file into two files based on the value in the second field. If the value in second field is... (2 Replies)
Discussion started by: anijan
2 Replies

5. Shell Programming and Scripting

awk for splitting file in constant chunks

Hi gurus, I wanted to split main file in 20 files with 2500 lines in each file. My main file conatins total 2500*20 lines. Following awk I made, but it is breaking with error. awk '{ for (i = 1; i <= 20; i++) { starts=2500*$i-1; ends=2500*$i; NR>=starts && NR<=ends {f=My$i".txt"; print >> f;... (10 Replies)
Discussion started by: mukesh.lalwani
10 Replies

6. Shell Programming and Scripting

Adding header to sub files after splitting the main file using AWK

Hi Folks, I have a file like: mainfile.txt: ------------- file1 abc def xyz file1 aaa pqr xyz file2 lmn ghi xyz file2 bbb tuv xyz I need output having two files file1 and file2. file1: ------ Name State Country abc def xyz aaa pqr xyz file2: (3 Replies)
Discussion started by: tanmay.gemini
3 Replies

7. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

8. Shell Programming and Scripting

Splitting a complex file using awk

I have a file that contains the following format delete from table1; delete from table2; insert into table1 (col1, col2) values (value1, value2)@ insert into table1 (col1, col2) values(value3, value4)@ insert into table2(col1, col2,col3) values(value1, value2, value3)@ etc etc This is... (9 Replies)
Discussion started by: hukcjv
9 Replies

9. Shell Programming and Scripting

splitting tab-delimited file with awk

Hi all, I need help to split a tab-delimited list into separate files by the filename-field. The list is already sorted ascendingly by filename, an example list would look like this; filename001 word1 word2 filename001 word3 word4 filename002 word1 word2 filename002 word3 word4... (4 Replies)
Discussion started by: perkele
4 Replies

10. UNIX for Advanced & Expert Users

Help with splitting lines in a file using awk

I have a file which is one big long line of text about 10Kb long. Can someone provide a way using awk to introduce carriage returns every 40 chars in this file. Any other solutions would also be welcome. Thank you in advance. (5 Replies)
Discussion started by: martinbarretto
5 Replies
Login or Register to Ask a Question