Help me pls : splitting single file in unix into different files based on data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help me pls : splitting single file in unix into different files based on data
# 36  
Old 10-09-2012
Just another doubt ....
can we create these files in a directory???
if yes
can we make the directory name as the file name we are using???

For ex:

Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{print s > fn ;s="";print > fn}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' temp1.txt

I want these files to be in a directory with the directory name as temp1.

If possible can u please modify the code which solves this problem...
# 37  
Old 10-09-2012
give your temp folder path as a variable..

use this..
It will create files in your given path..

Code:
awk -v dir_path="/home/temp1/" -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=dir_path""a[n]x;{print s > fn ;s="";print > fn}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' file

Hope this helps youSmilie
# 38  
Old 10-10-2012
Bug Small trouble in naming files

Code:
 
Output_File73:
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio\\Ab Initio GDE 1.15.11.1\\Components\\Datasets\\Output_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|write_metadata||3|8|s=|@{0|}}
{30001002|XXparameter|eme_dataset_location|$\{PROJECT_DIR\}/data/serial/lookup/m_cdp_cdm_gl_prod_lvl_2_lookup.dat|3|9||@{0|}}

Code:
 
Output_File77:
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Output_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|write_metadata||3|8|s=|@{0|}}
{30001002|XXparameter|eme_dataset_location|$\{PROJECT_DIR\}/data/serial/lookup/cdp2_uedw_v_thrd_prty_orig.lkp|3|9||@{0|}}

Hi ,
Please check the code above.
Even though It is a output , but it is a lookup o/p . I dont want the name of the file to be output for these kind.
So, the below code is working fine.
Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{print s > fn ;s="";print > fn}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' temp2.txt

after this code execution can we make anything so that the file name changes if

Code:
 
{30001002|XXparameter|eme_dataset_location|$\{PROJECT_DIR\}/data/serial/lookup/cdp2_uedw_v_thrd_prty_orig.lkp|3|9||@{0|}}

So if we have lookup in the above line . i.e cat the file with names output and input and grep this particular pattern and if it is present change the file name to lookup accordingly.
I also mentioned I/p bcz i think this trouble may occur in I/P file also.

Thanks a lot in advance Smilie
# 39  
Old 10-10-2012
Now its getting more complicated...

try this..

Code:
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){if(fn && s){print s > fn;s=$0;}else{s=$0}}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{s=s"\n"$0}}
else if($0 ~ /PROJECT_DIR/ && $0 ~ /serial\/lookup/){fn="lookup"x;s=s"\n"$o}
else if(s){s=s"\n"$0}
}END{print s > fn}' file


Last edited by pamu; 10-10-2012 at 03:41 AM.. Reason: added more info..
# 40  
Old 10-10-2012
issue in 'lookup' name change code

There is difference in O/P for these 2 codes:
Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{print s > fn ;s="";print > fn}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' temp1.txt


Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){if(fn && s){print s > fn;s=$0;}else{s=$0}}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{s=s"\n"$0}}
else if($0 ~ /PROJECT_DIR|serial\/lookup/){fn="lookup"x;s=s"\n"$o}
else if(s){s=s"\n"$0}
}END{print s > fn}' temp1.txt

For the first i am getting 102 files generated.This is correct .
But for the second where 'lookup' change is done , i am getting only 42 files.
I am not able to analyse these two and find where it is going wrong .
can u please look into this issue.

My point of view is as the first code is working good. Can we make a change of the file name like this:

cat all files like 'Output_File' and then | grep "{30001002|XXparameter|eme_dataset_location|$\{PROJECT_DIR\}/data/serial/lookup"
if present change corresponding file name to lookup. Here numbering convention is not necessary for lookup.So, if u feel numbering convention is difficult to obtain, u can leave it.
Thanks a lot in advance.
# 41  
Old 10-10-2012
what is output of below two commands..?

I need last value of x just run and give me the output value of below commands..

Code:
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){if(fn && s){print s > fn;s=$0;}else{s=$0}}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{s=s"\n"$0}}
else if($0 ~ /PROJECT_DIR|serial\/lookup/){fn="lookup"x;s=s"\n"$o}
else if(s){s=s"\n"$0}
}END{print s > fn; print x}' temp1.txt


Code:
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{print s > fn ;s="";print > fn}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}END{print x}' temp1.txt

# 42  
Old 10-10-2012
Bug O/P

for ur code , both of them are giving 101

Code:
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){s=$0;} else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{print s > fn ;s="";print > fn}} else if(s){s=s"\n"$0} else{if(fn){print > fn}}}' temp1.txt

O/P of the this command:
Quote:
Dedup_Sorted97 Filter_by_Expression72 Input_File11 Input_File39 Input_File57 Input_File78 Join37 lookup80 Replicate52
Filter_by_Expression100 Filter_by_Expression91 Input_File12 Input_File53 Input_File58 Join1 Join47 Partition_by_Key_and_Sort4 Rollup19
Filter_by_Expression101 Filter_by_Expression98 Input_File2 Input_File54 Input_File59 Join10 Join8 reformat32 Filter_by_Expression23 Filter_by_Expression99 Input_File27 Input_File55 Input_File60 Join26 lookup77 Reformat67
Filter_by_Expression24 Gather76 Input_File38 Input_File56 Input_File61 Join35 lookup79 reformat74
Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){if(fn && s){print s > fn;s=$0;}else{s=$0}}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{s=s"\n"$0}}
else if($0 ~ /PROJECT_DIR|serial\/lookup/){fn="lookup"x;s=s"\n"$o}
else if(s){s=s"\n"$0}
}END{print s > fn}' temp1.txt

O/P for the this command:
Quote:
Dedup_Sorted18 Filter_by_Expression91 Input_File38 Join10 Partition_by_Key_and_Sort16 Reformat64
Dedup_Sorted21 Filter_by_Expression92 Input_File39 Join26 Partition_by_Key_and_Sort17 Reformat66
Dedup_Sorted34 Filter_by_Expression94 Input_File48 Join35 Partition_by_Key_and_Sort29 Reformat67
Dedup_Sorted7 Filter_by_Expression95 Input_File53 Join37 Partition_by_Key_and_Sort30 Reformat69
Dedup_Sorted82 Filter_by_Expression98 Input_File54 Join47 Partition_by_Key_and_Sort31 reformat74
Dedup_Sorted83 Filter_by_Expression99 Input_File55 Join63 Partition_by_Key_and_Sort4 Reformat88
Dedup_Sorted90 Filter by Expression - (Transform)45 Input_File56 Join8 Partition_by_Key_and_Sort41 Replicate51
Dedup_Sorted93 Gather76 Input_File57 lookup77 Partition_by_Key_and_Sort42 Replicate52
Dedup_Sorted97 Input_File11 Input_File58 lookup79 Partition_by_Key_and_Sort43 Replicate70
Filter_by_Expression100 Input_File12 Input_File59 lookup80 Partition_by_Key_and_Sort44 Replicate71
Filter_by_Expression101 Input_File2 Input_File60 Lookup_File79 Partition_by_Key_and_Sort46 Replicate87
Filter_by_Expression13 Input_File20 Input_File61 Lookup_File80 Partition_by_Key_and_Sort49 Rollup19
Filter_by_Expression22 Input_File25 Input_File62 Lookup_File81 Partition_by_Key_and_Sort50 Rollup65
Filter_by_Expression23 Input_File27 Input_File68 Output_File5 Partition_by_Key_and_Sort6 Rollup89
Filter_by_Expression24 Input_File28 Input_File75 Output_File73 Partition_by_Key_and_Sort85 Filter_by_Expression40 Input_File3 Input_File78 Output_File77 Partition_by_Key_and_Sort86
Filter_by_Expression72 Input_File33 Input_File9 Partition_by_Key_and_Sort14 Partition_by_Key_and_Sort96
Filter_by_Expression84 Input_File36 Join1 Partition_by_Key_and_Sort15 reformat32
3

IO need 101 files for this particular temp1.txt
Please look into this
Thanks in advance
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. Shell Programming and Scripting

Split a single file into multiple files based on a value.

Hi All, I have the sales_data.csv file in the directory as below. SDDCCR; SOM ; MD6546474777 ;05-JAN-16 ABC ; KIRAN ; CB789 ;04-JAN-16 ABC ; RAMANA; KS566767477747 ;06-JAN-16 ABC ; KAMESH; A33535335 ;04-JAN-16 SDDCCR; DINESH; GD6674474747 ;08-JAN-16... (4 Replies)
Discussion started by: ROCK_PLSQL
4 Replies

3. Shell Programming and Scripting

Splitting a single file to multiple files

Hi Friends , Please guide me with the code to extract multiple files from one file . The File Looks like ( Suppose a file has 2 tables list ,column length may vary ) H..- > File Header.... H....- >Table 1 Header.... D....- > Table 1 Data.... T....- >Table 1 Trailer.... H..-> Table 2... (1 Reply)
Discussion started by: AspiringD
1 Replies

4. UNIX for Dummies Questions & Answers

Extracting data from one file, based on another file (splitting)

Dear All, I have two files but want to extract data from one based on another... can you please help me file 1 David Tom Ellen and file 2 David|0010|testnamez|resultsz David|0004|testnamex|resultsx Tom|0010|testnamez|resultsz Tom|0004|testnamex|resultsx Ellen|0010|testnamez|resultsz... (12 Replies)
Discussion started by: A-V
12 Replies

5. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

6. Shell Programming and Scripting

Urgent ...pls Sorting files based on timestamp and picking the latest file

Hi Friends, Newbie to shell scripting. Currently i have used the below to sort data based on filenames and datestamp $ printf '%s\n' *.dat* | sort -t. -k3,4 filename_1.dat.20120430.Z filename_2.dat.20120430.Z filename_3.dat.20120430.Z filename_1.dat.20120501.Z filename_2.dat.20120501.Z... (1 Reply)
Discussion started by: robertbrown624
1 Replies

7. Shell Programming and Scripting

Splitting single file into n files

Hi all, I am new to scripting and I have a requirement we have source file as HEADER 01.10.2010 14:32:37 NAYA TA0022 TA0000 20000001;20060612;99991231;K4;02;3 20000008;20080624;99991231;K4;02;3 20000026;19840724;99991231;KK;01;3 20000027;19840724;99991231;KK;01;3... (6 Replies)
Discussion started by: srk409
6 Replies

8. Shell Programming and Scripting

Data Splitting into two files from one file

I have a file as: I/P File: Ground Car 2009 Lib 2008 Lib 2003 Ground Car 2009 Ground Car 2003 Car 2005 Car 2003 Car 2005 Sita 2900 2006 Car 2007 I have to split the file into two: - one for names and second for years. O/p1 (Names): Ground Car (3 Replies)
Discussion started by: karumudi7
3 Replies

9. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

10. Shell Programming and Scripting

splitting files based on text in the file

I need to split a file based on certain context inside the file. Is there a unix command that can do this? I have looked into split and csplit but it does not seem like those would work because I need to split this file based on certain text. The file has multiple records and I need to split this... (1 Reply)
Discussion started by: matrix1067
1 Replies
Login or Register to Ask a Question