Help me pls : splitting single file in unix into different files based on data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help me pls : splitting single file in unix into different files based on data
# 1  
Old 10-05-2012
Lightbulb Help me pls : splitting single file in unix into different files based on data

I have a file in unix with sample data as follows :
Code:
--------------------------------------------------------------
--------------------------------------------------------------
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

I want this file to be splitted into different files and corresponding to the sample data 2 files with file names main and tag and those files must have data as below:
main:
Code:
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

___________________________________________________________________________________________
tag:
Code:
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

____________________________________________________________________________________________
I need to do some other analysis on these files.My file contains many as similar partitions.Can anyone help me in this issue in UNIX . Thanks in advance

Last edited by Scrutinizer; 10-05-2012 at 07:03 AM.. Reason: code tags
# 2  
Old 10-05-2012
what criteria you are following to split the file
# 3  
Old 10-05-2012
try this..
Code:
awk '{if($0 ~ /XXparameter\|Layout\|/){a++;print > "file_"a}else{if(a){print > "file_"a}}}' file

for your above record two files are created..
Code:
$ls file_*
file_1  file_2

This User Gave Thanks to pamu For This Post:
# 4  
Old 10-05-2012
I have to emphasize raj_saini20's request to be more specific, as e.g. all the characteristic lines are identical. However, assuming the line containing "Layout" being the separator, this should do the task:
Code:
 awk '/Layout/{fn="file"++x} x{print > fn}' file

This User Gave Thanks to RudiC For This Post:
# 5  
Old 10-05-2012
How can i name my new files with the names that i got from the text??

---------- Post updated at 11:02 PM ---------- Previous update was at 11:00 PM ----------

With the code above i can get starting text but how to make multiple files from one file which contains all data , can we achieve this using loop??? can you please help me to extract lines i.e starting position and ending line also.Thanks in advance
# 6  
Old 10-05-2012
Quote:
Originally Posted by Ravindra Swan
How can i name my new files with the names that i got from the text??
I don't understand. As I pointed out before, all the "Layout" lines in your sample are identical. What should be the filename?

WAIT! There's main and tag in the line after the "Layout" line. Try
Code:
$ awk 'BEGIN{FS="[\\\|]"} /Layout/{a=$0; getline; fn=$14;  print a >fn } a{print > fn}' file

---------- Post updated at 11:02 PM ---------- Previous update was at 11:00 PM ----------

Quote:
With the code above i can get starting text but how to make multiple files from one file which contains all data , can we achieve this using loop??? can you please help me to extract lines i.e starting position and ending line also.Thanks in advance
Again, I don't understand. The code is producing a different, new output file every time it encounters a line containing "Layout". BTW, you did not specify a criterion about how to split the file, as already requested by raj_saini20. Did you try the code? What do you mean by "starting position and ending line"? Where and when do we get these and where should we put these?
Pls. provide meaningful input samples and desired output.

Last edited by RudiC; 10-05-2012 at 03:32 PM.. Reason: Sorry, lines are not identical...
This User Gave Thanks to RudiC For This Post:
# 7  
Old 10-06-2012
Thanks for your response.

Code:
--------------------------------------------------------------
--------------------------------------------------------------
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
---------------------------------------------------------- {30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
---------------------------------------------------------- {30001002|XXparameter|metadata||7|8|RF=||{0|}} 
{30001002|XXparameter|metadata||7|8|RF=||{0|}}



The starting line of the file should be highlighted in green and
ending line of the file should be highlighted in violet(we have repeating
identical last line lines , just observe sample data , i need the
last most ending line into the file.)Similar to this sample data
i have many such a kind of partitions in my original file and even
layout is repeating twice in each of the partition i want.Sorry
i did not observe that when i am posting sample data. I'll provide you the smallest original file
on Monday.Please try to solve that problem.Now did u get first and
last line criteria??


---------- Post updated at 12:42 PM ---------- Previous update was at 11:41 AM ----------

The file names (main and tag from sample data) are also repeating.So,can we assign a variable initializing it to zero and incrementing it every time a file is created and appending that variable to the last word of the file name.
For example file names:
Code:
main1
tag2
main3
main4
tag5

Can this is possible?or do u have any suggestions like can we make it:
Code:
main1
tag1
main2
main3
tag2

In the sense for main there are 3 files and tag there are 2. Is this possible? Please help me out.......

Last edited by Scott; 10-07-2012 at 06:49 AM.. Reason: Use code tags, please, and less formatting.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. Shell Programming and Scripting

Split a single file into multiple files based on a value.

Hi All, I have the sales_data.csv file in the directory as below. SDDCCR; SOM ; MD6546474777 ;05-JAN-16 ABC ; KIRAN ; CB789 ;04-JAN-16 ABC ; RAMANA; KS566767477747 ;06-JAN-16 ABC ; KAMESH; A33535335 ;04-JAN-16 SDDCCR; DINESH; GD6674474747 ;08-JAN-16... (4 Replies)
Discussion started by: ROCK_PLSQL
4 Replies

3. Shell Programming and Scripting

Splitting a single file to multiple files

Hi Friends , Please guide me with the code to extract multiple files from one file . The File Looks like ( Suppose a file has 2 tables list ,column length may vary ) H..- > File Header.... H....- >Table 1 Header.... D....- > Table 1 Data.... T....- >Table 1 Trailer.... H..-> Table 2... (1 Reply)
Discussion started by: AspiringD
1 Replies

4. UNIX for Dummies Questions & Answers

Extracting data from one file, based on another file (splitting)

Dear All, I have two files but want to extract data from one based on another... can you please help me file 1 David Tom Ellen and file 2 David|0010|testnamez|resultsz David|0004|testnamex|resultsx Tom|0010|testnamez|resultsz Tom|0004|testnamex|resultsx Ellen|0010|testnamez|resultsz... (12 Replies)
Discussion started by: A-V
12 Replies

5. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

6. Shell Programming and Scripting

Urgent ...pls Sorting files based on timestamp and picking the latest file

Hi Friends, Newbie to shell scripting. Currently i have used the below to sort data based on filenames and datestamp $ printf '%s\n' *.dat* | sort -t. -k3,4 filename_1.dat.20120430.Z filename_2.dat.20120430.Z filename_3.dat.20120430.Z filename_1.dat.20120501.Z filename_2.dat.20120501.Z... (1 Reply)
Discussion started by: robertbrown624
1 Replies

7. Shell Programming and Scripting

Splitting single file into n files

Hi all, I am new to scripting and I have a requirement we have source file as HEADER 01.10.2010 14:32:37 NAYA TA0022 TA0000 20000001;20060612;99991231;K4;02;3 20000008;20080624;99991231;K4;02;3 20000026;19840724;99991231;KK;01;3 20000027;19840724;99991231;KK;01;3... (6 Replies)
Discussion started by: srk409
6 Replies

8. Shell Programming and Scripting

Data Splitting into two files from one file

I have a file as: I/P File: Ground Car 2009 Lib 2008 Lib 2003 Ground Car 2009 Ground Car 2003 Car 2005 Car 2003 Car 2005 Sita 2900 2006 Car 2007 I have to split the file into two: - one for names and second for years. O/p1 (Names): Ground Car (3 Replies)
Discussion started by: karumudi7
3 Replies

9. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

10. Shell Programming and Scripting

splitting files based on text in the file

I need to split a file based on certain context inside the file. Is there a unix command that can do this? I have looked into split and csplit but it does not seem like those would work because I need to split this file based on certain text. The file has multiple records and I need to split this... (1 Reply)
Discussion started by: matrix1067
1 Replies
Login or Register to Ask a Question