Help me pls : splitting single file in unix into different files based on data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help me pls : splitting single file in unix into different files based on data
# 50  
Old 10-14-2012
Hi,

Please look at below.

This is what i have imagined so far.(this is just base structure we can modify this if it is as per your requirement.Smilie)

Please do let me correct if am missing somethingSmilie
All files to while loop are file lists

Code:
while read line
do
#get the XXGoport column number add 5000 and go to next step.
for i in $(awk -F "|" '/XXGoport\|1260\|0\|3871\|0\|/{print $8+5000}' $line)
do
while read line_2
do
#get the XXGflow column number subtract 5000 and go to next step.
for j in $(awk -F "|" -v VM="$line_2" '/XXGflow/ && VM {print $(NF-3)-5000}')  # Choose your column number which column you want to print
while read line_3
do
awk -v IM="$j" '/XXGiport/ && IM' $line_3

done<out_file_iport # files required for get iport number 

done<output_file # files required to get the XXGflow number

done<Input_files    #Input_files is file having list of all input files

Hope this helps you..Smilie
This User Gave Thanks to pamu For This Post:
# 51  
Old 10-14-2012
Bug Clarification

For suppose let us assume we have :
Quote:
Input_File1
Join2
Output_File3
reformat4
Reformat5
PKS6
Lookup7
Input_File8
now i need

Quote:
Input_File1
Input_File1_f
Join2
Output_File3
reformat4
Reformat5
PKS6
Lookup7
Input_File8
Input_File8_f

Quote:
where cat Input_File1_f might be giving :

Input_File1
Reformat5
Lookup7
Output_File3
and
Quote:
cat Input_File8_f might be giving :

Input_File1
Reformat5
PKS6
reformat4
Output_File3

As many as i have Input_Files corresponding Input_Files_f files must be generated and they must contain the file names where they had flow(i guess u already got a picture of flow +5000 , -5000) and also for all the Input_Files_f, First line will be the corresponding Input_file name and last line will be Output_File.

Now is it clear??
# 52  
Old 10-14-2012
Quote:
Originally Posted by Ravindra Swan
First line will be the corresponding Input_file name and last line will be Output_File.
Okies. But what about middle lines there is no relation to establish between them we can achieve it.Smilie
# 53  
Old 10-14-2012
Bug Clarification

The relation ship b/w middle lines can be achieved by +/-5000 logic.
I am explaining this for u in a simpler way. Assume evry thing i am mentioning below just suppose the numbers but the logic is the same we need to implement.

for suppose:
i/p1 file has O/P numer 1000
join2 has I/P number 5555 , O/P number 1111
reformat3 has I/P number 6666 , O/P number 2222
output has I/P number 7777

1.) Now we need to go like this comming frm i/p1 take 1000
2.) add 5000 which gives 6000
3.) now go to XXGFlow file search for line 6000

Quote:
Quote:
{2010210004|XXGflow|35|0|69|0|{@{}@384|.5|.994955837726593|{12|207000|6000|227000|6000|2149000|6000|2149000|1069000|2159000|1069000|2180000|10555|}39469|17|}}


4.) So, from the XXGFlow file we will take 10555
5.) Substract 5000, 10555-5000=5555
6.) Go back to files search for 5555 , you will get it in join2 I/P number
7.) search in the same file (join2) for O/P number , u will get 1111
8.) add 5000 , 1111+5000 = 6111
9.) search in XXGFlow for 6111 and u got 11666
10.) substract 5000 , 11666-5000= 6666
11.) search for 6666 in our files u will get in reformat3
12.) now search for O/P number in reformat3, u will get 2222
13.) add 5000, 7222 . Search in XXGFlow , u got 12777.
14.) sub 5000 , u got 7777, search for 7777 in our files , it is Output_File
15.) job done.

Now in Input_File_f , we have
i/p1,join2,reformat3,output

Got it???

I just mentioned a small example , we can have n number of I/P files , many other files and may be more than one O/P files(im some cases we will have more than one O/P files).

So dude this is what i need.
I want u to be very clear abt this, bcz mostly dependent on u.
If u have still any doubts , come back to me.
Just for now get to know abt logic and then i'll tell u the key words for every number (I/P and O/P) and then we will try to figure it out.

I know it is a little bit difficult bcz even for me it took 1.5 month to figure it out. basically my need is:

I am working in Ab initio tool (some GE tool) , it stores every thing in unix.
Now i am in a situation to automate some thing , so i analysed how a gaph is stored in unix , how is it interpretating. At present we are now in the stage of scripting what i analysed and need.SmilieSmilieSmilieSmilie


Thanks a lot for ur patience
# 54  
Old 10-16-2012
Bug Zip file

Hey pls find zip file
I have zipped main document only , if this post got published then , i'll send u the commands whcih needed to be run
# 55  
Old 10-16-2012
Bug Zip file

Hey pls find zip file
I have zipped main document only , if this post got published then , i'll send u the commands whcih needed to be run

---------- Post updated at 01:19 PM ---------- Previous update was at 01:04 PM ----------

First run this code:
Code:
 
awk -F "\\\.mdc\||\\\.mpc\||\\\.mp\|" '{if($0~/Layout\|\$\[\[recor/){if(fn && s){print s > fn;s=$0;}else{s=$0}}
else if(NF > 1 && $0 ~ /Ab Initio/){n=split($1,a,"\\");x++;fn=a[n]x;{s=s"\n"$0}}
else if($0 ~ /PROJECT_DIR|serial\/lookup/){fn="lookup"x;s=s"\n"$o}
else if(s){s=s"\n"$0}
}END{print s > fn}' temp1.txt

U will get around 101 files generated.Observe 3 Output_Files will be there.
Now run the below:

Code:
 
ls > File_name_temp

Code:
 
while read line
do
if [[ $(awk '/PROJECT_DIR/ && /serial\/lookup/' $line) ]]
then
file_name="lookup"$(echo $line | sed 's/[a-zA-Z]//g')
echo $file_name
mv $line $file_name
fi
done<File_name_temp

Now you will find only one Output_File.

Code:
 
cat temp1.txt | grep "{2010210004|XXGflow|">xxgflow.txt

Now u will have a file to know the flows(I already mentioned about this in previous posts, this is to trace the components flow)

If every thing works fine , get back to me.
If not working also we will discuss

---------- Post updated at 03:07 PM ---------- Previous update was at 01:19 PM ----------

//logic:

Quote:
Code:
 
ls | grep "Input_File"
Input_File11
Input_File12
Input_File2
Input_File20
Input_File25
Input_File27
Input_File28
Input_File3
Input_File33
Input_File36
Input_File38
Input_File39
Input_File48
Input_File53
Input_File54
Input_File55
Input_File56
Input_File57
Input_File58
Input_File59
Input_File60
Input_File61
Input_File62
Input_File68
Input_File78
Input_File9

Code:
 
cat Input_File11 | grep "{2010203004|XXGoport|"

O/P:
Quote:
Quote:
{2010203004|XXGoport|261|0|694|0|{@{}@206000|1034000|11000|11000|read|0.0|@@@2160|0|}}

{2010203004|XXGoport|263|0|699|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7| 8|RF=||{0|}}


1034000+5000 = 1039000

Code:
 
cat xxgflow.txt | grep "1039000"
{2010210004|XXGflow|66|0|131|0|{@{}@384|.5|.5|{12|358000|1039000|378000|1039000|857000|1039000|857000|3004000|1336000|3004000|1367000|3004000|}39552|17|}}
{2010210004|XXGflow|96|0|191|0|{@{}@384|.5|.5|{12|217000|1039000|237000|1039000|251000|1039000|251000|1034000|265000|1034000|286000|1034000|}39479|17|}}

1034000-5000=1029000

Now search for 1029000 having starting line(keyword) as

Quote:
{2010202004|XXGiport|
U will get in Replicate52 , Now

Code:
 
cat Replicate52 | grep "{2010203004|XXGoport|"
{2010203004|XXGoport|937|0|2738|0|{@{}@347000|1024000|11000|21000|out|0.0|@@@2068|0|}}
{2010203004|XXGoport|940|0|2747|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}

1024000+5000=1029000

search for 1029000 in XXGFlow

Code:
 
cat xxgflow.txt | grep "1029000"
{2010210004|XXGflow|64|0|127|0|{@{}@384|.5|.5|{12|358000|1029000|378000|1029000|416000|1029000|416000|1024000|455000|1024000|476000|1024000|}39268|17|}}

1024000-5000=1019000

You will get in Filter_by_Expression(I dont know what is the sufix number)

Code:

Code:
{2010202004|XXGiport|439|0|1221|0|{@{}@477000|1019000|11000|11000|in|0.0|@@@1808|0|}}



......
Process continues until Output_Files is encountered and in
Input_Files11_f
it will be having Input_File11,Replicate52,Filter_by_Expression.....Output_File5


Got it.......
This process need to be executed for all the inputfiles
so in loop u can have a count of input files and take that particular number every time.

Code:
ls | grep "Input_File" | wc -l
26

Now loop should for 26 time each time taking Input_File



for(i=0;i<26;i++)
{
//logic//
}

//logic: is mentioned above in quotes .
Get back to me for any issues.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. Shell Programming and Scripting

Split a single file into multiple files based on a value.

Hi All, I have the sales_data.csv file in the directory as below. SDDCCR; SOM ; MD6546474777 ;05-JAN-16 ABC ; KIRAN ; CB789 ;04-JAN-16 ABC ; RAMANA; KS566767477747 ;06-JAN-16 ABC ; KAMESH; A33535335 ;04-JAN-16 SDDCCR; DINESH; GD6674474747 ;08-JAN-16... (4 Replies)
Discussion started by: ROCK_PLSQL
4 Replies

3. Shell Programming and Scripting

Splitting a single file to multiple files

Hi Friends , Please guide me with the code to extract multiple files from one file . The File Looks like ( Suppose a file has 2 tables list ,column length may vary ) H..- > File Header.... H....- >Table 1 Header.... D....- > Table 1 Data.... T....- >Table 1 Trailer.... H..-> Table 2... (1 Reply)
Discussion started by: AspiringD
1 Replies

4. UNIX for Dummies Questions & Answers

Extracting data from one file, based on another file (splitting)

Dear All, I have two files but want to extract data from one based on another... can you please help me file 1 David Tom Ellen and file 2 David|0010|testnamez|resultsz David|0004|testnamex|resultsx Tom|0010|testnamez|resultsz Tom|0004|testnamex|resultsx Ellen|0010|testnamez|resultsz... (12 Replies)
Discussion started by: A-V
12 Replies

5. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

6. Shell Programming and Scripting

Urgent ...pls Sorting files based on timestamp and picking the latest file

Hi Friends, Newbie to shell scripting. Currently i have used the below to sort data based on filenames and datestamp $ printf '%s\n' *.dat* | sort -t. -k3,4 filename_1.dat.20120430.Z filename_2.dat.20120430.Z filename_3.dat.20120430.Z filename_1.dat.20120501.Z filename_2.dat.20120501.Z... (1 Reply)
Discussion started by: robertbrown624
1 Replies

7. Shell Programming and Scripting

Splitting single file into n files

Hi all, I am new to scripting and I have a requirement we have source file as HEADER 01.10.2010 14:32:37 NAYA TA0022 TA0000 20000001;20060612;99991231;K4;02;3 20000008;20080624;99991231;K4;02;3 20000026;19840724;99991231;KK;01;3 20000027;19840724;99991231;KK;01;3... (6 Replies)
Discussion started by: srk409
6 Replies

8. Shell Programming and Scripting

Data Splitting into two files from one file

I have a file as: I/P File: Ground Car 2009 Lib 2008 Lib 2003 Ground Car 2009 Ground Car 2003 Car 2005 Car 2003 Car 2005 Sita 2900 2006 Car 2007 I have to split the file into two: - one for names and second for years. O/p1 (Names): Ground Car (3 Replies)
Discussion started by: karumudi7
3 Replies

9. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

10. Shell Programming and Scripting

splitting files based on text in the file

I need to split a file based on certain context inside the file. Is there a unix command that can do this? I have looked into split and csplit but it does not seem like those would work because I need to split this file based on certain text. The file has multiple records and I need to split this... (1 Reply)
Discussion started by: matrix1067
1 Replies
Login or Register to Ask a Question