The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
To parse through the file and print output using awk or sed script cdfd123 Shell Programming and Scripting 4 03-03-2008 08:07 AM
parse text file craggm Shell Programming and Scripting 9 02-26-2007 10:13 PM
parse text file klick81 Shell Programming and Scripting 3 12-18-2006 08:04 AM
Parse Text file and send mails Amruta Pitkar UNIX for Dummies Questions & Answers 12 08-11-2006 12:56 AM
How to parse a text file with \034 as field and \035 as end of message delimiter? indianya Shell Programming and Scripting 1 08-26-2005 06:20 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 02-17-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
Stumble this Post!
parse through one text file and output many

Hi, everyone

The input file pattern is like below:

Begin Object1

txt1

end
;


Begin Object2

txt2

end
;

...


I want to parse this one file into Object1.txt, Obeject2.txt... each contains one statement from 'Begin' to ';' , e.g. object1.txt contains:
Begin object1

txt1

end
;

---------------------------
Any thoughts?
also, Questions
1. any AWK or SED can search for a pattern across many lines?
2. how can I output many files?
3. Shall I move this thread to "Shell Programming and Scripting"?

Thank you in advance
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 02-17-2008
Registered User
 

Join Date: Oct 2007
Posts: 120
Stumble this Post!
If You use bash You could try something like this, very simple and probably lots of pitfalls. But since You are processing source code there are syntactic rules that can be expected to be followed. It simply checks for the Begin word and increments the file name index when encountered.
Code:
lakris@ubuntu:~/projekt/scripts$ cat projekt.txt 
Begin Object1
txt1
end
;
Begin Object2
txt2
end
;
Begin Object3
txt3
end
;
Begin Object4
txt4
end
;
lakris@ubuntu:~/projekt/scripts$ cat splitit.sh 
#!/bin/bash
cnt=0
while read line;do
  [[ "$line" =~ "Begin" ]] && cnt=$(($cnt+1))
  echo $line goes into Object$cnt.txt
done < projekt.txt
lakris@ubuntu:~/projekt/scripts$ ./splitit.sh 
Begin Object1 goes into Object1.txt
txt1 goes into Object1.txt
end goes into Object1.txt
; goes into Object1.txt
Begin Object2 goes into Object2.txt
txt2 goes into Object2.txt
end goes into Object2.txt
; goes into Object2.txt
Begin Object3 goes into Object3.txt
txt3 goes into Object3.txt
end goes into Object3.txt
; goes into Object3.txt
Begin Object4 goes into Object4.txt
txt4 goes into Object4.txt
end goes into Object4.txt
; goes into Object4.txt
lakris@ubuntu:~/projekt/scripts$
Change "goes into" to ">>" when You are confident that the output is what You want. It will append to any file with that name so You may want to remove any Object*.txt first.

/Lakris
Reply With Quote
  #3 (permalink)  
Old 02-17-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
Stumble this Post!
Thumbs up

Thank you very much, Lakris. I will try it out
Reply With Quote
  #4 (permalink)  
Old 02-17-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
Stumble this Post!
Oh, what if the input file like:
Begin aaaaa
txt1
end
;
Begin bbbbbb
txt2
end
;
Begin cccc
txt3
end
;
Begin ddd
txt4
end
;
Reply With Quote
  #5 (permalink)  
Old 02-18-2008
Registered User
 

Join Date: Oct 2007
Posts: 120
Stumble this Post!
then the first Begin statement (aaaaa) ends up in Object1.txt the second (bbbbbb) in Object2.txt etc. Do You want to have them named Object-aaaaa.txt, Object-bbbbbb.txt etc?
Have a look at the while read line construct. You can split it up to read more than one variable... or You can treat line as an array.
Reply With Quote
  #6 (permalink)  
Old 02-18-2008
radoulov's Avatar
addict
 

Join Date: Jan 2007
Location: Milan, Italy/Varna, Bulgaria
Posts: 1,516
Stumble this Post!
Code:
awk '/^Begin/{close(f);f=$2".txt"}f{print>f}' input
Reply With Quote
  #7 (permalink)  
Old 02-18-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
Stumble this Post!
Quote:
Originally Posted by Lakris View Post
then the first Begin statement (aaaaa) ends up in Object1.txt the second (bbbbbb) in Object2.txt etc. Do You want to have them named Object-aaaaa.txt, Object-bbbbbb.txt etc?
Have a look at the while read line construct. You can split it up to read more than one variable... or You can treat line as an array.
No, I dont want them named Object-aaaaa.txt. It should be named as aaaaa.txt


Thanks
Reply With Quote
Google The UNIX and Linux Forums
Reply

Tags
linux, ubuntu

Thread Tools
Display Modes




All times are GMT -7. The time now is 06:02 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0