Sponsored Content
Top Forums Shell Programming and Scripting Splitting a delimited text file Post 302899054 by lupin..the..3rd on Friday 25th of April 2014 04:58:46 PM
Old 04-25-2014
Splitting a delimited text file

Howdy folks, I've got a very large plain text file that I need to split into many smaller files. My script-fu is not powerful enough for this, so any assistance is much appreciated.

The file is a database dump from Cyrus IMAP server. It's basically a bunch of emails (thousands) all concatenated into one huge file. There is a delimiter line between each email. It looks something like this

Code:
--dump-4564564.some.jibberish.whatever
From: user@domain.com
To: myfriend@email.com

Email Body

Best Regards,
Email Author

--dump-789789863.random.numbers.maybe
From: anotheruser@domain.com
To: someguy@planet.earth

Email Body

Your Friend,
another user

So as you can see, the start of each email is preceded with a line that begins with "--dump".

What I'm looking for, is:

1. To split this monolithic file into many smaller files, where each smaller file contains a single email.
2. Where each smaller file should contain all of the lines of text after a "--dump" delimiter, up until the next "--dump" delimiter (or end of file).
3. And the "--dump" delimiter line itself should not be included in each smaller file.

I feel like some awk/grep/sed magic could do this, but I'm not enough of a wizard to write this script.

Thank you very much!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

splitting a pipe delimited file in unix

Could one of you shad some light on this: I need to split the file by determining the record count and than splitting it up into 4 files. Please note, this is not a fixed record length but rather a "|" delimited file. I am not sure as how to handle reminder/offset for the 4th file. For... (4 Replies)
Discussion started by: ddedic
4 Replies

2. Shell Programming and Scripting

splitting tab-delimited file with awk

Hi all, I need help to split a tab-delimited list into separate files by the filename-field. The list is already sorted ascendingly by filename, an example list would look like this; filename001 word1 word2 filename001 word3 word4 filename002 word1 word2 filename002 word3 word4... (4 Replies)
Discussion started by: perkele
4 Replies

3. Shell Programming and Scripting

splitting text file into smaller ones

Hello We have a text file with 400,000 lines and need to split into multiple files each with 5000 lines ( will result in 80 files) Got an idea of using head and tail commands to do that with a loop but looked not efficient. Please advise the simple and yet effective way to do it. TIA... (3 Replies)
Discussion started by: prvnrk
3 Replies

4. UNIX for Dummies Questions & Answers

How to convert text to columns in tab delimited text file

Hello Gurus, I have a text file containing nearly 12,000 tab delimited characters with 4000 rows. If the file size is small, excel can convert the text into coloumns. However, the file that I have is very big. Can some body help me in solving this problem? The input file example, ... (6 Replies)
Discussion started by: Unilearn
6 Replies

5. Linux

Splitting a Text File by Rows

Hello, Please help me. I have hundreds of text files composed of several rows of information and I need to separate each row into a new text file. I was trying to figure out how to split the text file into different text files, based on each row of text in the original text file. Here is an... (2 Replies)
Discussion started by: dvdrevilla
2 Replies

6. UNIX for Dummies Questions & Answers

Converting a text file with irregular spacing into a space delimited text file?

I have a text file with irregular spacing between values which makes it really difficult to manipulate. Is there an easy way to convert it into a space delimited text file so that all the spaces, double spaces, triple spaces, tabs between numbers are converted into spaces. The file looks like this:... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

splitting tab delimited strings

hi i have a requirement to input a string to a shell script and to split the string to multiple fields, the string is copied from a row of three columns (name,age,address) in an excel sheet. the three columns (from excel) are seperated with a tab when pasted in the command prompt, but when the ... (2 Replies)
Discussion started by: midhun19
2 Replies

8. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

9. Shell Programming and Scripting

Splitting delimited string into rows

Hi, I have a requirement that has 50-60 million records that we need to split a delimited string (Delimeter is newline) into rows. Source Date: SerialID UnidID GENRE 100 A11 AAAchar(10)BBB 200 B11 CCCchar(10)DDD(10)ZZZZ Field 'GENRE' is a string with new line as delimeter and not sure... (5 Replies)
Discussion started by: techmoris
5 Replies

10. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Hello, I have some large text files that look like, putrescine Mrv1583 01041713302D 6 5 0 0 0 0 999 V2000 2.0928 -0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.6650 0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5217 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies
dump(n) 																   dump(n)

__________________________________________________________________________________________________________________________________________________

NAME
dump - Dump information about Tcl interpreter in TkCon SYNOPSIS
dump method ?-nocomplain? ?-filter pattern? ?--? pattern ?pattern ...? _________________________________________________________________ DESCRIPTION
The dump command provides a way for the user to spit out state information about the interpreter in a Tcl readable (and human readable) form. It takes the general form: dump method ?-nocomplain? ?-filter pattern? ?--? pattern ?pattern ...? The patterns represent glob-style patterns (as in string match pattern $str). -nocomplain will prevent dump from throwing an error if no items matched the pattern. -filter is interpreted as appropriate for the method. The various methods are: dump command args Outputs one or more commands. dump procedure args Outputs one or more procs in sourceable form. dump variable args Outputs the values of variables in sourceable form. Recognizes nested arrays. The -filter pattern is used as to filter array ele- ment names and is interepreted as a glob pattern (defaults to {*}). It is passed down for nested arrays. dump widget args Outputs one or more widgets by giving their configuration options. The -filter pattern is used as to filter the config options and is interpreted as a case insensitive regexp pattern (defaults to {.*}). SEE ALSO
idebug(n), observe(n), tkcon(1), tkcon(n), tkconrc(5) KEYWORDS
Tk, console, dump COPYRIGHT
Copyright (c) Jeffrey Hobbs <jeff at hobbs.org> TkCon 2.5 dump(n)
All times are GMT -4. The time now is 06:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy