Sponsored Content
Top Forums Shell Programming and Scripting Need help splitting huge single record file Post 302578855 by leolson on Friday 2nd of December 2011 02:30:21 PM
Old 12-02-2011
Question Need help splitting huge single record file

I was given a data file that I need to split into multiple lines/records based on a key word. The problem is that it is 2.5GB or bigger and everything I try in perl or sed causes a Segmentation fault. Can someone give me some other ideas.

The data is of the form:
Code:
RANDOMDATA*end*RANDOMDATA*end*RANDOMDATA*end*RANDOMDATA*end*



with no LF's to break it up.
I have tried things such as:
Code:
sed "s/\*end\*/\n/g" test1.text > test2.txt
cat test1.text | sed "s/\*end\*/\n/g" > test2.txt
perl -p -e "s/\*end\*/\n/g" test1.text > test2.txt


which all fail with:
Code:
Segmentation fault



Any ideas?

Thanks in advance!

Last edited by radoulov; 12-02-2011 at 04:41 PM.. Reason: Code tags!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

splitting a record and adding a record to a file

Hi, I am new to UNIX scripting and woiuld appreicate your help... Input file contains only one (but long) record: aaaaabbbbbcccccddddd..... Desired file: NEW RECORD #new record (hardcoded) added as first record - its length is irrelevant# aaaaa bbbbb ccccc ddddd ... ... ... (1 Reply)
Discussion started by: rsolap
1 Replies

2. UNIX for Dummies Questions & Answers

Splitting a file based on record sin another file

All, We receive a file with a large no of records (records can vary) and we have to split it into two files based on another file. e.g. File1: UHDR 2008112 "25187","00000022","00",21-APR-1991,"" ,"D",-000000519,+0000000000,"C", ,+000000000,+000000000,000000000,"2","" ,21-APR-1991... (7 Replies)
Discussion started by: er_ashu
7 Replies

3. Shell Programming and Scripting

Help on splitting this huge file

Hi , i have files coming in my system which are very huge in MB and GBs, all these files are in a single line, there is no newline character. I need to get only last 700 bytes of these files, of this i am splitting the files by "split -b 700 filename" but this gives all the splitted... (2 Replies)
Discussion started by: Prateek007
2 Replies

4. Shell Programming and Scripting

Splitting the Huge file into several files...

Hi I have to write a script to split the huge file into several pieces. The file columns is | pipe delimited. The data sample is as: 6625060|1420215|07308806|N|20100120|5572477081|+0002.79|+0000.00|0004|0001|......... (3 Replies)
Discussion started by: lakteja
3 Replies

5. UNIX for Advanced & Expert Users

Splitting the single csv file

Hi, I have a requiement where in i will get a single file but there will be mutiple headers. Suppose say for eg: Header1 Data... Data... Header2 Data.. Data.. Header3 Data.. Data.. I want to split each with the corresponding data into a single file. Please let me know how... (1 Reply)
Discussion started by: weknowd
1 Replies

6. Shell Programming and Scripting

Splitting & reformating a single file

I have a bif text file with the following format: d1_03 fr:23 d1_03 fr:56 d1_03 fr:67 d1_03 fr:78 d1_01 fr:35 d1_01 fr:29 d1_01 fr:45 d2_09 fr:34 d2_09 fr:78 d3_98 fr:90 d3_98 fr:104 d3_98 fr:360 I have like thousands of such lines I want to reformat this file based on column 1... (3 Replies)
Discussion started by: Lucky Ali
3 Replies

7. Shell Programming and Scripting

splitting a huge line of file into multiple lines with fixed number of columns

Hi, I have a huge file with a single line. But I want to break that line into lines of with each line having five columns. My file is like this: code: "hi","there","how","are","you?","It","was","great","working","with","you.","hope","to","work","you." I want it like this: code:... (1 Reply)
Discussion started by: rajsharma
1 Replies

8. Shell Programming and Scripting

Splitting single file into n files

Hi all, I am new to scripting and I have a requirement we have source file as HEADER 01.10.2010 14:32:37 NAYA TA0022 TA0000 20000001;20060612;99991231;K4;02;3 20000008;20080624;99991231;K4;02;3 20000026;19840724;99991231;KK;01;3 20000027;19840724;99991231;KK;01;3... (6 Replies)
Discussion started by: srk409
6 Replies

9. Shell Programming and Scripting

Fetching record based on Uniq Key from huge file.

Hi i want to fetch 100k record from a file which is looking like as below. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... (17 Replies)
Discussion started by: lathigara
17 Replies

10. Shell Programming and Scripting

Splitting a single file to multiple files

Hi Friends , Please guide me with the code to extract multiple files from one file . The File Looks like ( Suppose a file has 2 tables list ,column length may vary ) H..- > File Header.... H....- >Table 1 Header.... D....- > Table 1 Data.... T....- >Table 1 Trailer.... H..-> Table 2... (1 Reply)
Discussion started by: AspiringD
1 Replies
fntsample(1)						      General Commands Manual						      fntsample(1)

NAME
fntsample - PDF and PostScript font samples generator SYNOPSIS
fntsample [ OPTIONS ] -f FONT-FILE -o OUTPUT-FILE fntsample -h DESCRIPTION
fntsample program can be used to generate font samples that show Unicode coverage of the font and are similar in appearance to Unicode charts. Samples can be saved into PDF (default) or PostScript file. OPTIONS
fntsample supports the following options. --font-file, -f FONT-FILE Make samples of FONT-FILE. --font-index, -n IDX Font index for FONT-FILE specified using --font-file option. Useful for files that contain multiple fonts, like TrueType Collec- tions (.ttc). By default font with index 0 is used. --output-file, -o OUTPUT-FILE Write output to OUTPUT-FILE. --other-font-file, -d OTHER-FONT Compare FONT-FILE with OTHER-FONT. Glyphs added to FONT-FILE will be highlighted. --other-index, -m IDX Font index for OTHER-FONT specified using --other-font-file option. --postscript-output, -s Use PostScript format for output instead of PDF. --svg, -g Use SVG format for output. The generated document contains one page. Use range selection options to specify which. --print-outline, -l Print document outlines data to standard output. This data can be used to add outlines (aka bookmarks) to resulting PDF file with pdfoutline program. --include-range, -i RANGE Show characters in RANGE. --exclude-range, -x RANGE Do not show characters in RANGE. --style, -t "STYLE: VAL" Set STYLE to value VAL. Run fntsample with option --help to see list of styles and default values. --help, -h Display help text and exit. Parameter RANGE for -i and -x can be given as one integer or a pair of integers delimited by minus sign (-). Integers can be specified in decimal, hexadecimal (0x...) or octal (0...) format. One integer of a pair can be missing (-N can be used to specify all characters with codes less or equal to N, and N- for all characters with codes greather or equal to N). Multiple -i and -x options can be used. EXAMPLES
Make PDF samples for font.ttf and write them to file samples.pdf: fntsample -f font.ttf -o samples.pdf Make PDF samples for font.ttf, compare it with oldfont.ttf and highlight new glyphs. Write output to file samples.pdf: fntsample -f font.ttf -d oldfont.ttf -o samples.pdf Make PostScript samples for font.ttf and write output to file samples.ps. Show only glyphs for characters with codes less or equal to U+04FF but exclude U+0370-U+03FF: fntsample -f font.ttf -s -o samples.ps -i -0x04FF -x 0x0370-0x03FF Make PDF samples for font.ttf and save output to file samples.pdf adding outlines to it: fntsample -f font.ttf -o temp.pdf -l > outlines.txt pdfoutline temp.pdf outlines.txt samples.pdf AUTHOR
Copyright (C) 2007 Eugeniy Meshcheryakov <eugen@debian.org> Homepage: <http://fntsample.sourceforge.net/> SEE ALSO
pdfoutline(1) 2010-10-14 fntsample(1)
All times are GMT -4. The time now is 10:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy