4 GB delimited-textfile on ONE LINE


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting 4 GB delimited-textfile on ONE LINE
# 1  
Old 04-30-2009
Error 4 GB delimited-textfile on ONE LINE

I have delimited-text files ( > 4GB ) and is just one line.


OS: HP-UX 11.23

Awk / cut / sed all have line_max limitations. & unable to read one line in (Buffered-mode).


Sample file:
xxxx|adsfadf|Afdsa|adsf|afds|Asdfas|ads|Afds|Asdf| .....till forever,

I want to put a carriage return every 17 fields.

xxx|asdfad|asdfadsf|adfsadsf
afdsadsf|adfsadsf|Afds|adsf|Asdf


...this has to be in -unbuffered mode..
as i tried to fold -w 500 inputfile > outputfile

then while loop (line-by-line) .. read multiples of 17 fields & append remaing to next line..


see trials below:


#!/usr/bin/sh -x

inputfile=$1
foldfile=${inputfile}.fold
outputfile=${inputfile}.new

i=0
multiplier=17 ### This is 17 Fields to be split



##gawk -F"|" 'OFS="," { for (i=1;i<=NF;i += 17) { print $i, $(i+1 ), $(i+2) , $(i+3) , $(i+4) , $(i+5) , $(i+6) , $(i+7) , $(i+8) , $(i+9) , $(i+10) , $(i+11) , $(i+12) , $(i+13) , $(i+14) , $(i+15) , $(i+16) } fflush(""); } ' $inputfile > $outputfile


fold -w 500 $inputfile > $foldfile

truncated_fields=''

while read line
do

line=`echo $truncated_fields$line`
fields_per_line=`echo $line | awk -F"|" '{print NF}'`



## ex: 69 / 17 = 4 loops + 1 remaining field

fields_multiples_multiplier=`expr ${fields_per_line} / ${multiplier}`
start_index_of_trunc_fields=`expr ${fields_multiples_multiplier} \* ${multiplier} + 1`

if [ $start_index_of_trunc_fields -gt $fields_per_line ]; then
$start_index_of_trunc_fields=$fields_per_line
fi



while [ $i -ne $fields_multiples_multiplier ]

do

start=`expr ${multiplier} \* ${i} + 1 `
end=`expr ${start} + 16 `

echo $line | cut -d"|" -f${start}-${end} >> $outputfile

i=`expr ${i} + 1`


done


truncated_fields=`echo $line | cut -d"|" -f${start_index_of_trunc_fields}-${fields_per_line}`


## echo $fields_multiples_multiplier $start_index_of_trunc_fields $fields_per_line
## echo $truncated_fields
## sleep 10

done < $foldfile
# 2  
Old 04-30-2009
have you tried converting '|' to newLines '\n' and then reconstructing a record/line from the 17 consecutive lines?
Code:
tr '|' '\n' < myFile | nawk 'ORS=(FNR%17)?OFS:RS' OFS='|'


Last edited by vgersh99; 05-01-2009 at 01:14 PM..
# 3  
Old 05-02-2009
Question Thanks very much

Thank you very much for your help,

But what if the delimiter is --> (1)

exSmilieone huge line that does not work with sed awk gawk ...etc

text_a(1)text_b(1)text_c(1)text_d(1)text_e(1)text_c(1)text_d(1)texte(1)


required output is :

text_a(1)text_b(1)text_c(1)text_d(1)
text_e(1)text_c(1)text_d(1)texte(1)

...etc


tr does not work with regexp

tr '\(1\)' '\n' < file

any ideas..

Thanks again for your help
# 4  
Old 05-02-2009
assuming your 'text' fields don't contain '(':
Code:
# with '|' as a separator
tr '(' '\n' < myFile | nawk '{sub("^[0-9][0-9]*[)]", "")}ORS=(FNR%17)?OFS:RS' OFS='|'

OR
assuming your 'text' fields don't contain ')':
Code:
# preserving the original separators
tr ')' '\n' < myFile | nawk 'ORS=(FNR%17)?OFS:")"RS' OFS=')' | sed '$s/)$//'


Last edited by vgersh99; 05-02-2009 at 12:22 PM..
# 5  
Old 05-02-2009
text is unguarded

text_a , text_b & text_c can contain anything .. they contain
( alone
) alone
and also
( ) at the same time.

one question, what's the best approach to tackle such problem..
perl
C under unix.
or what.. ?
because my basic understanding .. that if i fold the (one-line infinite file) to fixed width ..
then loop line-by-line.. it is the same as reading chunk by chunk of (1048 byte or something ).

does the unix re-open the file every line.. (having file handler.. open/shut thus the time is quite extensive ? )

thanks for clarification .
# 6  
Old 05-13-2009
Thanks Issue Resolved..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to add the line to previous line in | delimited text?

Hi All, I am new to Unix and I have one challenge and below are the details. I have pipe delimited text file in that data has span into multiple lines instead of single line. Sample data. Data should be like below for entire file. 41|216|398555|77|provided complete NP outcome data ... (21 Replies)
Discussion started by: Narasimhasss
21 Replies

2. Shell Programming and Scripting

How to separate sorte different characters from one textfile and copy them in a new textfile?

My first post, so don't kill me :) Say i open some textfile with some example like this. on the table are handy, bread and wine Now i know exactly what is in and i want to separate and sorted it in terminal to an existing file with another 2 existing lines in like this: table plane ... (3 Replies)
Discussion started by: schwatter
3 Replies

3. Shell Programming and Scripting

awk print in one line after reading textfile with paragraphs

Hello everybody I have a text file which has the following format: nmm "text20140601.033954text" "text" "text" "text" , ... , "text" "text" , ... , Lat 36.3247 Lon 16.0588 Depth 8 "text", ... , "text" "text", ..., CovXX 1.65 CovYY 2.32 CovZZ 1.2 "text" , ..., "text nmm ... (6 Replies)
Discussion started by: phaethon
6 Replies

4. Shell Programming and Scripting

How can i comma-delimited last field in line?

Awk gurus, Greatly appreciate for any kind of assistance from the expert community Input line: abc,11.22.33.44,xyz,7-8-9-10 pqr,111.222.333.444,wxy,1-2-3 def,22.33.44.55,stu,7-8 used the gsub function below but it changes all of the "-" delimiter: awk 'gsub("-",",")' Desired... (4 Replies)
Discussion started by: ux4me
4 Replies

5. Shell Programming and Scripting

Write $line number into textfile and read from line number

Hello everyone, I don't really know anything about scripting, but I have to manage to make this script, out of necessity. #!/bin/bash while read -r line; do #I'm reading from a big wordlist instructions using $line done Is there a way to automatically write the $line number the script... (4 Replies)
Discussion started by: bobylapointe
4 Replies

6. UNIX for Dummies Questions & Answers

Script to add text before the first word on a line in a textfile.

How can i make a script to add text before the first word on a line in a textfile : Example: Old line: is my place New line: this is my place Please use and tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks. (3 Replies)
Discussion started by: mjanssen
3 Replies

7. Shell Programming and Scripting

How to extract more than 1 line in a textfile ?

Hi, I'm starting a little project with a shell script but I'm don't know how to do it. Maybe someone can help me. I have un text file like this : I'd like to do a script who will extract from my file from @ADDLINE1@ to @ADDLINE4@ only and I have no idea how to do this. Any idea ? ... (7 Replies)
Discussion started by: Poulki
7 Replies

8. Shell Programming and Scripting

Get line of textfile and store it in variable

Hi! I need to do the following: (1) I wan't to extract a line of a textfile (defined by a numer) and store it into a variable... (2) ...but I want to cut out a part of the line which is between two tokens and store just this to the variable Example: BlaBlaBla Bla2Bla2Bla2 *pPointerOne;... (4 Replies)
Discussion started by: Michi21609
4 Replies

9. Shell Programming and Scripting

cut a string in a textfile line per line

i need to cut the string in a textfile but each line has a specific way of cutting it (different lengths) i have a for loop that gets the string line per line, then each line has to be compared: for x in `cat tmp2.txt`; do if; then echo 'BAC' elif ... (6 Replies)
Discussion started by: izuma
6 Replies

10. Shell Programming and Scripting

parsing a delimited line

I have a text file of lines like: A=5|B=7|G=4|C=3|P=4|... In other words, each line is a pipe-delimited set of pairs of strings of the form "X=Y". What I want to do is find the token starting with "C", and print it and its value (so I'd want to print "C=3" in the example above). I'm... (11 Replies)
Discussion started by: monkeys
11 Replies
Login or Register to Ask a Question