Reading large file, awk and cut


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading large file, awk and cut
# 1  
Old 10-17-2008
Reading large file, awk and cut

Hello all,
I have 2 files, the first (indexFile1) contains start offset and length for each record inside the second file. The second file can be very large, each actual record start offset and length is defined by the entry in indexFile1. Since there are no records separators wc-l returns 0 for the second file, no matter how large its size actually is.

I want to gather all the records one at a time and write them out to a new file individually.
What is the best way to approach this processing?

I suspect I will have trouble reading a whole large file into a variable (using awk) and then use a cut command on the variable contents to collect my record in the form:

FileContent=$(awk '{print $0}' largeFile2) # this is where I think I have a problem :

# LINE contains start and offset identifying each record in largeFile2
while read LINE;do
pass=1
for results in $LINE; do
if [[ $pass -eq 1 ]];then
from=$results
pass=2
else to=$results
fi
done
(( from=$val1+1 ))
(( to=$val1+$val2 ))
newOut=$(echo $FileContent|cut -c $from-$to)
echo $newOut >> newfile
done < indexFile1

I have it working ok for small size of largeFile2. I can see a problem when the size of file2 gets large.
I hope you can give me some suggestions on how to do this better.
Thanks!

Last edited by gio001; 10-17-2008 at 03:46 PM..
# 2  
Old 10-18-2008
Sample input and desired output, please.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies

2. Shell Programming and Scripting

Reading off values from a large file

Hi, I have a large output file (star.log), with many lines of the following type *** T vavg unburnt: 723.187 / burnt: 2662.000 What I would like to do is pick the values 723.187 and 2662.000 and What I've got so far is awk '/unburnt:.*burnt:/{Tu=$6;Tb=$NF}END{print Tu, Tb}'... (6 Replies)
Discussion started by: lost.identity
6 Replies

3. Shell Programming and Scripting

awk input large file

Hi...Does anyone know how to input huge file about 25 GB to awk if single file then this works awk '{print}' <hugefile suppose if have to use something like this awk FNR==NR{x=$0;next}{print $0,x}' hugefile1 hugefile2 then how to redirect ? and is there any provision to assign memory... (12 Replies)
Discussion started by: Akshay Hegde
12 Replies

4. Shell Programming and Scripting

sed and awk not working on a large record file

Hi All, I have a very large single record file. abc;date||bcd;efg|......... pqr;stu||record_count;date when i do wc -l on this file it gives me "0" records, coz of missing line feed. my problem is there is an extra pipe that is coming at the end of this record like... (6 Replies)
Discussion started by: Gurkamal83
6 Replies

5. Shell Programming and Scripting

reading file awk or while

While read line query !!! Folks, I am working on a file which has entries as follows. I am using while read line to generate desired output as follows. filename1: Name : sdt2156157_ID NOS : 4567 NOS : 2348 Name : sdt2156158_ID NOS : 4987 NOS :... (3 Replies)
Discussion started by: dynamax
3 Replies

6. UNIX for Advanced & Expert Users

Help using Awk and cut with a text file

Looking for some help on using awk and cut I have a text file that has fixed information and want to write a script that will prompt the user for an account to search for and pint the output The sample line that has the key information looks like this: Statement to: ... (5 Replies)
Discussion started by: ziggy6
5 Replies

7. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

8. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies

9. Shell Programming and Scripting

Updating a line in a large csv file, with sed/awk?

I have an extremely large csv file that I need to search the second field, and upon matches update the last field... I can pull the line with awk.. but apparently you cant use awk to directly update the file? So im curious if I can use sed to do this... The good news is the field I want to... (5 Replies)
Discussion started by: trey85stang
5 Replies

10. Programming

fopen() + reading in large text files

For reading in large text files (say files over 1kB in size) are there any issues with fopen() that I should be aware of ? cheers (2 Replies)
Discussion started by: JamesGoh
2 Replies
Login or Register to Ask a Question