It looks like Scrutinzer's suggestion should work just fine as long as:
trannum does not contain any characters that are special in an ERE, and
the number of bytes in a single transaction (from ##PAYMNT through 0000EOT is not more than 2047 bytes.
So:
What is the format of trannum? Is it all alphanumeric characters? (If it isn't all alphanumeric characters, what characters can be included in a trannum?) How many characters are in a trannum? (Is it always the same number of characters or does it vary? If it varies, what are the minimum and maximum number of characters in a trannum?)
What is the maximum number of bytes (not characters; bytes) in a transaction? If that number is larger than 2047, what is the maximum number of bytes in a single line in a transaction? (As long as the number of byte in a line (including the terminating <newline> character is no larger than 2048 bytes, we can easily do that. If it is more than 2048 bytes, it takes more work to get what you want on AIX.)
I would do it slightly differently (to quit after the desired transaction is found):
which should cut the time awk spends reading your large file about in half, on average.
But, the way to make big gains here would be to search for and extract multiple transactions in a single pass through your large file. If you could, for example, extract 10 transactions at a time, you would only have to read the large file once instead of 10 times and you would only have to invoke awk once instead of 10 times; both of which would be big wins for performance.
Note that extracting 10 transactions at a time does not mean that the extracted transactions would all be saved in a single file; each transaction could easily be extracted into a separate file. And, 10 is just an example; an awk script could easily extract thousands of transactions into separate files in a single pass through your large transaction file increasing your script's processing speed immensely if your script is being used to process thousands of transactions. Note also that this is why we want details about what you are doing instead of vague statements about a tiny piece of the script you are writing. The more we know, the better chance we have of making a suggestion that will significantly improve your script.
This User Gave Thanks to Don Cragun For This Post:
1 . Thanks everyone who read the post first.
2 . I have a log file which size is 143M , I can not use vi open it .I can not use xedit open it too.
How to view it ?
If I want to view 200-300 ,how can I implement it
3 . Thanks (3 Replies)
I have a command which prints #lines after and before the search string in the huge file
nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print;c=a}b{r=$0}' b=0 a=10 s="STRING1" FILE
The file is 5 gig big.
It works great and prints 10 lines after the lines which contains search string in... (8 Replies)
Hi,
I have a big (2.7 GB) text file. Each lines has '|' saperator to saperate each columns.
I want to delete those lines which has text like '|0|0|0|0|0'
I tried:
sed '/|0|0|0|0|0/d' test.txt
Unfortunately, it scans the file but does nothing.
file content sample:... (4 Replies)
hi,
i have two files.
file1.sh
echo "unix"
echo "linux"
file2.sh
echo "unix linux forums"
now the output i need is
$./file2.sh
unix linux forums (3 Replies)
Hi,
I need a unix command to delete first n (say 100) lines from a log file. I need to delete some lines from the file without using any temporary file. I found sed -i is an useful command for this but its not supported in my environment( AIX 6.1 ). File size is approx 100MB.
Thanks in... (18 Replies)
Hi all
I have a big file which I have attached here.
And, I have to fetch certain entries and arrange in 5 columns
Name Drug DAP ID disease approved or notIn the attached file data is arranged with tab separated columns in this way:
and other data is... (2 Replies)
The dataset I'm working on is about 450G, with about 7000 colums and 30,000,000 rows.
I want to extract about 2000 columns from the original file to form a new file.
I have the list of number of the columns I need, but don't know how to extract them.
Thanks! (14 Replies)
Dear all,
I have stuck with this problem for some days.
I have a very big file, this file can not open by vi command.
There are 200 loops in this file, in each loop will have one line like this:
GWA quasiparticle energy with Z factor (eV)
And I need 98 lines next after this line.
Is... (6 Replies)
Hi all,
I have a file like this I want to extract only those regions which are big and continous
chr1 3280000 3440000
chr1 3440000 3920000
chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output
chr1 3920000 4800000
chr1 ... (2 Replies)