10-04-2018
Worst-case, you're adding tons more overhead to files which need processing anyway. There's a lot more to be gained by quitting early than doing extra.
12 hours is surprisingly slow, though. awk is definitely slower than gzip, but it can process 50 megs per second on one of my more ancient systems. Assuming an 8:1 compression ratio for text, you're getting closer to 20. Given that, I'm suspicious that you really are hitting disk bandwidth limits.
Are you using an SSD or a spinning disk? A spinning disk will be hit particularly hard if it has to read and write simultaneously. Its bandwidth will be more than halved. And this is a worst-case situation, where your data is so large that cache is simply no help at all. And is your disk physically attached or a NAS, NFS share, USB disk, or some other such thing? The protocol overhead of these can be horrendous in practice.
If you're not hitting disk bandwidth limits though, multiprocessing should be a big gain.
Last edited by Corona688; 10-04-2018 at 04:37 PM..
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
hii everyone ,
i have a file in which i have line numbers.. file name is file1.txt
aa bb cc "12" qw
xx yy zz "23" we
bb qw we "123249" jh
here 12,23,123249. is the line number
now according to this line numbers we have to print lines from other file named... (11 Replies)
Discussion started by: kumar_amit
11 Replies
2. Programming
Hi,
I'm trying to figure out the best solution to the following problem, and I'm not
yet that much experienced like you. :-)
Basically I have to read a fairly large file, composed of "messages" , in order
to display all of them through an user interface (made with QT).
The messages that... (3 Replies)
Discussion started by: emitrax
3 Replies
3. AIX
We just set up a system to use large pages. I want to know if there is a command to see how much of the memory is being used for large pages. For example if we have a system with 8GB of RAm assigned and it has been set to use 4GB for large pages is there a command to show that 4GB of the *GB is... (1 Reply)
Discussion started by: daveisme
1 Replies
4. Shell Programming and Scripting
Hello
I have the following files
VOICE_hhhh
SUBSCR_llll
DEL_kkkk
Consider that there are 1000 VOICE files+1000 SUBSCR files+1000DEL files
When i try to tar these files using
tar -cvf backup.tar VOICE* SUBSCR* DEL*
i get the error:
ksh: /usr/bin/tar: arg list too long
How can i... (9 Replies)
Discussion started by: chriss_58
9 Replies
5. Emergency UNIX and Linux Support
Hello,
Error
awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt
What it is
It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies
6. Shell Programming and Scripting
Hi All,
I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies
7. Shell Programming and Scripting
awk "/May 23, 2012 /,0" /var/tmp/datafile
the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file.
now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
Discussion started by: SkySmart
8 Replies
8. Shell Programming and Scripting
I have script like below, who is picking number from one file and and searching in another file, and printing output.
Bu is is very slow to be run on huge file.can we modify it with awk
#! /bin/ksh
while read line1
do
echo "$line1"
a=`echo $line1`
if
then
echo "$num"
cat file1|nawk... (6 Replies)
Discussion started by: mirwasim
6 Replies
9. Shell Programming and Scripting
This basic code works.
I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in".
It... (5 Replies)
Discussion started by: sumguy
5 Replies
10. Shell Programming and Scripting
I have the below command which is referring a large file and it is taking 3 hours to run. Can something be done to make this command faster.
awk -F ',' '{OFS=","}{ if ($13 == "9999") print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 }' ${NLAP_TEMP}/hist1.out|sort -T ${NLAP_TEMP} |uniq>... (13 Replies)
Discussion started by: Peu Mukherjee
13 Replies
LEARN ABOUT NETBSD
svhlabel
SVHLABEL(8) BSD System Manager's Manual SVHLABEL(8)
NAME
svhlabel -- update disk label from SGI Volume Header
SYNOPSIS
svhlabel [-fqrw] device
DESCRIPTION
svhlabel is used to update a NetBSD disk label from the Silicon Graphics Volume Header on disks that were previously used on IRIX systems.
svhlabel scans the Volume Header contained in the first blocks of the disk and generates additional partition entries for the disk from the
entries found.
Each Volume Header entry which does not have an equivalent partition in the disk label (equivalent in having the same size and offset) is
added to the first free partition slot in the disk label. A free partition slot is defined as one with an fstype of 'unused' and a size of
zero ('0'). If there are not enough free slots in the disk label, a warning will be issued.
The raw partition (typically partition c, but d on i386 and some other platforms) is left alone during this process.
By default, the proposed changed disk label will be displayed and no disk label update will occur.
Available options:
-f Force an update, even if there has been no change.
-q Performs operations in a quiet fashion.
-r In conjunction with -w, also update the on-disk label. You probably do not want to do this.
-w Update the in-core label if it has been changed.
SEE ALSO
disklabel(8), dkctl(8), mount_efs(8), sgivol(8)
HISTORY
The svhlabel command appeared in NetBSD 5.0.
BSD
February 26, 2007 BSD