11-12-2012
Yes, that was what I meant. And yes it is very likely the grep, egrep, and sed are better at massive I/O than awk, which is running interpreted. The point, I think, is that a lot of tests like this are a lot of fun, but they may not be informative. Unless you understand why results can be set askew.
On my large m4000 Solaris boxes sed always outperforms awk on simple stream editing of massive files. On cygwin they come out really close.
However, by the time I've set up a fair test and run several candidates through, I could have coded and already processed 24 files in parallel, using any reasonable method.
Which is a lot less fun, I admit.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi
I am running a script (which compares two directory contents) for which I am getting an output of 70 pages in which few pages are blank so I was able to delete those blank lines.
But I also want to delete the headers present for each page. can any one help me by providing the code... (1 Reply)
Discussion started by: raj_thota
1 Replies
2. Shell Programming and Scripting
I have a file with millions of records...Before I experiment, I would like to know which one is faster.
Both the commands work absolutely fine on a smaller set of records.
Please advice.
sed 's/^M//g' ${INPUT_FILE} > tmp.txt
mv tmp.txt ${INPUT_FILE}
tr -d "\15" < ${INPUT_FILE} > ... (11 Replies)
Discussion started by: madhunk
11 Replies
3. Shell Programming and Scripting
I have a data file with over 500,000 records/lines that has the header throughout the file.
SEQ_ID Name Start_Date Ins_date Add1 Add2
1 Harris 04/02/08 03/02/08 333 Main Suite 101
2 Smith 02/03/08 01/23/08 287 Jenkins
SEQ_ID Name ... (3 Replies)
Discussion started by: psmall
3 Replies
4. UNIX for Dummies Questions & Answers
Hello,
So i want to send mails in any way from a solaris 5.8 system, perhaps using mailx or sendmail. My purpose is to stay clear of systems name in head data. So i want to strip at least the "Message-Id" and the "Recieved" headers of the mail. Yet this seems to be a bit of a problem.
Now i... (2 Replies)
Discussion started by: congo
2 Replies
5. Shell Programming and Scripting
Hi,
I'm trying to strip all lines between two headers in a file:
### BEGIN ###
Text to remove, contains all kinds of characters
...
Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c
om/updates/
Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies
6. Shell Programming and Scripting
Hi ,
I have a typical situation. I have 4 files and with different headers (number of headers is varible ).
I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only).
For example -
File 1
H1|H2|H3|H4
11|12|13|14
21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies
7. Shell Programming and Scripting
Hi All,
I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies
8. UNIX for Dummies Questions & Answers
Hi,
I have catenated multiple output files (from a monte carlo run) into one big output file. Each individual file has it's own two line header. So when I catenate, there are multiple two line headers (of the same wording) within the big file. How do I use the sed command to search for the... (1 Reply)
Discussion started by: rebazon
1 Replies
9. Shell Programming and Scripting
Good evening
I need your help please, im new at Unix and i wanted to remove the first 5 headers for 100000 records files and then create a control file .ctl that contains the number of records and all seem to work out but when i tested at production it didnt wotk.
Here is the code:
#!... (6 Replies)
Discussion started by: alexcol
6 Replies
10. Shell Programming and Scripting
I have a file called "dsout" with empty rows and duplicate headers.
DATE TIME TOTAL_GB USED_GB %USED
--------- -------- ---------- ---------- ----------
03/05/013 12:34 PM 3151.24316 2331.56653 73.988785 ... (3 Replies)
Discussion started by: Daniel Gate
3 Replies
LEARN ABOUT XFREE86
largefile
largefile(5) Standards, Environments, and Macros largefile(5)
NAME
largefile - large file status of utilities
DESCRIPTION
A large file is a regular file whose size is greater than or equal to 2 Gbyte ( 2**31 bytes). A small file is a regular file whose size is
less than 2 Gbyte.
Large file aware utilities
A utility is called large file aware if it can process large files in the same manner as it does small files. A utility that is large file
aware is able to handle large files as input and generate as output large files that are being processed. The exception is where additional
files are used as system configuration files or support files that can augment the processing. For example, the file utility supports the
-m option for an alternative "magic" file and the -f option for a support file that can contain a list of file names. It is unspecified
whether a utility that is large file aware will accept configuration or support files that are large files. If a large file aware utility
does not accept configuration or support files that are large files, it will cause no data loss or corruption upon encountering such files
and will return an appropriate error.
The following /usr/bin utilities are large file aware:
adb awk bdiff cat chgrp
chmod chown cksum cmp compress
cp csh csplit cut dd
dircmp du egrep fgrep file
find ftp getconf grep gzip
head join jsh ksh ln
ls mdb mkdir mkfifo more
mv nawk page paste pathchck
pg rcp remsh rksh rm
rmdir rsh sed sh sort
split sum tail tar tee
test touch tr uncompress uudecode
uuencode wc zcat
The following /usr/xpg4/bin utilities are large file aware:
awk cp chgrp chown du
egrep fgrep file grep ln
ls more mv rm sed
sh sort tail tr
The following /usr/xpg6/bin utilities are large file aware:
getconf ls tr
The following /usr/sbin utilities are large file aware:
install mkfile mknod mvdir swap
See the USAGE section of the swap(1M) manual page for limitations of swap on block devices greater than 2 Gbyte on a 32-bit operating sys-
tem.
The following /usr/ucb utilities are large file aware:
chown from ln ls sed
sum touch
The /usr/bin/cpio and /usr/bin/pax utilities are large file aware, but cannot archive a file whose size exceeds 8 Gbyte - 1 byte.
The /usr/bin/truss utilities has been modified to read a dump file and display information relevant to large files, such as offsets.
cachefs file systems
The following /usr/bin utilities are large file aware for cachefs file systems:
cachefspack cachefsstat
The following /usr/sbin utilities are large file aware for cachefs file systems:
cachefslog cachefswssize cfsadmin fsck
mount umount
nfs file systems
The following utilities are large file aware for nfs file systems:
/usr/lib/autofs/automountd /usr/sbin/mount
/usr/lib/nfs/rquotad
ufs file systems
The following /usr/bin utility is large file aware for ufs file systems:
df
The following /usr/lib/nfs utility is large file aware for ufs file systems:
rquotad
The following /usr/xpg4/bin utility is large file aware for ufs file systems:
df
The following /usr/sbin utilities are large file aware for ufs file systems:
clri dcopy edquota ff fsck
fsdb fsirand fstyp labelit lockfs
mkfs mount ncheck newfs quot
quota quotacheck quotaoff quotaon repquota
tunefs ufsdump ufsrestore umount
Large file safe utilities
A utility is called large file safe if it causes no data loss or corruption when it encounters a large file. A utility that is large file
safe is unable to process properly a large file, but returns an appropriate error.
The following /usr/bin utilities are large file safe:
audioconvert audioplay audiorecord comm diff
diff3 diffmk ed lp mail
mailcompat mailstats mailx pack pcat
red rmail sdiff unpack vi
view
The following /usr/xpg4/bin utilities are large file safe:
ed vi view
The following /usr/xpg6/bin utility is large file safe:
ed
The following /usr/sbin utilities are large file safe:
lpfilter lpforms
The following /usr/ucb utilities are large file safe:
Mail lpr
The following /usr/lib utility is large file safe:
sendmail
SEE ALSO
lf64(5), lfcompile(5), lfcompile64(5)
SunOS 5.10 7 Nov 2003 largefile(5)