Sponsored Content
Top Forums Shell Programming and Scripting Faster command to remove headers for files in a directory Post 302730417 by jim mcnamara on Monday 12th of November 2012 08:42:37 PM
Old 11-12-2012
Yes, that was what I meant. And yes it is very likely the grep, egrep, and sed are better at massive I/O than awk, which is running interpreted. The point, I think, is that a lot of tests like this are a lot of fun, but they may not be informative. Unless you understand why results can be set askew.

On my large m4000 Solaris boxes sed always outperforms awk on simple stream editing of massive files. On cygwin they come out really close.

However, by the time I've set up a fair test and run several candidates through, I could have coded and already processed 24 files in parallel, using any reasonable method.

Which is a lot less fun, I admit.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

help:how to remove headers in output file

Hi I am running a script (which compares two directory contents) for which I am getting an output of 70 pages in which few pages are blank so I was able to delete those blank lines. But I also want to delete the headers present for each page. can any one help me by providing the code... (1 Reply)
Discussion started by: raj_thota
1 Replies

2. Shell Programming and Scripting

Which one is faster to remove control m characters?

I have a file with millions of records...Before I experiment, I would like to know which one is faster. Both the commands work absolutely fine on a smaller set of records. Please advice. sed 's/^M//g' ${INPUT_FILE} > tmp.txt mv tmp.txt ${INPUT_FILE} tr -d "\15" < ${INPUT_FILE} > ... (11 Replies)
Discussion started by: madhunk
11 Replies

3. Shell Programming and Scripting

Remove Headers throughout a data file

I have a data file with over 500,000 records/lines that has the header throughout the file. SEQ_ID Name Start_Date Ins_date Add1 Add2 1 Harris 04/02/08 03/02/08 333 Main Suite 101 2 Smith 02/03/08 01/23/08 287 Jenkins SEQ_ID Name ... (3 Replies)
Discussion started by: psmall
3 Replies

4. UNIX for Dummies Questions & Answers

Remove certain headers using mailx or sendmail

Hello, So i want to send mails in any way from a solaris 5.8 system, perhaps using mailx or sendmail. My purpose is to stay clear of systems name in head data. So i want to strip at least the "Message-Id" and the "Recieved" headers of the mail. Yet this seems to be a bit of a problem. Now i... (2 Replies)
Discussion started by: congo
2 Replies

5. Shell Programming and Scripting

Remove text between headers while leaving headers intact

Hi, I'm trying to strip all lines between two headers in a file: ### BEGIN ### Text to remove, contains all kinds of characters ... Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c om/updates/ Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies

6. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

7. Shell Programming and Scripting

Running rename command on large files and make it faster

Hi All, I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

8. UNIX for Dummies Questions & Answers

Using sed command to remove multiple instances of repeating headers in one file?

Hi, I have catenated multiple output files (from a monte carlo run) into one big output file. Each individual file has it's own two line header. So when I catenate, there are multiple two line headers (of the same wording) within the big file. How do I use the sed command to search for the... (1 Reply)
Discussion started by: rebazon
1 Replies

9. Shell Programming and Scripting

Remove headers thar dont match

Good evening I need your help please, im new at Unix and i wanted to remove the first 5 headers for 100000 records files and then create a control file .ctl that contains the number of records and all seem to work out but when i tested at production it didnt wotk. Here is the code: #!... (6 Replies)
Discussion started by: alexcol
6 Replies

10. Shell Programming and Scripting

Remove white space and duplicate headers

I have a file called "dsout" with empty rows and duplicate headers. DATE TIME TOTAL_GB USED_GB %USED --------- -------- ---------- ---------- ---------- 03/05/013 12:34 PM 3151.24316 2331.56653 73.988785 ... (3 Replies)
Discussion started by: Daniel Gate
3 Replies
largefile(5)                                            Standards, Environments, and Macros                                           largefile(5)

NAME
largefile - large file status of utilities DESCRIPTION
A large file is a regular file whose size is greater than or equal to 2 Gbyte ( 2**31 bytes). A small file is a regular file whose size is less than 2 Gbyte. Large file aware utilities A utility is called large file aware if it can process large files in the same manner as it does small files. A utility that is large file aware is able to handle large files as input and generate as output large files that are being processed. The exception is where additional files are used as system configuration files or support files that can augment the processing. For example, the file utility supports the -m option for an alternative "magic" file and the -f option for a support file that can contain a list of file names. It is unspecified whether a utility that is large file aware will accept configuration or support files that are large files. If a large file aware utility does not accept configuration or support files that are large files, it will cause no data loss or corruption upon encountering such files and will return an appropriate error. The following /usr/bin utilities are large file aware: adb awk bdiff cat chgrp chmod chown cksum cmp compress cp csh csplit cut dd dircmp du egrep fgrep file find ftp getconf grep gzip head join jsh ksh ln ls mdb mkdir mkfifo more mv nawk page paste pathchck pg rcp remsh rksh rm rmdir rsh sed sh sort split sum tail tar tee test touch tr uncompress uudecode uuencode wc zcat The following /usr/xpg4/bin utilities are large file aware: awk cp chgrp chown du egrep fgrep file grep ln ls more mv rm sed sh sort tail tr The following /usr/xpg6/bin utilities are large file aware: getconf ls tr The following /usr/sbin utilities are large file aware: install mkfile mknod mvdir swap See the USAGE section of the swap(1M) manual page for limitations of swap on block devices greater than 2 Gbyte on a 32-bit operating sys- tem. The following /usr/ucb utilities are large file aware: chown from ln ls sed sum touch The /usr/bin/cpio and /usr/bin/pax utilities are large file aware, but cannot archive a file whose size exceeds 8 Gbyte - 1 byte. The /usr/bin/truss utilities has been modified to read a dump file and display information relevant to large files, such as offsets. cachefs file systems The following /usr/bin utilities are large file aware for cachefs file systems: cachefspack cachefsstat The following /usr/sbin utilities are large file aware for cachefs file systems: cachefslog cachefswssize cfsadmin fsck mount umount nfs file systems The following utilities are large file aware for nfs file systems: /usr/lib/autofs/automountd /usr/sbin/mount /usr/lib/nfs/rquotad ufs file systems The following /usr/bin utility is large file aware for ufs file systems: df The following /usr/lib/nfs utility is large file aware for ufs file systems: rquotad The following /usr/xpg4/bin utility is large file aware for ufs file systems: df The following /usr/sbin utilities are large file aware for ufs file systems: clri dcopy edquota ff fsck fsdb fsirand fstyp labelit lockfs mkfs mount ncheck newfs quot quota quotacheck quotaoff quotaon repquota tunefs ufsdump ufsrestore umount Large file safe utilities A utility is called large file safe if it causes no data loss or corruption when it encounters a large file. A utility that is large file safe is unable to process properly a large file, but returns an appropriate error. The following /usr/bin utilities are large file safe: audioconvert audioplay audiorecord comm diff diff3 diffmk ed lp mail mailcompat mailstats mailx pack pcat red rmail sdiff unpack vi view The following /usr/xpg4/bin utilities are large file safe: ed vi view The following /usr/xpg6/bin utility is large file safe: ed The following /usr/sbin utilities are large file safe: lpfilter lpforms The following /usr/ucb utilities are large file safe: Mail lpr The following /usr/lib utility is large file safe: sendmail SEE ALSO
lf64(5), lfcompile(5), lfcompile64(5) SunOS 5.10 7 Nov 2003 largefile(5)
All times are GMT -4. The time now is 03:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy