Sponsored Content
Top Forums Shell Programming and Scripting Fast way to split a tab delimited file Post 302073909 by dayanandra on Thursday 18th of May 2006 01:44:43 PM
Old 05-18-2006
Quote:
Originally Posted by tmarikle
awk seems to be faster on my system.

Test with 500 records:

Code:
#! /usr/bin/ksh

print "Single task awk"
time {
    > M.txt
    > D.txt
    nawk '{
        if ($0 ~ /^M/) print $0 >"M.txt"
        else print $0 >"D.txt"
    }' test.dat
}
ls -altr M.txt D.txt

print "Two task awk"
time {
    > M.txt
    > D.txt
    nawk '/^M/' test.dat >> M.txt &
    nawk '/^D/' test.dat >> D.txt &
    wait
}
ls -altr M.txt D.txt

print "4-way awk"
time {
    > M.txt
    > D.txt
    nawk 'NR <  250000 && /^M/' test.dat >> M.txt &
    nawk 'NR >= 250000 && /^M/' test.dat >> M.txt &
    nawk 'NR <  250000 && /^D/' test.dat >> D.txt &
    nawk 'NR >= 250000 && /^D/' test.dat >> D.txt &
    wait
}
ls -altr M.txt D.txt

print "Grep"
time {
    > M.txt
    > D.txt
    grep "^M" test.dat > M.txt &
    grep "^D" test.dat > D.txt &
    wait
}
ls -altr M.txt D.txt

results:
Code:
Single task awk

real    3m12.40s
user    0m4.69s
sys     0m9.63s
-rw-r--r--   1 ... 34770850 ... D.txt
-rw-r--r--   1 ... 46222065 ... M.txt

Two task awk

real    0m14.12s
user    0m5.93s
sys     0m1.55s
-rw-r--r--   1 ... 34770850 ... D.txt
-rw-r--r--   1 ... 46222065 ... M.txt

4-way awk

real    0m16.14s
user    0m10.52s
sys     0m2.48s-rw-r--r--   1 ... 34770850 ... D.txt
-rw-r--r--   1 ... 46222065 ... M.txt

Grep

real    0m22.70s
user    0m1.50s
sys     0m3.24s-rw-r--r--   1 ... 34770850 ... D.txt
-rw-r--r--   1 ... 46222065 ... M.txt



How do you find the time taken to execute the script?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

tab delimited file to commas

Hi there Just wondered if someone could help me out I have a file that has been delimited by tabs, ie field1<tab>fiield2<tab>field3 Does anybody know a command that will convert tabs to commas throughout the entire file? Note: there are a number of unpopulated fields in the file so... (6 Replies)
Discussion started by: hcclnoodles
6 Replies

2. Shell Programming and Scripting

Converting Tab delimited file to Comma delimited file in Unix

Hi, Can anyone let me know on how to convert a Tab delimited file to Comma delimited file in Unix Thanks!! (22 Replies)
Discussion started by: charan81
22 Replies

3. UNIX for Dummies Questions & Answers

Converting Space delimited file to Tab delimited file

Hi all, I have a file with single white space delimited values, I want to convert them to a tab delimited file. I tried sed, tr ... but nothing is working. Thanks, Rajeevan D (16 Replies)
Discussion started by: jeevs81
16 Replies

4. UNIX for Dummies Questions & Answers

100 $1's to new tab delimited file

Hi I have 100 files each with only one column of 10 numbers that I wish to add to a new file so that I get 100 columns collected in one tab delimited file. I tried something like: foreach num (1 2 3) foreach? gawk -F '\t' '{$num=$1}1' OFS='\t' Eu9_10.2patienter/pospep_10.2patient$num >>... (5 Replies)
Discussion started by: Banni
5 Replies

5. UNIX for Dummies Questions & Answers

tab delimited file that is not tab delimited.

Hi Forum I have a tab delimited file that opens well in Openoffice calc (excel). But when I perform any operation in command line, it reads the file incorrectly. When I 'save As' the same file in office as tab delimited then it works fine. The file that I think is tab delimited is actually... (8 Replies)
Discussion started by: imlearning
8 Replies

6. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

7. Shell Programming and Scripting

How to make tab delimited file to space delimited?

Hi How to make tab delimited file to space delimited? in put file: ABC kgy jkh ghj ash kjl o/p file: ABC kgy jkh ghj ash kjl Use code tags, thanks. (1 Reply)
Discussion started by: jagdishrout
1 Replies

8. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

9. Shell Programming and Scripting

Tab Delimited file in loop

Hi, I have requirement to create tab delimited file with values coming from variables. File will contain only two columns separated by tab. Header will be added once. Values will be keep adding upon the script run. If values already exists then values will be replaced. I have done so... (1 Reply)
Discussion started by: sukhdip
1 Replies

10. UNIX for Beginners Questions & Answers

Replace a column in tab delimited file with column in other tab delimited file,based on match

Hello Everyone.. I want to replace the retail col from FileI with cstp1 col from FileP if the strpno matches in both files FileP.txt ... (2 Replies)
Discussion started by: YogeshG
2 Replies
h5jam(1)						      General Commands Manual							  h5jam(1)

NAME
h5jam - Add a user block to a HDF5 file SYNOPSIS
h5jam -u user_block -i in_file.h5 [-o out_file.h5] [--clobber] DESCRIPTION
h5jam concatenates a user_block file and an HDF5 file to create an HDF5 file with a user block. The user block can be either binary or text. The output file is padded so that the HDF5 header begins on byte 512, 1024, etc.. (See the HDF5 File Format.) If out_file.h5 is given, a new file is created with the user_block followed by the contents of in_file.h5. In this case, infile.h5 is unchanged. If out_file.h5 is not specified, the user_block is added to in_file.h5. If in_file.h5 already has a user block, the contents of user_block will be added to the end of the existing user block, and the file shifted to the next boundary. If --clobber is set, any existing user block will be overwritten. EXAMPLE USAGE
Create new file, newfile.h5, with the text in file mytext.txt as the user block for the HDF5 file file.h5. h5jam -u mytext.txt -i file.h5 -o newfile.h5 Add text in file mytext.txt to front of HDF5 dataset, file.h5. h5jam -u mytext.txt -i file.h5 Overwrite the user block (if any) in file.h5 with the contents of mytext.txt. h5jam -u mytext.txt -i file.h5 --clobber RETURN VALUE
h5jam returns the size of the output file, or -1 if an error occurs. CAVEATS
This tool copies all the data (sequentially) in the file(s) to new offsets. For a large file, this copy will take a long time. The most efficient way to create a user block is to create the file with a user block (see H5Pset_user_block), and write the user block data into that space from a program. The user block is completely opaque to the HDF5 library and to the h5jam and h5unjam tools. The user block is simply read or written as a string of bytes, which could be text or any kind of binary data. It is up to the user to know what the contents of the user block means and how to process it. When the user block is extracted, all the data is written to the output, including any padding or unwritten data. This tool moves the HDF5 file through byte copies, i.e., it does not read or interpret the HDF5 objects. SEE ALSO
h5dump(1), h5ls(1), h5diff(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1), h5unjam(1). h5jam(1)
All times are GMT -4. The time now is 01:16 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy