**HELP** need to split this line faster than cut-command


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting **HELP** need to split this line faster than cut-command
# 1  
Old 10-29-2009
Question **HELP** need to split this line faster than cut-command

Hi,

A datafile containing lines such as below needs to be split:

500000000000932491683600000000000000000000000000016800000GS0000000000932491683600*HOME

I need to get the 2-5, 11-20, and 35-40 characters and I can do it via cut command.

cut -c 2-5 file > temp1.txt
cut -c 11-20 file > temp2.txt
cut -c 35-40 file > temp3.txt
paste -d"," temp1.txt temp2.txt temp3.txt > result.txt


The problem is, with the huge amount of data, the process is too slow. Can this be done faster via awk or sed command? Hope you can teach me.

Thanks!!!
# 2  
Old 10-29-2009
Do you have GNU awk on your box?
# 3  
Old 10-29-2009
hi ,
try this...
Code:
cat temp | awk '{print substr($1,2,5) ,"," substr($1,11,20),"," substr($1,35,40)}'

# 4  
Old 10-29-2009
Quote:
Originally Posted by pravin27
hi ,
try this...
Code:
cat temp | awk '{print substr($1,2,5) ,"," substr($1,11,20),"," substr($1,35,40)}'

Code:
awk '{print substr($1,2,5) ,"," substr($1,11,20),"," substr($1,35,40)}' temp

# 5  
Old 10-29-2009
The following code works in bash too and is pretty efficient.
Code:
#!/bin/ksh
while read line; do
  echo "${line:1:4},${line:10:10},${line:34:6}"
done<file>result.txt

If you need even more performance then perhaps an awk script would be even faster.
# 6  
Old 10-29-2009
If you have GNU awk, it has a very efficient builtin to split fixed length record. Try this:

Code:
BEGIN{
  FIELDWIDTHS="1 5 4 10 14 6 1"
}
{
  print $2, $4, $6
}

With this test file:
Code:
$ cat f
000000000111111111122222222223333333333444444444455555555556
123456789012345678901234567890123456789012345678901234567890
500000000000932491683600000000000000000000000000016800000GS0000000000932491683600*HOME

The awk code returns:
Code:
$ awk -f split.awk f
00000 1111111112 333334
23456 1234567890 567890
00000 0093249168 000000

If you don't have Gawk, use the substr() in the post above but it will be less efficient than with the FIELDWIDTHS builtin.
# 7  
Old 10-29-2009
The awks aren't working right. IMO they should be like this:
Code:
awk '{print substr($1,2,4)","substr($1,11,10)","substr($1,35,6)}' file > result.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Cut command doesn't remove (^C) character from only first line

I have a file which has first 2 junk characters(X^C) at beginning of each line in file. When i run cut -c 2- filename it removes junk characters from all lines except on first line it keeps one junk character control C(^C). Not sure why it is not removing only from first line. (2 Replies)
Discussion started by: later_troy
2 Replies

2. Shell Programming and Scripting

Faster Line by Line String/Date Comparison of 2 Files

Hello, I was wondering if anyone knows a faster way to search and compare strings and dates from 2 files? I'm currently using "for loop" but seems sluggish as i have to cycle through 10 directories with 10 files each containing thousands of lines. Given: -10 directories -10 files... (4 Replies)
Discussion started by: agentgrecko
4 Replies

3. Shell Programming and Scripting

Faster way to use this awk command

awk "/May 23, 2012 /,0" /var/tmp/datafile the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file. now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
Discussion started by: SkySmart
8 Replies

4. Emergency UNIX and Linux Support

Cut | command line args

Hi, Can you please hint me how to achieve the below? Input: $./script.sh start 1 2 Internally inside the script i want to set a single variable with $2 and $3 value? Output: CMD=$1 ARGS=$2 $3 --VInodh (10 Replies)
Discussion started by: vino_hymi
10 Replies

5. Shell Programming and Scripting

Print the whole line which contains the result of the command cut

Hey everyone I have a file 'agenda' which contains: Object Day Month Year Birthday 09 02 2012 i want to extract from a script the line which contains the day the user typed. for example if he type 09 the line is showed using... (4 Replies)
Discussion started by: Goldstein
4 Replies

6. Shell Programming and Scripting

cut command issue from a line of text

Hi, I got a line of text which has spaces in between and it is a long stream of characters. I want to extract the text from certain position. Below is the line and I want to take out 3 characters from 86 to 88 character position. In this line space is also a character. However when using cut... (5 Replies)
Discussion started by: asutoshch
5 Replies

7. Shell Programming and Scripting

On the command line using bash, how do you split a string by column?

Input: MD5(secret.txt)= fe66cbf9d929934b09cc7e8be890522e MD5(secret2.txt)= asd123qwlkjgre5ug8je7hlt488dkr0p I want the results to look like these, respectively: MD5(secret.txt)= fe66cbf9 d929934b 09cc7e8b e890522e MD5(secret2.txt)= asd123qw lkjgre5u g8je7hlt 488dkr0p Basically, keeping... (11 Replies)
Discussion started by: teiji
11 Replies

8. UNIX for Dummies Questions & Answers

Which command will be faster? y?

i)wc -c/etc/passwd|awk'{print $1}' ii)ls -al/etc/passwd|awk'{print $5}' (4 Replies)
Discussion started by: karthi_g
4 Replies

9. Shell Programming and Scripting

command faster in crontab..

Hi all you enlightened unix people, I've been trying to execute a perl script that contains the following line within backticks: `grep -f patternfile.txt otherfile.txt`;It takes normally 2 minutes to execute this command from the bash shell by hand. I noticed that when i run this command... (2 Replies)
Discussion started by: silverlocket
2 Replies

10. Shell Programming and Scripting

Which is faster AWK or CUT

If I just wanted to get andred08 from the following ldap dn would I be best to use AWK or CUT? uid=andred08,ou=People,o=example,dc=com It doesn't make a difference if it's just one ldap search I am getting it from but when there's a couple of hundred people in the group that retruns all... (10 Replies)
Discussion started by: dopple
10 Replies
Login or Register to Ask a Question