Consider this:
Quote:
$ head bigfile ; wc -l bigfile
12345678901234567890
23456789012345678901
12345678901234567890
23456789012345678901
12345678901234567890
23456789012345678901
12345678901234567890
23456789012345678901
12345678901234567890
23456789012345678901
20000 bigfile
|
@bakunin
sed definitely takes the cake (as long as no too long line issues):
Quote:
[sdass@db012a:PNB] /shared/home/sdass/tmp > time sed 's/\(.\{7\}\).\{7\}/\10000000/' bigfile > newfile.sed
real 0m0.08s
user 0m0.03s
sys 0m0.00s
|
But wrongly produces lines like this - nothing which cannot be fixed:
Quote:
12345670000000567890
23456780000000678901
|
Instead of this:
Quote:
12345600000004567890
23456700000005678901
|
However, speaking of speed, the bash script you posted is as fast as a Snail:
Quote:
$ cat baku
#!/usr/bin/ksh
typeset line=""
typeset start=""
typeset end=""
cat bigfile | while read line ; do
start="(print - "$line" | cut -c1-14)"
end="$(print - "$line" | cut -c15-)"
start="${start%%???????}0000000"
print - "${start}${end}" >> newfile
done
$ time baku
real 5m6.62s
user 0m14.95s
sys 1m9.19s
|
And it produced incorrect output too.
@ghostdog74
This wasn't very fast either:
Quote:
$ cat ghost
#!/usr/bin/bash
while read line; do echo ${line//${line:7:7}/0000000}; done < bigfile > newfile.bash
$ time ghost
real 0m1.31s
user 0m0.82s
sys 0m0.44s
|
Mine wasn't too bad - not as fast as the
sed version:
Quote:
$ time awk '{print substr($0,1,6)"0000000"substr($0,14)}' bigfile > newfile.awk
real 0m0.17s
user 0m0.11s
sys 0m0.00s
|
Nothing against anyone - just fair comparison.
HTH