Improve script - slow process with big files

01-26-2017

Registered User

411, 1

Join Date: Aug 2010

Last Activity: 23 May 2020, 10:33 PM EDT

Location: EEUU

Posts: 411

Thanks Given: 317

Thanked 1 Time in 1 Post

Dear RudiC,

Thansk a lot for this great job.

I think i have something missing because when i use the code. i got the following output .. for sfile.

Code:

H265678901234567890123456789012345678901234567890123456789012345678901234567890
H26      1         2         3         4         5         6         7          
S      0.00      0.00  11                           0.0       0.0   0.0000000004
S      0.00      0.00  11                           0.0       0.0   0.0000000150
S      0.00      0.00  11                           0.0       0.0   0.0000000296
S      0.00      0.00  11                           0.0       0.0   0.0000000442
S      0.00      0.00  11                           0.0       0.0   0.0000000588
S      0.00      0.00  11                           0.0       0.0   0.0000000734
S      0.00      0.00  11                           0.0       0.0   0.0000000880
S      0.00      0.00  11                           0.0       0.0   0.0000001026
S      0.00      0.00  11                           0.0       0.0   0.0000001172
S      0.00      0.00  11                           0.0       0.0   0.0000001318

Then i dont have the data for column 2 and others.

Please can u send me the output you got.

Thanks and regards

Last edited by jiam912; 01-26-2017 at 11:09 AM..

jiam912

View Public Profile for jiam912

Find all posts by jiam912

01-26-2017

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

This is what I get for SFILE:

Code:

H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26      1         2         3         4         5         6         7          
S  67609.00  30835.00  11                      240038.1 2786615.9 373.82147483647
S  67609.00  30841.00  11                      240113.1 2786612.8 373.72147483647
S  67607.00  30841.00  11                      240111.7 2786588.4 373.92147483647
S  67605.00  30841.00  11                      240111.1 2786562.3 374.32147483647
S  67603.00  30841.00  11                      240116.1 2786537.1 374.42147483647
S  67609.00  30851.00  11                      240237.3 2786613.9 373.32147483647
S  67609.00  30491.00  11                      235736.9 2786612.1 368.72147483647
S  67607.00  30491.00  11                      235734.3 2786587.1 369.32147483647
S  67605.00  30491.00  11                      235737.1 2786561.2 368.72147483647
S  67603.00  30491.00  11                      235738.4 2786539.5 367.92147483647

Except for the last column which is the difficult date/time info, it is identical to your sample output. Did you test with your sample file from post#1?

This User Gave Thanks to RudiC For This Post:

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-26-2017

Registered User

411, 1

Join Date: Aug 2010

Last Activity: 23 May 2020, 10:33 PM EDT

Location: EEUU

Posts: 411

Thanks Given: 317

Thanked 1 Time in 1 Post

Dear RudiC,

Yes I use the same sample file,, but really i dont understand where the issue is.. I convert it to unix also to try but does not work.

jiam912

View Public Profile for jiam912

Find all posts by jiam912

01-26-2017

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

After some cogitating about GPS -> UTC date/time conversion, I could replicate the date/time column in your S file using GNU date 8.25 (although I still don't understand what you're after here). Both output files now are identical to the ones you attached in post#1. Try:

Code:

awk -F: '
BEGIN                   {FMT = "date +\"%d %d %d %d %11.1f %11.1f %11.1f 0%%d%%H%%M%%S %010d %010d\" -d@%s\n"
                         for (n = split ("Tape_Nb:File_Nb:Line_Name:Point_Number:Cog_Easting:Cog_Northing:Cog_Elevation:Tb_GPS_Time", IX); n>0; n--) SRCH[IX[n]]
                        }

$1 ~ /^Observer_Report/ {if (flag)      printf FMT,     OUT[IX[1]], OUT[IX[2]], OUT[IX[3]], OUT[IX[4]],
                                                        OUT[IX[5]], OUT[IX[6]], OUT[IX[7]], from, to, OUT[IX[8]] + 315961200 + 10783    # epoch = GPS + 6.1.1980 + 3h - 17 sec
                         delete OUT
                         from = NR
                         flag = 1
                        }

                        {gsub (/[       ]/, _)
                         to = NR
                        }

$1 in SRCH              {OUT[$1] = $2
                        }
$1 ~ SRCH[IX[8]]        {OUT[$1] = substr($2,1,10)
                        }

END                     {printf FMT,    OUT[IX[1]], OUT[IX[2]], OUT[IX[3]], OUT[IX[4]],
                                        OUT[IX[5]], OUT[IX[6]], OUT[IX[7]], from, to, OUT[IX[8]] + 315961200 + 10783                    # epoch = GPS + 6.1.1980 + 3h - 17 sec
                        }
' /tmp/16.txt |

sh |

awk -F[:-\(] '
BEGIN                   {HD1 = "H26 5678901234567890123456789012345678901234567890123456789012345678901234567890"
                         HD2 = "H26      1         2         3         4         5         6         7          "
                        }
NR == 1                 {print HD1 RS HD2 > XFILE
                         print HD1 RS HD2 > SFILE
                         }

FNR == NR               {OR[NR] = $0
                         MX = NR
                         next
                        }
FNR > NXTREP ||
FNR == 1                {n = split (OR[++OCNT], T, " ")
                         NXTREP = T[n] + 0
                         printf "S%10.2f%10.2f%3d1                     %9.1f%10.1f%6.1f%09d\n", T[3], T[4], 1, T[5], T[6], T[7], T[8] > SFILE
                        }

                        {sub (/^[       ]*/, _)
                         sub (/ *: */, ":")
                        }


$1 ~ /^Live_Seis/       {DATA = 1
                         sub (/Live_Seis[^:]*:/, _)
                        }
/[^0-9:() -]/           {DATA = 0
                        }
DATA                    {printf "X%6d%8d11%10.2f%10.2f%1d%5d%5d1%10.2f%10.2f%10.2f1\n", T[1], T[2], T[3], T[4], 1, $4, $5, $1, $2, $3 > XFILE 
                        }
' XFILE="xfile" SFILE="sfile" - /tmp/16.txt

diff xfile /tmp/16.xx01    # no diff = identical! 
diff sfile /tmp/16.ss01    # no diff = identical!

Last edited by RudiC; 01-26-2017 at 04:33 PM..

This User Gave Thanks to RudiC For This Post:

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-27-2017

Registered User

411, 1

Join Date: Aug 2010

Last Activity: 23 May 2020, 10:33 PM EDT

Location: EEUU

Posts: 411

Thanks Given: 317

Thanked 1 Time in 1 Post

Dear RudiC,

Thanks a lot for your help, It works perfectly now..

I have modified a little the code to get correct value in indexpoint,

Code:

OUT[IX[8]]

..

and i have to remove the tab spaces to let the code works fine.

here the last modification:

Code:

            read -p " " jd 

sed -i -e "s/[[:space:]]\+/ /g" $jd.txt 

awk -F: '
BEGIN                   {FMT = "date +\"%d %d %d %d %11.1f %11.1f %11.1f %d 0%%d%%H%%M%%S %010d %010d\" -d@%s\n"
                         for (n = split ("Tape_Nb:File_Nb:Line_Name:Point_Number:Cog_Easting:Cog_Northing:Cog_Elevation:Point_Index:Tb_GPS_Time", IX); n>0; n--) SRCH[IX[n]]
                        }

$1 ~ /^Observer_Report/ {if (flag)      printf FMT,     OUT[IX[1]], OUT[IX[2]], OUT[IX[3]], OUT[IX[4]],
                                                        OUT[IX[5]], OUT[IX[6]], OUT[IX[7]], OUT[IX[8]], from, to, OUT[IX[9]] + 315961200 + 10783    # epoch = GPS + 6.1.1980 + 3h - 17 sec
                         delete OUT
                         from = NR
                         flag = 1
                        }

                        {gsub (/[       ]/, _)
                         to = NR
                        }

$1 in SRCH              {OUT[$1] = $2
                        }
$1 ~ SRCH[IX[9]]        {OUT[$1] = substr($2,1,10)
                        }

END                     {printf FMT,    OUT[IX[1]], OUT[IX[2]], OUT[IX[3]], OUT[IX[4]],
                                        OUT[IX[5]], OUT[IX[6]], OUT[IX[7]], OUT[IX[8]], from, to, OUT[IX[9]] + 315961200 + 10783                    # epoch = GPS + 6.1.1980 + 3h - 17 sec
                        }
' $jd.txt |

sh |

awk -F[:-\(] '
BEGIN                   {HD1 = "H26 5678901234567890123456789012345678901234567890123456789012345678901234567890"
                         HD2 = "H26      1         2         3         4         5         6         7          "
                        }
NR == 1                 {print HD1 RS HD2 > XFILE
                         print HD1 RS HD2 > SFILE
                         }

FNR == NR               {OR[NR] = $0
                         MX = NR
                         next
                        }
FNR > NXTREP ||
FNR == 1                {n = split (OR[++OCNT], T, " ")
                         NXTREP = T[n] + 0
                         printf "S%10.2f%10.2f%3d1                     %9.1f%10.1f%6.1f%09d\n", T[3], T[4], T[8], T[5], T[6], T[7], T[9] > SFILE
                        }

                        {sub (/^[       ]*/, _)
                         sub (/ *: */, ":")
                        }


$1 ~ /^Live_Seis/       {DATA = 1
                         sub (/Live_Seis[^:]*:/, _)
                        }
/[^0-9:() -]/           {DATA = 0
                        }
DATA                    {printf "X%6d%8d11%10.2f%10.2f%1d%5d%5d1%10.2f%10.2f%10.2f1\n", T[1], T[2], T[3], T[4], T[8], $4, $5, $1, $2, $3 > XFILE 
                        }
' XFILE="$jd.x" SFILE="$jd.s" - $jd.txt

Appreciate your help

jiam912

View Public Profile for jiam912

Find all posts by jiam912

01-27-2017

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

The sed is not necessary, {gsub (/[ ]/, _) contained space and <TAB> and should remove all those. Mayhap got lost in transfer.
Why do you use the GPS date/time stamp and its (OK, not too) complicated transformation to UTC, if the clear text date/time is available in the "Date" record?

This User Gave Thanks to RudiC For This Post:

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-27-2017

Registered User

411, 1

Join Date: Aug 2010

Last Activity: 23 May 2020, 10:33 PM EDT

Location: EEUU

Posts: 411

Thanks Given: 317

Thanked 1 Time in 1 Post

Dear RudiC,
I will check why i have problems with the tab space.
I use the conversion GPS time to UTC to by more precise only.. and you are right the datetime is already in the file.. but this is the only reason why i use the GPStime...
Thanks s lot for your help...

jiam912

View Public Profile for jiam912

Find all posts by jiam912

Shell Programming and Scripting

Improve script - slow process with big files

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash script search, improve performance with large files

Discussion started by: SDohmen

2. Solaris

Rsync quite slow (using very little cpu): how to improve its speed?

Discussion started by: priyadarshan

3. HP-UX

Script execution is very slow when trying to find all files and their owners on HP-UX box

Discussion started by: Adyan Faruqi

4. UNIX for Dummies Questions & Answers

How do I slow down a process?

Discussion started by: Nathan1

5. Shell Programming and Scripting

Very big text file - Too slow!

Discussion started by: fedonMan

6. UNIX for Advanced & Expert Users

sed working slow on big files

Discussion started by: sumoka

7. Shell Programming and Scripting

egrep is very slow : How to improve performance

Discussion started by: hidnana

8. AIX

How to send big files over slow network?

Discussion started by: giribt

9. Shell Programming and Scripting

bash script working for small size files but not for big size files.

Discussion started by: davidpreml

10. UNIX for Advanced & Expert Users

looking for solution to improve process replicate files to remote loc.

Discussion started by: mr_manny