Is there a way to make this more efficient


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Is there a way to make this more efficient
# 8  
Old 08-12-2008
The reply I already posted in a related thread https://www.unix.com/shell-programmin...e-sed-cut.html can easily be extended to cope as well. Whether you use a single sed script or a single awk script is a matter of taste; I picked awk because it's less arcane.

Code:
awk '{ f=substr($0, 1, 6000); gsub (/[^a-zA-Z0-9+_:-]/, "", f);
    if (f ~ ... hit on ex:Msg ...) ... extract and print ex:Msg data ...;
    if (f ~ ... hit on PutDate ...) ...extract and print PutDate ...;
   ... etc' temp01

With better information about what the input is supposed to look like, the dotted parts could be made less speculative.
# 9  
Old 08-12-2008
... Actually with Perl it gets rather simple.

Code:
perl -ne '$f = substr($_, 0, 5999);
  if ($f =~ /ex:Msg(.*?)ex:Msg/) { print "MsgId = $1\n"; }
  if ($f =~ /PutDate(.*?)PutTime/) { print "Put Date = $1\n"; }
  if ($f =~ /PutTime(.*?)ApplOrgin/) { print "Put Time = $1\n"; }
  if ($f =~ /CreationTimeStamp(.*?)CreationTimeStamp/) { print "Timestamp = $1\n"; }' temp01

# 10  
Old 08-13-2008
... And finally here's sed:

Code:
cut -c1-6000 temp01 | sed '
  # Copy data to hold space
  h
  # Substitute MsgId and print
  s/.*ex:Msg\(.*\)ex:Msg.*/MsgId = \1/p
  # Copy back from hold space
  g
  # Similar for Put Date, Put Time, and Timestamp
  s/.*PutDate\(.*\)PutTime.*/Put Date = \1/p
  g
  s/.*PutTime\(.*\)ApplOrigin.*/Put Time = \1/p
  g
  s/.*CreationTimeStamp\(.*\)CreationTimeStamp.*/Timestamp = \1/p'

The cut could be replaced by s/^\(.\{6000\}\).*/\1/ if your sed supports that syntax, or a similar expression with 6000 periods inside the parenteses, if your sed can cope with that. Or you could simply hope that each match on any of the expressions above will happen within the first 6,000 characters, and simply omit the cut. Again, without sample data, it's hard to say.

Sorry for following up on my own posts; sleeping on this question brought up new ideas.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combining awk command to make it more efficient

VARIABLE="jhovan 5259 5241 0 20:11 ? 00:00:00 /proc/self/exe --type=gpu-process --channel=5182.0.1597089149 --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,45,57 --disable-accelerated-video-decode --gpu-vendor-id=0x80ee --gpu-device-id=0xbeef --gpu-driver-vendor... (3 Replies)
Discussion started by: SkySmart
3 Replies

2. Programming

Help with make this Fortran code more efficient (in HPC manner)

Hi there, I had run into some fortran code to modify. Obviously, it was written without thinking of high performance computing and not parallelized... Now I would like to make the code "on track" and parallel. After a whole afternoon thinking, I still cannot find where to start. Can any one... (3 Replies)
Discussion started by: P_E_M_Lee
3 Replies

3. UNIX for Advanced & Expert Users

Efficient way to grep

Hi Experts, I've been trying simple grep to search for a string in a huge number of files in a directory. grep <pattern> * this gives the search results as well as the following - grep: <filename>: Permission denied grep: <filename>: Permission denied for files which I don't have... (4 Replies)
Discussion started by: sumoka
4 Replies

4. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

5. UNIX for Dummies Questions & Answers

Is this regex efficient?

I want to match the red portion: 9784323456787-Unknown Phrase with punctuation "Some other PhrASE." Is this the best regex to match this? '978\{10\}-*' (4 Replies)
Discussion started by: glev2005
4 Replies

6. UNIX for Advanced & Expert Users

efficient repace

some of the data i receive has been typed in manually due to which there are often places where i find 8 instead of ( and the incorrect use of case what according to you is the best way to correct such data. The data has around 20,000 records. The value i want to change is in the 4th field.... (2 Replies)
Discussion started by: VGR
2 Replies

7. Shell Programming and Scripting

help on most efficient search

Hello, We have a directory with 15 sub-directories where each sub-directory contains 1.5 to 2 lakhs of files in it. Daily, around 300-500 files will be uploaded to each sub-directory. Now, i need to get the list of files received today in most efficient way. I tried using "find with newer... (16 Replies)
Discussion started by: prvnrk
16 Replies

8. Shell Programming and Scripting

Is there a more efficient way?

I'm using korn shell to connect to oracle, retrieve certain values, put them in a list, and iterate through them. While this method works, I can't help but think there is an easier method. If you know of one, please suggest a shorter, more efficient method. ############### FUNCTIONS ... (6 Replies)
Discussion started by: SelectSplat
6 Replies

9. UNIX for Advanced & Expert Users

Efficient Dispatching

Does anyone know what's new with Efficient dispatching in the Solaris 2.8 release (vs Solaris 2.6) release? Specifically, does anyone know of a good website to get detailed information on thread dispatching using efficient dispatching in solaris 2.8? Thank you. (1 Reply)
Discussion started by: uchachra
1 Replies
Login or Register to Ask a Question