sed or awk editing help


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed or awk editing help
# 36  
Old 11-02-2010
Like many features, pretty but not fast, so you avoid them for heavy lifting if possible, but sometimes you need them even when they are slower.

Of course, if the user wanted to trim leading and trailing, these 4 would run fast, but the ones searching for space first should go last, s sed works his reges left to right, so it is best to have a selective string on the left end (I guess an Arabic/Hebrew language sed would work right to left?):
Code:
sed '
  s/^  *//
  s/,  */,/
  s/  *,/,/
  s/  *$//
 '

---------- Post updated at 04:08 PM ---------- Previous update was at 04:00 PM ----------

Quote:
Originally Posted by ctsgnb
Maybe the parsing Step, indeed, when ambiguous grouping is specified, it goes through a kind of "auto completion" step.
Also maybe using a memory copy instead of a memory mapping?.
Early sed had a limited line size, and was faster with less indirection, but gnu sed and later sed's seem to have very big or realloc()'d buffers. I don't think many programmers use a mmap()'d tmp file for the buffer. Sed is a pipe-oriented stream editor, so it would not be able to map the input file all the time, and even so, it could not write there, and of course it needs to scan intermediate product on multi-command scripts. So, I am not thinking memory map and sed at the same time. I suspect as it rewrites a line, it has an input pointer and an output pointer, and if the line is expanding, then when the pointers collide, there must be either mid-substitute moves in one buffer or copying between two buffers. I am not going to read the code, though! Smilie
# 37  
Old 11-02-2010
Yeah ! Thanks for this precisions DGnitPick Smilie

By the way, here is a thread in which i put an example few days ago of what i call "ambiguous grouping"
https://www.unix.com/shell-programmin...#post302464388 see in post #21

Consider how it behaves with ambiguous matching and how the \1 and & are auto-completed and \2 also if last appearing in the line (that was on SunOS 5.9, but i got the same results on a GNU linux machine) :
Code:
# echo MPMTR20100706043000.txt|sed -n -e 's/\([0-9][0-9]\).*\(3[0-9][0-9]\)/\1,\2/p'
MPMTR20,3000.txt
# echo MPMTR20100706043000.txt|sed -n -e 's/\([0-9][0-9]\).*\(3[0-9][0-9]\)/\1,\2,&/p'
MPMTR20,300,20100706043000.txt


Last edited by ctsgnb; 11-02-2010 at 05:29 PM..
# 38  
Old 11-02-2010
Hey, I love mmap(), and mmap64() more, a golden door into unlimited VM and RAM use and random access to application data. But sed is all about the (expandable) microcosm of two lines. The early small buffer sed activity fit in the small L1 caches of earlier days. I rarely nit pick below the bit level -- the pixel, maybe, but not the bit! It is about seeing all the choices, weighing all the choices, and making informed choices, investing in knowing the right technique for the next time, investing in yourself, investing in your new friends.
# 39  
Old 11-02-2010
Code:
there must be either mid-substitute moves in one buffer or copying between two buffers

Since i am not a C coder, so i trust your intuition about that dude , we have no better answer so far Smilie
# 40  
Old 11-02-2010
If you have Solaris, the truss -u'*' option shows you more than you want to know about the libc and other calls a running proc is making. JAVA object creation does a lot of memcpy()! The truss or tusc commands are very educational, even if you do not have the code, do not read C/C++, even if the process is already running! It shows all the kernel calls even without the -u'*' feature.
# 41  
Old 11-02-2010
I'm sure the multipass solutions should be faster, so
posting just to illustrate some Perl constructs: with one pass (if I'm not missing something):

Code:
perl -ple'
  s/
    ((?<=,)|(?<=^))
    \s+
    ((?=,)|(?=$))
    //xg  
  ' infile


Last edited by radoulov; 11-02-2010 at 07:07 PM..
# 42  
Old 11-02-2010
The single pass perl solution had is taking 2,5 times longer than the fastest sed solution. But it is twice as fast as the solutions that employed grouping. Interestingly when I tried this:
Code:
perl -ple 's/^ +,/,/;s/, +,/,,/g;s/, +,/,,/g;s/, +$/,/'

It was much faster and only about 20% slower than the equivalent fastest sed solution.

Last edited by Scrutinizer; 11-02-2010 at 07:32 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Editing files with sed or something similar

{ "AFafa": "FAFA","AFafa": "FAFA" "baseball":"soccer","wrestling":"dancing" "rhinos":"crocodiles","roles":"foodchain" } I need to insert a new line before the closing brackets "}" so that the final output looks like this: { "AFafa": "FAFA","AFafa": "FAFA"... (6 Replies)
Discussion started by: SkySmart
6 Replies

2. Shell Programming and Scripting

editing file with awk cut and sed

HI All, I am new to unix. I have a file would like to do some editing by using awk, cut and sed. Could anyone help? This file contain 100 lines. There are one line for example: 2,"102343454",5060,"579668","579668","579668","SIP",,,"825922","035885221283026",1,268,"00:59:00.782 APR 17... (2 Replies)
Discussion started by: mimilaw
2 Replies

3. UNIX for Dummies Questions & Answers

sed help finding and editing

With sed 1. I need to find a line that contains "DVM" and "73069". 2. I need to insert a double quote at the beginning of the first line of the file. These two have been driving me crazy for the last 45 minutes. Any help would be greatly appreciated. Thanks (3 Replies)
Discussion started by: nlassiter
3 Replies

4. UNIX for Dummies Questions & Answers

sed editing help....

Hello all, I need some help with sed. seems like i cant get through it. So here is what i am trying. when i do ps -ef|grep bla blah ...like below...i get /u01/app/oracle/11g/bin/tnslsnr .... but i want to replace that string with something using sed. So basically i want to get rid of... (3 Replies)
Discussion started by: abdul.irfan2
3 Replies

5. Shell Programming and Scripting

Line/Variable Editing for Awk sed Cut

Hello, i have a file, i open the file and read the line, i want to get the first item in the csv file and also teh third+6 item and wirte it to a new csv file. only problem is that using echo it takes TOO LONG: please help a newbie. below is my code: WorkingDir=$1 FileName=`cut -d ',' -f... (2 Replies)
Discussion started by: limamichelle
2 Replies

6. Shell Programming and Scripting

Comparison and editing of files using awk.(And also a possible bug in awk for loop?)

I have two files which I would like to compare and then manipulate in a way. File1: pictures.txt 1.1 1.3 dance.txt 1.2 1.4 treehouse.txt 1.3 1.5 File2: pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244 dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2... (1 Reply)
Discussion started by: linuxkid
1 Replies

7. Shell Programming and Scripting

problem in using sed command in editing a file

Hi all, I have a conf file, i want to update some entries in that conf file. Below is the code for that using a temporary file. sed '/workgroup=/ c\workgroup=Workgroup' /usr/local/netx.conf > /usr/local/netx.conf.tmp mv -f /usr/local/netx.conf.tmp /usr/local/netx.conf Sample contents of... (9 Replies)
Discussion started by: ranj14r
9 Replies

8. Homework & Coursework Questions

String editing using sed? awk?

1. The problem statement, all variables and given/known data: Problem Statement for project: When an account is created on the CS Unix network, a public html directory is created in the account's home directory. A default web page is put into that directory. Some users replace or... (13 Replies)
Discussion started by: peage1475
13 Replies

9. Shell Programming and Scripting

Editing Commas in a textfile using sed

Hi guys task removing the last commas of 5th and 6th columns. The bug in the script is causing effect because of whitespaces around commas. I tried to delete white spaces first and running the above script. but still some where getting the results wrong. I already have a script to do this... (12 Replies)
Discussion started by: repinementer
12 Replies

10. Shell Programming and Scripting

Editing File using awk/sed

Hello Awk Gurus, Can anyone of you help me with the below problem. I have got a file having data in below format pmFaultyTransportBlocks ----------------------- 9842993 pmFrmNoOfDiscRachFrames ----------------------- NULL pmNoRecRandomAccSuccess -----------------------... (4 Replies)
Discussion started by: Mohammed
4 Replies
Login or Register to Ask a Question