Sponsored Content
Full Discussion: Good sed Book?
Top Forums Shell Programming and Scripting Good sed Book? Post 302731705 by DGPickett on Thursday 15th of November 2012 03:28:22 PM
Old 11-15-2012
So, it is a area where behavior is not trustworthy, and thus I never go there! Files with no line feed right before EOF tend to have that last "line" ignored by sed. Maybe that's POSIX, too. I think EOF, new line and form feed should all be treated as end of line, but it is a bit late, never mind those MAC people with just carriage return and the DOS people with both. Both made sense for teletype: the cariage took more time to return 80 columns than the platen to rise one line, so it was sent on its way first.

Since sed is pretty easy about white space, I put sed on different lines than shell, indented meaningfully, and so have never needed one or more -e options! You, too, are worthy of well formatted code, reducing your errors, potential confusion and that of future maintainers.

I have never used 'G' and 'H' or space exchange but g and h are nice for parsing situations where something is missing, so you want to annotate the original line with an error prefix and write it to a reject log. You h it on the way in, in case of rejections, and upon rejection, g it before annotation on the way out. Similarly, usually I do not use 'D', but 's/.*\n//' so the second line is not released.

The 't' is a great time saver, as the s can both modify and recognize '/../' what had been there with one regex search. Just make sure, especially in a looper, that it gets cleaned out before reuse, as the flag reflects all s since the last t or automatic read.

I have been warming up to the -n and 's/.../.../p' lately, as it fits many situations (frequency not variety), but initially I ignored them as I was interested in the most versatile tactics.

I would note that many sed flavors do not tolerate comments # whatever, which is a shame. Inline documentation can help maintenance. In C, C++, JAVA, shell and SQL I like the switch/case/when/then/else, as each case can be commented neatly!

In data warehouses and similar places with crushingly big data sets, sed's lack of temp files and near-C speed are very well respected. It has a very important role to play in a pipe-oriented shell programming paradigm, where there are no intermediate or temp files, or any temp files are managed by the tool like sort. This results in lower latency and pipeline parallel multiprocessing, as many steps run concurrently.

Using literal '|' and named pipes (/sbin/mknod p p -- one of those p's is a file name), especially the self-managed named pipes '<(...)' and '>(...)' in bash and luckier systems' ksh, you can build a tree of pipelines working one or many inputs to produce one or many outputs. (On unlucky systems, bash makes named pipes somewhere under /var/tmp that accumulate, a bug I reported.) Unfortunately for sed, the self-managed named pipes '<(...)' and '>(...)' are parsed as words in ksh (according to David Korne) and probably bash; they have virtual spaces around them that you cannot erase without passing them through a shell function call or the like. Life is sometimes excessively complicated! In the following example, the first '>(...)' after a 'w' command in $1 does not work, might resolve to, essentially, ' /dev/fd/3 ' (the writable fd number from a pipe() call), so the pipe's file name '/dev/fd/3' is unrecognized sed command line option $2, the next part of the sed script is $3 and the next named pipe, perhaps '/dev/fd/5', is $4:
Code:
$ sed '
  /xyz/w '>( sort -u >file_1 )'
  /abc/w '>( sort -u >file_2 )'
 .... '


Last edited by DGPickett; 11-15-2012 at 05:21 PM..
 

10 More Discussions You Might Find Interesting

1. Programming

Good book

I just want to know if someone can tell me if this book "C Programming Language (2nd Edition) by Brian W. Kernighan, Dennis M. Ritchie" is a good book to learn C on unix/linux ??? i'm an old (33 :)) mainframe programmer who wants to learn something else besides cobol and pl/1 ...... (2 Replies)
Discussion started by: pat
2 Replies

2. UNIX for Dummies Questions & Answers

Good Solaris Admin book??

Can anyone recommend a good Solaris 8 or 9 Admin book? (5 Replies)
Discussion started by: here2learn
5 Replies

3. Shell Programming and Scripting

Need a good scripting book

Just a quick request guys As you might have guessed I've just started getting involved in Unix The guys and the boss in the unix team (not with them yet) have given me some projects to do at my request. Some of which involve scripting. The work is paying for me to go on a scripting... (2 Replies)
Discussion started by: w33man
2 Replies

4. AIX

Need a Good AIX Book

I'm an AIX newby:eek: and need to learn fast (I go on a course in a few week's time but I need to know some stuff now:mad:). Can anybody recommend a good AIX book please? Not too basic though - I've been in software for many years (8bit/16bit/32bit, etc, Intel/Microsoft/FORTRAN/68000/anything... (9 Replies)
Discussion started by: Pennant Man
9 Replies

5. AIX

Good book for AIX

Hi guys, From where can i download a good book on AIX other than redbooks from IBM website. I am also looking for the below book. AIX 5L Administration By Randal K. Michael (3 Replies)
Discussion started by: shabu
3 Replies

6. UNIX for Dummies Questions & Answers

Good book on Unix

I'm learning about Unix on my mac through the terminal application. I like it quite a bit. I'm finishing the chapter on Unix from my Mac OS X the missing manual, and it's whetted my appetite. Can anyone recommend a good book on beginning Unix (starting at beginner to intermediate). I'd like... (1 Reply)
Discussion started by: Straitsfan
1 Replies

7. BSD

Good book about the freeBSD architecure

Hi Guys, I need some help in getting a good book that describes the internals of the freeBSD OS, like the architecure, the process and memory management, etc.. I have some book which is named : the design and implementation of the freeBSD operating system, but I feel it's somewhat... (2 Replies)
Discussion started by: marwan
2 Replies

8. Programming

Good book to learn C

I'd like to learn C but I wanted to ask if anyone knows of a good book to start with. I came across some folks who said the best one is 'The C programming language, second edition' but some reviews said that it's not for beginners. I am learning Java and UNIX on my Mac and am familiar with... (6 Replies)
Discussion started by: Straitsfan
6 Replies

9. UNIX for Dummies Questions & Answers

Good book on Unix

Please suggest me few good books on Unix. I have currently purchased Unix Power Tools. (2 Replies)
Discussion started by: prashdeep
2 Replies

10. Shell Programming and Scripting

Looking for good book on awk

I am not sure if I am posting to the right forum but I would like to buy a book which goes into Awk in detail and covers the most advanced Awk programming techniques. Would anybody be able to recommend a good book? I see plenty of books available on Amazon but I am not sure how detailed they are.... (2 Replies)
Discussion started by: kieranfoley
2 Replies
SED(1)								   User Commands							    SED(1)

NAME
sed - stream editor for filtering and transforming text SYNOPSIS
sed [OPTION]... {script-only-if-no-other-script} [input-file]... DESCRIPTION
Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipe- line). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed's ability to filter text in a pipeline which particularly distinguishes it from other types of editors. -n, --quiet, --silent suppress automatic printing of pattern space -e script, --expression=script add the script to the commands to be executed -f script-file, --file=script-file add the contents of script-file to the commands to be executed -i[SUFFIX], --in-place[=SUFFIX] edit files in place (makes backup if extension supplied) -l N, --line-length=N specify the desired line-wrap length for the `l' command --posix disable all GNU extensions. -r, --regexp-extended use extended regular expressions in the script. -s, --separate consider files as separate rather than as a single continuous long stream. -u, --unbuffered load minimal amounts of data from the input files and flush the output buffers more often --help display this help and exit --version output version information and exit If no -e, --expression, -f, or --file option is given, then the first non-option argument is taken as the sed script to interpret. All remaining arguments are names of input files; if no input files are specified, then the standard input is read. E-mail bug reports to: bonzini@gnu.org . Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. COMMAND SYNOPSIS
This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed; other documentation (such as the tex- info document) must be consulted for fuller descriptions. Zero-address ``commands'' : label Label for b and t commands. #comment The comment extends until the next newline (or the end of a -e script fragment). } The closing bracket of a { } block. Zero- or One- address commands = Print the current line number. a text Append text, which has each embedded newline preceded by a backslash. i text Insert text, which has each embedded newline preceded by a backslash. q Immediately quit the sed script without processing any more input, except that if auto-print is not disabled the current pattern space will be printed. Q Immediately quit the sed script without processing any more input. r filename Append text read from filename. R filename Append a line read from filename. Commands which accept address ranges { Begin a block of commands (end with a }). b label Branch to label; if label is omitted, branch to end of script. t label If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script. T label If no s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script. c text Replace the selected lines with text, which has each embedded newline preceded by a backslash. d Delete pattern space. Start next cycle. D Delete up to the first embedded newline in the pattern space. Start next cycle, but skip reading from the input if there is still data in the pattern space. h H Copy/append pattern space to hold space. g G Copy/append hold space to pattern space. x Exchange the contents of the hold and pattern spaces. l List out the current line in a ``visually unambiguous'' form. n N Read/append the next line of input into the pattern space. p Print the current pattern space. P Print up to the first embedded newline of the current pattern space. s/regexp/replacement/ Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes 1 through 9 to refer to the corresponding matching sub-expressions in the regexp. w filename Write the current pattern space to filename. W filename Write the first line of the current pattern space to filename. y/source/dest/ Transliterate the characters in the pattern space which appear in source to the corresponding character in dest. Addresses Sed commands can be given with no addresses, in which case the command will be executed for all input lines; with one address, in which case the command will only be executed for input lines which match that address; or with two addresses, in which case the command will be executed for all input lines which match the inclusive range of lines starting from the first address and continuing to the second address. Three things to note about address ranges: the syntax is addr1,addr2 (i.e., the addresses are separated by a comma); the line which addr1 matched will always be accepted, even if addr2 selects an earlier line; and if addr2 is a regexp, it will not be tested against the line that addr1 matched. After the address (or address-range), and before the command, a ! may be inserted, which specifies that the command shall only be executed if the address (or address-range) does not match. The following address types are supported: number Match only the specified line number. first~step Match every step'th line starting with line first. For example, ``sed -n 1~2p'' will print all the odd-numbered lines in the input stream, and the address 2~5 will match every fifth line, starting with the second. (This is an extension.) $ Match the last line. /regexp/ Match lines matching the regular expression regexp. cregexpc Match lines matching the regular expression regexp. The c may be any character. GNU sed also supports some special 2-address forms: 0,addr2 Start out in "matched first address" state, until addr2 is found. This is similar to 1,addr2, except that if addr2 matches the very first line of input the 0,addr2 form will be at the end of its range, whereas the 1,addr2 form will still be at the beginning of its range. addr1,+N Will match addr1 and the N lines following addr1. addr1,~N Will match addr1 and the lines following addr1 until the next line whose input line number is a multiple of N. REGULAR EXPRESSIONS
POSIX.2 BREs should be supported, but they aren't completely because of performance problems. The sequence in a regular expression matches the newline character, and similarly for a, , and other sequences. BUGS
E-mail bug reports to bonzini@gnu.org. Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. Also, please include the output of ``sed --version'' in the body of your report if at all possible. COPYRIGHT
Copyright (C) 2003 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU- LAR PURPOSE, to the extent permitted by law. SEE ALSO
awk(1), ed(1), grep(1), tr(1), perlre(1), sed.info, any of various books on sed, the sed FAQ (http://sed.sf.net/grabbag/tutorials/sed- faq.txt), http://sed.sf.net/grabbag/. The full documentation for sed is maintained as a Texinfo manual. If the info and sed programs are properly installed at your site, the command info sed should give you access to the complete manual. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +--------------------+-----------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +--------------------+-----------------+ |Availability | SUNWgsed | +--------------------+-----------------+ |Interface Stability | Volatile | +--------------------+-----------------+ NOTES
Source for gsed is available on http://opensolaris.org. sed version 4.1.4 February 2006 SED(1)
All times are GMT -4. The time now is 04:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy