How to remove spaces using awk,sed,perl? Post: 302446984

Sponsored Content

Top Forums Shell Programming and Scripting How to remove spaces using awk,sed,perl? Post 302446984 by durden_tyler on Friday 20th of August 2010 10:46:58 AM

08-20-2010

Registered User

Quote:

Originally Posted by cola

...Why are there so many different types of regular expression?

That shouldn't be so surprising. They are just different methods of arriving at the same solution. Just as there are different routes to go from point A to point B.

The regex in the first awk solution used a character class with the repetition quantifier "+".

Code:

/^[ 	]+/

There's a blank space and a tab character in there which means that this regex searches for one or more occurrences, from the beginning, of either a space or tab character (or any combination of both). The sub function substitutes those with a zero-length string.
I think "\t" for the actual tab character should work; at least it does for gawk:

Code:

gawk '{sub(/^[ \t]+/,""); print}' file

The "+" repetition quantifier is an Extended Regular Expression (ERE). Gnu sed allows it, but the sed binaries in most Unix systems do not. So the other type of regex used in the sed solution was this -

Code:

/^ \{1,\}/

The bracket repetition operator is for finer control over repetition. "+" means one or more - there's no limit for "more". Whereas {m,n} means at least m at the most n repetitions. Further, {m,} means at least m repetitions - there's no upper limit here. So {1,} becomes equivalent to "+", which is why the sed solution works more or less the same. (It doesn't take care of tab characters though).

Quote:

...Perl regular expression is not similar to sed regular expression.

Perl is a different beast altogether. All the above concepts, BREs as well as EREs, are for POSIX regexes. Perl, on the other hand, started off by implementing Henry Spencer's regular expression library. It's regex syntax is richer, more consistent and more extensive than those of POSIX compliant regexes.

HTH,
tyler_durden

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove spaces in a string using sed.

Hello, I have the following to remove spaces from beginning and end of a string. infile=`echo "$infilename" | sed 's/^ *//;s/ *$//` How do I modify the above code to remove spaces from beginning, end and in the middle of the string also. ex: ...

2. Shell Programming and Scripting

sed over writes my original file (using sed to remove leading spaces)

Hello and thx for reading this I'm using sed to remove only the leading spaces in a file bash-280R# cat foofile some text some text some text some text some text bash-280R# bash-280R# sed 's/^ *//' foofile > foofile.use bash-280R# cat foofile.use some text some text some text...

3. Shell Programming and Scripting

How to delete ending/trailing spaces using awk,sed,perl?

How to delete ending/trailing spaces using awk,sed,perl? Input:(each line has extra spaces at the end) 3456 565 3 7 35 878 Expected output: 3456 565 3 7 35 878

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance

5. Shell Programming and Scripting

tr and sed remove spaces. how to stop this?

if the answer is obvious, sorry, I'm new here. anyway, I'm using tr to encrypt with rot-13: echo `cat $script | tr 'a-zA-Z' 'n-za-mN-ZA-M'` > $script it works, but it removes any consecutive spaces so that there is just one space between words. I've had this problem before while using sed to...

6. Shell Programming and Scripting

sed remove newlines and spaces

Hi all, i am getting count from oracle 11g by spooling it to a file. Now there are some newline characters and blank spaces i need to remove these. pl provide me a awk/sed solution. the spooled file is attached. i tried this.. but not getting req o/p

7. Shell Programming and Scripting

Need an awk / sed / or perl one-liner to remove last 4 characters with non-unique pattern.

Hi, I'm writing a ksh script and trying to use an awk / sed / or perl one-liner to remove the last 4 characters of a line in a file if it begins with a period. Here is the contents of the file... the column in which I want to remove the last 4 characters is the last column. ($6 in awk). I've...

8. Shell Programming and Scripting

PERL : Remove spaces in a variable

I have a variable I want to remove the spaces in between. The output should be How can this be done Any help will be appreciated. Thanks in advance

9. Shell Programming and Scripting

Using sed, awk or perl to remove substring of all lines except the first

Greetings All, I would like to find all occurences of a pattern and delete a substring from the all matching lines EXCEPT the first. For example: 1234::group:user1,user2,user3,blah1,blah2,blah3 2222::othergroup:user9,user8 4444::othergroup2:user3,blah,blah,user1 1234::group3:user5,user1 ...

10. Shell Programming and Scripting

Can't remove spaces with sed when calling it from sh -c

The following command works echo "some text with spaces" | sh -c 'sed -e 's/t//g''But this doesn't and should echo "some text with spaces" | sh -c 'sed -e 's/ //g''Any ideas?

LEARN ABOUT OSF1

regcmp

regcmp(3)						     Library Functions Manual							 regcmp(3)

NAME

       regcmp, regex - Compile and execute regular expression

LIBRARY

       Standard C Library (libc.so, libc. a)

SYNOPSIS

       #include <libgen.h>

       char *regcmp(	  const char *string1,	    ... /*,	 (char *)0 */);

       char *regex(	 const char *re,      const char *subject,	... );

STANDARDS

       Interfaces documented on this reference page conform to industry standards as follows:

       regcmp(), regex():  XPG4-UNIX

       Refer to the standards(5) reference page for more information about industry standards and associated tags.

PARAMETERS

       Points  to the string that is to be matched or converted.  Points to a compiled regular expression string.  Points to the string that is to
       be matched against re.

DESCRIPTION

       The regcmp()  function compiles a regular expression consisting of the concatenated arguments and returns a pointer to the  compiled  form.
       The  end  of  arguments	is  indicated  by a null pointer.  The malloc() function is used to create space for the compiled form.  It is the
       responsibility of the process to free unneeded space so allocated.  A null pointer returned from regcmp() indicates an invalid argument.

       The regex() function executes a compiled pattern against the subject string. Additional arguments of type char must be  passed  to  receive
       matched subexpressions back.  A global character pointer, __loc1, points to the first matched character in the subject string.

       The  regcmp()  and regex() functions support the simple regular expressions which are defined in the grep(1) reference page, but the syntax
       and semantics are slightly different.  The following are the valid symbols and their associated	meanings:  The	left  and  right  bracket,
       asterisk,  period, and circumflex symbols retain their meanings as defined in the grep(1) reference page.  A dollar sign matches the end of
       the string; 
 matches a new line.  Used within brackets, the hyphen signifies an ASCII character range.  For example [a-z]  is	equivalent
       to  [abcd...xyz].   The	-  (hyphen)  can  represent  itself only if used as the first or last character.  For example, the character class
       expression []-] matches the characters ] (right bracket) and - (hyphen).  A regular expression followed by a + (plus  sign)  means  one	or
       more  times.  For example, [0-9]+ is equivalent to [0-9][0-9]*.	Integer values enclosed in {} braces indicate the number of times the pre-
       ceding regular expression can be applied.  The value m is the minimum number and u is a number, less than 256, which is the  maximum.   The
       syntax {m} indicates the exact number of times the regular expression can be applied.  The syntax {m,} is analogous to {m,infinity}.  The +
       (plus sign) and * (asterisk) operations are equivalent to {1,} and {0,}, respectively.  The value of the  enclosed  regular  expression	is
       returned.   The	value is stored in the (n+1)th argument following the subject argument.  A maximum of ten enclosed regular expressions are
       allowed.  The regex() function makes its assignments unconditionally.  Parentheses are used for grouping.  An operator, such as	*,  +,	or
       {}, can work on a single character or a regular expression enclosed in parentheses.  For example, (a*(cb+)*)$0.

       Since all of the symbols defined above are special characters, they must be escaped to be used as themselves.

NOTES

       The regcmp() and regex() interfaces are scheduled to be withdrawn from a future version of the X/Open CAE Specification.

       These interfaces are obsolete; they are guaranteed to function properly only in the C/POSIX locale and so should be avoided.  Use the POSIX
       regcomp() interface instead of regcmp() and regex().

RETURN VALUES

       Upon successful completion, the regcmp() function returns a pointer to the compiled regular  expression.   Otherwise,  a  null  pointer	is
       returned and errno may be set to indicate the error.

       Upon  successful  completion,  the  regex() function returns a pointer to the next unmatched character in the subject string.  Otherwise, a
       null pointer is returned.

RELATED INFORMATION

       Commands: grep(1)

       Functions: malloc(3), regcomp(3)

       Standards: standards(5) delim off

																	 regcmp(3)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove spaces in a string using sed.

Discussion started by: radhika

2. Shell Programming and Scripting

sed over writes my original file (using sed to remove leading spaces)

Discussion started by: laser

3. Shell Programming and Scripting

How to delete ending/trailing spaces using awk,sed,perl?

Discussion started by: cola

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Discussion started by: cola

5. Shell Programming and Scripting

tr and sed remove spaces. how to stop this?

Discussion started by: Trichopterus

6. Shell Programming and Scripting

sed remove newlines and spaces

Discussion started by: rishav

7. Shell Programming and Scripting

Need an awk / sed / or perl one-liner to remove last 4 characters with non-unique pattern.

Discussion started by: right_coaster

8. Shell Programming and Scripting

PERL : Remove spaces in a variable

Discussion started by: irudayaraj

9. Shell Programming and Scripting

Using sed, awk or perl to remove substring of all lines except the first

Discussion started by: jacksolm

10. Shell Programming and Scripting

Can't remove spaces with sed when calling it from sh -c

Discussion started by: Tribe

LEARN ABOUT OSF1

regcmp