SED help - cleaning up code, extra spaces won't go away


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting SED help - cleaning up code, extra spaces won't go away
# 1  
Old 04-24-2010
SED help - cleaning up code, extra spaces won't go away

Hello,

W/in the script I'm working on, I have a need to take a column from a file, and format it so I can have a variable that will egrep for & invert the regex from another file.

My solution is this:
Code:
VAR=`awk -F, '{print $2}' $FAIL | sed 's/-i/\|/g'`
VAR2=`echo $VAR | sed 's/ //g;s/^.\{1\}//g'`

egrep "$VAR2" file.txt >> newfile.txt
egrep -v "$VAR2" file.txt >> newfile.txt

The above works, but it's just ugly sed code.

All together...
input.txt:
Code:
-h a, -i b, -j c
-h d, -i e, -j f
-h g, -i h, -j i

Code:
VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g'`

... echo $VAR, prints: | b | e | h

So I'm running the variable again, through "another filter":
Code:
VAR2=`echo $VAR | sed 's/ //g;s/^.\{1\}//g'`

... echo $VAR2, prints b|e|h
... which is what I'd need for a good egrep command.

What I don't understand, why 'this' isn't working:
Code:
VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`

???
It'll produce something like this:
b e h
... I'm losing the "|", and the spaces are still there?

Can someone please help me understand what I'm doing wrong w/ SED? I can get the results I want, but it's ugly. Is what I'm doing correct? Lastly... if there's nothing wrong w/ how I'm doing things... is there a better or more efficient way?

Thanks everybody.

Last edited by vgersh99; 04-24-2010 at 09:53 AM.. Reason: code tag added
# 2  
Old 04-24-2010
nawk -f matt.awk input.txt file.txt

matt.awk:
Code:
BEGIN {
  FS=","
}
FNR==NR {
   match($2, "[^ ][^ ]*$")
   str=substr($2, RSTART)
   regex=(!regex)?str:regex "|" str
   next
}
$0 !~ regex

# 3  
Old 04-24-2010
Quote:
Originally Posted by Matthias03
...
What I don't understand, why 'this' isn't working:
Code:
VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`

???
It'll produce something like this:
b e h
... I'm losing the "|", and the spaces are still there?
...
Well, I do see the "|" characters in my output:

Code:
$ 
$ cat -n input.txt
     1    -h a, -i b, -j c
     2    -h d, -i e, -j f
     3    -h g, -i h, -j i
$ 
$ awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'
|b
|e
|h
$

But if I assign the quoted output of that pipeline to a shell variable, then spaces are introduced.

Code:
$ 
$ VAR="`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`"
$ 
$ echo $VAR
|b |e |h
$

The command pipeline's output did *not* have spaces at the end:

Code:
$ 
$ awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g' | od -bc
0000000 174 142 012 174 145 012 174 150 012
          |   b  \n   |   e  \n   |   h  \n
0000011
$

So I guess the shell replaces those newlines by blank spaces when it is assigned to a (shell) variable:

Code:
$ 
$ VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`
$ echo $VAR
|b |e |h
$ 
$ ## or quoted
$ VAR="`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`"
$ echo $VAR
|b |e |h
$ 
$ echo $VAR | od -bc
0000000 174 142 040 174 145 040 174 150 012
          |   b       |   e       |   h  \n
0000011
$

Here's another way to extract a pipe-delimited output from the file using plain awk:

Code:
$ 
$ cat -n input.txt
     1    -h a, -i b, -j c
     2    -h d, -i e, -j f
     3    -h g, -i h, -j i
$ 
$ awk '{sub(",","",$4); x = NR==1 ? $4 : x"|"$4} END{print x}' input.txt
b|e|h
$

tyler_durden
# 4  
Old 04-24-2010
Quote:
Originally Posted by durden_tyler
But if I assign the quoted output of that pipeline to a shell variable, then spaces are introduced.
The value of $VAR does contain the newlines. It may seem that the shell is converting newlines to spaces, but it is not. Since the variable expansion is unquoted, it's splitting the result into words at each of those newlines (and spaces and tabs, assuming a default value for IFS). echo never sees the newlines. echo does its job, printing its arguments as space-delimited list. If VAR is double-quoted, echo will be invoked with one argument which will contain newlines.

Code:
$ VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g'`
$ echo $VAR
| b | e | h
$ echo "$VAR"
 | b
 | e
 | h


Sidenote: Just as echo isn't seeing any of the newlines (because the shell "consumed" them during the field splitting step), echo is not seeing any of the spaces either. You are just less likely to miss them since echo prints a space to delimit its arguments, which are often space delimited to begin with. (Although you would notice that multiple-consecutive spaces are squeezed into one.)

Code:
$ VAR='a b c'
$ # Gives the impression that field splitting did not happen, but it did.
$ echo $VAR
a b c
$ VAR='a          b              c'
$ echo $VAR
a b c
$ echo "$VAR"
a          b              c

Regards,
Alister

Last edited by alister; 04-24-2010 at 12:40 PM..
# 5  
Old 04-24-2010
In short, try using:
Code:
awk -F, '{printf $2}' input.txt

# 6  
Old 04-24-2010
Quote:
Originally Posted by Matthias03
What I don't understand, why 'this' isn't working:
Code:
VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`

???
It'll produce something like this:
b e h
... I'm losing the "|", and the spaces are still there?

Can someone please help me understand what I'm doing wrong w/ SED? I can get the results I want, but it's ugly. Is what I'm doing correct? Lastly... if there's nothing wrong w/ how I'm doing things... is there a better or more efficient way?

Thanks everybody.
I cannot reproduce the "b e h" pipe-loss result.

My results:
Code:
$ cat input.txt 
-h a, -i b, -j c
-h d, -i e, -j f
-h g, -i h, -j i
$ VAR=`awk -F, '{print $2}' input.txt | sed 's/-i/\|/g;s/^.\{1\}//g;s/ //g'`
$ echo $VAR
|b |e |h

Personally, I'm partial to ...
Code:
awk -F'[, ]+' '{print $4}' input.txt | paste -sd\| -
b|e|h

... and ...
Code:
awk -F'[, ]+' '{printf("%s", (NR!=1 ? "|" : "") $4)}' input.txt
b|e|h

Regards,
Alister

Last edited by alister; 04-24-2010 at 03:01 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep string causes extra spaces

Hello, I have an xml file and my aim is to grab each line in keywords file and search the string in another file. When keyword is found in xml file,I expect the script to go to previous line in the xml file and grab the string/value between two strings. It's almost working with an error. tab... (6 Replies)
Discussion started by: baris35
6 Replies

2. Shell Programming and Scripting

Have problem with extra EOLs in my CSV - need help cleaning out

Hi Everyone, Searching the forum, I came across another closed thread, that appears to be either the same problem, or very close to what I'm experiencing. Closed thread for reference is at: https://www.unix.com/shell-programming-and-scripting/241664-removing-cr-lf-till-number-fields-full.html ... (4 Replies)
Discussion started by: richardsantink
4 Replies

3. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the 2nd field. Sample: abc|bd |bkd123 .. 1space abc|badf |bakdsf123 .. 2space abc|bqe |bakuowe .. 3space Output: abc|bd|bkd123 abc|badf|bakdsf123 abc|bqe|bakuowe i used the following command, (9 Replies)
Discussion started by: anshaa
9 Replies

4. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the filed. Sample: abc~bd ~bkd123 .. 1space abc~badf ~bakdsf123 .. 2space abc~bqed ~bakuowe .. 3space output: abc~bd ~bkd123 .. 1space abc~badf~bakdsf123 .. 2space abc~bqed~bakuowe .. 3space i used the following command, (2 Replies)
Discussion started by: anshaa
2 Replies

5. Shell Programming and Scripting

Remove of extra spaces from the trailing

HI, I need the help from the experts like I have created one file with text like: a b c d e f g h i j k l So my question is that i have to write the script in which like in the first sentence it will take only one space after d and remove all the extra space in the end.I dont... (8 Replies)
Discussion started by: bhanudhingra
8 Replies

6. Shell Programming and Scripting

clear extra spaces and tabs in a file

Any help appreciated Thanks sample input: > (extra spaces&tabs in here) test1 (extra spaces&tabs in here) 123.123.123.123 (extra spaces&tabs in here) abc (extra spaces&tabs in here) 123 --- < (extra spaces&tabs in... (3 Replies)
Discussion started by: goofist
3 Replies

7. Shell Programming and Scripting

How to remove extra spaces from a string??

Hi, I have a string like this and i want to remove extra spaces that exists between the words. Here is the sentence. $string="The small DNA genome of hepadnaviruses is replicated by reverse transcription via an RNA intermediate. This RNA "pregenome" contains ... (2 Replies)
Discussion started by: vanitham
2 Replies

8. Shell Programming and Scripting

remove extra spaces between fields

Hi, I have a source file as mentioned below: I want to remove all the extra spaces between the fields. a b--------|sa df-------|3232---|3 sf sa------|afs sdf-----|43-----|33 a b c------|adfsa dsf---|23-32|23 *Here '-' idicates spaces Now, I want output as below: a b|sa df|3232|3... (7 Replies)
Discussion started by: srilaxmi
7 Replies

9. Shell Programming and Scripting

Remove extra spaces in a line

Hi, I need a help in deleting extra spaces in a text. I have a huge file, a part of it is :- 3 09/21/08 03:32:07 started undef mino Oracle nmx004.wwdc.numonyx.com Message Text : The Oracle session with the PID 1103 has a CPU time ... (6 Replies)
Discussion started by: vikas027
6 Replies

10. UNIX for Dummies Questions & Answers

To remove the extra spaces in unix

Hi... I am quite new to Unix and would like an issue to be resolved. I have a file in the format below; 4,Reclaim,ECXTEST02,abc123,Harry Potter,5432 6730 0327 5469,0603,,MC,,1200,EUR,sho-001,,1,,,abc123,1223 I would like my output to be as follows; 4,Reclaim,ECXTEST02,abc123,Harry... (4 Replies)
Discussion started by: Sho
4 Replies
Login or Register to Ask a Question