Grep string causes extra spaces


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Grep string causes extra spaces

Hello,
I have an xml file and my aim is to grab each line in keywords file and search the string in another file.
When keyword is found in xml file,I expect the script to go to previous line in the xml file and grab the string/value between two strings. It's almost working with an error.

tab separated keywords.txt
Code:
test1 qqq98
test35 sss32
test26 Rsiw

1.xml file
Code:
  <id="229954e70d6b702f8d570b4be11af181">
    <display-name>test44 lgi3d</display-name>
  <id="229954e70d6b702f8d51331cbe11af181">
    <display-name>test35 kkld</display-name>
  <id="2223230did3s2Qafevrgvve1cbe11af181">
    <display-name>test26 Rsiw</display-name>

expected output:
Code:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id="2223230did3s2Qafevrgvve1cbe11af181"

Code:
while read COL1 COL2 && read -r line <&3; do
A=$(grep -B1 "$COL1.*$COL2" 1.xml | grep -v "display-name" | sed -e 's/<id=\"\(.*\)\">/\1/' )
#A=$(grep -B1 "$COL".*$COL2" 1.xml | grep -v "display-name" | grep -o -P '(?<=<id=\").*(?=\">)')
echo "$COL1 $COL2 id=\"$A\""
done < keywords.txt 3<1.xml

This gives:
Code:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id="  2223230did3s2Qafevrgvve1cbe11af181"

I wondered why there are two spaces before $A variable at output console.

Thank you
Boris

Last edited by baris35; 5 Days Ago at 09:01 AM..
# 2  
It doesn't print "extra" spaces, but the two leading spaces in the "id" line, which you do not remove with your sed command. Try again piping through
Code:
sed -e 's/^ *<id=\"\(.*\)\">/\1/'

, i.e. include the spaces from line start...
This User Gave Thanks to RudiC For This Post:
# 3  
How about (be aware there's NO test1 in your data samples)
Code:
awk -F"[<>]" '
NR == FNR       {T[$0]
                 next
                }
/<id/           {TMP = $2
                 next
                }
                {print $3, ($3 in T)?TMP:"id=\"\""}
' keywords.txt 1.xml 
test44 lgi3d id=""
test35 kkld id=""
test26 Rsiw id="2223230did3s2Qafevrgvve1cbe11af181"

Aside: why do you read line <&3 and then don't use it?
This User Gave Thanks to RudiC For This Post:
# 5  
Code:
while read key; do
        while read line; do
                if [[ $line =~ $key ]]; then
                        IFS=\" read a id z
                        break
                fi
        done < <(tac 1.xml)
        echo $key id=\"$id\"
        unset id
done < keywords.txt

This User Gave Thanks to nezabudka For This Post:
# 6  
Code:
awk -F ">|<" '
NR == FNR       {tmp=$2; getline; T[$3] = tmp; next
                }
                {print  $0, ($0 in T)?T[$0]:"id=\"\""
                }
' 1.xml keywords.txt

--- Post updated at 20:34 ---

Code:
awk -F '[<>"]' '
NR == FNR       {tmp=$3; getline; T[$3] = tmp; next
                }
                {print  $0, "id=\"" T[$0] "\""
                }
' 1.xml keywords.txt

This User Gave Thanks to nezabudka For This Post:
# 7  
Thank You All,
I will also test your codes and keep you posted.

Kind regards
Boris
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Removing extra unwanted spaces
anshaa
hi, i need to remove the extra spaces in the filed. Sample: abc~bd ~bkd123 .. 1space abc~badf ~bakdsf123 .. 2space abc~bqed ~bakuowe .. 3space output: abc~bd ~bkd123 .. 1space abc~badf~bakdsf123 .. 2space abc~bqed~bakuowe .. 3space i used the following command,... Shell Programming and Scripting
2
Shell Programming and Scripting
grep on string separated by spaces
dustytina
hi I am on AIX 5 and i have a script that runs the following command to list processes running. I then want to kill the returned processes. The PID are on field 2 separated by spaces. $ ps -ef|grep "rams.e $PORT" lesqa 1826998 2646248 0 11:20:35 pts/2 0:00 grep rams.e t24cm 2789380 ...... Shell Programming and Scripting
3
Shell Programming and Scripting
Remove of extra spaces from the trailing
bhanudhingra
HI, I need the help from the experts like I have created one file with text like: a b c d e f g h i j k l So my question is that i have to write the script in which like in the first sentence it will take only one space after d and remove all the extra space in the end.I dont...... Shell Programming and Scripting
8
Shell Programming and Scripting
How to remove extra spaces from a string??
vanitham
Hi, I have a string like this and i want to remove extra spaces that exists between the words. Here is the sentence. $string="The small DNA genome of hepadnaviruses is replicated by reverse transcription via an RNA intermediate. This RNA "pregenome" contains ...... Shell Programming and Scripting
2
Shell Programming and Scripting
remove extra spaces between fields
srilaxmi
Hi, I have a source file as mentioned below: I want to remove all the extra spaces between the fields. a b--------|sa df-------|3232---|3 sf sa------|afs sdf-----|43-----|33 a b c------|adfsa dsf---|23-32|23 *Here '-' idicates spaces Now, I want output as below: a b|sa df|3232|3...... Shell Programming and Scripting
7
Shell Programming and Scripting