Grep string causes extra spaces


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep string causes extra spaces
# 1  
Old 05-16-2019
Grep string causes extra spaces

Hello,
I have an xml file and my aim is to grab each line in keywords file and search the string in another file.
When keyword is found in xml file,I expect the script to go to previous line in the xml file and grab the string/value between two strings. It's almost working with an error.

tab separated keywords.txt
Code:
test1 qqq98
test35 sss32
test26 Rsiw

1.xml file
Code:
  <id="229954e70d6b702f8d570b4be11af181">
    <display-name>test44 lgi3d</display-name>
  <id="229954e70d6b702f8d51331cbe11af181">
    <display-name>test35 kkld</display-name>
  <id="2223230did3s2Qafevrgvve1cbe11af181">
    <display-name>test26 Rsiw</display-name>

expected output:
Code:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id="2223230did3s2Qafevrgvve1cbe11af181"

Code:
while read COL1 COL2 && read -r line <&3; do
A=$(grep -B1 "$COL1.*$COL2" 1.xml | grep -v "display-name" | sed -e 's/<id=\"\(.*\)\">/\1/' )
#A=$(grep -B1 "$COL".*$COL2" 1.xml | grep -v "display-name" | grep -o -P '(?<=<id=\").*(?=\">)')
echo "$COL1 $COL2 id=\"$A\""
done < keywords.txt 3<1.xml

This gives:
Code:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id="  2223230did3s2Qafevrgvve1cbe11af181"

I wondered why there are two spaces before $A variable at output console.

Thank you
Boris

Last edited by baris35; 05-16-2019 at 10:01 AM..
# 2  
Old 05-16-2019
It doesn't print "extra" spaces, but the two leading spaces in the "id" line, which you do not remove with your sed command. Try again piping through
Code:
sed -e 's/^ *<id=\"\(.*\)\">/\1/'

, i.e. include the spaces from line start...
This User Gave Thanks to RudiC For This Post:
# 3  
Old 05-16-2019
How about (be aware there's NO test1 in your data samples)
Code:
awk -F"[<>]" '
NR == FNR       {T[$0]
                 next
                }
/<id/           {TMP = $2
                 next
                }
                {print $3, ($3 in T)?TMP:"id=\"\""}
' keywords.txt 1.xml 
test44 lgi3d id=""
test35 kkld id=""
test26 Rsiw id="2223230did3s2Qafevrgvve1cbe11af181"

Aside: why do you read line <&3 and then don't use it?
This User Gave Thanks to RudiC For This Post:
# 4  
Old 05-16-2019
Thank You Rudic,
This one also works as expected.

Kind regards
Boris
# 5  
Old 05-16-2019
Code:
while read key; do
        while read line; do
                if [[ $line =~ $key ]]; then
                        IFS=\" read a id z
                        break
                fi
        done < <(tac 1.xml)
        echo $key id=\"$id\"
        unset id
done < keywords.txt

This User Gave Thanks to nezabudka For This Post:
# 6  
Old 05-16-2019
Code:
awk -F ">|<" '
NR == FNR       {tmp=$2; getline; T[$3] = tmp; next
                }
                {print  $0, ($0 in T)?T[$0]:"id=\"\""
                }
' 1.xml keywords.txt

--- Post updated at 20:34 ---

Code:
awk -F '[<>"]' '
NR == FNR       {tmp=$3; getline; T[$3] = tmp; next
                }
                {print  $0, "id=\"" T[$0] "\""
                }
' 1.xml keywords.txt

This User Gave Thanks to nezabudka For This Post:
# 7  
Old 05-16-2019
Thank You All,
I will also test your codes and keep you posted.

Kind regards
Boris
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Remove line breaks and extra spaces

Hi, I want to remove all extra spaces, line breaks . Need a new line entry only for term starting"array" For eg: my input is array(), array(), array(), and my expected output is array(), array(), array(), Is it possible using awk? (5 Replies)
Discussion started by: rsi.245
5 Replies

2. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the 2nd field. Sample: abc|bd |bkd123 .. 1space abc|badf |bakdsf123 .. 2space abc|bqe |bakuowe .. 3space Output: abc|bd|bkd123 abc|badf|bakdsf123 abc|bqe|bakuowe i used the following command, (9 Replies)
Discussion started by: anshaa
9 Replies

3. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the filed. Sample: abc~bd ~bkd123 .. 1space abc~badf ~bakdsf123 .. 2space abc~bqed ~bakuowe .. 3space output: abc~bd ~bkd123 .. 1space abc~badf~bakdsf123 .. 2space abc~bqed~bakuowe .. 3space i used the following command, (2 Replies)
Discussion started by: anshaa
2 Replies

4. Shell Programming and Scripting

grep on string separated by spaces

hi I am on AIX 5 and i have a script that runs the following command to list processes running. I then want to kill the returned processes. The PID are on field 2 separated by spaces. $ ps -ef|grep "rams.e $PORT" lesqa 1826998 2646248 0 11:20:35 pts/2 0:00 grep rams.e t24cm 2789380 ... (3 Replies)
Discussion started by: dustytina
3 Replies

5. Shell Programming and Scripting

Remove of extra spaces from the trailing

HI, I need the help from the experts like I have created one file with text like: a b c d e f g h i j k l So my question is that i have to write the script in which like in the first sentence it will take only one space after d and remove all the extra space in the end.I dont... (8 Replies)
Discussion started by: bhanudhingra
8 Replies

6. Shell Programming and Scripting

clear extra spaces and tabs in a file

Any help appreciated Thanks sample input: > (extra spaces&tabs in here) test1 (extra spaces&tabs in here) 123.123.123.123 (extra spaces&tabs in here) abc (extra spaces&tabs in here) 123 --- < (extra spaces&tabs in... (3 Replies)
Discussion started by: goofist
3 Replies

7. Shell Programming and Scripting

How to remove extra spaces from a string??

Hi, I have a string like this and i want to remove extra spaces that exists between the words. Here is the sentence. $string="The small DNA genome of hepadnaviruses is replicated by reverse transcription via an RNA intermediate. This RNA "pregenome" contains ... (2 Replies)
Discussion started by: vanitham
2 Replies

8. Shell Programming and Scripting

remove extra spaces between fields

Hi, I have a source file as mentioned below: I want to remove all the extra spaces between the fields. a b--------|sa df-------|3232---|3 sf sa------|afs sdf-----|43-----|33 a b c------|adfsa dsf---|23-32|23 *Here '-' idicates spaces Now, I want output as below: a b|sa df|3232|3... (7 Replies)
Discussion started by: srilaxmi
7 Replies

9. Shell Programming and Scripting

Remove extra spaces in a line

Hi, I need a help in deleting extra spaces in a text. I have a huge file, a part of it is :- 3 09/21/08 03:32:07 started undef mino Oracle nmx004.wwdc.numonyx.com Message Text : The Oracle session with the PID 1103 has a CPU time ... (6 Replies)
Discussion started by: vikas027
6 Replies

10. UNIX for Dummies Questions & Answers

To remove the extra spaces in unix

Hi... I am quite new to Unix and would like an issue to be resolved. I have a file in the format below; 4,Reclaim,ECXTEST02,abc123,Harry Potter,5432 6730 0327 5469,0603,,MC,,1200,EUR,sho-001,,1,,,abc123,1223 I would like my output to be as follows; 4,Reclaim,ECXTEST02,abc123,Harry... (4 Replies)
Discussion started by: Sho
4 Replies
Login or Register to Ask a Question