Using SED/AWK to extract xml at end of file

10-27-2010

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

Can you check if there are any non-printable characters in the XML portion ? Especially around the line break that has affected the xml tag.

tyler_durden

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

10-27-2010

Registered User

8, 0

Join Date: Oct 2010

Last Activity: 30 March 2011, 4:31 AM EDT

Posts: 8

Thanks Given: 2

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by durden_tyler

Can you check if there are any non-printable characters in the XML portion ? Especially around the line break that has affected the xml tag.

tyler_durden

there isnt i dont think... just a whitespace that seperates the tags in some instances. Not all.

Cheers

hugh86

View Public Profile for hugh86

Find all posts by hugh86

10-27-2010

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

What's the output of this command ?

Code:

sed -n '/Sending XML/,/Message sending ended/p' your_file | od -bc

tyler_durden

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

10-27-2010

Registered User

8, 0

Join Date: Oct 2010

Last Activity: 30 March 2011, 4:31 AM EDT

Posts: 8

Thanks Given: 2

Thanked 0 Times in 0 Posts

Hello, i added your code into my script, im not sure what file you were referring to so i have attached what i used.

Code:

#!/bin/bash
echo "getXML"

echo -n "Enter the source file name WITH extension : "
read infile 
echo "Processing... : " 
sleep 1 
echo -n "Enter output file name (extenstion not applicable) : "
read outfile
sed -n '/Sending XML/,/Message sending ended/p' $outfile | od -bc
echo "Processing XML... : "
sleep 1
echo "Success..Data should be in '$outfile' if compiled correctly"

The outcome...
Unexpected error: Incomplete multibyte sequence in input when i open the outfile created.

On the terminal i got loads of different numbers fly accross the screen. Im not sure if they are even related to the infile i have.. attached below...

Code:

e   l   d   I   D   >   <   f   i   e   l   d   N   a   m
0031640 145 076 144 141 164 145 117 015 012 040 146 102 151 162 164 150
          e   >   d   a   t   e   O  \r  \n       f   B   i   r   t   h
0031660 074 057 146 151 145 154 144 116 141 155 145 076 074 146 151 145
          <   /   f   i   e   l   d   N   a   m   e   >   <   f   i   e
0031700 154 144 126 141 154 165 145 057 076 074 057 157 142 152 145 143
          l   d   V   a   l   u   e   /   >   <   /   o   b   j   e   c
0031720 164 106 151 145 154 144 076 074 157 142 152 145 143 164 106 151
          t   F   i   e   l   d   >   <   o   b   j   e   c   t   F   i
0031740 145 154 144 076 040 074 146 151 145 154 144 111 104 076 061 065
          e   l   d   >       <   f   i   e   l   d   I   D   >   1   5
0031760 061 067 074 057 146 151 145 015 012 040 154 144 111 104 076 074
          1   7   <   /   f   i   e  \r  \n       l   d   I   D   >   <
0032000 146 151 145 154 144 116 141 155 145 076 154 151 146 145 164 151
          f   i   e   l   d   N   a   m   e   >   l   i   f   e   t   i
0032020 155 145 123 154 141 101 155 157 165 156 164 074 057 146 151 145
          m   e   S   l   a   A   m   o   u   n   t   <   /   f   i   e
0032040 154 144 116 141 155 145 076 074 146 151 145 154 144 126 141 154
          l   d   N   a   m   e   >   <   f   i   e   l   d   V   a   l
0032060 165 145 076 061 070 060 060 060 060 060 074 057 146 151 145 154

Thanks,

H

Last edited by hugh86; 10-27-2010 at 12:13 PM.. Reason: code tags

hugh86

View Public Profile for hugh86

Find all posts by hugh86

10-27-2010

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

Quote:

Originally Posted by hugh86

Hello, i added your code into my script, im not sure what file you were referring to so i have attached what i used.

What I wanted was you executing my command on your command prompt (the Linux dollar-prompt).

The file I was refering to was the source file. That is, the one that is being read in your Bash script.

Quote:

Code:

#!/bin/bash
echo "getXML"

echo -n "Enter the source file name WITH extension : "
read infile 
...

Since you are going to test your Bash script, I am sure you know the name of the source file that you'll enter at the prompt above. That file name will be assigned to the variable "infile" in your script.

Now, let's say the source file name you have in mind is "abc.txt".

This file has some XML stuff embedded in it. My hunch is that there are Unicode characters in that XML stuff.

Try this on your Linux dollar-prompt -

Code:

perl -lne 'binmode(STDOUT, ":utf8"); while(/(.)/g){print $.,"\t",$1,"\t",ord($1) if ord($1) > 255}' abc.txt

Replace the string "abc.txt" by the actual name of your source file name.

tyler_durden

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

10-27-2010

Registered User

8, 0

Join Date: Oct 2010

Last Activity: 30 March 2011, 4:31 AM EDT

Posts: 8

Thanks Given: 2

Thanked 0 Times in 0 Posts

i tried that and replaced the file with my source file, in my case it was trace.txt i am not sure where the output file is though? I checked trace.txt and it was the same doc, do i not need to specify where the output is?

sorry if im being slow, i only started learning three weeks ago

hugh86

View Public Profile for hugh86

Find all posts by hugh86

10-27-2010

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

Quote:

Originally Posted by hugh86

i tried that and replaced the file with my source file, in my case it was trace.txt i am not sure where the output file is though? I checked trace.txt and it was the same doc, do i not need to specify where the output is?
...

No, you do not need to specify the output file name. The output will be displayed right after your command.

(A) If you have Ubuntu Gnome, then open up "Gnome Terminal" or "Terminal".

(B) If you have Ubuntu KDE (Kubuntu?), then open up "Konsole".

You'll see a dollar prompt in the terminal window.

Type in the following command at the prompt, in a single line.

Code:

$ perl -lne 'binmode(STDOUT, ":utf8"); while(/(.)/g){print $.,"\t",$1,"\t",ord($1) if ord($1) > 255}' trace.txt

Don't type that $ symbol. That's just for you to know that the stuff from "perl -lne .... " has to be typed at the $ prompt.

You could, alternatively, copy+paste the perl command from this webpage.

When you press the Enter or Return key after "trace.txt" the output will be displayed right there on the terminal window - right below your command.

Copy your command and the output from the terminal window and post them over here.

(Put that Bash script aside for the time being. You'd want to investigate the contents of the trace.txt source file first.)

tyler_durden

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

Shell Programming and Scripting

Using SED/AWK to extract xml at end of file

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Discussion started by: qwerty000

2. Shell Programming and Scripting

sed - extract text from xml file

Discussion started by: gioni

3. Shell Programming and Scripting

Use grep sed or awk to extract string from log file and put into CSV

Discussion started by: chipperuga

4. Shell Programming and Scripting

Extract XML message from a log file using awk

Discussion started by: on9west

5. Shell Programming and Scripting

sed extract from xml

Discussion started by: garboon

6. Shell Programming and Scripting

reformatting xml file, sed or awk I think (possibly perl)

Discussion started by: LMHmedchem

7. UNIX for Dummies Questions & Answers

Extract a specific number from an XML file based on the start and end tags

Discussion started by: sushant172

8. Shell Programming and Scripting

SED extract XML value

Discussion started by: ArterialTool

9. UNIX for Dummies Questions & Answers

Using sed to extract a substring at end of line

Discussion started by: figaro

10. Shell Programming and Scripting

sed or awk to extract data from Xml file

Discussion started by: yeclota