Sponsored Content
Full Discussion: extract strings between tags
Top Forums Shell Programming and Scripting extract strings between tags Post 302340987 by drl on Tuesday 4th of August 2009 11:23:58 PM
Old 08-05-2009
Hi.

I don't use XMLish files, but I ran across this utility. if you have access to xml_grep, this task can be straight-forward. I modified your data file to put it into proper format and to differentiate between data1 and data2, then ran this script:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate extract data from XML file, xml_grep.
# Reference for XPath: http://en.wikipedia.org/wiki/XPath_1.0
# xml_grep: http://xmltwig.com/tool/

echo
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) xml_grep
set -o nounset
echo

FILE=${1-data1}

echo " Data file $FILE:"
cat $FILE

echo
echo " Results:"
xml_grep --text_only --cond '*[@name="data1"]/String' $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
/usr/bin/xml_grep version 0.7

 Data file data1:
<project>
<key name="data1">
<String>abcdef</String>
<String>abcdef1</String>
<String>abcdef2</String>
</key>

<key name="data2">
<String>abcdefg</String>
<String>abcdefg1</String>
<String>abcdefg2</String>
<String>abcdefg3</String>
</key>
</project>

 Results:
abcdef
abcdef1
abcdef2

The xml_grep perl script was in the Debian repository for me. The site URL is listed in the script above. Good luck ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data between two strings

Hi , I have a billing CDR file which has repeated lines as indicated below and I need to extract data between two strings (i.e.: <?> and </?>). Eventually, map that information with the corresponding field. I'm new to unix, any help will be greatly appreciated. Gamini Input (single line): !... (3 Replies)
Discussion started by: jaygamini
3 Replies

2. Shell Programming and Scripting

How to Extract text between two strings?

Hi, I want to extract some text between two strings in a line i am using following command i.e; awk '/-string1/,/-string2/' filename contents of file is--- line1 line2 aaa -bbb -ccc -string1 c,d,e -string2 line4 but it is showing complete line which is having searched strings. aaa... (19 Replies)
Discussion started by: emresearch
19 Replies

3. Shell Programming and Scripting

Extract text between two strings

Hi I have something like this: EXAMPLE 1 CREATE UNIQUE INDEX "STRING_1"."STRING_2" ON "BOSNI_CAB_EVENTO" ("CD_EVENTO" , "CD_EJECUCION" ) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 5242880 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "DB1000_INDICES_512K"... (4 Replies)
Discussion started by: chrispaz81
4 Replies

4. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

5. Shell Programming and Scripting

sed to extract all strings

Hi, I have a text file containing 2 lines as follows: I'm trying to extract all the strings following an "AME." The output would be as follows: BUSINESS_UNIT PROJECT_ID ACTIVITY_ID RES_USER1 RESOURCE_ID_FROM ANALYSIS_TYPE BI_DISTRIB_STATUS BUSINESS_UNIT PROJECT_ID ACTIVITY_ID... (5 Replies)
Discussion started by: simpletech369
5 Replies

6. UNIX for Dummies Questions & Answers

Extract code between 2 strings.

Hi, Im having some problems with this. I have loaded a file with html code. All code is placed in the same line. I want to get everything between two given strings (including these strings and get only the first appearance). Example: File contains <html><body><a href='a.html'>abc</a><a... (5 Replies)
Discussion started by: ngb
5 Replies

7. UNIX for Dummies Questions & Answers

Extract strings based on the value

I have a file with multiple columns (in this case, the file has 3 columns): NM_001006304 (-33.7) XM_418228 (-38.4) JN880447 (-33.7) CR387600 (-33.7) CR524203 (-36.3) GALGA_6AKII_KRT75 (-33.7) GALGA25_SC7 (-31.9) CR352795 (-36.3) NM_204172 (-31.7) NM_204137 (-31.9) NM_001030561 (-36.3) AB011672... (7 Replies)
Discussion started by: yuejian
7 Replies

8. UNIX for Dummies Questions & Answers

Issue when using egrep to extract strings (too many strings)

Dear all, I have a data like below (n of rows=400,000) and I want to extract the rows with certain strings. I use code below. It works if there is not too many strings for example n of strings <5000. while I have 90,000 strings to extract. If I use the egrep code below, I will get error: ... (3 Replies)
Discussion started by: forevertl
3 Replies

9. UNIX for Beginners Questions & Answers

Extract content between strings

Hello i am stuck with this. i have input which is as follows /type/work /works/OL10627594W 3 2019-04-24T16:46:21.351549 {"created": {"type": "/type/datetime", "value": "2009-12-11T03:18:17.488715"}, "title": "Tog the dog", "covers": , "last_modified": {"type":... (3 Replies)
Discussion started by: ahfze
3 Replies

10. Shell Programming and Scripting

Extract strings from output

I am having the following output when executing a dig command : dig @1.1.1.1 google.com +noall +answer +stats ; <<>> DiG 9.11.4-P1 <<>> @1.1.1.1 google.com +noall +answer +stats ; (1 server found) ;; global options: +cmd obodrm.prod.at.dmdsdp.com. 86154 IN A ... (1 Reply)
Discussion started by: liviusbr
1 Replies
echo(1) 							   User Commands							   echo(1)

NAME
echo - echo arguments SYNOPSIS
/usr/bin/echo [string]... DESCRIPTION
The echo utility writes its arguments, separated by BLANKs and terminated by a NEWLINE, to the standard output. If there are no arguments, only the NEWLINE character is written. echo is useful for producing diagnostics in command files, for sending known data into a pipe, and for displaying the contents of environ- ment variables. The C shell, the Korn shell, and the Bourne shell all have echo built-in commands, which, by default, is invoked if the user calls echo without a full pathname. See shell_builtins(1). sh's echo, ksh's echo, ksh93's echo, and /usr/bin/echo understand the back-slashed escape characters, except that sh's echo does not understand a as the alert character. In addition, ksh's and ksh93's echo does not have an -n option. sh's echo and /usr/bin/echo have an -n option if the SYSV3 environment variable is set (see ENVIRONMENT VARIABLES below). csh's echo and /usr/ucb/echo, on the other hand, have an -n option, but do not understand the back-slashed escape characters. sh and ksh deter- mine whether /usr/ucb/echo is found first in the PATH and, if so, they adapt the behavior of the echo builtin to match /usr/ucb/echo. OPERANDS
The following operand is supported: string A string to be written to standard output. If any operand is "-n", it is treated as a string, not an option. The following char- acter sequences is recognized within any of the arguments: a Alert character.  Backspace. c Print line without new-line. All characters following the c in the argument are ignored. f Form-feed. New-line. Carriage return. Tab. v Vertical tab. \ Backslash. n Where n is the 8-bit character whose ASCII code is the 1-, 2- or 3-digit octal number representing that character. USAGE
Portable applications should not use -n (as the first argument) or escape sequences. The printf(1) utility can be used portably to emulate any of the traditional behaviors of the echo utility as follows: o The Solaris 2.6 operating environment or compatible version's /usr/bin/echo is equivalent to: printf "%b " "$*" o The /usr/ucb/echo is equivalent to: if [ "X$1" = "X-n" ] then shift printf "%s" "$*" else printf "%s " "$*" fi New applications are encouraged to use printf instead of echo. EXAMPLES
Example 1 Finding how far below root your current directory is located You can use echo to determine how many subdirectories below the root directory (/) is your current directory, as follows: o Echo your current-working-directory's full pathname. o Pipe the output through tr to translate the path's embedded slash-characters into space-characters. o Pipe that output through wc -w for a count of the names in your path. example% /usr/bin/echo $PWD | tr '/' ' ' | wc -w See tr(1) and wc(1) for their functionality. Below are the different flavors for echoing a string without a NEWLINE: Example 2 /usr/bin/echo example% /usr/bin/echo "$USER's current directory is $PWDc" Example 3 sh/ksh shells example$ echo "$USER's current directory is $PWDc" Example 4 csh shell example% echo -n "$USER's current directory is $PWD" Example 5 /usr/ucb/echo example% /usr/ucb/echo -n "$USER's current directory is $PWD" ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of echo: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. SYSV3 This environment variable is used to provide compatibility with INTERACTIVE UNIX System and SCO UNIX installation scripts. It is intended for compatibility only and should not be used in new scripts. This variable is applicable only for Solaris x86 platforms, not Solaris SPARC systems. EXIT STATUS
The following error values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Committed | +-----------------------------+-----------------------------+ |Standard |See standards(5). | +-----------------------------+-----------------------------+ SEE ALSO
ksh93(1), printf(1), shell_builtins(1), tr(1), wc(1), echo(1B), ascii(5), attributes(5), environ(5), standards(5) NOTES
When representing an 8-bit character by using the escape convention n, the n must always be preceded by the digit zero(0). For example, typing: echo 'WARNING:7' prints the phrase WARNING: and sounds the "bell" on your terminal. The use of single (or double) quotes (or two backslashes) is required to protect the "" that precedes the "07". Following the , up to three digits are used in constructing the octal output character. If, following the n, you want to echo addi- tional digits that are not part of the octal representation, you must use the full 3-digit n. For example, if you want to echo "ESC 7" you must use the three digits "033" rather than just the two digits "33" after the . 2 digits Incorrect: echo "337" | od -xc produces: df0a (hex) 337 (ascii) 3 digits Correct: echo "0337" | od -xc produces: lb37 0a00 (hex) 033 7 (ascii) For the octal equivalents of each character, see ascii(5). SunOS 5.11 8 Apr 2008 echo(1)
All times are GMT -4. The time now is 02:52 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy