How to extract data from BNC xml with reference brackets?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract data from BNC xml with reference brackets?
# 15  
Old 12-16-2008
Code:
$ grep "^<w " file | sed 's/\(\<w c5=\"\)\(...\)\(.*>\)\(.*\)\(\<\/w\>\)/\2 \4/g'
NN1 FACTSHEET
DTQ WHAT
VBZ IS
NN1 AIDS
NN1 AIDS
VVN Acquired
AJ0 Immune
NN1 Deficiency
NN1 Syndrome
VBZ is
AT0 a
NN1 condition
VVN caused
PRP by
AT0 a

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data using a reference

Gents, If there the possibility can to extract data using a reference from other file. input.txt ( big file which contends all data output.txt ( data extracted ) selection.txt ( information to extract the data Example In file input.txt there is big data each record have 56 lines like... (3 Replies)
Discussion started by: jiam912
3 Replies

2. Shell Programming and Scripting

Extract Data from XML file.

Hi Guys, I am in a need to extract data from a xml file. The XML file format is as below. <data jsxnamespace="propsbundle" locales=""> <locale> <!--Error messages starts--> <record jsxid="CHARPAIR001" jsxtext=" must be selected"></record> <record... (1 Reply)
Discussion started by: Showdown
1 Replies

3. Shell Programming and Scripting

awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through. https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies

4. Shell Programming and Scripting

Extract data from XML file

Hi , I have input file as XML. following are input data #complex.xml <?xml version="1.0" encoding="UTF-8"?> <TEST_doc xmlns="http://www.w3.org/2001/XMLSchema-instance"> <ENTRY uid="123456"> <protein> <name>PROT001</name> <organism>Human</organism> ... (1 Reply)
Discussion started by: mohan sharma
1 Replies

5. Shell Programming and Scripting

Extract and parse XML data (statistic value) to csv

Hi All, I need to parse some statistic data from the "measInfo" -eg. 25250000 (as highlighted) and return the result into line by line, and erasing all other unnecessary info/tag. Thought of starting with grep "measInfoID="25250000" but this only returns 1 line. How do I get all the output... (8 Replies)
Discussion started by: jackma
8 Replies

6. Shell Programming and Scripting

Data Extract from XML Log File

Please help me out to extract the Data from the XML Log files. So here is the data ERROR|2010-08-26 00:05:52,958|SERIAL_ID=128279996|ST=2010-08-2600:05:52|DEVICE=113.2.21.12:601|TYPE=TransactionLog... (9 Replies)
Discussion started by: raghunsi
9 Replies

7. Shell Programming and Scripting

XML data extract

Hi all, I have the following xml document : <HEADER><El1>asdf</El1> <El2>3</El2> <El3>asad</El3> <El4>asasdf</El4> <El5>asdf</El5> <El6>asdf</El6> <El7>asdf</El7> <El8>A</El8> <El9>0</El9> <El10>75291028141917</El10> <El11>asdf</El11> <El12>sdf</El12> <El13>er</El13> <El14><El15>asdf... (1 Reply)
Discussion started by: nthed
1 Replies

8. Shell Programming and Scripting

Extract xml data

Hi all, I have the following xml file : <xmlhead><xmlelement1>element1value</xmlelement1>\0a<xmlelement2>jjasd</xmlelement2>...</xmlhead> As you can see there are no lines or spaces seperating the elements, just the character \0a. How can i find and print the values of a specific element?... (1 Reply)
Discussion started by: nthed
1 Replies

9. Shell Programming and Scripting

sed or awk to extract data from Xml file

Hi, I want to get data from Xml file by using sed or awk command. I want to get the following result : mon titre 1;Createur1;Dossier1 mon titre 1;Createur1;Dossier1 and save it in cvs file (fichier.cvs). FROM this Xml file (test.xml): <playlist version="1"> <trackList> <track>... (1 Reply)
Discussion started by: yeclota
1 Replies

10. Shell Programming and Scripting

Help with shell script to extract data from XML file

Hello Scripting Gurus, I need help with extracting data from the XML file using shell script. The data is in a large XML and I need to extract the id values of all completedworkflows. Here is a sample of it. Input and output data is also in the attached text files. <wfregistry>... (5 Replies)
Discussion started by: yajaykumar
5 Replies
Login or Register to Ask a Question
SHELLEXP(3)						     Library Functions Manual						       SHELLEXP(3)

NAME
shellexp - match string against a cruft filter pattern SYNOPSIS
extern int shellexp(const char *string, const char *pattern); DESCRIPTION
The shellexp() function is similar to fnmatch(3), but works with cruft patterns instead of standard glob(7) patterns. The function returns a true value if string matches the cruft pattern pattern, and a false value (0) otherwise. Returns -1 in case of pattern syntax error. Cruft patterns are similar to glob(7) patterns, but are not fully compatible. The following special characters are supported: ? (a question mark) matches exacly one character of string other than a slash. * matches zero or more characters of string other than a slash. /** or /**/ matches zero or more path components in string. Please note that you can only use ** when directly following a slash, and further- more, only when either directly preceding a slash or at the very end of pattern. A ** followed by anything other than a slash makes pattern invalid. A ** following anything else than a slash reduces it to having the same effect as *. [character-class] Matches any character between the brackets exactly once. Named character classes are NOT supported. If the first character of the class is ! or ^, then the meaning is inverted (matches any character NOT listed between the brackets). If you want to specify a literal closing bracket in the class, then specify it as the first (or second, if you want to negate) character after the opening bracket. Also, simple ASCII-order ranges are supported using a dash character (see examples section). Any other character matches itself. EXAMPLES
/a/b*/*c matches /a/b/xyz.c, as well as /a/bcd/.c, but not /a/b/c/d.c. /a/**/*.c matches all of the following: /a/a.c, /a/b/a.c, /a/b/c/a.c and /a/b/c/d/a.c. /a/[0-9][^0-9]* matches /a/1abc, but not /a/12bc. BUGS
Uses constant-length 1000 byte buffers to hold filenames. Also uses recursive function calls, which are not very efficient. Does not vali- date the pattern before matching, so any pattern errors (unbalanced brackets or misplaced **) are only reported when and if the matching algorithm reaches them. SEE ALSO
fnmatch(3), glob(3), cruft(8) and dash-search(1). AUTHOR
This manual page was written by Marcin Owsiany <porridge@debian.org>, for the Debian GNU/Linux system (but may be used by others). October 17, 2007 SHELLEXP(3)