Sponsored Content
Top Forums Shell Programming and Scripting Extracting text between two strings, first instance only Post 302369209 by fubaya on Friday 6th of November 2009 10:51:10 PM
Old 11-06-2009
Extracting text between two strings, first instance only

There are a lot of ways to extract text from between two strings, but what if those strings occur multiple times and you only want the text from the first two strings? I can't seem to find anything to work here. I'm using sed to process the text after it's extracted, so I prefer a sed answer, but whatever works is fine with me.

It's an xml file, the text is between string tags (hope that doesn't cause any confusion). The text may be 1 or 100 lines long and may also contain whitespace, linebreaks, indentions, etc, which shouldn't matter much, but the location of the tags may seem fairly random in relation to the actual text and not a clean "^tagTEXTtag$". I want everything, whitespace, blank lines, etc, between the first open and close tags.

Code:
         <string>This is
         the text 
         that I want
          
          </string>

          <string>text I don't want</string>

         <string>more text
I don't want</string>

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extracting strings

Hi, How do I extract the bytes size string from the ls -l command. (1 Reply)
Discussion started by: hugow
1 Replies

2. Shell Programming and Scripting

Help with extracting strings from a file

I want to collect the characters from 1-10 and 20-30 from each line of the file and take them in a file in the following format.Can someone help me with this : string1,string2 string1,string2 string1,string2 : : : : (7 Replies)
Discussion started by: cmsdelhi
7 Replies

3. Shell Programming and Scripting

extracting a set of strings from a text file

i have textfiles that contain a series of lines that look like this: string0 .................................................... column3a column4a string1**384y0439 ..................................... column3b column4b... (2 Replies)
Discussion started by: Deanne
2 Replies

4. Shell Programming and Scripting

Deleting files that don't contain particular text strings / more than one instance of a string

Hi all, I have a directory containing many subdirectories each named like KOG#### where # represents any digit 0-9. There are several files in each KOG#### folder but the one I care about is named like KOG####_final.fasta. I am trying to write a script to copy all of the KOG####_final.fasta... (3 Replies)
Discussion started by: kmkocot
3 Replies

5. Shell Programming and Scripting

Extracting text between two strings

Hi, I've looked at a few existing posts on this, but they don't seem to work for my inputs. I have a text file where I want to extract all the text between two strings, every time that occurs. Eg my input file is Anna said that she would fetch the bucket. Anna and Ben moved the bucket.... (9 Replies)
Discussion started by: JamesForeman
9 Replies

6. Shell Programming and Scripting

Extracting text between two constant strings

Hi All, I have a file whose common patter is like this: .I 1 .U 87049087 .S Some text here too .M This is a text .T Some another text here .P Name of the book .W Some lines of more text. This text needs to be extracted. .A more text goes here too .I 2 (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

7. Shell Programming and Scripting

Extracting text between two strings, multiple instances

Hi experts, Ive got a text file which has the following text which will occur in this format at least one time: +=========================>> Some stuff that evreryone should knnow other stufsjdokajkajokajda aijhjajcdjajcisajcqsqdqwdqad <<=========================+ It is likely that... (8 Replies)
Discussion started by: martin0852
8 Replies

8. Shell Programming and Scripting

Help extracting single instance of numbers which repeat

Hi, the title isn't very descriptive but it'll be easier to explain what I need if I write out the coordinates from which I need to extract certain information: ATOM 2521 C MAM X 61 44.622 49.357 12.584 1.00 0.00 C ATOM 2522 H MAM X 61 43.644 49.102 12.205 ... (10 Replies)
Discussion started by: crunchgargoyle
10 Replies

9. UNIX for Dummies Questions & Answers

Extracting 22-character strings from text using sed/awk?

Here is my task, I feel sure this can be accomplished with see/awk but can't seem to figure out how. I have large flat file from which I need to extract every case of a pairing of characters (GG) in this case PLUS the previous 20 characters. The output should be a list (which I plan to make... (17 Replies)
Discussion started by: Twinklefingers
17 Replies

10. UNIX for Beginners Questions & Answers

Extracting strings at various positions of text file

Hi Team - I hope everyone has been well! I export a file from one of our source systems that gives me more information than I need. The way the file outputs, I need to extract certain strings at different positions on the file and echo them to another file. I can do this in batch easily,... (2 Replies)
Discussion started by: SIMMS7400
2 Replies
Text::Trim(3pm) 					User Contributed Perl Documentation					   Text::Trim(3pm)

NAME
Text::Trim - remove leading and/or trailing whitespace from strings VERSION
version 1.02 SYNOPSIS
use Text::Trim; $text = " important data "; $data = trim $text; # now $data contains "important data" and $text is unchanged # or: trim $text; # work in-place, $text now contains "important data" @lines = <STDIN>; rtrim @lines; # remove trailing whitespace from all lines # Alternatively: @lines = rtrim <STDIN>; # Or even: while (<STDIN>) { trim; # Change $_ in place # ... } DESCRIPTION
This module provides functions for removing leading and/or trailing whitespace from strings. It is basically a wrapper around some simple regexes with a flexible context-based interface. EXPORTS
All functions are exported by default. CONTEXT HANDLING
void context Functions called in void context change their arguments in-place trim(@strings); # All strings in @strings are trimmed in-place ltrim($text); # remove leading whitespace on $text rtrim; # remove trailing whitespace on $_ No changes are made to arguments in non-void contexts. list context Values passed in are changed and returned without affecting the originals. @result = trim(@strings); # @strings is unchanged @result = rtrim; # @result contains rtrimmed $_ ($result) = ltrim(@strings); # like $result = ltrim($strings[0]); scalar context As list context but multiple arguments are stringified before being returned. Single arguments are unaffected. This means that under these circumstances, the value of $" ($LIST_SEPARATOR) is used to join the values. If you don't want this, make sure you only use single arguments when calling in scalar context. @strings = (" hello ", " there "); $trimmed = trim(@strings); # $trimmed = "hello there" local $" = ', '; $trimmed = trim(@strings); # Now $trimmed = "hello, there" $trimmed = rtrim; # $trimmed = $_ minus trailing whitespace Undefined values If any of the functions are called with undefined values, the behaviour is in general to pass them through unchanged. When stringifying a list (calling in scalar context with multiple arguments) undefined elements are excluded, but if all elements are undefined then the return value is also undefined. $foo = trim(undef); # $foo is undefined $foo = trim(undef, undef); # $foo is undefined @foo = trim(undef, undef); # @foo contains 2 undefined values trim(@foo) # @foo still contains 2 undefined values $foo = trim('', undef); # $foo is '' FUNCTIONS
trim Removes leading and trailing whitespace from all arguments, or $_ if none are provided. rtrim Like trim() but removes only trailing (right) whitespace. ltrim Like trim() but removes only leading (left) whitespace. UNICODE
Because this module is implemented using perl regular expressions, it is capable of recognising and removing unicode whitespace characters (such as non-breaking spaces) from scalars with the utf8 flag on. See Encode for details about the utf8 flag. Note that this only applies in the case of perl versions after 5.8.0 or so. SEE ALSO
Brent B. Powers' String::Strip performs a similar function in XS. AUTHOR
Matt Lawrence <mattlaw@cpan.org> ACKNOWLEDGEMENTS
Terrence Brannon <metaperl@gmail.com> for bringing my attention to String::Strip and suggesting documentation changes. perl v5.10.1 2010-06-07 Text::Trim(3pm)
All times are GMT -4. The time now is 05:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy