Extracting text between two strings, first instance only
There are a lot of ways to extract text from between two strings, but what if those strings occur multiple times and you only want the text from the first two strings? I can't seem to find anything to work here. I'm using sed to process the text after it's extracted, so I prefer a sed answer, but whatever works is fine with me.
It's an xml file, the text is between string tags (hope that doesn't cause any confusion). The text may be 1 or 100 lines long and may also contain whitespace, linebreaks, indentions, etc, which shouldn't matter much, but the location of the tags may seem fairly random in relation to the actual text and not a clean "^tagTEXTtag$". I want everything, whitespace, blank lines, etc, between the first open and close tags.
I want to collect the characters from 1-10 and 20-30 from each line of the file and take them in a file in the following format.Can someone help me with this :
string1,string2
string1,string2
string1,string2
:
:
:
: (7 Replies)
i have textfiles that contain a series of lines that look like this:
string0 .................................................... column3a column4a
string1**384y0439 ..................................... column3b column4b... (2 Replies)
Hi all,
I have a directory containing many subdirectories each named like KOG#### where # represents any digit 0-9. There are several files in each KOG#### folder but the one I care about is named like KOG####_final.fasta. I am trying to write a script to copy all of the KOG####_final.fasta... (3 Replies)
Hi,
I've looked at a few existing posts on this, but they don't seem to work for my inputs.
I have a text file where I want to extract all the text between two strings, every time that occurs.
Eg my input file is
Anna said that she would fetch the bucket.
Anna and Ben moved the bucket.... (9 Replies)
Hi All,
I have a file whose common patter is like this:
.I 1
.U
87049087
.S
Some text here too
.M
This is a text
.T
Some another text here
.P
Name of the book
.W
Some lines of more text. This text needs to be extracted.
.A
more text goes here too
.I 2 (2 Replies)
Hi experts,
Ive got a text file which has the following text which will occur in this format at least one time:
+=========================>>
Some stuff that evreryone should knnow
other stufsjdokajkajokajda
aijhjajcdjajcisajcqsqdqwdqad
<<=========================+
It is likely that... (8 Replies)
Hi, the title isn't very descriptive but it'll be easier to explain what I need if I write out the coordinates from which I need to extract certain information:
ATOM 2521 C MAM X 61 44.622 49.357 12.584 1.00 0.00 C
ATOM 2522 H MAM X 61 43.644 49.102 12.205 ... (10 Replies)
Here is my task, I feel sure this can be accomplished with see/awk but can't seem to figure out how.
I have large flat file from which I need to extract every case of a pairing of characters (GG) in this case PLUS the previous 20 characters. The output should be a list (which I plan to make... (17 Replies)
Hi Team -
I hope everyone has been well!
I export a file from one of our source systems that gives me more information than I need. The way the file outputs, I need to extract certain strings at different positions on the file and echo them to another file.
I can do this in batch easily,... (2 Replies)
Discussion started by: SIMMS7400
2 Replies
LEARN ABOUT DEBIAN
text::trim
Text::Trim(3pm) User Contributed Perl Documentation Text::Trim(3pm)NAME
Text::Trim - remove leading and/or trailing whitespace from strings
VERSION
version 1.02
SYNOPSIS
use Text::Trim;
$text = " important data
";
$data = trim $text;
# now $data contains "important data" and $text is unchanged
# or:
trim $text; # work in-place, $text now contains "important data"
@lines = <STDIN>;
rtrim @lines; # remove trailing whitespace from all lines
# Alternatively:
@lines = rtrim <STDIN>;
# Or even:
while (<STDIN>) {
trim; # Change $_ in place
# ...
}
DESCRIPTION
This module provides functions for removing leading and/or trailing whitespace from strings. It is basically a wrapper around some simple
regexes with a flexible context-based interface.
EXPORTS
All functions are exported by default.
CONTEXT HANDLING
void context
Functions called in void context change their arguments in-place
trim(@strings); # All strings in @strings are trimmed in-place
ltrim($text); # remove leading whitespace on $text
rtrim; # remove trailing whitespace on $_
No changes are made to arguments in non-void contexts.
list context
Values passed in are changed and returned without affecting the originals.
@result = trim(@strings); # @strings is unchanged
@result = rtrim; # @result contains rtrimmed $_
($result) = ltrim(@strings); # like $result = ltrim($strings[0]);
scalar context
As list context but multiple arguments are stringified before being returned. Single arguments are unaffected. This means that under
these circumstances, the value of $" ($LIST_SEPARATOR) is used to join the values. If you don't want this, make sure you only use single
arguments when calling in scalar context.
@strings = (" hello
", " there
");
$trimmed = trim(@strings);
# $trimmed = "hello there"
local $" = ', ';
$trimmed = trim(@strings);
# Now $trimmed = "hello, there"
$trimmed = rtrim;
# $trimmed = $_ minus trailing whitespace
Undefined values
If any of the functions are called with undefined values, the behaviour is in general to pass them through unchanged. When stringifying a
list (calling in scalar context with multiple arguments) undefined elements are excluded, but if all elements are undefined then the return
value is also undefined.
$foo = trim(undef); # $foo is undefined
$foo = trim(undef, undef); # $foo is undefined
@foo = trim(undef, undef); # @foo contains 2 undefined values
trim(@foo) # @foo still contains 2 undefined values
$foo = trim('', undef); # $foo is ''
FUNCTIONS
trim
Removes leading and trailing whitespace from all arguments, or $_ if none are provided.
rtrim
Like trim() but removes only trailing (right) whitespace.
ltrim
Like trim() but removes only leading (left) whitespace.
UNICODE
Because this module is implemented using perl regular expressions, it is capable of recognising and removing unicode whitespace characters
(such as non-breaking spaces) from scalars with the utf8 flag on. See Encode for details about the utf8 flag.
Note that this only applies in the case of perl versions after 5.8.0 or so.
SEE ALSO
Brent B. Powers' String::Strip performs a similar function in XS.
AUTHOR
Matt Lawrence <mattlaw@cpan.org>
ACKNOWLEDGEMENTS
Terrence Brannon <metaperl@gmail.com> for bringing my attention to String::Strip and suggesting documentation changes.
perl v5.10.1 2010-06-07 Text::Trim(3pm)