Sponsored Content
Top Forums Shell Programming and Scripting Outputting characters after a given string and reporting the characters in the row below --sed Post 303028720 by Xterra on Sunday 13th of January 2019 10:29:45 PM
Old 01-13-2019
Don
Thanks!
PS. If I would like to search for more than one string (GCATGAAAACATACA and TTTCCAGAAATTGT) and report different number characters (3 and 6. I should be able to do it passing the strings and number of charters as variables, right?

Code:
awk -vlen="3 6" -vstr="GCATGAAAACATACA TTTCCAGAAATTGT" '
BEGIN {	for (MX=n=split (str, TMP); n>0; n--) SRCH[TMP[n]] = n
	String = n
	StringLen = length(String)
}
/^@/ {	matchline = NR + 1
	qualityline = NR + 3
	next
}
NR == matchline {
	if(spot = index($0, String))
		printf("Codon:\t%s\tQuality Score:\t",
		    substr($0, spot + StringLen, len))
	else	qualityline = 0
	next
}
NR == qualityline {
	printf("%s\n", substr($0, spot + StringLen, len))
}' test.txt

This User Gave Thanks to Xterra For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

outputting selected characters from within a variable

Hi all, if for example I had a variable containing the string 'hello', is the any way I can output, for example, the e and the 2nd l based on their position in the string not their character (in this case 2 and 4)? any general pointers in the right direction will be much appreciated, at... (3 Replies)
Discussion started by: skinnygav
3 Replies

2. Shell Programming and Scripting

SED help delete characters in a string

Hi Please help me to refine my syntax. I want to delete the excess characters from the out put below. -bash-3.00$ top -b -n2 -d 00.20 |grep Cpu|tail -1 | awk -F ":" '{ print $2 }' | cut -d, -f1 4.4% us now i want to delete the % and us. How wil i do that to make it just 4.4. Thanks (7 Replies)
Discussion started by: redtred
7 Replies

3. Shell Programming and Scripting

Want to remove the last characters from each row of csv using shell script

Hi, I've a csv file seperated by '|' from which I'm trying to remove the excess '|' characters more than the existing fields. My CSV looks like as below. HRLOAD|Service|AddChange|EN PERSONID|STATUS|LASTNAME|FIRSTNAME|ITDCLIENTUSERID|ADDRESSLINE1 10000001|ACTIVE|Testazar1|Testore1|20041|||... (24 Replies)
Discussion started by: rajak.net
24 Replies

4. Shell Programming and Scripting

Delete row if a a particular column has more then three characters in it

Hi i have a data like hw:dsfnsmdf:39843 chr2 76219829 51M atatata 51 872389 hw:dsfnsmdf:39853 chr2 76219839 51M65T atatata 51 872389 hw:dsfnsmdf:39863 chr2 76219849 51M atatata 51 872389 hw:dsfnsmdf:39873 chr2 ... (3 Replies)
Discussion started by: bhargavpbk88
3 Replies

5. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

6. Shell Programming and Scripting

sed cut characters of string

helloo I wonder if there's a way to cut characters out of a string and keep only the last 2 by using sed. For example if there's the todays' date: 2012-05-06 and we only want to keep the last 2 characters which are the day. Is there a quick way to do it with sed? (2 Replies)
Discussion started by: vlm
2 Replies

7. Shell Programming and Scripting

Trouble with sed and substituting a string with special characters in variable

Hey guys, I know that title is a mouthful - I'll try to better explain my struggles a little better... What I'm trying to do is: 1. Query a db and output to a file, a list of column data. 2. Then, for each line in this file, repeat these values but wrap them with: ITEM{ ... (3 Replies)
Discussion started by: ampsys
3 Replies

8. Shell Programming and Scripting

Help with sed command - find a string between two characters

Hi, I have a xml file (Config.xml) <Header name="" TDate="" PDate=""> <Config> {"config" { "Nation" "Pri:|Sec:"}} </Config> </Header> Now I wanted to printed all the strings between "". I tried the following cat Config.xml | sed -n 's/.*\.*//p' ... (8 Replies)
Discussion started by: vivek_damodaran
8 Replies

9. Shell Programming and Scripting

sed replace nth characters with string

Hi, I hope you can help me out please? I need to replace from character 8-16 with AAAAAAAA and the rest should stay the same after character 16 gtwrhtrd11111111rjytwyejtyjejetjyetgeaEHT wrehrhw22222222hytekutkyukrylryilruilrGEQTH hrwjyety33333333gtrhwrjrgkreglqeriugn;RUGNEURGU ... (4 Replies)
Discussion started by: stinkefisch
4 Replies

10. UNIX for Dummies Questions & Answers

Reporting characters after string

I have a file that looks like this: >ID 1 AATAATTCCGGATCGTGC >ID 2 TTTGACAGTAGAC >ID 3 AGACGATGACGAT I am using the following script to report if AATTCCGGATCG is present in any sequence: awk 'FNR==1{n=substr(FILENAME,1,index(FILENAME,".")-1)} { print n "\t"... (10 Replies)
Discussion started by: Xterra
10 Replies
String(3pm)						User Contributed Perl Documentation					       String(3pm)

NAME
Unicode::String - String of Unicode characters (UTF-16BE) SYNOPSIS
use Unicode::String qw(utf8 latin1 utf16be); $u = utf8("string"); $u = latin1("string"); $u = utf16be("string"); print $u->utf32be; # 4 byte characters print $u->utf16le; # 2 byte characters + surrogates print $u->utf8; # 1-4 byte characters DESCRIPTION
A "Unicode::String" object represents a sequence of Unicode characters. Methods are provided to convert between various external formats (encodings) and "Unicode::String" objects, and methods are provided for common string manipulations. The functions utf32be(), utf32le(), utf16be(), utf16le(), utf8(), utf7(), latin1(), uhex(), uchr() can be imported from the "Unicode::String" module and will work as constructors initializing strings of the corresponding encoding. The "Unicode::String" objects overload various operators, which means that they in most cases can be treated like plain strings. Internally a "Unicode::String" object is represented by a string of 2 byte numbers in network byte order (big-endian). This representation is not visible by the API provided, but it might be useful to know in order to predict the efficiency of the provided methods. METHODS Class methods The following class methods are available: Unicode::String->stringify_as Unicode::String->stringify_as( $enc ) This method is used to specify which encoding will be used when "Unicode::String" objects are implicitly converted to and from plain strings. If an argument is provided it sets the current encoding. The argument should have one of the following: "ucs4", "utf32", "utf32be", "utf32le", "ucs2", "utf16", "utf16be", "utf16le", "utf8", "utf7", "latin1" or "hex". The default is "utf8". The stringify_as() method returns a reference to the current encoding function. $us = Unicode::String->new $us = Unicode::String->new( $initial_value ) This is the object constructor. Without argument, it creates an empty "Unicode::String" object. If an $initial_value argument is given, it is decoded according to the specified stringify_as() encoding, UTF-8 by default. In general it is recommended to import and use one of the encoding specific constructor functions instead of invoking this method. Encoding methods These methods get or set the value of the "Unicode::String" object by passing strings in the corresponding encoding. If a new value is passed as argument it will set the value of the "Unicode::String", and the previous value is returned. If no argument is passed then the current value is returned. To illustrate the encodings we show how the 2 character sample string of "Xm" (micro meter) is encoded for each one. $us->utf32be $us->utf32be( $newval ) The string passed should be in the UTF-32 encoding with bytes in big endian order. The sample "Xm" is "xB5m" in this encoding. Alternative names for this method are utf32() and ucs4(). $us->utf32le $us->utf32le( $newval ) The string passed should be in the UTF-32 encoding with bytes in little endian order. The sample "Xm" is is "xB5m" in this encoding. $us->utf16be $us->utf16be( $newval ) The string passed should be in the UTF-16 encoding with bytes in big endian order. The sample "Xm" is "xB5m" in this encoding. Alternative names for this method are utf16() and ucs2(). If the string passed to utf16be() starts with the Unicode byte order mark in little endian order, the result is as if utf16le() was called instead. $us->utf16le $us->utf16le( $newval ) The string passed should be in the UTF-16 encoding with bytes in little endian order. The sample "Xm" is is "xB5m" in this encoding. This is the encoding used by the Microsoft Windows API. If the string passed to utf16le() starts with the Unicode byte order mark in big endian order, the result is as if utf16le() was called instead. $us->utf8 $us->utf8( $newval ) The string passed should be in the UTF-8 encoding. The sample "Xm" is "xC2xB5m" in this encoding. $us->utf7 $us->utf7( $newval ) The string passed should be in the UTF-7 encoding. The sample "Xm" is "+ALU-m" in this encoding. The UTF-7 encoding only use plain US-ASCII characters for the encoding. This makes it safe for transport through 8-bit stripping protocols. Characters outside the US-ASCII range are base64-encoded and '+' is used as an escape character. The UTF-7 encoding is described in RFC 1642. If the (global) variable $Unicode::String::UTF7_OPTIONAL_DIRECT_CHARS is TRUE, then a wider range of characters are encoded as themselves. It is even TRUE by default. The characters affected by this are: ! " # $ % & * ; < = > @ [ ] ^ _ ` { | } $us->latin1 $us->latin1( $newval ) The string passed should be in the ISO-8859-1 encoding. The sample "Xm" is "xB5m" in this encoding. Characters outside the "x00" .. "xFF" range are simply removed from the return value of the latin1() method. If you want more control over the mapping from Unicode to ISO-8859-1, use the "Unicode::Map8" class. This is also the way to deal with other 8-bit character sets. $us->hex $us->hex( $newval ) The string passed should be plain ASCII where each Unicode character is represented by the "U+XXXX" string and separated by a single space character. The "U+" prefix is optional when setting the value. The sample "Xm" is "U+00b5 U+006d" in this encoding. String Operations The following methods are available: $us->as_string Converts a "Unicode::String" to a plain string according to the setting of stringify_as(). The default stringify_as() encoding is "utf8". $us->as_num Converts a "Unicode::String" to a number. Currently only the digits in the range 0x30 .. 0x39 are recognized. The plan is to eventually support all Unicode digit characters. $us->as_bool Converts a "Unicode::String" to a boolean value. Only the empty string is FALSE. A string consisting of only the character U+0030 is considered TRUE, even if Perl consider "0" to be FALSE. $us->repeat( $count ) Returns a new "Unicode::String" where the content of $us is repeated $count times. This operation is also overloaded as: $us x $count $us->concat( $other_string ) Concatenates the string $us and the string $other_string. If $other_string is not an "Unicode::String" object, then it is first passed to the Unicode::String->new constructor function. This operation is also overloaded as: $us . $other_string $us->append( $other_string ) Appends the string $other_string to the value of $us. If $other_string is not an "Unicode::String" object, then it is first passed to the Unicode::String->new constructor function. This operation is also overloaded as: $us .= $other_string $us->copy Returns a copy of the current "Unicode::String" object. This operation is overloaded as the assignment operator. $us->length Returns the length of the "Unicode::String". Surrogate pairs are still counted as 2. $us->byteswap This method will swap the bytes in the internal representation of the "Unicode::String" object. Unicode reserve the character U+FEFF character as a byte order mark. This works because the swapped character, U+FFFE, is reserved to not be valid. For strings that have the byte order mark as the first character, we can guaranty to get the byte order right with the following code: $ustr->byteswap if $ustr->ord == 0xFFFE; $us->unpack Returns a list of integers each representing an UCS-2 character code. $us->pack( @uchr ) Sets the value of $us as a sequence of UCS-2 characters with the characters codes given as parameter. $us->ord Returns the character code of the first character in $us. The ord() method deals with surrogate pairs, which gives us a result-range of 0x0 .. 0x10FFFF. If the $us string is empty, undef is returned. $us->chr( $code ) Sets the value of $us to be a string containing the character assigned code $code. The argument $code must be an integer in the range 0x0 .. 0x10FFFF. If the code is greater than 0xFFFF then a surrogate pair created. $us->name In scalar context returns the official Unicode name of the first character in $us. In array context returns the name of all characters in $us. Also see Unicode::CharName. $us->substr( $offset ) $us->substr( $offset, $length ) $us->substr( $offset, $length, $subst ) Returns a sub-string of $us. Works similar to the builtin substr() function. $us->index( $other ) $us->index( $other, $pos ) Locates the position of $other within $us, possibly starting the search at position $pos. $us->chop Chops off the last character of $us and returns it (as a "Unicode::String" object). FUNCTIONS
The following functions are provided. None of these are exported by default. byteswap2( $str, ... ) This function will swap 2 and 2 bytes in the strings passed as arguments. If this function is called in void context, then it will modify its arguments in-place. Otherwise, the swapped strings are returned. byteswap4( $str, ... ) The byteswap4 function works similar to byteswap2, but will reverse the order of 4 and 4 bytes. latin1( $str ) utf7( $str ) utf8( $str ) utf16le( $str ) utf16be( $str ) utf32le( $str ) utf32be( $str ) Constructor functions for the various Unicode encodings. These return new "Unicode::String" objects. The provided argument should be encoded correspondingly. uhex( $str ) Constructs a new "Unicode::String" object from a string of hex values. See hex() method above for description of the format. uchar( $num ) Constructs a new one character "Unicode::String" object from a Unicode character code. This works similar to perl's builtin chr() function. SEE ALSO
Unicode::CharName, Unicode::Map8 <http://www.unicode.org/> perlunicode COPYRIGHT
Copyright 1997-2000,2005 Gisle Aas. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2005-10-26 String(3pm)
All times are GMT -4. The time now is 01:56 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy