Sponsored Content
Top Forums Shell Programming and Scripting Search strings and highlight them using Perl or bash/awk/sed Post 302842893 by Jotne on Sunday 11th of August 2013 04:42:03 AM
Old 08-11-2013
I need some help here to help out Smilie
Whit this script I get only one single line, correct is 5, why?
Code:
awk 'NR==FNR {a[$0]=$0;next} {for (i in a) {if ($0~a[i]) print}}' b.txt a.txt
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48

correct answer is
Code:
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49

Running a test like this show correctly all possibility
Code:
awk 'NR==FNR {a[$0]=$0;next} {for (i in a) {print $0,a[i]}}' b.txt a.txt
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 ACG
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 TAATG
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 AAAAAG
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 GACAAGT
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 CAAGC
Seq1 -------------------------------TTAAAAAGTTTGAGTTCTAAA---------------- 21 GCTTG
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 ACG
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 TAATG
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 AAAAAG
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 GACAAGT
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 CAAGC
Seq2 -----CTTGGCTCTTTCGTAAGTTTTTCATTAAGGAACTTGAATACACGGTTT----AC- 50 GCTTG
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 ACG
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 TAATG
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 AAAAAG
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 GACAAGT
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 CAAGC
Seq3 TTAAACTTTTTTCAACCCTAATG-----CGGTTTGAACCATTAACC-----------TAAC 48 GCTTG
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 ACG
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 TAATG
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 AAAAAG
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 GACAAGT
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 CAAGC
Seq4 --------GAAAGGAGCGGAGTG-GTCACGTGACAAGTTCTCAGACGCACGTGC--TTGT 49 GCTTG


Last edited by Jotne; 08-11-2013 at 05:49 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk search for Quoted strings (')

Hi All, I have files: 1. abc.sql 'This is a sample file for testing' This does not have quotations this also does not have quotations. and this 'has quotations'. here I need to list the hard coded strings 'This is a sample file for testing' and 'has quotations'. So i have... (13 Replies)
Discussion started by: kprattip
13 Replies

2. Shell Programming and Scripting

Awk search for a element in the list of strings

Hi, how do I match a particular element in a list and replace it with blank? awk 'sub///' $FILE list="AL, AK, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME, MD, MA, MI, MN, MS, MO, MT, NE, NV, NH, NJ, NM, NY, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VT, VA, WA,... (2 Replies)
Discussion started by: grossgermany
2 Replies

3. UNIX for Advanced & Expert Users

bash/grep/awk/sed: How to extract every appearance of text between two specific strings

I have a text wich looks like this: clid=2 cid=6 client_database_id=35 client_nickname=Peter client_type=0|clid=3 cid=22 client_database_id=57 client_nickname=Paul client_type=0|clid=5 cid=22 client_database_id=7 client_nickname=Mary client_type=0|clid=6 cid=22 client_database_id=6... (3 Replies)
Discussion started by: Pioneer1976
3 Replies

4. Shell Programming and Scripting

Help with sed search&replace between two strings

Hi, i read couple of threads here on forum, and googled about what bugs me, yet i still can't find solution. Problem is below. I need to change this string (with sed if it is possible): This is message text that is being quoted to look like this: This is message text that is being quotedI... (2 Replies)
Discussion started by: angrybb
2 Replies

5. Shell Programming and Scripting

Using Awk to Search Two Strings on One Line

If i wanted to search for two strings that are on lines in the log, how do I do it? The following code searches for just one string that is one one line. awk '/^/ {split($2,s,",");a=$1 FS s} /failure agaf@fafa/ {b=a} END{print b}' urfile What if I wanted to search for "failure agaf@fafa"... (3 Replies)
Discussion started by: SkySmart
3 Replies

6. Shell Programming and Scripting

Using Bash/Sed to delete between identical strings

Hi. I'm hoping that someone can help me with a bash script to delete a block of lines from a file. What I want to do is delete every line between two stings that are the same, including the line the first string is on but not the second. (Marked lines to match with !) For example if I... (2 Replies)
Discussion started by: Zykr
2 Replies

7. Shell Programming and Scripting

Advance search using sed/awk/perl

Hi, I have a file with more than 50,000 lines of records and each record is 50 bytes in length. I need to search every record in this file between positions 11-19 (9 bytes) and 32-40 (9 bytes) and in case any of the above 2 fields is alpha-numeric, i need to replace the whole 9 bytes of that... (7 Replies)
Discussion started by: kikionline
7 Replies

8. Shell Programming and Scripting

Display records between two search strings using sed

I have input file like AAA AAA CCC CCC CCC EEE EEE EEE EEE FFF FFF GGG GGG i was trying to retrieve data between two strings using sed. sed -n /CCC/,/FFF/p input_file Am getting output like CCC CCC CCC (1 Reply)
Discussion started by: NareshN
1 Replies

9. Shell Programming and Scripting

Rsync script to rewrite suffix - BASH, awk, sed, perl?

trying to write up a script to put the suffix back. heres what I have but can't get it to do anything :( would like it to be name.date.suffix rsync -zrlpoDtub --suffix=".`date +%Y%m%d%k%M%S`.~" --bwlimit=1024 /mymounts/test1/ /mymounts/test2/ while IFS=. read -r -u 9 -d '' name... (1 Reply)
Discussion started by: jmituzas
1 Replies

10. Shell Programming and Scripting

How to search and lossless replace strings using sed?

Hello, I would like to replace all occurencies of long data types by others (coresponding int) using 'sed' in the extensive source code of a software package written for 32 bit CPUs. I must use regular expressions to avoid wrong replacements like s/unsigned]+long/ulong/gLeaving out... (2 Replies)
Discussion started by: Mick P. F.
2 Replies
Locale::Codes::LangFam(3pm)				 Perl Programmers Reference Guide			       Locale::Codes::LangFam(3pm)

NAME
Locale::Codes::LangFam - standard codes for language extension identification SYNOPSIS
use Locale::Codes::LangFam; $lext = code2langfam('apa'); # $lext gets 'Apache languages' $code = langfam2code('Apache languages'); # $code gets 'apa' @codes = all_langfam_codes(); @names = all_langfam_names(); DESCRIPTION
The "Locale::Codes::LangFam" module provides access to standard codes used for identifying language families, such as those as defined in ISO 639-5. Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639-5 language family codes will be used. SUPPORTED CODE SETS
There are several different code sets you can use for identifying language families. A code set may be specified using either a name, or a constant that is automatically exported by this module. For example, the two are equivalent: $lext = code2langfam('apa','alpha'); $lext = code2langfam('apa',LOCALE_LANGFAM_ALPHA); The codesets currently supported are: alpha This is the set of three-letter (lowercase) codes from ISO 639-5 such as 'apa' for Apache languages. This is the default code set. ROUTINES
code2langfam ( CODE [,CODESET] ) langfam2code ( NAME [,CODESET] ) langfam_code2code ( CODE ,CODESET ,CODESET2 ) all_langfam_codes ( [CODESET] ) all_langfam_names ( [CODESET] ) Locale::Codes::LangFam::rename_langfam ( CODE ,NEW_NAME [,CODESET] ) Locale::Codes::LangFam::add_langfam ( CODE ,NAME [,CODESET] ) Locale::Codes::LangFam::delete_langfam ( CODE [,CODESET] ) Locale::Codes::LangFam::add_langfam_alias ( NAME ,NEW_NAME ) Locale::Codes::LangFam::delete_langfam_alias ( NAME ) Locale::Codes::LangFam::rename_langfam_code ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangFam::add_langfam_code_alias ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangFam::delete_langfam_code_alias ( CODE [,CODESET] ) These routines are all documented in the Locale::Codes::API man page. SEE ALSO
Locale::Codes The Locale-Codes distribution. Locale::Codes::API The list of functions supported by this module. http://www.loc.gov/standards/iso639-5/id.php ISO 639-5 . AUTHOR
See Locale::Codes for full author history. Currently maintained by Sullivan Beck (sbeck@cpan.org). COPYRIGHT
Copyright (c) 2011-2013 Sullivan Beck This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.18.2 2013-11-04 Locale::Codes::LangFam(3pm)
All times are GMT -4. The time now is 03:41 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy