Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Finding specific series of strings or characters Post 302562626 by ctsgnb on Friday 7th of October 2011 12:55:23 PM
Old 10-07-2011
Code:
nawk '/Seq/{x=$0;next}$0!~/[-*]/{print x RS $0}' infile

Code:
$ cat f1
>Sequence1
AGACAGATGACAGTAGACAGAT-GACGATAGCAGT
>Sequence2
AGACAGATGACAGTAGACAGATAGACGATAGCAGT
>Sequence3
AGACAGATGACAGTAGACAGATCGACGATAGCAGT
>Sequence4
AGACAGATGA-AGTAGACAGATTGACGATAGCAGT
>Sequence5
AGAC*GATGA


Code:
$ nawk '/Seq/{x=$0;next}$0!~/[-*]/{print x RS $0}' f1
>Sequence2
AGACAGATGACAGTAGACAGATAGACGATAGCAGT
>Sequence3
AGACAGATGACAGTAGACAGATCGACGATAGCAGT

few code explainations

nawk ' call awk
/Seq/if "Seq" pattern is found in the scanned line
{x=$0;save the line in variable x
next}skip further awk instructions and process next line from beginning of awk instructions
$0!~/[-*]/If no occurence of - or * is found in the scanned line
{print x RS $0}print the last Sequence scanned , a Record Separator ("\n" by default) , and the current scanned line
' <yourfile>name of the file that is passed to awk (argument)

Last edited by ctsgnb; 10-07-2011 at 02:08 PM..
This User Gave Thanks to ctsgnb For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

print 10 characters in series

suppose fileA kanika123ABC 1222222222222222 raciat5678ty 1221123333331121 jessica78ulllo 2233243223333333 so output shud be print only first 10 characters in series and rest remain same kanika123A 1222222222222222 raciat5678 1221123333331121 jessica78u ... (1 Reply)
Discussion started by: cdfd123
1 Replies

2. Shell Programming and Scripting

Finding strings

Hi I made a post earlier but now my problem has become a lot more complicated. So I have a file that looks like this: Name 1 13 94 1 AGGTT Name 1 31 44 1 TTCCG Name 1 13 94 2 AAAAATTTT Name 1 41 47 2 GGGGGGGGGGG So the file is tab delimited and what I want to do is find... (8 Replies)
Discussion started by: kylle345
8 Replies

3. Shell Programming and Scripting

Finding repitition of series

Dear friends, hello to everyone. I am new to this forum. I have a set of data where I need to find the repitition of series as below data format: 0001230000456000001230000456 each digit can be separated by any delimeter I need to find out the starting point (index) of '123' and '456' I... (2 Replies)
Discussion started by: gjarms
2 Replies

4. Shell Programming and Scripting

Finding Minimum in a Series

I have two LARGE files of data more than 20,000 line each, file-1 and file-2, and I wish to do the following if possible: file-1 1 2 5 7 9 2 4 6 3 8 9 4 6 8 9 3 2 1 3 1 2 . . . file-2 1 2 3 2 5 7 5 7 3 7 9 4 . (5 Replies)
Discussion started by: ali2011
5 Replies

5. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

6. Shell Programming and Scripting

finding the strings beween 2 characters "/" & "/" in .txt file

Hi all. I have a .txt file that I need to sort it My file is like: 1- 88 chain0 MASTER (FF-TE) FFFF 1962510 /TCK T FD2TQHVTT1 /jtagc/jtag_instreg/updateinstr_reg_1 dff1 (TI,SO) 2- ... (10 Replies)
Discussion started by: Behrouzx77
10 Replies

7. Shell Programming and Scripting

Can't figure out how to find specific characters in specific columns

I am trying to find a specific set of characters in a long file. I only want to find the characters in column 265 for 4 bytes. Is there a search for that? I tried cut but couldn't get it to work. Ex. I want to find '9999' in column 265 for 4 bytes. If it is in there, I want it to print... (12 Replies)
Discussion started by: Drenhead
12 Replies

8. Shell Programming and Scripting

Count specific characters at specific column positions

Hi all, I need help. I have an input text file (input.txt) like this: 21 GTGCAACACCGTCTTGAGAGG 50 21 GACCGAGACAGAATGAAAATC 73 21 CGGGTCTGTAGTAGCAAACGC 108 21 CGAAAAATGAACCCCTTTATC 220 21 CGTGATCCTGTTGAAGGGTCG 259 Now I need to count A/T/G/C numbers at each character location in column... (2 Replies)
Discussion started by: thienxho
2 Replies

9. Shell Programming and Scripting

Finding Strings between 2 characters in a file

Hi All, Assuming i have got a file test.dat which has contains as follows: Unix = abc def fgt jug 111 2222 3333 Linux = gggg pppp qqq C# = ccc ffff llll I would like to traverse through the file, get the 1st occurance of "=" and then need to get the sting... (22 Replies)
Discussion started by: rtagarra
22 Replies

10. UNIX for Dummies Questions & Answers

Printing lines with specific strings at specific columns

Hi I have a file which is tab-delimited. Now, I'd like to print the lines which have "chr6" string in both first and second columns. Could anybody help? (3 Replies)
Discussion started by: a_bahreini
3 Replies
ucblinks(1B)					     SunOS/BSD Compatibility Package Commands					      ucblinks(1B)

NAME
       ucblinks - adds /dev entries to give SunOS 4.x compatible names to SunOS 5.x devices

SYNOPSIS
       /usr/ucb/ucblinks [-e rulebase] [-r rootdir]

DESCRIPTION
       ucblinks  creates symbolic links under the /dev directory for devices whose SunOS 5.x names differ from their SunOS 4.x names. Where possi-
       ble, these symbolic links point to the device's SunOS 5.x name rather than to the actual /devices entry.

       ucblinks does not remove unneeded compatibility links; these must be removed by hand.

       ucblinks should be called each time the system is reconfiguration-booted, after any new SunOS 5.x links that are needed have been  created,
       since the reconfiguration may have resulted in more compatibility names being needed.

       In  releases prior to SunOS 5.4, ucblinks used a  nawk rule-base to construct the SunOS 4.x compatible names. ucblinks no longer uses  nawk
       for the default operation, although  nawk rule-bases can still be specifed with the -e option.  The  nawk rule-base equivalent to the SunOS
       5.4 default operation can be found in /usr/ucblib/ucblinks.awk.

OPTIONS
       -e rulebase     Specify rulebase as the file containing nawk(1) pattern-action statements.

       -r rootdir      Specify rootdir as the directory under which dev and devices will be found, rather than the standard root directory /.

FILES
       /usr/ucblib/ucblinks.awk        sample rule-base for compatibility links

ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:

       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Availability		     |SUNWscpu			   |
       +-----------------------------+-----------------------------+

SEE ALSO
       devlinks(1M), disks(1M), ports(1M), tapes(1M), attributes(5)

SunOS 5.10							    13 Apr 1994 						      ucblinks(1B)
All times are GMT -4. The time now is 03:04 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy