Sponsored Content
Full Discussion: awk regular expression
Top Forums Shell Programming and Scripting awk regular expression Post 302785031 by @man on Sunday 24th of March 2013 11:44:22 AM
Old 03-24-2013
thank you Yoda and alister.

As you said shorter is not always better! Smilie I was not sure if what I used does the matching strictly or not. Now I'm
once again thanks to both of you.

Cheers!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression query in AWK

I have a varable(var1) in a AWK script that contain data in the following format - I need to extract timestamp,priority and log message.I can extract these by using split function but i don't want to use it, since i want to extract it in one go. I have some difficulties in doing it using... (3 Replies)
Discussion started by: omprasad
3 Replies

2. Shell Programming and Scripting

awk and regular expression

Ive got a file with words and also numbers. Bla BLA 10 10 11 29 12 89 13 35 And i need to change "10,29,89,25" and also remove anything that contains actually words... (4 Replies)
Discussion started by: maskot
4 Replies

3. UNIX for Dummies Questions & Answers

regular expression and awk

I can print a line with an expression using this: awk '/regex/' I can print the line immediately before an expression using this: awk '/regex/{print x};{x=$0}' How do I print the line immediately before and then the line with the expression? (2 Replies)
Discussion started by: nickg
2 Replies

4. Shell Programming and Scripting

need help guys for Regular expression in awk

Hello Experts, Please help me to cope with the following problem I ve patterens like Input Noptx(5) // remain the same -*Nop(3); Nop(9); --Nop(8); // remain the same d3 **---Nop(7); //remain the same d3 **---Nop(7); *--Nop(6); --**Nop(5); -Nop(4); Nop(3); - represents a space... (2 Replies)
Discussion started by: user_prady
2 Replies

5. Shell Programming and Scripting

Regular expression query in AWK

Hi, I have a string like this-->"After Executing service For 10 Request" in this string i need to extract "10". the contents of the string is variable and "10" appears before "For" and after "Request" i.e, in this format "For x Request" I need to extract the value of x. How to do this in AWK?... (10 Replies)
Discussion started by: omprasad
10 Replies

6. UNIX for Advanced & Expert Users

Regular Expression Error in AWK

I have a file "fwcsales_filenames.txt" which has a list of file names that are supposed to be copied to another directory. In addition to that, I am trying to extract the date part and write to the log. I am getting the regular expression error when trying to strip the date part using the "ll"... (1 Reply)
Discussion started by: madhunk
1 Replies

7. Shell Programming and Scripting

Regular expression in AWK

Hello world, I was wondering if there is a nicer way to write the following code (in AWK): awk ' FNR==NR&&$1~/^m$/{tok1=1} FNR==NR&&$1~/^m10$/{tok1=1} ' my_file In fact, it looks for m2, m4, m6, m8 and m10 and then return a positive flag. The problem is how to define 10 thanks... (3 Replies)
Discussion started by: jolecanard
3 Replies

8. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

9. Shell Programming and Scripting

Problem with Regular expression in awk

Hi, I have a file with two fields in it as shown below 14,30 28,30 16,30 22,30 21,30 3,30 Fields are separated by comma ",". I've been trying to validate the file based on the condition "each field must be a numeric value" I am using HP-UX OS. I have tried the following awk... (4 Replies)
Discussion started by: meetsriharsha
4 Replies

10. Shell Programming and Scripting

awk regular expression search

Hi All, I would like to search a regular expression by passing as an i/p variableto AWK. For Example :: 162.111.101.209.9516 162.111.101.209.41891 162.111.101.209.9516 162.111.101.209.9517 162.111.101.209.41918 162.111.101.209.9517 162.111.101.209.41937 162.111.101.209.41951... (7 Replies)
Discussion started by: Girish19
7 Replies
CD-HIT-EST(1)							   User Commands						     CD-HIT-EST(1)

NAME
cdhit-est - run CD-HIT algorithm on RNA/DNA sequences SYNOPSIS
cdhit-est [Options] DESCRIPTION
====== CD-HIT version 4.6 (built on Apr 26 2012) ====== Options -i input filename in fasta format, required -o output filename, required -c sequence identity threshold, default 0.9 this is the default cd-hit's "global sequence identity" calculated as: number of identical amino acids in alignment divided by the full length of the shorter sequence -G use global sequence identity, default 1 if set to 0, then use local sequence identity, calculated as : number of identical amino acids in alignment divided by the length of the alignment NOTE!!! don't use -G 0 unless you use alignment coverage controls see options -aL, -AL, -aS, -AS -b band_width of alignment, default 20 -M memory limit (in MB) for the program, default 800; 0 for unlimitted; -T number of threads, default 1; with 0, all CPUs will be used -n word_length, default 10, see user's guide for choosing it -l length of throw_away_sequences, default 10 -d length of description in .clstr file, default 20 if set to 0, it takes the fasta defline and stops at first space -s length difference cutoff, default 0.0 if set to 0.9, the shorter sequences need to be at least 90% length of the representative of the cluster -S length difference cutoff in amino acid, default 999999 if set to 60, the length difference between the shorter sequences and the representative of the cluster can not be bigger than 60 -aL alignment coverage for the longer sequence, default 0.0 if set to 0.9, the alignment must covers 90% of the sequence -AL alignment coverage control for the longer sequence, default 99999999 if set to 60, and the length of the sequence is 400, then the alignment must be >= 340 (400-60) residues -aS alignment coverage for the shorter sequence, default 0.0 if set to 0.9, the alignment must covers 90% of the sequence -AS alignment coverage control for the shorter sequence, default 99999999 if set to 60, and the length of the sequence is 400, then the alignment must be >= 340 (400-60) residues -A minimal alignment coverage control for the both sequences, default 0 alignment must cover >= this value for both sequences -uL maximum unmatched percentage for the longer sequence, default 1.0 if set to 0.1, the unmatched region (excluding leading and tailing gaps) must not be more than 10% of the sequence -uS maximum unmatched percentage for the shorter sequence, default 1.0 if set to 0.1, the unmatched region (excluding leading and tail- ing gaps) must not be more than 10% of the sequence -U maximum unmatched length, default 99999999 if set to 10, the unmatched region (excluding leading and tailing gaps) must not be more than 10 bases -B 1 or 0, default 0, by default, sequences are stored in RAM if set to 1, sequence are stored on hard drive it is recommended to use -B 1 for huge databases -p 1 or 0, default 0 if set to 1, print alignment overlap in .clstr file -g 1 or 0, default 0 by cd-hit's default algorithm, a sequence is clustered to the first cluster that meet the threshold (fast clus- ter). If set to 1, the program will cluster it into the most similar cluster that meet the threshold (accurate but slow mode) but either 1 or 0 won't change the representatives of final clusters -r 1 or 0, default 1, by default do both +/+ & +/- alignments if set to 0, only +/+ strand alignment -mask masking letters (e.g. -mask NX, to mask out both 'N' and 'X') -match matching score, default 2 (1 for T-U and N-N) -mismatch mismatching score, default -2 -gap gap opening score, default -6 -gap-ext gap extension score, default -1 -bak write backup cluster file (1 or 0, default 0) -h print this help Questions, bugs, contact Limin Fu at l2fu@ucsd.edu, or Weizhong Li at liwz@sdsc.edu For updated versions and information, please visit: http://cd-hit.org cd-hit web server is also available from http://cd-hit.org If you find cd-hit useful, please kindly cite: "Clustering of highly homologous sequences to reduce thesize of large protein database", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282-283 "Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences", Weizhong Li & Adam Godzik. Bioinformatics, (2006) 22:1658-1659 cd-hit-est 4.6-2012-04-25 April 2012 CD-HIT-EST(1)
All times are GMT -4. The time now is 07:41 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy