Sponsored Content
Top Forums Programming C++ getline, parse and take first tokens by condition Post 302916907 by Corona688 on Friday 12th of September 2014 12:41:04 PM
Old 09-12-2014
Code:
$ awk '/^>/ { NF=1;$1="\n"$1" " } { $1=$1 } 1 ; END { print "\n" }' ORS="" OFS="" *.fasta

>seq01 AGCTACGTACATCAGTCGTGTGATCGAGCGGG
>seq02 AGCTAGAGTAGCGCGCTAGCTAGCGATGCAACGCGGTCGT
>seq03 AGCTACGTACATGCAGTCGTGTGATGCGAGCGGGA

$

Setting blank output field and record separators causes it to print spaces and newlines only when we ask (the $1=$1 trick strips them from $0).

Code:
awk '/^>/ { NF=1;$1="\n"$1" " } # Turf all but field 1, add space and newline if line begins with >
{ $1=$1 } # Get rid of all other spaces between fields
1 # Print all lines
END { print "\n" }' ORS="" OFS="" *.fasta

This User Gave Thanks to Corona688 For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

tokens in unix ?

im trying to remove all occurences of " OF xyz " in a file where xyz could be any word assuming xyz is the last word on the line but I won't always be. at the moment I have sed 's/OF.*//' but I want a nicer solution which could be in pseudo code sed 's/OF.* (next token)//' Is... (6 Replies)
Discussion started by: seaten
6 Replies

2. UNIX for Advanced & Expert Users

How to parse through a file and based on condition form another output file

I have one file say CM.txt which contains values like below.Its just a flat file 1000,A,X 1001,B,Y 1002,B,Z ... .. total around 4 million lines of entries will be in that file. Now i need to write another file CM1.txt which should have 1000,1 1001,2 1002,3 .... ... .. Here i... (6 Replies)
Discussion started by: sivasu.india
6 Replies

3. Shell Programming and Scripting

: + : more tokens expected

Hello- Trying to add two numbers in a ksh shell scripts and i get this error every time I execute stat1_ex.ksh: + : more tokens expected stat1=`cat .stat1a.tmp | cut -f2 -d" "` stat2=`cat .stat2a.tmp | cut -f2 -d" "` j=$(($stat1 + $stat2)) # < Here a the like the errors out echo $j... (3 Replies)
Discussion started by: Nomaad
3 Replies

4. Shell Programming and Scripting

Shell script to parse/split input string and display the tokens

Hi, How do I parse/split lines (strings) read from a file and display the individual tokens in a shell script? Given that the length of individual lines is not constant and number of tokens in each line is also not constant. The input file could be as below: ... (3 Replies)
Discussion started by: yajaykumar
3 Replies

5. Shell Programming and Scripting

Replacing tokens

Hi all, I have a variable with value DateFileFormat=NAME.CODE.CON.01.#.S001.V1.D$.hent.txt I want this variable to get replaced with : var2 is a variable with string value DateFileFormat=NAME\\.CODE\\.CON\\.01\\.var2\\.S001\\.V1\\.D+\\.hent\\.txt\\.xml$ Please Help (3 Replies)
Discussion started by: abhinav192
3 Replies

6. Shell Programming and Scripting

+: more tokens expected

Hey everyone, i needed some help with this one. We move into a new file system (which should be the same as the previous one, other than the name directory has changed) and the script worked fine in the old file system and not the new. I'm trying to add the results from one with another but i'm... (4 Replies)
Discussion started by: senormarquez
4 Replies

7. Shell Programming and Scripting

Need tokens in shell script

Hi All, Im writing a shell script in which I want to get the folder names in one folder to be used in for loop. I have used: packsName=$(cd ~/packs/Acquisitions; ls -l| awk '{print $9}') echo $packsName o/p: opt temp user1 user2 ie. Im getting the output as a string. But I want... (3 Replies)
Discussion started by: AB10
3 Replies

8. Shell Programming and Scripting

Parse tab delimited file, check condition and delete row

I am fairly new to programming and trying to resolve this problem. I have the file like this. CHROM POS REF ALT 10_sample.bam 11_sample.bam 12_sample.bam 13_sample.bam 14_sample.bam 15_sample.bam 16_sample.bam tg93 77 T C T T T T T tg93 79 ... (4 Replies)
Discussion started by: empyrean
4 Replies

9. Programming

Reading tokens

I have a String class with a function that reads tokens using a delimiter. For example String sss = "6:8:12:16"; nfb = sss.nfields_b (':'); String tkb1 = sss.get_token_b (':'); String tkb2 = sss.get_token_b (':'); String tkb3 = sss.get_token_b (':'); String tkb4 =... (1 Reply)
Discussion started by: kristinu
1 Replies

10. Shell Programming and Scripting

Parse xml in shell script and extract records with specific condition

Hi I have xml file with multiple records and would like to extract records from xml with specific condition if specific tag is present extract entire row otherwise skip . <logentry revision="21510"> <author>mantest</author> <date>2015-02-27</date> <QC_ID>334566</QC_ID>... (12 Replies)
Discussion started by: madankumar.t@hp
12 Replies
AWK(1)							      General Commands Manual							    AWK(1)

NAME
awk - pattern scanning and processing language SYNOPSIS
awk [ -Fc ] [ prog ] [ file ] ... DESCRIPTION
Awk scans each input file for lines that match any of a set of patterns specified in prog. With each pattern in prog there can be an asso- ciated action that will be performed when a line of a file matches the pattern. The set of patterns may appear literally as prog, or in a file specified as -f file. Files are read in order; if there are no files, the standard input is read. The file name `-' means the standard input. Each line is matched against the pattern portion of every pattern-action statement; the associated action is performed for each matched pattern. An input line is made up of fields separated by white space. (This default can be changed by using FS, vide infra.) The fields are denoted $1, $2, ... ; $0 refers to the entire line. A pattern-action statement has the form pattern { action } A missing { action } means print the line; a missing pattern always matches. An action is a sequence of statements. A statement can be one of the following: if ( conditional ) statement [ else statement ] while ( conditional ) statement for ( expression ; conditional ; expression ) statement break continue { [ statement ] ... } variable = expression print [ expression-list ] [ >expression ] printf format [ , expression-list ] [ >expression ] next # skip remaining patterns on this input line exit # skip the rest of the input Statements are terminated by semicolons, newlines or right braces. An empty expression-list stands for the whole line. Expressions take on string or numeric values as appropriate, and are built using the operators +, -, *, /, %, and concatenation (indicated by a blank). The C operators ++, --, +=, -=, *=, /=, and %= are also available in expressions. Variables may be scalars, array elements (denoted x[i]) or fields. Variables are initialized to the null string. Array subscripts may be any string, not necessarily numeric; this allows for a form of associative memory. String constants are quoted "...". The print statement prints its arguments on the standard output (or on a file if >file is present), separated by the current output field separator, and terminated by the output record separator. The printf statement formats its expression list according to the format (see printf(3S)). The built-in function length returns the length of its argument taken as a string, or of the whole line if no argument. There are also built-in functions exp, log, sqrt, and int. The last truncates its argument to an integer. substr(s, m, n) returns the n-character sub- string of s that begins at position m. The function sprintf(fmt, expr, expr, ...) formats the expressions according to the printf(3S) format given by fmt and returns the resulting string. Patterns are arbitrary Boolean combinations (!, ||, &&, and parentheses) of regular expressions and relational expressions. Regular expressions must be surrounded by slashes and are as in egrep. Isolated regular expressions in a pattern apply to the entire line. Regu- lar expressions may also occur in relational expressions. A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines between an occurrence of the first pattern and the next occurrence of the second. A relational expression is one of the following: expression matchop regular-expression expression relop expression where a relop is any of the six relational operators in C, and a matchop is either ~ (for contains) or !~ (for does not contain). A condi- tional is an arithmetic expression, a relational expression, or a Boolean combination of these. The special patterns BEGIN and END may be used to capture control before the first input line is read and after the last. BEGIN must be the first pattern, END the last. A single character c may be used to separate the fields by starting the program with BEGIN { FS = "c" } or by using the -Fc option. Other variable names with special meanings include NF, the number of fields in the current record; NR, the ordinal number of the current record; FILENAME, the name of the current input file; OFS, the output field separator (default blank); ORS, the output record separator (default newline); and OFMT, the output format for numbers (default "%.6g"). EXAMPLES
Print lines longer than 72 characters: length > 72 Print first two fields in opposite order: { print $2, $1 } Add up first column, print sum and average: { s += $1 } END { print "sum is", s, " average is", s/NR } Print fields in reverse order: { for (i = NF; i > 0; --i) print $i } Print all lines between start/stop pairs: /start/, /stop/ Print all lines whose first field is different from previous one: $1 != prev { print; prev = $1 } SEE ALSO
lex(1), sed(1) A. V. Aho, B. W. Kernighan, P. J. Weinberger, Awk - a pattern scanning and processing language BUGS
There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate "" to it. 7th Edition April 29, 1985 AWK(1)
All times are GMT -4. The time now is 10:43 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy