"Concatenating sequence length to another file"

Post #303031776 by RudiC on Tuesday 5th of March 2019 04:59:34 PM

Try
Code:
$ wc -cl *.fa | awk '
FILENAME == "-" {sub (".fa", "", $3)
                 T[$3] = $2 - $1
                 next
                }
FNR == 1        {IX = FILENAME
                 sub (/_[^_]*\..*$/, "", IX)
                }

                {print FNR, $0, T[IX] > (FILENAME ".new")
                }
' - OFS="\t" fil*pos.txt

$ cf *.new

---------- file_1_pos.txt.new: ----------

1    File_1_pos    253     164    350
2    File_1_pos    738     827    350

---------- file_2_pos.txt.new: ----------

1    File_2_pos    1494    1583    280
2    File_2_pos    1785    1874    280

Copy exactly as given; then mv the ".new" files over the old ".txt" files




EDIT: Given there are any number of .fa files, and each has a corresponding _pos.txt file, you could try
Code:
$ wc -cl *.fa |  
awk '
FILENAME == "-" {if ($3 == "total") next
                 sub (".fa", "", $3)
                 T[$3] = $2 - $1
                 ARGV[ARGC++] = $3 "_pos.txt"
                 next
                }
FNR == 1        {IX = FILENAME
                 sub (/_[^_]*\..*$/, "", IX)
                }
                {print FNR, $0, T[IX] > (FILENAME ".new")
                }
' - OFS="\t"


Last edited by RudiC; 03-05-2019 at 06:16 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Concatenating values in a File

Hi All, I have a ',' delimited file and i would like concatenate a new value at a specific column. Example :- xXXX,XYZ,20071005,ABC,DEF,123 xXXX,XYZ,20071005,ABC,DEF,123 xXXX,XYZ,20071005,ABC,DEF,123 The output that i want is xXXX,XYZ,20071005001,ABC,DEF,123... (7 Replies)
Discussion started by: amitkhiare
7 Replies

2. Shell Programming and Scripting

Concatenating the two lines in a file

hi My requirement is i have a file with some records like this file name ::xyz a=1 b=100,200 ,300,400 ,500,600 c=700,800 d=900 i want to change my file a=1 b=100,200,300,400 c=700,800 d=900 if record starts with " , " that line should fallows the previous line.please give... (6 Replies)
Discussion started by: srivsn
6 Replies

3. UNIX for Dummies Questions & Answers

What the command to find out the record length of a fixed length file?

I want to find out the record length of a fixed length file? I forgot the command. Any body know? (9 Replies)
Discussion started by: tranq01
9 Replies

4. UNIX for Dummies Questions & Answers

Convert a tab delimited/variable length file to fixed length file

Hi, all. I need to convert a file tab delimited/variable length file in AIX to a fixed lenght file delimited by spaces. This is the input file: 10200002<tab>US$ COM<tab>16/12/2008<tab>2,3775<tab>2,3783 19300978<tab>EURO<tab>16/12/2008<tab>3,28523<tab>3,28657 And this is the expected... (2 Replies)
Discussion started by: Everton_Silveir
2 Replies

5. Shell Programming and Scripting

Concatenating File and String for Sendmail

I want o add a variable in addition to a file which will be send with sendmail. I have problems to find the correct syntax for concatenating this variable called $MyVariable. sendmail mai@domain.com </tmp/errormessage.txt $MyVariable] Thanks for your help! (2 Replies)
Discussion started by: high5
2 Replies

6. Shell Programming and Scripting

Concatenating fixed length lines in shell script

I have a peculiar file with record format like given below. Each line is wrapped to next lines after certain number of characters. I want to concatenate all wrapped lines into 1. Input:(wrapped after 10 columns) This is li ne1 This is li ne2 and this line is too lo ng Shortline ... (8 Replies)
Discussion started by: kmanyam
8 Replies

7. Shell Programming and Scripting

Flat file-make field length equal to header length

Hello Everyone, I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table. Input Data- ------ ------------------------ ---- ----------------- WFI001 Xxxxxx Control Work Item A Number of Records ------ ------------------------... (5 Replies)
Discussion started by: sonali.s.more
5 Replies

8. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

9. Shell Programming and Scripting

Concatenating 3 files into a single file

I have 3 files File1 C1 C2 c3 File 2 C1 c2 c3 File 3 C1 c2 c3 Now i want to have File1 as C1 c2 c3 I File2 as C1 c2 c3 O File3 as c1 c2 c3 D and these 3 files should be concatenated into a single file how can it be done in unix script? (3 Replies)
Discussion started by: Codesearcher
3 Replies

10. UNIX for Dummies Questions & Answers

ConCATenating binaries but excluding last bytes from each file

Hi there, shameful Linux Newbie here :p I was wondering if you could help with my problem... I have plenty of files I'd like to concatenate. I know how to basically use cat command but that won't be enough from what I need : excluding the last xx bytes from files before assembling since there's... (4 Replies)
Discussion started by: grolido
4 Replies
SQUIZZ(1)							   User Manuals 							 SQUIZZ(1)

NAME
squizz - Sequence format checker SYNOPSIS
squizz [-AShlns] [-c format] [-f format] file OPTIONS
Following command line options are allowed: -A Restrict detection/verification to alignment formats (conflict with -S option). -S Restrict detection/verification to sequence formats (conflict with -A option). -c format Convert detected sequence/alignment into format. This option implies strict alignment checking. -f format Assume input format is format. Do not try to detect the format, just verify that the given one is correct. -h Usage display. -l List all supported formats. -n Count and report detected entries. This option is only available when the detection is restricted to a single type (with -A or -S options) and strict checks (without -s option) are enabled. -s Disable strict format checks (enabled by default). DESCRIPTION
squizz is a sequence format file checker, but it has some conversion capabilities too. squizz can detect the most common sequence and alignment formats : * EMBL, FASTA, GCG, GDE, GENBANK, IG, NBRF, PIR (codata), RAW, and SWISSPROT. * CLUSTAL, FASTA, MSF, NEXUS, PHYLIP (interleaved and sequential) and STOCKHOLM. squizz can do some conversions too, if the format the input format is supported. Only 3 types are available : sequence to sequence, align- ment to alignment, and alignment to sequence (the last one, sequence to alignment, require multiple alignments algorithms and cannot be handled with formatting tools). Strict format checks validate the previously detected objects, by making some sanity checks: - sequence strings must exists. - alignment is made of more than one sequence. - alignment sequence strings must have the same length. - alignment sequence names must exists, and be unique. SEE ALSO
seqfmt(5), alifmt(5) AUTHOR
Nicolas Joly (njoly@pasteur.fr), Institut Pasteur. Unix 2009-05-19 SQUIZZ(1)

Featured Tech Videos

All times are GMT -4. The time now is 11:48 PM.
Unix & Linux Forums Content Copyright 1993-2019. All Rights Reserved.
Privacy Policy