09-26-2008
To get an output by combining fields from two different files
Hi guys,
I couldn't find solution to this problem. If anyone knows please help me out.
your guidance is highly appretiated.
I have two files -
FILE1 has the following 7 columns ( - has been added to make columns visible enough else columns are separated by single space)
155.34 - leg - 1 - 344 - TC200232 - 292 - 930
152.88 - leg - 1 - 344 -TC215306 - 2 - 743
123.94 - leg - 1 - 344 -TC210135 - 423 - 1148
FILE2
>TC200232.pep
AYNGFNNSNIIRDGVAIINSSGALKLTNRSYNVIGHAFHPNPVPIFNSSTKNVTSFSTYF
VFAIVPLEKTSGGFGFA
>TC210135.pep
GFGDFGKDSNFESQIALYGDAKVVNGGIQMSGSMGFSAGRILNKKPFKLIDGNPRKMVSF
SLHFVFSLSRENGDGFAFVMVPIGYPFDVFDGGSFGLLGNRKMKFLAVEFDTFMDEKYGD
VNDNHVGVDLSS
>TC215306.pep
PRLKQDLTLVGSVIVSDEKKSVQIPDPEREGDDLKHLVGRAIYSSPIR
I want an output like this - FILE3 - which is same as FILE2 but the line starting with '>' should also contain (region 292 to 930 of SEQ) where 292 and 930 are the corresponding columns 6 and 7 of FILE1 for the common id i.e. TC200232 (present in both the files)
>TC200232.pep (region 292 to 930 of SEQ)
AYNGFNNSNIIRDGVAIINSSGALKLTNRSYNVIGHAFHPNPVPIFNSSTKNVTSFSTYF
VFAIVPLEKTSGGFGFA
>TC210135.pep (region 423 to 1148 of SEQ)
GFGDFGKDSNFESQIALYGDAKVVNGGIQMSGSMGFSAGRILNKKPFKLIDGNPRKMVSF
SLHFVFSLSRENGDGFAFVMVPIGYPFDVFDGGSFGLLGNRKMKFLAVEFDTFMDEKYGD
VNDNHVGVDLSS
>TC215306.pep (region 2 to 743 of SEQ)
PRLKQDLTLVGSVIVSDEKKSVQIPDPEREGDDLKHLVGRAIYSSPIR
Last edited by smriti_shridhar; 09-26-2008 at 08:03 AM..
Reason: formatting
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Can someone tell me how to do this using sed, awk, or any other basic shell scripting? Basically I have two text files with the following contained in each file:
File A:
a b c
d e f
g h i
File B:
1
2
3
I want the final outcome to look like this:
a b c 1
d e f 2
g h i 3
How... (3 Replies)
Discussion started by: shocker
3 Replies
2. Shell Programming and Scripting
I am using:
ps -A -o command,%cpu
to get process and cpu usage figures. I want to use awk to split up the columns it returns. If I use:
awk '{print "Process: "$1"\nCPU Usage: "$NF"\n"}'
the $NF will get me the value in the last column, but if there is more than one word in the... (2 Replies)
Discussion started by: json4639
2 Replies
3. Shell Programming and Scripting
Hello!
I am writing a program to run through two large lists of data (~300,000 rows), find where rows in one file match another, and combine them based on matching fields. Due to the large file sizes, I'm guessing AWK will be the most efficient way to do this. Overall, the input and output I'm... (5 Replies)
Discussion started by: Michelangelo
5 Replies
4. Shell Programming and Scripting
Hi All,
Looking for a quick AWK script to output some differences between two files.
FILE1
device1 1.1.1.1 PINGS
device1 2.2.2.2 PINGS
FILE2
2862 SITE1 device1-prod 1.1.1.1 icmp - 0 ... (4 Replies)
Discussion started by: stacky69
4 Replies
5. Shell Programming and Scripting
Hello I am trying to develop a shell script that takes a text file such as this...
E-mail@ Soc.Sec.No. *--------Name-----------* Class *School.Curriculum.Major.* Campus.Phone
JCC2380 XXX-XX-XXXX CAREY, JULIE C JR-II BISS CPSC BS INFO TECH 412/779-9445
JAC1936 XXX-XX-XXXX... (7 Replies)
Discussion started by: crimputt
7 Replies
6. Shell Programming and Scripting
Hi,
I have a file of the following format:
AV 103
AV 104
AV 105
AV 308
AV 517
BN 210
BN 211
BN 212
BN 218
and the desired output is :
AV 103-105 3
AV 308 1
AV 517 1
BN 210-212 3 (5 Replies)
Discussion started by: rochitsharma
5 Replies
7. Shell Programming and Scripting
I need to take 2 input files and create 1 output based on matches from each file. I am looking to match field #1 in both files (Userid) and create an output file that will be a combination of fields from
both file1 and file2 if there are any differences in the fields 2,3,4,5,or 6.
Below is an... (5 Replies)
Discussion started by: ambroze
5 Replies
8. Shell Programming and Scripting
Hi,
I have 3 files with one column value as shown
File: a.txt
------------
Data_a1
Data_a2
File2: b.txt
------------
Data_b1
Data_b2
Data_b3
Data_b4
File3: c.txt
------------
Data_c1
Data_c2
Data_c3
Data_c4
Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies
9. Shell Programming and Scripting
I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns.
Like I have file A
1234,ABCD,23,JOHN,NJ,USA
2345,ABCD,24,SAM,NY,USA
5678,GHIJ,24,TOM,NY,USA
5678,WXYZ,27,MAT,NJ,USA
and file B
... (2 Replies)
Discussion started by: mady135
2 Replies
10. UNIX for Dummies Questions & Answers
Hello,
I'm back again looking for your precious help-
This time I need to merge two text files with matching two fields, output only common records with mixed output.
Let's look at the example:
FILE1
56153;AAA0708;3;TEST1TEST1;
89014;BBB0708;3;TEST2TEST2;
89014;BBB0708;4;TEST3TEST3;
... (7 Replies)
Discussion started by: emare
7 Replies
JOIN(1) FSF JOIN(1)
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line to standard output. The default join field is the first, delimited
by whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
-a SIDE
print unpairable lines coming from file SIDE
-e EMPTY
replace missing input fields with EMPTY
-i, --ignore-case ignore differences in case when comparing fields
-j FIELD
(obsolescent) equivalent to `-1 FIELD -2 FIELD'
-j1 FIELD
(obsolescent) equivalent to `-1 FIELD'
-j2 FIELD
(obsolescent) equivalent to `-2 FIELD'
-o FORMAT
obey FORMAT while constructing output line
-t CHAR
use CHAR as input and output field separator
-v SIDE
like -a SIDE, but suppress joined output lines
-1 FIELD
join on this FIELD of file 1
-2 FIELD
join on this FIELD of file 2
--help display this help and exit
--version
output version information and exit
Unless -t CHAR is given, leading blanks separate fields and are ignored, else fields are separated by CHAR. Any FIELD is a field number
counted from 1. FORMAT is one or more comma or blank separated specifications, each being `SIDE.FIELD' or `0'. Default FORMAT outputs the
join field, the remaining fields from FILE1, the remaining fields from FILE2, all separated by CHAR.
AUTHOR
Written by Mike Haertel.
REPORTING BUGS
Report bugs to <bug-coreutils@gnu.org>.
COPYRIGHT
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU-
LAR PURPOSE.
SEE ALSO
The full documentation for join is maintained as a Texinfo manual. If the info and join programs are properly installed at your site, the
command
info join
should give you access to the complete manual.
join (coreutils) 4.5.3 February 2003 JOIN(1)