Sponsored Content
Top Forums Shell Programming and Scripting awk to match keyword and return matches and unique fields Post 302963372 by cmccabe on Tuesday 29th of December 2015 06:03:55 PM
Old 12-29-2015
awk to match keyword and return matches and unique fields

Trying to use awk to find a keyword and return the matches in the row, but also $1 and $2, which are the unique id's, but they only appear once. Thank you Smilie.


file
Code:
name	31	Index	Chromosomal Position	Gene	Inheritance
		122	2106725	TSC2	AD
		124	2115481	TSC2	AD
		121	2105400	TSC2	AD
		82	135782221	TSC1	AD
		81	135782026	TSC1	AD
		126	2138218	TSC2	AD
		123	2113107	TSC2	AD
		125	2126142	TSC2	AD
name2	12	Index	Chromosomal Position	Gene	Inheritance
		1	43396568	SLC2A1	AD, AR
name3	20	Index	Chromosomal Position	Gene	Inheritance
		188	2135240	TSC1	AD
		179	2103379	TSC1 AD
		191	2137899	TSC2	AD
		181	2110617	TSC2	AD
		190	2137857	TSC2	AD
		189	2137806	TSC2	AD
		186	2133798	TSC2	AD
		187	2135074	TSC2	AD
		180	2105400	TSC2	AD
		183	2122822	TSC2	AD
		192	2138218	TSC2	AD
		185	2125937	TSC2	AD
		184	2125788	TSC2	AD
		193	2138269	TSC2	AD
		182	2112981	TSC2	AD

Desired output
Code:
name	  31	Index	Chromosomal Position	Gene	Inheritance
                  82	135782221	TSC1	AD
                  81	135782026	TSC1	AD
name3  20	Index	Chromosomal Position	Gene	Inheritance
                  188	2135240	TSC1	AD
                  179	2103379	TSC1	AD
                  191	2137899	TSC1	AD


awk
Code:
awk '/TSC1/{ print $1,$2,$0 }' file.txt > output.txt


Last edited by cmccabe; 12-29-2015 at 07:59 PM.. Reason: corrected input
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Hi. I have a tab separated file that has a couple nearly identical lines. When doing: sort file | uniq > file.new It passes through the nearly identical lines because, well, they still are unique. a) I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
Discussion started by: rocket_dog
1 Replies

2. Shell Programming and Scripting

awk Help -- If match found return the count

Hi All, I need to get the count of records in the file, if the passing parameter matches with the list of records in the file. Below is my example source file: Test1.dat 20120913 20120913 20120912 20120912 20120912 20120912 20120912 20120913 20120913 20120912 In my script I am... (5 Replies)
Discussion started by: bbc17484
5 Replies

3. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||... (4 Replies)
Discussion started by: nex_asp
4 Replies

4. Shell Programming and Scripting

awk to calculate fields only if match is found

Trying to combine the matching $5 values between file1 and file2. If a match is found then the last $6 value in the match and the sum of $7 are outputted to a new file. The awk below I hope is a good start. Thank you :). file1 chr12 9221325 9221448 chr12:9221325-9221448 A2M 1... (5 Replies)
Discussion started by: cmccabe
5 Replies

5. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Trying to get the unique count of the below input, but if the text in beginning of $5 is a partial match to another line in the file then it is not unique. awk awk '!seen++ {n++} END {print n}' input 7 input chr1 159174749 159174770 chr1:159174749-159174770 ACKR1 chr1 ... (2 Replies)
Discussion started by: cmccabe
2 Replies

6. Shell Programming and Scripting

awk to combine matches and use a field to adjust coordinates in other fields

Trying to output a result that uses the data from file to combine and subtract specific lines. If $4 matches in each line then the last $6 value is added to $2 and that becomes the new$3. Each matching line in combined into one with $1 then the original $2 then the new$3 then $5. For the cases... (4 Replies)
Discussion started by: cmccabe
4 Replies

7. UNIX for Beginners Questions & Answers

Grep or awk a unique and specific word across many fields

Hi there, I have data with similar structure as this: CHR START-SNP END-SNP REF ALT PATIENT1 PATIENT2 PATIENT3 PATIENT4 chr1 69511 69511 A G homo hetero homo hetero chr2 69513 69513 T C . hetero homo hetero chr3 69814 69814 G C . . homo homo chr4 69815 69815 C A hetero . . hetero is... (10 Replies)
Discussion started by: daashti
10 Replies

8. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

9. UNIX for Beginners Questions & Answers

awk match two fields in two files

Hi, I have two TEST files t.xyz and a.xyz which have three columns each. a.xyz have more rows than t.xyz. I will like to output rows at which $1 and $2 of t.xyz match $1 and $2 of a.xyz. Total number of output rows should be equal to that of t.xyz. It works fine, but when I apply it to large... (6 Replies)
Discussion started by: geomarine
6 Replies

10. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this: InputFile:PS002,003 D -1 5 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 6 6 -1 -1 -1 -1 0 509 0 PS002,003 PSQ 0 1 7 18 1 0 -1 1 1 3 -1 -1 ... (5 Replies)
Discussion started by: jvoot
5 Replies
Bio::LiveSeq::Gene(3pm) 				User Contributed Perl Documentation				   Bio::LiveSeq::Gene(3pm)

NAME
Bio::LiveSeq::Gene - Range abstract class for LiveSeq SYNOPSIS
# documentation needed DESCRIPTION
This is used as storage for all object references concerning a particular gene. AUTHOR - Joseph A.L. Insana Email: Insana@ebi.ac.uk, jinsana@gmx.net APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new Title : new Usage : $gene = Bio::LiveSeq::Gene->new(-name => "name", -features => $hashref -upbound => $min -downbound => $max); Function: generates a new Bio::LiveSeq::Gene Returns : reference to a new object of class Gene Errorcode -1 Args : one string and one hashreference containing all features defined for the Gene and the references to the LiveSeq objects for those features. Two labels for defining boundaries of the gene. Usually the boundaries will reflect max span of transcript, exon... features, while the DNA sequence will be created with some flanking regions (e.g. with the EMBL_SRS::gene2liveseq routine). If these two labels are not given, they will default to the start and end of the DNA object. Note : the format of the hash has to be like DNA => reference to LiveSeq::DNA object Transcripts => reference to array of transcripts objrefs Transclations => reference to array of transcripts objrefs Exons => .... Introns => .... Prim_Transcripts => .... Repeat_Units => .... Repeat_Regions => .... Only DNA and Transcripts are mandatory verbose Title : verbose Usage : $self->verbose(0) Function: Sets verbose level for how ->warn behaves -1 = silent: no warning 0 = reduced: minimal warnings 1 = default: all warnings 2 = extended: all warnings + stack trace dump 3 = paranoid: a warning becomes a throw and the program dies Note: a quick way to set all LiveSeq objects at the same verbosity level is to change the DNA level object, since they all look to that one if their verbosity_level attribute is not set. But the method offers fine tuning possibility by changing the verbose level of each object in a different way. So for example, after $loader= and $gene= have been retrieved by a program, the command $gene->verbose(0); would set the default verbosity level to 0 for all objects. Returns : the current verbosity level Args : -1,0,1,2 or 3 perl v5.14.2 2012-03-02 Bio::LiveSeq::Gene(3pm)
All times are GMT -4. The time now is 07:30 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy