awk to match keyword and return matches and unique fields Post: 302963372

Sponsored Content

Top Forums Shell Programming and Scripting awk to match keyword and return matches and unique fields Post 302963372 by cmccabe on Tuesday 29th of December 2015 06:03:55 PM

12-29-2015

Registered User

awk to match keyword and return matches and unique fields

Trying to use awk to find a keyword and return the matches in the row, but also $1 and $2, which are the unique id's, but they only appear once. Thank you Smilie

.

file

Code:

name	31	Index	Chromosomal Position	Gene	Inheritance
		122	2106725	TSC2	AD
		124	2115481	TSC2	AD
		121	2105400	TSC2	AD
		82	135782221	TSC1	AD
		81	135782026	TSC1	AD
		126	2138218	TSC2	AD
		123	2113107	TSC2	AD
		125	2126142	TSC2	AD
name2	12	Index	Chromosomal Position	Gene	Inheritance
		1	43396568	SLC2A1	AD, AR
name3	20	Index	Chromosomal Position	Gene	Inheritance
		188	2135240	TSC1	AD
		179	2103379	TSC1 AD
		191	2137899	TSC2	AD
		181	2110617	TSC2	AD
		190	2137857	TSC2	AD
		189	2137806	TSC2	AD
		186	2133798	TSC2	AD
		187	2135074	TSC2	AD
		180	2105400	TSC2	AD
		183	2122822	TSC2	AD
		192	2138218	TSC2	AD
		185	2125937	TSC2	AD
		184	2125788	TSC2	AD
		193	2138269	TSC2	AD
		182	2112981	TSC2	AD

Desired output

Code:

name	  31	Index	Chromosomal Position	Gene	Inheritance
                  82	135782221	TSC1	AD
                  81	135782026	TSC1	AD
name3  20	Index	Chromosomal Position	Gene	Inheritance
                  188	2135240	TSC1	AD
                  179	2103379	TSC1	AD
                  191	2137899	TSC1	AD

awk

Code:

awk '/TSC1/{ print $1,$2,$0 }' file.txt > output.txt

Last edited by cmccabe; 12-29-2015 at 07:59 PM.. Reason: corrected input

cmccabe

View Public Profile for cmccabe

Find all posts by cmccabe

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Hi. I have a tab separated file that has a couple nearly identical lines. When doing: sort file | uniq > file.new It passes through the nearly identical lines because, well, they still are unique. a) I want to look only at field x for uniqueness and if the content in field x is the...

2. Shell Programming and Scripting

awk Help -- If match found return the count

Hi All, I need to get the count of records in the file, if the passing parameter matches with the list of records in the file. Below is my example source file: Test1.dat 20120913 20120913 20120912 20120912 20120912 20120912 20120912 20120913 20120913 20120912 In my script I am...

3. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||...

4. Shell Programming and Scripting

awk to calculate fields only if match is found

Trying to combine the matching $5 values between file1 and file2. If a match is found then the last $6 value in the match and the sum of $7 are outputted to a new file. The awk below I hope is a good start. Thank you :). file1 chr12 9221325 9221448 chr12:9221325-9221448 A2M 1...

5. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Trying to get the unique count of the below input, but if the text in beginning of $5 is a partial match to another line in the file then it is not unique. awk awk '!seen++ {n++} END {print n}' input 7 input chr1 159174749 159174770 chr1:159174749-159174770 ACKR1 chr1 ...

6. Shell Programming and Scripting

awk to combine matches and use a field to adjust coordinates in other fields

Trying to output a result that uses the data from file to combine and subtract specific lines. If $4 matches in each line then the last $6 value is added to $2 and that becomes the new$3. Each matching line in combined into one with $1 then the original $2 then the new$3 then $5. For the cases...

7. UNIX for Beginners Questions & Answers

Grep or awk a unique and specific word across many fields

Hi there, I have data with similar structure as this: CHR START-SNP END-SNP REF ALT PATIENT1 PATIENT2 PATIENT3 PATIENT4 chr1 69511 69511 A G homo hetero homo hetero chr2 69513 69513 T C . hetero homo hetero chr3 69814 69814 G C . . homo homo chr4 69815 69815 C A hetero . . hetero is...

8. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing...

9. UNIX for Beginners Questions & Answers

awk match two fields in two files

Hi, I have two TEST files t.xyz and a.xyz which have three columns each. a.xyz have more rows than t.xyz. I will like to output rows at which $1 and $2 of t.xyz match $1 and $2 of a.xyz. Total number of output rows should be equal to that of t.xyz. It works fine, but when I apply it to large...

10. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this: InputFile:PS002,003 D -1 5 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 6 6 -1 -1 -1 -1 0 509 0 PS002,003 PSQ 0 1 7 18 1 0 -1 1 1 3 -1 -1 ...

LEARN ABOUT DEBIAN

bio::liveseq::gene

Bio::LiveSeq::Gene(3pm) 				User Contributed Perl Documentation				   Bio::LiveSeq::Gene(3pm)

NAME

       Bio::LiveSeq::Gene - Range abstract class for LiveSeq

SYNOPSIS

	 # documentation needed

DESCRIPTION

       This is used as storage for all object references concerning a particular gene.

AUTHOR - Joseph A.L. Insana
       Email:  Insana@ebi.ac.uk, jinsana@gmx.net

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

   new
	 Title	 : new
	 Usage	 : $gene = Bio::LiveSeq::Gene->new(-name => "name",
						   -features => $hashref
						   -upbound => $min
						   -downbound => $max);

	 Function: generates a new Bio::LiveSeq::Gene
	 Returns : reference to a new object of class Gene
	 Errorcode -1
	 Args	 : one string and one hashreference containing all features defined
		   for the Gene and the references to the LiveSeq objects for those
		   features.
		   Two labels for defining boundaries of the gene. Usually the
		   boundaries will reflect max span of transcript, exon... features,
		   while the DNA sequence will be created with some flanking regions
		   (e.g. with the EMBL_SRS::gene2liveseq routine).
		   If these two labels are not given, they will default to the start
		   and end of the DNA object.
	 Note	 : the format of the hash has to be like
		      DNA => reference to LiveSeq::DNA object
		      Transcripts => reference to array of transcripts objrefs
		      Transclations => reference to array of transcripts objrefs
		      Exons => ....
		      Introns => ....
		      Prim_Transcripts => ....
		      Repeat_Units => ....
		      Repeat_Regions => ....
		   Only DNA and Transcripts are mandatory

   verbose
	Title	: verbose
	Usage	: $self->verbose(0)
	Function: Sets verbose level for how ->warn behaves
		  -1 = silent: no warning
		   0 = reduced: minimal warnings
		   1 = default: all warnings
		   2 = extended: all warnings + stack trace dump
		   3 = paranoid: a warning becomes a throw and the program dies

		  Note: a quick way to set all LiveSeq objects at the same verbosity
		  level is to change the DNA level object, since they all look to
		  that one if their verbosity_level attribute is not set.
		  But the method offers fine tuning possibility by changing the
		  verbose level of each object in a different way.

		  So for example, after $loader= and $gene= have been retrieved
		  by a program, the command $gene->verbose(0); would
		  set the default verbosity level to 0 for all objects.

	Returns : the current verbosity level
	Args	: -1,0,1,2 or 3

perl v5.14.2							    2012-03-02						   Bio::LiveSeq::Gene(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Discussion started by: rocket_dog

2. Shell Programming and Scripting

awk Help -- If match found return the count

Discussion started by: bbc17484

3. Shell Programming and Scripting

awk pattern match and count unique in column

Discussion started by: nex_asp

4. Shell Programming and Scripting

awk to calculate fields only if match is found

Discussion started by: cmccabe

5. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Discussion started by: cmccabe

6. Shell Programming and Scripting

awk to combine matches and use a field to adjust coordinates in other fields

Discussion started by: cmccabe

7. UNIX for Beginners Questions & Answers

Grep or awk a unique and specific word across many fields

Discussion started by: daashti

8. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

Discussion started by: cmccabe

9. UNIX for Beginners Questions & Answers

awk match two fields in two files

Discussion started by: geomarine

10. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

Discussion started by: jvoot

LEARN ABOUT DEBIAN

bio::liveseq::gene