Hello!
I am writing a program to run through two large lists of data (~300,000 rows), find where rows in one file match another, and combine them based on matching fields. Due to the large file sizes, I'm guessing AWK will be the most efficient way to do this. Overall, the input and output I'm... (5 Replies)
Hi folks,
Lets say I have the following text file:
name, lastname, 1234, name.lastname@test.com
name1, lastname1, name2.lastname2@test.com, 2345
name, 3456, lastname, name3.lastname3@test.com
4567, name, lastname, name4.lastname4@test.com
I now need the following output:
1234... (5 Replies)
Hi experts,
I need to print the first field first then last two fields should come next and then i need to print rest of the fields.
Input :
a1,abc,jsd,fhf,fkk,b1,b2
a2,acb,dfg,ghj,b3,c4
a3,djf,wdjg,fkg,dff,ggk,d4,d5
Expected output:
a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Dear AWK-experts!
I did get stuck in the task of combining files after matching fields, so I'm still awkward with learning AWK.
There are 2 files: one containing 3 columns with ID, coding status, and score for long noncoding RNAs:
file1 (1.txt) (>5000 lines)
... (12 Replies)
Hi all,
I want to extract rows with the pattern ALPHANUMERIC/ALPHANUMNERIC in the 2nd column.
I dont wan rows with more than 1 slash or without any slash in 2nd column.
a a/b
b a/b/c
c a/b//c
d t/y
e r
f /f
I came up with the regex
grep '\/$' file
a a/b
b a/b/c
d t/y (3 Replies)
Hi
I have a file as below
<field1> <field2> <field3> ... <field_num1> <field_num2>
Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields.
I tried this and it doesn't sort on the difference field .. Appreciate your help.
cat... (9 Replies)
In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited,
that are in $1 of gene which is just a single column of text.
However only the line with the greatest $9 value in input needs to be printed.
So in the example below all the MECP2 and LTBP1... (0 Replies)
Hi everyone,
Given two files (test1 and test2) with the following contents:
test1:
80263760,I71
80267369,M44
80274628,L77
80276793,I32
80277390,K05
80277391,I06
80279206,I43
80279859,K37
80279866,K35
80279867,J16
80280346,I14and test2:
80263760,PT18
80279867,PT01I need to do some... (3 Replies)
Hi,
I have 2 tab-delimited input files as follows.
file1.tab:
green A apple
red B apple
file2.tab:
apple - A;Z
Objective:
Return $1 of file1 if,
. $1 of file2 matches $3 of file1 and,
. any single element (separated by ";") in $3 of file2 is present in $2 of file1
In order to... (3 Replies)
Trying to use awk to match the contents of each line in file1 with $5 in file2. Both files are tab-delimited and there may be a space or special character in the name being matched in file2, for example in file1 the name is BRCA1 but in file2 the name is BRCA 1 or in file1 name is BCR but in file2... (6 Replies)
Discussion started by: cmccabe
6 Replies
LEARN ABOUT OSX
ptargrep5.16
PTARGREP(1) Perl Programmers Reference Guide PTARGREP(1)NAME
ptargrep - Apply pattern matching to the contents of files in a tar archive
SYNOPSIS
ptargrep [options] <pattern> <tar file> ...
Options:
--basename|-b ignore directory paths from archive
--ignore-case|-i do case-insensitive pattern matching
--list-only|-l list matching filenames rather than extracting matches
--verbose|-v write debugging message to STDERR
--help|-? detailed help message
DESCRIPTION
This utility allows you to apply pattern matching to the contents of files contained in a tar archive. You might use this to identify all
files in an archive which contain lines matching the specified pattern and either print out the pathnames or extract the files.
The pattern will be used as a Perl regular expression (as opposed to a simple grep regex).
Multiple tar archive filenames can be specified - they will each be processed in turn.
OPTIONS --basename (alias -b)
When matching files are extracted, ignore the directory path from the archive and write to the current directory using the basename of
the file from the archive. Beware: if two matching files in the archive have the same basename, the second file extracted will
overwrite the first.
--ignore-case (alias -i)
Make pattern matching case-insensitive.
--list-only (alias -l)
Print the pathname of each matching file from the archive to STDOUT. Without this option, the default behaviour is to extract each
matching file.
--verbose (alias -v)
Log debugging info to STDERR.
--help (alias -?)
Display this documentation.
COPYRIGHT
Copyright 2010 Grant McLean <grantm@cpan.org>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.16.2 2013-08-25 PTARGREP(1)