Filter records based on 2nd file Post: 302875145

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

filter out all the records which are having space in the 8th filed of my file

I have a file which is having fileds separtaed by delimiter. Ex: C;4498;qwa;cghy;;;;40;;222122 C;4498;sample;city;;;;34 2;;222123 C;4498;qwe;xcbv;;;;34-2;;222124 C;4498;jj;sffz;;;;41;;222120 C;4498;eert;qwq;;;;34 A;;222125 C;4498;jj;szxzzd;;;;34;;222127 out of these records I...

2. Shell Programming and Scripting

Filter records in a file using AWK

I want to filter records in one of my file using AWK command (or anyother command). I am using the below code awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT I want to pass the 0003 and 20100402 values through a variable. How can I do this? Any help is much...

3. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Dear members.. I have a fixed width file. Requirement is as below:- 1. Scan each record from this fixed width file 2. Check for value under field no "6" equals to "ABC". If yes, then filter this record into the output file Please suggest a unix command to achieve this, my guess awk might...

4. UNIX for Dummies Questions & Answers

Filter records in a huge text file from a filter text file

Hi Folks, I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file. bash scripting is what i need. Data.txt Name OrderID Quantity Sam 123 300 Jay 342 498 Kev 78 2500 Sam 420 50 Vic 10...

5. Shell Programming and Scripting

Shell script to filter records in a zip file that contains matching columns from another file

Not sure if this is the correct forum for this question. I have two files. file1.zip, file2 Input: file1.zip col1, col2 , col3 a , b , 0:0:0:0:0:c436:9346:d40b x, y, 0:0:0:0:0:880:39f9:c9a7 m, n , 0:0:0:0:0:80c7:9161:fe00 file2.txt col1 c4:36:93:46:d4:0b...

6. Shell Programming and Scripting

Filter tab file based on column value

Hello I have a tab text file with many columns and have to filter rows ONLY if column 22 has the value of '0', '1', '2' or '3' (out of 0-5). If Column 22 has value '0','1', '2' or '3' (highlighted below), then remove anything less than 10 and greater 100 (based on column 5) AND remove anything...

7. Shell Programming and Scripting

Awk/sed/cut to filter out records from a file based on criteria

I have two files and would need to filter out records based on certain criteria, these column are of variable lengths, but the lengths are uniform throughout all the records of the file. I have shown a sample of three records below. Line 1-9 is the item number "0227546_1" in the case of the first...

8. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789...

9. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field...

10. UNIX for Beginners Questions & Answers

Filter records from a log file based on timestamp

Dear Experts, I have a log file that contains a timestamp, I would like to filter record from that file based on timestamp. For example refer below file - cat sample.txt Jan 19 20:51:48 mukul-Vostro-14-3468 systemd: pam_unix(systemd-user:session): session opened for user root by (uid=0)...

LEARN ABOUT DEBIAN

cd-hit-2d-para

CD-HIT-2D-PARA.PL(1)						   User Commands					      CD-HIT-2D-PARA.PL(1)

NAME

       cd-hit-2d-para.pl - divide a big clustering job into pieces to run cd-hit-2d or cd-hit-est-2d jobs

SYNOPSIS

       cd-hit-2d-para.pl options

DESCRIPTION

	      This  script  divide a big clustering job into pieces and submit jobs to remote computers over a network to make it parallel.  After
	      all the jobs finished, the script merge the clustering results as if you just run a single cd-hit-2d or cd-hit-est-2d.

	      You can also use it to divide big jobs on a single computer if your computer does not have enough RAM (with -L option).

   Requirements:
	      1 When run this script over a network, the directory where you

	      run the scripts and the input files must be available on all the remote hosts with identical path.

	      2 If you choose "ssh" to submit jobs, you have to have

	      passwordless ssh to any remote host, see ssh manual to know how to set up passwordless ssh.

	      3 I suggest to use queuing system instead of ssh,

	      I currently support PBS and SGE

	      4 cd-hit-2d cd-hit-est-2d cd-hit-div cd-hit-div.pl must be

	      in same directory where this script is in.

       Options

       -i     input filename for 1st db in fasta format, required

       -i2 input filename for 2nd db in fasta format, required

       -o     output filename, required

       --P    program, "cd-hit-2d" or "cd-hit-est-2d", default "cd-hit-2d"

       --B    filename of list of hosts, requred unless -Q or -L option is supplied

       --L    number of cpus on local computer, default 0 when you are not running it over a cluster, you can use this	option	to  divide  a  big
	      clustering jobs into small pieces, I suggest you just use "--L 1" unless you have enough RAM for each cpu

       --S    Number of segments to split 1st db into, default 2

       --S2 Number of segments to split 2nd db into, default 8

       --Q    number of jobs to submit to queue queuing system, default 0 by default, the program use ssh mode to submit remote jobs

       --T    type of queuing system, "PBS", "SGE" are supported, default PBS

       --R    restart file, used after a crash of run

       -h     print this help

       More cd-hit-2d/cd-hit-est-2d options can be speicified in command line

	      Questions, bugs, contact Weizhong Li at liwz@sdsc.edu

cd-hit-2d-para.pl 4.6-2012-04-25				    April 2012						      CD-HIT-2D-PARA.PL(1)