I have a file which is having fileds separtaed by delimiter.
Ex:
C;4498;qwa;cghy;;;;40;;222122
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41;;222120
C;4498;eert;qwq;;;;34 A;;222125
C;4498;jj;szxzzd;;;;34;;222127
out of these records I... (3 Replies)
I want to filter records in one of my file using AWK command (or anyother command). I am using the below code
awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT
I want to pass the 0003 and 20100402 values through a variable. How can I do this?
Any help is much... (1 Reply)
Dear members..
I have a fixed width file. Requirement is as below:-
1. Scan each record from this fixed width file
2. Check for value under field no "6" equals to "ABC". If yes, then filter this record into the output file
Please suggest a unix command to achieve this, my guess awk might... (6 Replies)
Hi Folks,
I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file.
bash scripting is what i need.
Data.txt
Name OrderID Quantity
Sam 123 300
Jay 342 498
Kev 78 2500
Sam 420 50
Vic 10... (3 Replies)
Not sure if this is the correct forum for this question. I have two files. file1.zip, file2
Input:
file1.zip
col1, col2 , col3
a , b , 0:0:0:0:0:c436:9346:d40b
x, y, 0:0:0:0:0:880:39f9:c9a7
m, n , 0:0:0:0:0:80c7:9161:fe00
file2.txt
col1
c4:36:93:46:d4:0b... (1 Reply)
Hello
I have a tab text file with many columns and have to filter rows ONLY if column 22 has the value of '0', '1', '2' or '3' (out of 0-5).
If Column 22 has value '0','1', '2' or '3' (highlighted below), then remove anything less than 10 and greater 100 (based on column 5) AND remove anything... (1 Reply)
I have two files and would need to filter out records based on certain criteria, these column are of variable lengths, but the lengths are uniform throughout all the records of the file. I have shown a sample of three records below. Line 1-9 is the item number "0227546_1" in the case of the first... (15 Replies)
I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789... (5 Replies)
Hi Experts,
I have csv file with 30, 40 columns
Pasting just 2 column for problem description.
Need to print error if below combination is not present in file
check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same.
For all such rows, the field... (7 Replies)
Dear Experts,
I have a log file that contains a timestamp, I would like to filter record from that file based on timestamp. For example refer below file -
cat sample.txt
Jan 19 20:51:48 mukul-Vostro-14-3468 systemd: pam_unix(systemd-user:session): session opened for user root by (uid=0)... (6 Replies)
Discussion started by: mukulverma2408
6 Replies
LEARN ABOUT DEBIAN
cd-hit-2d-para
CD-HIT-2D-PARA.PL(1) User Commands CD-HIT-2D-PARA.PL(1)NAME
cd-hit-2d-para.pl - divide a big clustering job into pieces to run cd-hit-2d or cd-hit-est-2d jobs
SYNOPSIS
cd-hit-2d-para.pl options
DESCRIPTION
This script divide a big clustering job into pieces and submit jobs to remote computers over a network to make it parallel. After
all the jobs finished, the script merge the clustering results as if you just run a single cd-hit-2d or cd-hit-est-2d.
You can also use it to divide big jobs on a single computer if your computer does not have enough RAM (with -L option).
Requirements:
1 When run this script over a network, the directory where you
run the scripts and the input files must be available on all the remote hosts with identical path.
2 If you choose "ssh" to submit jobs, you have to have
passwordless ssh to any remote host, see ssh manual to know how to set up passwordless ssh.
3 I suggest to use queuing system instead of ssh,
I currently support PBS and SGE
4 cd-hit-2d cd-hit-est-2d cd-hit-div cd-hit-div.pl must be
in same directory where this script is in.
Options
-i input filename for 1st db in fasta format, required
-i2 input filename for 2nd db in fasta format, required
-o output filename, required
--P program, "cd-hit-2d" or "cd-hit-est-2d", default "cd-hit-2d"
--B filename of list of hosts, requred unless -Q or -L option is supplied
--L number of cpus on local computer, default 0 when you are not running it over a cluster, you can use this option to divide a big
clustering jobs into small pieces, I suggest you just use "--L 1" unless you have enough RAM for each cpu
--S Number of segments to split 1st db into, default 2
--S2 Number of segments to split 2nd db into, default 8
--Q number of jobs to submit to queue queuing system, default 0 by default, the program use ssh mode to submit remote jobs
--T type of queuing system, "PBS", "SGE" are supported, default PBS
--R restart file, used after a crash of run
-h print this help
More cd-hit-2d/cd-hit-est-2d options can be speicified in command line
Questions, bugs, contact Weizhong Li at liwz@sdsc.edu
cd-hit-2d-para.pl 4.6-2012-04-25 April 2012 CD-HIT-2D-PARA.PL(1)