06-01-2009
Thanks this has helped me solve my problem
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Say I do a date command and get the time from 15 minutes ago.
I have a text file with the date printed out every minute or so and I want to cut the file at the date stamp given to me by the 15 minute ago time stamp.
Is there an easy way to do this?
Example:
date +%M gives me 56
I... (2 Replies)
Discussion started by: LordJezo
2 Replies
2. Shell Programming and Scripting
How to cut data from big file
my file around 30 gb
I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy.
afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow
Please recommend me , faster command to cut some data from... (4 Replies)
Discussion started by: almanto
4 Replies
3. Shell Programming and Scripting
Hi All,
I need some help to effectively parse out a subset of results from a big results file.
Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given... (1 Reply)
Discussion started by: Lucky Ali
1 Replies
4. Shell Programming and Scripting
Hi I have some problem to cut out the second line in a output file and send to a new file it's a #!/bin/bash script
1 something
2 something
3 something
and after I cut
1 something
3 something
New file
2 something
Thanks in advance (7 Replies)
Discussion started by: pelle
7 Replies
5. UNIX for Dummies Questions & Answers
How can I cut the text of definite length say from line no. 20 to 1000?
It is trivial ques, but I am very new to Unix.
Thanks :) (3 Replies)
Discussion started by: JackR
3 Replies
6. Shell Programming and Scripting
Hello everyone,
suppose there is a very big text file (>800 mb) that each line contains an article from wikipedia. Each article begins with a tag (<..>) containing its url. Currently there are 10^6 articles in the file.
I want to take random N articles, eliminate all non-alpharithmetic... (14 Replies)
Discussion started by: fedonMan
14 Replies
7. UNIX for Advanced & Expert Users
Looking for some help on using awk and cut
I have a text file that has fixed information and want to write a script that will prompt the user for an account to search for and pint the output
The sample line that has the key information looks like this:
Statement to: ... (5 Replies)
Discussion started by: ziggy6
5 Replies
8. UNIX for Dummies Questions & Answers
Hello all,
Currently I have a txt file named as a.txt with the content as:
f e100
aa bb
cc dd
ee ff
f e222
aa dd
ff gg
f e987
dd aa
f e2222
gg ff
gg aa
dd ff
ee ee
While, for some reason I want to cut a.txt into small ones, e.g. f1.txt, f2.txt, f3.txt and f4.txt. The routine is to... (6 Replies)
Discussion started by: locohd
6 Replies
9. Shell Programming and Scripting
Hello Friends,
I am stuck with the below problem.Any help will be appreciated.
I have a file which has say 100 lines.
On the second last line I have a line from which i want to remove certain characters..
e.g
CAST(CAST( A as varchar(50)) || ',' ||
CAST(CAST( B as varchar(50)) || ',' ||... (8 Replies)
Discussion started by: vital_parsley
8 Replies
LEARN ABOUT DEBIAN
cd-hit-para
CD-HIT-PARA.PL(1) User Commands CD-HIT-PARA.PL(1)
NAME
cd-hit-para.pl - divide a big clustering job into pieces to run cd-hit or cd-hit-est jobs
SYNOPSIS
cd-hit-para.pl options
DESCRIPTION
This script divide a big clustering job into pieces and submit jobs to remote computers over a network to make it parallel. After
all the jobs finished, the script merge the clustering results as if you just run a single cd-hit or cd-hit-est.
You can also use it to divide big jobs on a single computer if your computer does not have enough RAM (with -L option).
Requirements:
1 When run this script over a network, the directory where you
run the scripts and the input files must be available on all the remote hosts with identical path.
2 If you choose "ssh" to submit jobs, you have to have
passwordless ssh to any remote host, see ssh manual to know how to set up passwordless ssh.
3 I suggest to use queuing system instead of ssh,
I currently support PBS and SGE
4 cd-hit cd-hit-2d cd-hit-est cd-hit-est-2d
cd-hit-div cd-hit-div.pl must be in same directory where this script is in.
Options
-i input filename in fasta format, required
-o output filename, required
--P program, "cd-hit" or "cd-hit-est", default "cd-hit"
--B filename of list of hosts,
requred unless -Q or -L option is supplied
--L number of cpus on local computer, default 0
when you are not running it over a cluster, you can use this option to divide a big clustering jobs into small pieces, I suggest you
just use "--L 1" unless you have enough RAM for each cpu
--S Number of segments to split input DB into, default 64
--Q number of jobs to submit to queue queuing system, default 0
by default, the program use ssh mode to submit remote jobs
--T type of queuing system, "PBS", "SGE" are supported, default PBS
--R restart file, used after a crash of run
-h print this help
More cd-hit/cd-hit-est options can be speicified in command line
Questions, bugs, contact Weizhong Li at liwz@sdsc.edu
cd-hit-para.pl 4.6-2012-04-25 April 2012 CD-HIT-PARA.PL(1)