Cut big text file into 2 Post: 302321610

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to cut a text file at a certain spot?

Say I do a date command and get the time from 15 minutes ago. I have a text file with the date printed out every minute or so and I want to cut the file at the date stamp given to me by the 15 minute ago time stamp. Is there an easy way to do this? Example: date +%M gives me 56 I...

2. Shell Programming and Scripting

How to cut some data from big file

How to cut data from big file my file around 30 gb I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy. afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow Please recommend me , faster command to cut some data from...

3. Shell Programming and Scripting

Helping in parsing subset of text from a big results file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given...

4. Shell Programming and Scripting

cut the second line in a text file

Hi I have some problem to cut out the second line in a output file and send to a new file it's a #!/bin/bash script 1 something 2 something 3 something and after I cut 1 something 3 something New file 2 something Thanks in advance

5. UNIX for Dummies Questions & Answers

Cut text from a file

How can I cut the text of definite length say from line no. 20 to 1000? It is trivial ques, but I am very new to Unix. Thanks :)

6. Shell Programming and Scripting

Very big text file - Too slow!

Hello everyone, suppose there is a very big text file (>800 mb) that each line contains an article from wikipedia. Each article begins with a tag (<..>) containing its url. Currently there are 10^6 articles in the file. I want to take random N articles, eliminate all non-alpharithmetic...

7. UNIX for Advanced & Expert Users

Help using Awk and cut with a text file

Looking for some help on using awk and cut I have a text file that has fixed information and want to write a script that will prompt the user for an account to search for and pint the output The sample line that has the key information looks like this: Statement to: ...

8. UNIX for Dummies Questions & Answers

How to cut a big file into small ones?

Hello all, Currently I have a txt file named as a.txt with the content as: f e100 aa bb cc dd ee ff f e222 aa dd ff gg f e987 dd aa f e2222 gg ff gg aa dd ff ee ee While, for some reason I want to cut a.txt into small ones, e.g. f1.txt, f2.txt, f3.txt and f4.txt. The routine is to...

9. Shell Programming and Scripting

Cut text from a file and remove

Hello Friends, I am stuck with the below problem.Any help will be appreciated. I have a file which has say 100 lines. On the second last line I have a line from which i want to remove certain characters.. e.g CAST(CAST( A as varchar(50)) || ',' || CAST(CAST( B as varchar(50)) || ',' ||...

LEARN ABOUT DEBIAN

cd-hit-para

CD-HIT-PARA.PL(1)						   User Commands						 CD-HIT-PARA.PL(1)

NAME

       cd-hit-para.pl - divide a big clustering job into pieces to run cd-hit or cd-hit-est jobs

SYNOPSIS

       cd-hit-para.pl options

DESCRIPTION

	      This  script  divide a big clustering job into pieces and submit jobs to remote computers over a network to make it parallel.  After
	      all the jobs finished, the script merge the clustering results as if you just run a single cd-hit or cd-hit-est.

	      You can also use it to divide big jobs on a single computer if your computer does not have enough RAM (with -L option).

   Requirements:
	      1 When run this script over a network, the directory where you

	      run the scripts and the input files must be available on all the remote hosts with identical path.

	      2 If you choose "ssh" to submit jobs, you have to have

	      passwordless ssh to any remote host, see ssh manual to know how to set up passwordless ssh.

	      3 I suggest to use queuing system instead of ssh,

	      I currently support PBS and SGE

	      4 cd-hit cd-hit-2d cd-hit-est cd-hit-est-2d

	      cd-hit-div cd-hit-div.pl must be in same directory where this script is in.

       Options

       -i input filename in fasta format, required

       -o output filename, required

       --P program, "cd-hit" or "cd-hit-est", default "cd-hit"

       --B filename of list of hosts,

	      requred unless -Q or -L option is supplied

       --L number of cpus on local computer, default 0

	      when you are not running it over a cluster, you can use this option to divide a big clustering jobs into small pieces, I suggest you
	      just use "--L 1" unless you have enough RAM for each cpu

       --S Number of segments to split input DB into, default 64

       --Q number of jobs to submit to queue queuing system, default 0

	      by default, the program use ssh mode to submit remote jobs

       --T type of queuing system, "PBS", "SGE" are supported, default PBS

       --R restart file, used after a crash of run

       -h print this help

       More cd-hit/cd-hit-est options can be speicified in command line

	      Questions, bugs, contact Weizhong Li at liwz@sdsc.edu

cd-hit-para.pl 4.6-2012-04-25					    April 2012							 CD-HIT-PARA.PL(1)