10-06-2011
Awesome, works nicely... and I'll run it on my 12,000 line files and see how it goes. Thanks again.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All,
I have file1 line below:
$myName$|xxx
Now I need to read the file1 and find for $myName$ in file2 and replace with xxx
file1:
$myName$|xxx
file2:
My name is $myName$
expected output in file2 after executing the script is below:
my name is xxx
Thanks, (8 Replies)
Discussion started by: gdevadas
8 Replies
2. Shell Programming and Scripting
Hello,
I was hoping someone could help me with this work related problem...
basically what I want to do is the following:
file2:
1 o
2 t
4 f
5 v
7 n
8 e
10 a
file1:
1 : (8 Replies)
Discussion started by: smarones
8 Replies
3. UNIX for Dummies Questions & Answers
Hi !
I have a tab-delimited file, file.tab:
Column1 Column2 Column3
aaaaaaaaaa bbtomatoesbbbbbb cccccccccc
ddddddddd eeeeappleseeeeeeeee ffffffffffffff
ggggggggg hhhhhhtomatoeshhh iiiiiiiiiiiiiiii
... (18 Replies)
Discussion started by: lucasvs
18 Replies
4. Shell Programming and Scripting
Hi Friends,
I am new to Shell Scripting and need your help in the below situation.
- I have two files (File 1 and File 2) and the contents of the files are mentioned below.
- "Application handle" is the common field in both the files.
(NOTE :- PLEASE REFER TO THE ATTACHMENT "Compare files... (2 Replies)
Discussion started by: Santoshbn
2 Replies
5. Shell Programming and Scripting
Hi Freinds,
i have a file1 as below
file1
1|ndmf|fdd|d3484|34874
2|jdehf|wru7|478|w489
3|dfkj|wej|484|49894
file2 contains lakhs of records and not in sorted order
i want to retrive only the records from file2 by searcing the first field of file 1
i used
grep ^1 file2... (4 Replies)
Discussion started by: i150371485
4 Replies
6. Shell Programming and Scripting
Dear All,
Need your help..:D
I am not regular on shell scripts..:(
I have 2 files..
Content of file1
cellRef 4};"4038_2_MTNL_KALAMBOLI"
cellRef 1020};"4112_3_RAINBOW_BLDG"
cellRef 134};"4049_2_TATA_HOSPITAL"
cellRef 1003};"4242_3_HITESH_CONSTRUCTION"
cellRef... (6 Replies)
Discussion started by: ailnilanjan
6 Replies
7. Shell Programming and Scripting
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies
8. Shell Programming and Scripting
Trying to use awk to:
update $2 in file2 with the $2 value in file1, if $1 in file1 matches $13 in file2, which is tab-delimeted. The $2values may already be the same so in that case nothing happens and the next line is processed.
There are exactly 4,605 unique $13 values. Thank you :).
... (4 Replies)
Discussion started by: cmccabe
4 Replies
9. Shell Programming and Scripting
In the awk below I am trying to set/update the value of $14 in file2 in
bold, using the matching NM_ in $12 or $9 in file2
with the NM_ in $2 of file1.
The lengths of $9 and $12 can be variable but what is consistent is the start pattern
will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies
10. Shell Programming and Scripting
I have two files which are the output of a multiple choice vocab test (60 separate questions) from 104 people (there are some missing responses) and the question list. I have the item list in one file (File1)
Item,Stimulus,Choice1,Choice2,Choice3,Choice4,Correct... (5 Replies)
Discussion started by: samonl
5 Replies
LEARN ABOUT DEBIAN
bp_fast_load_gff
BP_FAST_LOAD_GFF(1p) User Contributed Perl Documentation BP_FAST_LOAD_GFF(1p)
NAME
bp_fast_load_gff.pl - Fast-load a Bio::DB::GFF database from GFF files.
SYNOPSIS
% bp_fast_load_gff.pl -d testdb dna1.fa dna2.fa features1.gff features2.gff ...
DESCRIPTION
This script loads a Bio::DB::GFF database with the features contained in a list of GFF files and/or FASTA sequence files. You must use the
exact variant of GFF described in Bio::DB::GFF. Various command-line options allow you to control which database to load and whether to
allow an existing database to be overwritten.
This script is similar to load_gff.pl, but is much faster. However, it is hard-coded to use MySQL and probably only works on Unix
platforms due to its reliance on pipes. See bp_load_gff.pl for an incremental loader that works with all databases supported by
Bio::DB::GFF, and bp_bulk_load_gff.pl for a fast MySQL loader that supports all platforms.
NOTES
If the filename is given as "-" then the input is taken from standard input. Compressed files (.gz, .Z, .bz2) are automatically
uncompressed.
FASTA format files are distinguished from GFF files by their filename extensions. Files ending in .fa, .fasta, .fast, .seq, .dna and their
uppercase variants are treated as FASTA files. Everything else is treated as a GFF file. If you wish to load -fasta files from STDIN,
then use the -f command-line swith with an argument of '-', as in
gunzip my_data.fa.gz | bp_fast_load_gff.pl -d test -f -
The nature of the load requires that the database be on the local machine and that the indicated user have the "file" privilege to load the
tables and have enough room in /usr/tmp (or whatever is specified by the $TMPDIR environment variable), to hold the tables transiently.
If your MySQL is version 3.22.6 and was compiled using the "load local file" option, then you may be able to load remote databases with
local data using the --local option.
About maxfeature: the default value is 100,000,000 bases. If you have features that are close to or greater that 100Mb in length, then the
value of maxfeature should be increased to 1,000,000,000. This value must be a power of 10.
If the list of GFF or fasta files exceeds the kernel limit for the maximum number of command-line arguments, use the --long_list
/path/to/files option.
The adaptor used is dbi::mysqlopt. There is currently no way to change this.
COMMAND-LINE OPTIONS
Command-line options can be abbreviated to single-letter options. e.g. -d instead of --database.
--database <dsn> Mysql database name
--create Reinitialize/create data tables without asking
--local Try to load a remote database using local data.
--user Username to log in as
--fasta File or directory containing fasta files to load
--password Password to use for authentication
--long_list Directory containing a very large number of
GFF and/or FASTA files
--maxfeature Set the value of the maximum feature size (default 100Mb; must be a power of 10)
--group A list of one or more tag names (comma or space separated)
to be used for grouping in the 9th column.
--gff3_munge Activate GFF3 name munging (see Bio::DB::GFF)
--summary Generate summary statistics for drawing coverage histograms.
This can be run on a previously loaded database or during
the load.
--Temporary Location of a writable scratch directory
SEE ALSO
Bio::DB::GFF, bulk_load_gff.pl, load_gff.pl
AUTHOR
Lincoln Stein, lstein@cshl.org
Copyright (c) 2002 Cold Spring Harbor Laboratory
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for
disclaimers of warranty.
perl v5.14.2 2012-03-02 BP_FAST_LOAD_GFF(1p)