Hello,
I wonder if it would be possible to add to Gimley's program. I had written a perl script to identify duplicates in a large file which has a structure similar to Gimley's.
Quote:
word=word1,word2,word3
where Word is the headword and word1, word2, word3 are all equivalents of the word.
It so happens that some times two entries for the same headword can be present.
Quote:
word=word1,word2,word3
word=word1,word4,word5
I have written a program in PERL which identifies such dupes and spews them out in a file where singletons and dupes are clearly identified.
However I have not been able to add to it the added functionality of merging the duplicates into one single entry.
Thus the dupes mentioned above should merge to one single entry:
I need to find to find duplicate lines in a document and then print the line numbers of the duplicates
The files contain multiple lines with about 100 numbers on each line I need something that will output the line numbers where duplicates were found ie 1=5=7, 2=34=76
Any suggestions would be... (5 Replies)
Hi All,
Can Anyone please tell me,how can I delete a line from a file.
I am reading the file line by line using whil loop and validating each line..Suppose in the middle i found a particular line is invalid,i need to delete that particular line.
Can anyone please help.
Thanks in advance,... (14 Replies)
Hello,
I have two files. File1 or the master file contains two columns separated by a delimiter:
a=b
b=d
e=f
g=h
File 2 which is the file to be processed has only a single column
a
h
c
b
What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Hello,
I have a large database in which name homonyms are arranged in a row. Since the database is large and generated by hand, very often dupes creep in. I want to remove the dupes either using an awk or perl script.
An input is given below
The expected output is given below:
As can be... (2 Replies)
Hi
I have a file:
r58778.3|SOURCES={KEY=f665931a...,fw,221-705}|ERRORS={16_1:T,30_1:T,56_1:C,57_1:T,59_1:A,101_1:A,115:-,158_1:C,186_1:A,204:-,271_1:T,305:-,350_1:C,368_1:G,442_1:C,472_1:G,477_1:A}|SOURCE_1="Contig_1092402550638"(f665931a359e36cea0976db191ff60ff09cc816e)
I want to retain... (15 Replies)
Hello,
I have a database of name variants with the following structure:
variant=variant=variant
The number of variants can be as many as thirty to forty.
Since the database is quite large (at present around 60,000 lines) duplicate sets of variants creep in. Thus
John=Johann=Jon
and... (2 Replies)
Dear all,
I have a large dictionary database which has the following structure
source word=target word
e.g.
book=livre
Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated
e.g.
book=livre
book=tome
Since I want to... (7 Replies)
Hello,
I have a script which removes duplicates in a database with a single delimiter
=
The script is given below:
# script to remove dupes from a row with structure word=word
BEGIN{FS="="}
{for(i=1;i<=NF;i++){a++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1
How do I modify... (6 Replies)
Hi,
I have a shell script which has a for loop that scans list of files and do find and replace few variables using sed command. While doing this, it deletes the last line of all input file which is something wrong. how to fix this. please suggest. When i add an empty line in all my input file,... (5 Replies)
Discussion started by: rbalaj16
5 Replies
LEARN ABOUT DEBIAN
shell-quote
SHELL-QUOTE(1p) User Contributed Perl Documentation SHELL-QUOTE(1p)NAME
shell-quote - quote arguments for safe use, unmodified in a shell command
SYNOPSIS
shell-quote [switch]... arg...
DESCRIPTION
shell-quote lets you pass arbitrary strings through the shell so that they won't be changed by the shell. This lets you process commands
or files with embedded white space or shell globbing characters safely. Here are a few examples.
EXAMPLES
ssh preserving args
When running a remote command with ssh, ssh doesn't preserve the separate arguments it receives. It just joins them with spaces and
passes them to "$SHELL -c". This doesn't work as intended:
ssh host touch 'hi there' # fails
It creates 2 files, hi and there. Instead, do this:
cmd=`shell-quote touch 'hi there'`
ssh host "$cmd"
This gives you just 1 file, hi there.
process find output
It's not ordinarily possible to process an arbitrary list of files output by find with a shell script. Anything you put in $IFS to
split up the output could legitimately be in a file's name. Here's how you can do it using shell-quote:
eval set -- `find -type f -print0 | xargs -0 shell-quote --`
debug shell scripts
shell-quote is better than echo for debugging shell scripts.
debug() {
[ -z "$debug" ] || shell-quote "debug:" "$@"
}
With echo you can't tell the difference between "debug 'foo bar'" and "debug foo bar", but with shell-quote you can.
save a command for later
shell-quote can be used to build up a shell command to run later. Say you want the user to be able to give you switches for a command
you're going to run. If you don't want the switches to be re-evaluated by the shell (which is usually a good idea, else there are
things the user can't pass through), you can do something like this:
user_switches=
while [ $# != 0 ]
do
case x$1 in
x--pass-through)
[ $# -gt 1 ] || die "need an argument for $1"
user_switches="$user_switches "`shell-quote -- "$2"`
shift;;
# process other switches
esac
shift
done
# later
eval "shell-quote some-command $user_switches my args"
OPTIONS --debug
Turn debugging on.
--help
Show the usage message and die.
--version
Show the version number and exit.
AVAILABILITY
The code is licensed under the GNU GPL. Check http://www.argon.org/~roderick/ or CPAN for updated versions.
AUTHOR
Roderick Schertler <roderick@argon.org>
perl v5.8.4 2005-05-03 SHELL-QUOTE(1p)