Sponsored Content
Top Forums Shell Programming and Scripting Modify script to remove dupes with two delimiters Post 302990230 by Don Cragun on Tuesday 24th of January 2017 02:26:26 AM
Old 01-24-2017
You didn't answer the question about what type of file is being processed! And, that is even more important now that we know you're working on a Windows system (while posting your question in a forum devoted to UNIX and UNIX-like operating systems).

If you have awk, you must have installed some UNIX utilities on your Windows system. Did you try the sort command I suggested? If so, what did it do? If not, why not?

An common, easy way to remove duplicated lines using awk is:
Code:
awk '!a[$0]++' file

but, of course, that depends on file being a text file (as defined by UNIX systems); a DOS file that doesn't have a line terminator may silently drop the last (incomplete) line in a DOS file.
This User Gave Thanks to Don Cragun For This Post:
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Script to find the number of tab delimiters in a line

Hi, I need to find the number of tab delimiters in the first line of a file.So using word=`head -1 files.txt` I have extracted the first line of file into a variable word.It has 20 tab delimted columns.So can anyone help me in finding the number of delimiters? I am using csh and I am a... (5 Replies)
Discussion started by: poornimajayan
5 Replies

2. Shell Programming and Scripting

Script in SED and AWK so that it treats consecutive delimiters as one

Hi All, I am trying to cut to do a cut operation, but since there are seems to be more than one deltimiters in some occasion I am not able to get the exact field. Can you please provide an SED and AWK script for treating the source file in such a way that all consecutive delimiters are treated... (3 Replies)
Discussion started by: rakesh.su30
3 Replies

3. Shell Programming and Scripting

Using an awk script to identify dupes in two files

Hello, I have two files. File1 or the master file contains two columns separated by a delimiter: a=b b=d e=f g=h File 2 which is the file to be processed has only a single column a h c b What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Discussion started by: gimley
6 Replies

4. UNIX for Dummies Questions & Answers

Remove two delimiters, space and double quotes

I would like to know how to replace a space delimiter with a ^_ (\037) delimiter and a double quote delimiter while maintaining the spaces inside the double quotes. The double quote delimiter is only used on text fields. I'd prefer a one-liner, but could handle a function or script that accepts... (4 Replies)
Discussion started by: SteveDWin
4 Replies

5. Shell Programming and Scripting

Script for identifying and deleting dupes in a line

I am compiling a synonym dictionary which has the following structure Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma. As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Discussion started by: gimley
3 Replies

6. Shell Programming and Scripting

Help in modifying existing Perl Script to produce report of dupes

Hello, I have a large amount of data with the following structure: Word=Transliterated word I have written a Perl Script (reproduced below) which goes through the full file and identifies all dupes on the right hand side. It creates successfully a new file with two headers: Singletons and Dupes.... (5 Replies)
Discussion started by: gimley
5 Replies

7. Shell Programming and Scripting

Remove newline character between two delimiters

hi i am having delimited .dat file having content like below. test.dat(5 line of records) ====== PT2~Stag~Pt2 Stag Test. Updated~PT2 S T~Area~~UNCEF R20~~2012-05-24 ~2014-05-24~~ PT2~Stag y~Pt2 Stag Test. Updated~PT2 S T~Area~METR~~~2012-05-24~2014-05-24~~test PT2~Pt2 Stag Test~~PT2 S... (4 Replies)
Discussion started by: sushine11
4 Replies

8. Shell Programming and Scripting

Help with Perl script for identifying dupes in column1

Dear all, I have a large dictionary database which has the following structure source word=target word e.g. book=livre Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated e.g. book=livre book=tome Since I want to... (7 Replies)
Discussion started by: gimley
7 Replies

9. Shell Programming and Scripting

Remove dupes in a large file

I have a large file 1.5 gb and want to sort the file. I used the following AWK script to do the job !x++ The script works but it is very slow and takes over an hour to do the job. I suspect this is because the file is not sorted. Any solution to speed up the AWk script or a Perl script would... (4 Replies)
Discussion started by: gimley
4 Replies
mcopy(1)						      General Commands Manual							  mcopy(1)

NAME
mcopy - mtools utility to copy DOS files to and from a UNIX operating system SYNOPSIS
mcopy [-mntv] sourcefile targetfile mcopy [-mntv] sourcefile [sourcefiles...] targetdirectory OPTIONS
Preserves the file modification time. Specifies that a warning is not issued when an existing file is specified as the target file. If this option is not specified, the mcopy command verifies whether or not to overwrite an existing file. Specifies a text file transfer. Line terminators are converted to the appropriate format. Specifies verbose mode. The new file name is displayed if the name supplied is invalid. DESCRIPTION
The mcopy command copies the specified file to the named file, or copies multiple files to the named directory. The specified files or directories can be either DOS or UNIX files. If the file is a text file line terminators are converted to the appropriate format. Using a drive letter designation on the DOS files such as 'a:' determines the direction of the transfer. A missing drive designation indi- cates a UNIX file whose path starts in the current directory. DOS subdirectory names that contain the '/' or '' separator are supported. If you use the '' separator or wildcards, you must enclose file names in quotes to protect them from the shell. The mcd command can be used to establish the device and the current working directory (relative to DOS), otherwise the default is A:. Not all UNIX file names are supported in the DOS world. The mcopy command may have to change UNIX names to fit the DOS file name conven- tions. The following table shows some examples of file name conversions: ----------------------------------------------- UNIX name DOS name Reason for the change ----------------------------------------------- thisisatest THISISAT file name too long file.stuff FILE.STU extension too long prn.txt XRN.TXT PRN is a device name .abc X.ABC null file name hot+cold HOTXCOLD illegal character ----------------------------------------------- RESTRICTIONS
The following restrictions exist: Omitting the destination directory is not supported. Using the plus (+) operator is not supported. Using a drive letter designation on DOS files is required with this command only, not with other mtools. EXIT STATUS
The following exit values are returned: Success. Failure. ENVIRONMENT VARIABLES
The following environment variables affect the execution of mcopy: If set, this variable names the file that contains the name of the cur- rent mtools working directory as established by the mcd command. If this variable is not set, the file $HOME/.mcwd is used. FILES
Contains the name of the current mtools working directory as established by the mcd command. If this file does not exist, the default mtools working directory is A:. Executable file SEE ALSO
Commands: dos2unix(1), mcd(1), mdiskcopy(1), mread(1), mtools(1), mwrite(1), unix2dos(1) mcopy(1)
All times are GMT -4. The time now is 04:24 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy