Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicate lines from a 50 MB file size Post 302577582 by vsachan on Tuesday 29th of November 2011 11:14:36 AM
Old 11-29-2011
the problem is that the file size is 100 MB - 150 MB

input file:
---------
1,2, ,TTT,DDFG,
1,2, ,TTT,DDFG,
1,2, ,TTT,DDFG,
7,8, ,TTT,DDFG,
1,2, ,TTT,DDFG,
1,2, ,TTT,DDFG,

output file should be like:
1,2, ,TTT,DDFG,
7,8, ,TTT,DDFG,
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies

2. UNIX for Dummies Questions & Answers

Remove Duplicate lines from File

I have a log file "logreport" that contains several lines as seen below: 04:20:00 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 06:38:08 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 07:11:05 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but... (18 Replies)
Discussion started by: Nysif Steve
18 Replies

3. Shell Programming and Scripting

Command/Script to remove duplicate lines from the file?

Hello, Can anyone tell Command/Script to remove duplicate lines from the file? (2 Replies)
Discussion started by: Rahulpict
2 Replies

4. UNIX for Dummies Questions & Answers

How to delete or remove duplicate lines in a file

Hi please help me how to remove duplicate lines in any file. I have a file having huge number of lines. i want to remove selected lines in it. And also if there exists duplicate lines, I want to delete the rest & just keep one of them. Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies

5. Shell Programming and Scripting

remove duplicate lines from file linux/sh

greetings, i'm hoping there is a way to cat a file, remove duplicate lines and send that output to a new file. the file will always vary but be something similar to this: please keep in mind that the above could be eight occurrences of each hostname or it might simply have another four of an... (2 Replies)
Discussion started by: crimso
2 Replies

6. Shell Programming and Scripting

Remove duplicate lines from first file comparing second file

Hi, I have two files with below data:: file1:- 123|aaa|ppp 445|fff|yyy 999|ttt|jjj 555|hhh|hhh file2:- 445|fff|yyy 555|hhh|hhh The records present in file1, not present in file 2 should be writtent to the out put file. output:- 123|aaa|ppp 999|ttt|jjj Is there any one line... (3 Replies)
Discussion started by: gani_85
3 Replies

7. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Hey guys, need some help to fix this script. I am trying to remove all the duplicate lines in this file. I wrote the following script, but does not work. What is the problem? The output file should only contain five lines: Later! (5 Replies)
Discussion started by: Ernst
5 Replies

8. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some... (7 Replies)
Discussion started by: sudhakar T
7 Replies

9. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

10. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies
g3tolj(1)						       mgetty+sendfax manual							 g3tolj(1)

NAME
g3tolj - converts a Group 3 fax file into a printable HP-PCL file SYNOPSIS
g3tolj [-kludge] [-reversebits] [-scale N] [-aspect N] [-resolution 75|100|150|300] [-compress 0|1|2] [-pagelength N] [-duplength N] [g3file] DESCRIPTION
Reads a Group 3 fax file (raw or digifax) as input. If no filename is given, stdin is used. Produces a printable HP-PCL file as output. OPTIONS
-kludge Tells g3tolj to skip the first lines for synchronisation. -reversebits Tells g3tolj to interpret bits least-significant first, instead of the default most-significant first. Apparently some fax modems do it one way and others do it the other way. If you get a whole bunch of "invalid code" messages, try using this flag. -scale N Scale the output to match the printer resolution and paper size, the default of 1.40 will do in most cases. -aspect N Scale the output to match the printer resolution and paper size, the default of 1.0 will do for high resolution faxes, 2.0 will do for low resolution faxes. -resolution 75|100|150|300 Selects print resolution. The default is 300. -compress 0|1|2 Selects compression method for the print output. 0 = none, 1 = rll, 2 = tiff. The default is 0. -pagelength N Defines the pagelength in inches, the default is 10.95. After this length a pagebreak is generated and the last part of the previous page is duplicated on the next page -duplength N Defines the length in inches that will be duplicated after a pagebreak, The default is 0.7. REFERENCES
The standard for Group 3 fax is defined in CCITT Recommendation T.4. BUGS
Please report bugs to chel@vangennip.nl SEE ALSO
pbmtog3(1), pbm(5), g3cat(1), sendfax(8), mgetty(1) AUTHOR
g3tolj is Copyright (C) 1994 by Chel van Gennip, <chel@vangennip.nl>. Sources of g3topbm and pbmtolj programs in Jef Poskanzers pbmplus package have been used, but al lot of code has been changed or added to simplify its use for printing faxes. Value added: low use of mem- ory, fast scaling, printing of long faxes with page breaks, print file compression (by John Watson) Chel 22 may 94 g3tolj(1)
All times are GMT -4. The time now is 08:42 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy