08-02-2011
Quote:
Originally Posted by
yazu
You can split the file (with "split" command), then "sort -u" the chunks separately and then merge them with "sort -m". (Of course whether you need it depends on the memory size of your system).
You probably won't have to split anything manually. Many (if not most) sort implementations (GNU, *BSD, Solaris, HP-UX, to name a few) will do this for you automatically. They compare the size of the file to be sorted against the system's available memory and make a conservative guess. Intermediate files are then created in $TMPDIR.
As vgersh99 pointed out, often there'll be a -T option to override the enviroment variable, although if this option is missing, you can simply override the environment default when invoking sort (TMPDIR=/lots/of/space sort ...).
Regards,
Alister
These 3 Users Gave Thanks to alister For This Post:
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have to delete 1st 7000 lines of a file which is 12GB large. As it is so large, i can't open in vi and delete these lines. Also I found one post here which gave solution using perl, but I don't have perl installed. Also some solutions were redirecting the o/p to a different file and renaming it.... (3 Replies)
Discussion started by: rahulrathod
3 Replies
2. Shell Programming and Scripting
Ok here's what I'm trying to do. I need to get a listing of all the mountpoints on a system into a file, which is easy enough, just using something like "mount | awk '{print $1}'"
However, on a couple of systems, they have some mount points looking like this:
/stage
/stand
/usr
/MFPIS... (2 Replies)
Discussion started by: paqman
2 Replies
3. UNIX for Dummies Questions & Answers
OK, I have read several things on how to do this, but can't make it work. I am writing this to a vi file then calling it as an awk script.
So I need to search a file for duplicate lines, delete duplicate lines, then write the result to another file, say /home/accountant/files/docs/nodup
... (2 Replies)
Discussion started by: bfurlong
2 Replies
4. UNIX for Dummies Questions & Answers
Hi please help me how to remove duplicate lines in any file.
I have a file having huge number of lines.
i want to remove selected lines in it.
And also if there exists duplicate lines, I want to delete the rest & just keep one of them.
Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies
5. UNIX for Dummies Questions & Answers
Hey all, a relative bash/script newbie trying solve a problem.
I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like
2007-11-03... (3 Replies)
Discussion started by: mattv
3 Replies
6. UNIX for Dummies Questions & Answers
hi :)
I need to delete partial duplicate lines
I have this in a file
sihp8027,/opt/cf20,1980182
sihp8027,/opt/oracle/10gRelIIcd,155200016
sihp8027,/opt/oracle/10gRelIIcd,155200176
sihp8027,/var/opt/ERP,10376312
and need to leave it like this:
sihp8027,/opt/cf20,1980182... (2 Replies)
Discussion started by: C|KiLLeR|S
2 Replies
7. Shell Programming and Scripting
The question is not as simple as the title... I have a file, it looks like this
<string name="string1">RZ-LED</string>
<string name="string2">2.0</string>
<string name="string2">Version 2.0</string>
<string name="string3">BP</string>
I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies
8. Shell Programming and Scripting
Hi, I'm sorry I'm no coder so I came here, counting on your free time and good will to beg for spoonfeeding some good code. I'll try to be quick and concise!
Got file with 50k lines like this:
"Heh, heh. Those darn ninjas. They're _____."*wacky
The "canebrake", "timber" & "pygmy" are types... (7 Replies)
Discussion started by: shadowww
7 Replies
9. UNIX for Beginners Questions & Answers
Hi,
I have a file as follows.
file1
Hello
Hi
His
Hi
Hi
Hungry
hi
so I want to delete identical lines while leaving one of them undeleted.
So desired output will be
Hello
Hi (2 Replies)
Discussion started by: beginner_99
2 Replies
10. UNIX for Beginners Questions & Answers
Hi
I need to delete duplicate like pattern lines from a text file containing 2 duplicates only (one being subset of the other) using sed or awk preferably.
Input:
FM:Chicago:Development
FM:Chicago:Development:Score
SR:Cary:Testing:Testcases
PM:Newyork:Scripting
PM:Newyork:Scripting:Audit... (6 Replies)
Discussion started by: tech_frk
6 Replies
LEARN ABOUT LINUX
wrap-and-sort
WRAP-AND-SORT(1) General Commands Manual WRAP-AND-SORT(1)
NAME
wrap-and-sort - wrap long lines and sort items in Debian packaging files
SYNOPSIS
wrap-and-sort [options]
DESCRIPTION
wrap-and-sort wraps the package lists in Debian control files. By default the lists will only split into multiple lines if the entries are
longer than 80 characters. wrap-and-sort sorts the package lists in Debian control files and all .install files. Beside that wrap-and-sort
removes trailing spaces in these files.
This script should be run in the root of a Debian package tree. It searches for control, control.in, copyright, copyright.in, install, and
*.install in the debian directory.
OPTIONS
-h, --help
Show this help message and exit.
-a, --wrap-always
Wrap all package lists in the Debian control file even if the entries are shorter than 80 characters and could fit in one line line.
-s, --short-indent
Only indent wrapped lines by one space (default is in-line with the field name).
-b, --sort-binary-packages
Sort binary package paragraphs by name.
-k, --keep-first
When sorting binary package paragraphs, leave the first one at the top. Unqualified debhelper(7) configuration files are applied to
the first package.
-n, --no-cleanup
Do not remove trailing whitespaces.
-d path, --debian-directory=path
Location of the debian directory (default: ./debian).
-f file, --file=file
Wrap and sort only the specified file. You can specify this parameter multiple times. All supported files will be processed if no
files are specified.
-v, --verbose
Print all files that are touched.
AUTHORS
wrap-and-sort and this manpage have been written by Benjamin Drung <bdrung@debian.org>.
Both are released under the ISC license.
DEBIAN Debian Utilities WRAP-AND-SORT(1)