Sponsored Content
Top Forums Shell Programming and Scripting find numeric duplicates from 300 million lines.... Post 302669653 by pamu on Wednesday 11th of July 2012 07:35:44 AM
Old 07-11-2012
find numeric duplicates from 300 million lines....

these are numeric ids..
Code:
   222932017099186177      
   222932014385467392      
   222932017371820032      
   222932017409556480

I have text file having 300 millions of line as shown above. I want to find duplicates from this file. Please suggest the quicker way..
sort | uniq -d will take longer time and may run out of memory.

Thanks...
 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

duplicates lines with one column different

Hi I have the following lines in a file SANDI108085FRANKLIN WRAP 7285 SANDI109514ZIPLOC STRETCH N SEAL 7285 SANDI110198CHOICE DM 0911 SANDI111144RANDOM WEIGHT BRAND 0704 SANDI111144RANDOM WEIGHT BRAND 0738... (10 Replies)
Discussion started by: dhanamurthy
10 Replies

2. Shell Programming and Scripting

How to delete lines in a file that have duplicates or derive the lines that aper once

Input: a b b c d d I need: a c I know how to get this (the lines that have duplicates) : b d sort file | uniq -d But i need opossite of this. I have searched the forum and other places as well, but have found solution for everything except this variant of the problem. (3 Replies)
Discussion started by: necroman08
3 Replies

3. Shell Programming and Scripting

Tail 86000 lines from 1.2 million line file?

I have a log file that is about 1.2 million lines long and about 300MB. we need a way to clean up this file and only keep the last few thousand lines. if i use tail command we run our of memory as the file is too big. I do have a key word to match on. example, we want to keep every line... (8 Replies)
Discussion started by: robsonde
8 Replies

4. UNIX for Dummies Questions & Answers

Find and Replace random numeric value with non-numeric value

Can someone tell me how to change the first column in a very large 17k line file from a random 10 digit numeric value to a non numeric value. The format of lines in the file is: 1702938475,SNU022,201004 the first 10 numbers always begin with 170 (6 Replies)
Discussion started by: Bahf1s
6 Replies

5. UNIX for Dummies Questions & Answers

Only print lines with 3 numeric values

Hey guys & gals, I am hoping for some advice on a sed or awk command that will allow to only print lines from a file that contain 3 numeric values. From previous searches here I saw that ygemici used the sed command to remove lines containing more than 3 numeric values ; however how... (3 Replies)
Discussion started by: TAPE
3 Replies

6. UNIX for Dummies Questions & Answers

Help with changing header of tsv with 30 million lines

Hi My 30 million line file has a header chr start end strand ref_context repeat_masked s1_smpl_context s1_c_count s1_ct_count s1_non_ct_count s1_m% s1_score s1_snp s1_indels s2_smpl_context s2_c_count s2_ct_count s2_non_ct_count s2_m% s2_score s2_snp s2_indels ... (2 Replies)
Discussion started by: plumb_r
2 Replies

7. Shell Programming and Scripting

Find duplicates in column 1 and merge their lines (awk?)

Hi, I have a file (sorted by sort) with 8 tab delimited columns. The first column contains duplicated fields and I need to merge all these identical lines. My input file: comp100002 aaa bbb ccc ddd eee fff ggg comp100003 aba aba aba aba aba aba aba comp100003 fff fff fff fff fff fff fff... (5 Replies)
Discussion started by: falcox
5 Replies

8. Shell Programming and Scripting

Fast processing(mv command) of 1 million+ files using find, mv and xargs

Hi, I'd like to ask if anybody can help improve my code to move 1 million+ files from a directory to another: find /source/dir -name file* -type f | xargs -I '{}' mv {} /destination/dir I learned this line of code from this forum as well and it works fine. However, file movement is kinda... (6 Replies)
Discussion started by: agentgrecko
6 Replies
NWDIAG(1)						      General Commands Manual							 NWDIAG(1)

NAME
rackdiag - generate rack-structure-diagram image file from spec-text file. SYNOPSIS
rackdiag [options] files DESCRIPTION
This manual page documents briefly the rackdiag commands. rackdiag is generate sequence-diagram image file from spec-text file. OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. For a complete description, see the Info files. -h, --help show this help message and exit. --version show program's version number and exit. -a, --antialias Pass diagram image to anti-alias filter. -c FILE, --config=FILE read configurations from FILE. -o FILE write diagram to FILE. -f FONT, --font=FONT use FONT to draw diagram. -T TYPE Output diagram as TYPE format. SEE ALSO
The programs are documented fully by http://blockdiag.com/en/nwdiag/ AUTHOR
rackdiag was written by Takeshi Komiya <i.tkomiya@gmail.com> This manual page was written by Kouhei Maeda <mkouhei@palmtb.net>, for the Debian project (and may be used by others). June 11, 2011 NWDIAG(1)
All times are GMT -4. The time now is 08:32 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy