I created a test file with 114,688 records:
Running the Vgersh's arrays solution:
Running Aigles' sort solution:
Guys, the 'sort' command is very optmized.
Arrays work great for a short number of occurrences or when constants must be used.
Otherwise, arrays should be used with caution, especially when several
thousands of occurrences are involved.
Dear Guru
What a great froum I am in !!! I really feel proud to be a member of it !
I thank to aigles, Shell Life ,vgersh99 from bottom of my heart 'cause
i have been trying with the code also by moving the previous but to no
avail and totally forgot about the array . Once again my sincere thanks to
all of you for sparing your time and providing me with the solution. One more
clarification I would like though I have yet to test ,
Will the array works with large volume ??
Or the simple script will do the job ? will you please throw some detail?
Something for my detail knowledge .
Aigles,
Try to create a similar data set that I tested against as follows:
I confirm you results :
I think that the problem doesn't come from the number of elements in the array.
In this test the array contains only 5 elements, but the elements are very large (up to 1300 Kb) and modified very often.
The situation was inverse in my previous test.
There were more than 100000 elements with a maximal size of 200 Kb and low update rate.
I have a job that produces a file of barcodes that gets added to every time the job runs
I want to check the list to see if the barcode is already in the list and report it out if it is. (3 Replies)
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Hello,
I have a huge directory (with millions of files) and need to find out duplicates based on BOTH file name and File size.
I know fdupes but it calculates MD5 which is very time-consuming and especially it takes forever as I have millions of files.
Can anyone please suggest a script or... (7 Replies)
Gents,
I needs to delete duplicate values and only get uniq values based in columns 2-27
Always we should keep the last record found...
I need to store one clean file and other with the duplicate values removed.
Input :
S3033.0 7305.01 0 420123.8... (18 Replies)
Gents,
I have a file like this.
1 1
1 2
2 3
2 4
2 5
3 6
3 7
4 8
5 9
I would like to get something like it
1 1 2
2 3 4 5
3 6 7
Thanks in advance for your support :b: (8 Replies)
Hi All,
i have file like
ID|Indiv_ID
12345|10001
|10001
|10001
23456|10002
|10002
|10002
|10002
|10003
|10004
if indiv_id having duplicate values and corresponding ID column is null then copy the id.
I need output like:
ID|Indiv_ID
12345|10001... (11 Replies)
Hi,
In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.
In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
EDIT : This is for perl
@data2 = grep(/$data/, @list_now);
This gives me @data2 as
Printing data2 11 testzone1 running /zones/testzone1 ***-*****-****-*****-***** native shared
But I really cant access data2 by its individual elements.
$data2 is the entire list, while $data,2,3...... (1 Reply)
I have a list which contains all the jar files shipped with the product I am involved with. Now, in this list I have some jar files which appear again and again. But these jar files are present in different folders.
My input file looks like this
/path/1/to a.jar
/path/2/to a.jar
/path/1/to... (10 Replies)