Sponsored Content
Top Forums Shell Programming and Scripting Help with removing duplicate entries with awk or Perl Post 302723105 by Amit Pande on Monday 29th of October 2012 11:48:54 AM
Old 10-29-2012
Code:
    	 	 	 	 	 		 	 	     	 	 		 			chr1 			11127067 			11132181 			89 			chr1 			11128023 			11128311 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11128023 			11128311 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131908 			11132010 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131908 			11132010 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130992 			11131108 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130992 			11131108 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11128311 			11128447 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11128311 			11128447 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130630 			11130711 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130630 			11130711 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130729 			11130979 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11130729 			11130979 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131263 			11131553 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131263 			11131553 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131587 			11131709 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11131587 			11131709 			chr1 			11131583 			11131618 			1 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11132034 			11132488 			chr1 			11130990 			11131025 			5 		 		 			chr1 			11127067 			11132181 			89 			chr1 			11132034 			11132488 			chr1 			11131583 			11131618 			1

and the output should look like this for all the lines...sorry for the trouble friends...I am not an expert with computers
Code:
    	 	 	 	 	 		 	 	     	 	 		 			chr1 			11127067 			11132181 			89 			chr1 			11128023 			11128311 			chr1 			11130990 			11131025 			5 		 		 			
			
			
			
			chr1 			11131908 			11132010 			
			
			
			
		 		 			
			
			
			
			chr1 			11130992 			11131108 			chr1 			11131583 			11131618 			1

 

10 More Discussions You Might Find Interesting

1. Linux

Need awk script for removing duplicate records

I have huge txt file having millions of trade data. For e.g Trade.txt (first 8 lines in the file is header info) COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID, TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID, EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,... (6 Replies)
Discussion started by: nmumbarkar
6 Replies

2. Shell Programming and Scripting

Removing duplicate records from 2 files

Can anyone help me to removing duplicate records from 2 separate files in UNIX? Please find the sample records for both the files cat Monday.dat 3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE 3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE 3MEHM0JG7AR652083MUTLAB NAL-NAFISAH... (4 Replies)
Discussion started by: zooby
4 Replies

3. Shell Programming and Scripting

Counting duplicate entries in a file using awk

Hi, I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d The file looks like 10.1.1.1 10.1.1.1 10.1.1.1 10.1.2.4 10.1.2.4 12.1.5.6 . . . . and so on.... There are duplicate/multiple entries for some IP... (3 Replies)
Discussion started by: sajal.bhatia
3 Replies

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

5. Post Here to Contact Site Administrators and Moderators

Removing or Merging some duplicate threads

I have made some threads that were identical and were about the same question :( I've made them in 3 forums , the moderator has moved and merged one of these threads. There is one thread left and it need to be merged or deleted. Is there any way I can delete it or merge it myself ? I have delete... (1 Reply)
Discussion started by: k.a.docpp
1 Replies

6. Linux

Need awk script for removing duplicate records

I have log file having Traffic line 2011-05-21 15:11:50.356599 TCP (6), length: 52) 10.10.10.1.3020 > 10.10.10.254.50404: 2011-05-21 15:11:50.652739 TCP (6), length: 52) 10.10.10.254.50404 > 10.10.10.1.3020: 2011-05-21 15:11:50.652558 TCP (6), length: 89) 10.10.10.1.3020 >... (1 Reply)
Discussion started by: Rastamed
1 Replies

7. Shell Programming and Scripting

Removing Dupes from huge file- awk/perl/uniq

Hi, I have the following command in place nawk -F, '!a++' file > file.uniq It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error: bash-3.2$ nawk -F, '!a++'... (17 Replies)
Discussion started by: makn
17 Replies

8. Shell Programming and Scripting

Removing duplicate terms in a file

Hi everybody I have a .txt file that contains some assembly code for optimizing it i need to remove some replicated parts. for example I have:e_li r0,-1 e_li r25,-1 e_lis r25,0000 add r31, r31 ,r0 e_li r28,-1 e_lis r28,0000 add r31, r31 ,r0 e_li r28,-1 ... (3 Replies)
Discussion started by: Behrouzx77
3 Replies

9. Shell Programming and Scripting

How to delete duplicate entries without using awk command?

Hello.. I am trying to remove the duplicate entries in a log files and used the the below shell script to do the same. awk '!x++' <filename> Can I do without using the awk command and the regex? I do not want to start the search from the beginning of the line in the log file as it contains... (9 Replies)
Discussion started by: sandeepcm
9 Replies

10. Shell Programming and Scripting

Removing duplicate entries from edge-lists

I have a file which has connections given as: A B 0.1 B C 5.8 C B 5.8 E F 0.67 B A 0.1 A B and B A are same, so I want to remove one of them. Same with BC and CB. Desired output: A B 0.1 B C 5.8 E F 0.67 (2 Replies)
Discussion started by: Sanchari
2 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 03:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy