duplicate row based on single column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting duplicate row based on single column
# 1  
Old 04-12-2011
duplicate row based on single column

I am a newbie to shell scripting ..

I have a .csv file. It has 1000 some rows and about 7 columns...

but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type...

example below..
Code:
column 1                column2
(111)222-3333      1000
(222)333-4444      1000
(111)222-3333       2000
(333)444-5555      2000

here (111)222-3333 is considered duplicate and 2000 takes precedence over 1000
so I have to remove the row with values (111)222-3333 1000 ...How do I achieve this ??
any help is greatly appreciated.

Thank you,
M.

Last edited by Franklin52; 04-13-2011 at 04:23 AM.. Reason: Please use code tags
# 2  
Old 04-12-2011
You can try the code before you sort your data by the column 1 first, colunm 2 (small to large) using excel .
Code:
 
awk 'NR==1{print;next}{a[$1]=$2}END{for(i in a) print i,a[i]|"sort -r"}'
column 1 column2
(333)444-5555 2000
(222)333-4444 1000
(111)222-3333 2000

# 3  
Old 04-13-2011
I cannot manipulate the Excel file.
It comes from a third party and we have to run the batch file to handle the data that they send before inserting into our DB.

---------- Post updated at 09:01 AM ---------- Previous update was at 08:57 AM ----------

tried some thing like this ..
awk '
{s[$1]++}
END {
for(i in s) {
if(s[i]>1) {
print i
}
}
}'

It wouldnt work.. It would consider only (111) as duplicates ..not the whole number..

So I changed to this ..

awk '
{s[($1)$2"-"$3]++}
END {
for(i in s) {
if(s[i]>1) {
print i
}
}
}'

still doesnt help ..its working as if its given
awk '
{s[$0]++}
END {
for(i in s) {
if(s[i]>1) {
print i
}
}
}'
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting single row into multiple rows based on for every 10 digits of last field of the row

Hi ALL, We have requirement in a file, i have multiple rows. Example below: Input file rows 01,1,102319,0,0,70,26,U,1,331,000000113200000011920000001212 01,1,102319,0,1,80,20,U,1,241,00000059420000006021 I need my output file should be as mentioned below. Last field should split for... (4 Replies)
Discussion started by: kotra
4 Replies

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Hi, I have an input file as shown below: 20140102;13:30;FR-AUD-LIBOR-1W;2.495 20140103;13:30;FR-AUD-LIBOR-1W;2.475 20140106;13:30;FR-AUD-LIBOR-1W;2.495 20140107;13:30;FR-AUD-LIBOR-1W;2.475 20140108;13:30;FR-AUD-LIBOR-1W;2.475 20140109;13:30;FR-AUD-LIBOR-1W;2.475... (2 Replies)
Discussion started by: shash
2 Replies

3. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

4. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

5. Shell Programming and Scripting

Subtracting each row from the first row in a single column file using awk

Hi Friends, I have a single column data like below. 1 2 3 4 5 I need the output like below. 0 1 2 3 4 where each row (including first row) subtracting from first row and the result should print below like the way shown in output file. Thanks Sid (11 Replies)
Discussion started by: ks_reddy
11 Replies

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

7. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

8. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

9. Shell Programming and Scripting

Find and replace duplicate column values in a row

I have file which as 12 columns and values like this 1,2,3,4,5 a,b,c,d,e b,c,a,e,f a,b,e,a,h if you see the first column has duplicate values, I need to identify (print it to console) the duplicate value (which is 'a') and also remove duplicate values like below. I could be in two... (5 Replies)
Discussion started by: nuthalapati
5 Replies

10. Shell Programming and Scripting

Delete a row that has a duplicate column

I'm trying to remove lines of data that contain duplicate data in a specific column. For example. apple 12345 apple 54321 apple 14234 orange 55656 orange 88989 orange 99898 I only want to see apple 12345 orange 55656 How would i go about doing this? (5 Replies)
Discussion started by: spartan22
5 Replies
Login or Register to Ask a Question