Get rid of repeated entries.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Get rid of repeated entries.
# 1  
Old 10-03-2005
Get rid of repeated entries.

I have the following problem. The file contains many lines already sorted according to their first arguments. Some of these first arguments are repetitive. For each first argument value, I need to keep the first and the last line that contain it. For example,
...
1 234
1 348
...
...
5 483
...
...
7 132
7 154
7 168
7 244
...

will become

...
1 234
1 348
...
...
5 483
...
...
7 132
7 244
...

I am really pulling my hair Smilie . Could any expert help me here ?
# 2  
Old 10-03-2005
Assuming your file is sorted as you say - numerically by first field then by second....

input file - jiji.txt
Code:
1 234
1 348
3 123
3 124
3 125
5 483
7 132
7 154
7 168
7 244
9 123
10 123
10 124
10 125
10 126
10 194

Script - jiji.sh
Code:
#!/bin/sh

input_file="jiji.txt"

cut -d' ' -f1 $input_file | uniq |\
  while read val; do
  values=`sed -n "/^${val} / p" $input_file`
  if [ `echo "$values" | wc -l` -eq 1 ]; then
     echo "$values"
  else
     first=`echo "$values" | sed -n '1 p'`
     last=`echo "$values" | sed -n '$ p'`
     echo "$first"
     echo "$last"
  fi
done

exit 0

Output
Code:
1 234
1 348
3 123
3 125
5 483
7 132
7 244
9 123
10 123
10 194

Cheers
ZB
# 3  
Old 10-03-2005
Cool Smilie !!! Thanks a lot, ZB!!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Choosing between repeated entries based on the "absolute values" of a column

Hello, I was looking for a way to select between the repeated entries (column1) based on the values of absolute values of column 3 (larger value). For example if the same gene id has FC value -2 and 1, I should get the output as -2. Kindly help. GeneID Description FC ... (2 Replies)
Discussion started by: Sanchari
2 Replies

2. Shell Programming and Scripting

Choosing between repeated entries based on a column field

Hello, I have an input file: LOC_Os04g01890\LOC_Os05g17604 0.051307 LOC_Os04g01890\LOC_Os05g17604 0.150977 LOC_Os04g01890\LOC_Os05g17604 0.306231 LOC_Os04g01890\LOC_Os06g33100 0.168037 LOC_Os04g01890\LOC_Os06g33100 0.236293 ... (3 Replies)
Discussion started by: Sanchari
3 Replies

3. Shell Programming and Scripting

Find repeated word and take sum of the second field to it ,for all the repeated words in awk

Hi below is the input file, i need to find repeated words and sum up the values of it which is second field from the repeated work.Im trying but getting no where close to it.Kindly give me a hint on how to go about it Input fruits,apple,20,fruits,mango,20,veg,carrot,12,veg,raddish,30... (11 Replies)
Discussion started by: 100bees
11 Replies

4. Shell Programming and Scripting

How can I get rid of this error?

Hi, I have a quick question about sed and the r (read) command, whenever I use it on my mac I get an error that says sed: \r: No such file or directory. Although the script works fine and behaves as it is intended, I would like to get rid of the pesky error message because I need to... (2 Replies)
Discussion started by: Paul Walker
2 Replies

5. Shell Programming and Scripting

Getting rid of ^M

I have a text file with hundreds of 32-character hash codes in it, each terminated with a linefeed (/l, or ^M). 185ead08e45a5cbb51e9f7b0b384aaa2 57643e1a17252a9fc746d49c3da04168 60cba11d09221d52aaabb5db30f408a2 2b75ee6e5c2efc31b4ee9a190d09a4df ...... etc. I want to create a file for each... (6 Replies)
Discussion started by: teledon
6 Replies

6. Linux

How to get rid of ^m

Hi all, I am new to unix....pls help me with this. I have a binary file generating output by passing arguments in bash.when i open the output file in VI i can see that ^m is included in between most of lines,as a result when i pass this file to my java application it dosent parse the data... (3 Replies)
Discussion started by: asheshrocky
3 Replies

7. UNIX for Dummies Questions & Answers

How to get rid of ^[[D

Hi All, Im selecting a large record from a table and putting it in a file in the unix box. The file has a hidden character "^[[D " present in it. Can any one help me in getting rid of the character Thanks in advance, (4 Replies)
Discussion started by: madhan@29
4 Replies

8. UNIX for Dummies Questions & Answers

how to get rid of ==>

ok the assignment question: That English paper you were writing on the works of Lewis Carroll is due in a few hours and you have forgeotten the name of the text file in which you has written a number of quotations to use in your paper. Luckily, you know that the file is somewhere in your... (1 Reply)
Discussion started by: mek86
1 Replies

9. Solaris

how to get rid of ok prompt

Hi: My ultra 10 booted up to the ok prompt. does anyone how to get to the # prompt in normal mode? Alan (1 Reply)
Discussion started by: alanj9000
1 Replies

10. Solaris

getting rid of ^H

Hello everybody I have a very annoying problem on my Solaris (Unix in general) servers. When I open a shell and press the backspace button, it results in a ^H character being printed on screen. I can resolve it by typing stty erase <backspace>, but does anyone know how I can prevent the... (3 Replies)
Discussion started by: soliberus
3 Replies
Login or Register to Ask a Question