Sponsored Content
Top Forums Shell Programming and Scripting Delete unique rows - optimize script Post 302701419 by varu0612 on Sunday 16th of September 2012 03:08:16 AM
Old 09-16-2012
Delete unique rows - optimize script

Hi all,

I have the following input - the unique row key is 1st column

Code:
cat file.txt
 
[4] A response
[1] C request
[1] C response
[3] D request
[2] C request
[2] C response
[5] E request

The desired output should be

Code:
[1] C request
[1] C response
[2] C request
[2] C response

Now i have implemented the below loop which does work but when the input file is bigger than 300 mb in size the whole process of removing the non-pairs rows takes ages since it needs to scan the whole file in a loop.

Code:
#/bin/bash
req=$(mktemp)
res=$(mktemp)
new=$(mktemp)
tmp=$(mktemp)
grep request  $1 > $req
grep response $1 > $res
for id in `cat $req | awk '{ print $1}'` 
do    
    id=$(echo $id | tr -d "[]")
    grep "$id" $res > $tmp 
    if [[ -s $tmp ]]
    then
 grep "$id" $req >> $new
 cat $tmp >> $new 
    fi
done
mv $new $2
rm $req $res $tmp

Any idea how i can optimize/ do it differently to remove the unique rows as per above example in order to speed up the process?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

optimize the script

Hi, I have this following script below. Its searching a log file for 2 string and if found then write the strings to success.txt and If not found write strings to failed.txt . if one found and not other...then write found to success.txt and not found to failed.txt. I want to optimize this... (3 Replies)
Discussion started by: amitrajvarma
3 Replies

2. Shell Programming and Scripting

Script to delete older versions of unique files

I have directory where new sub directories and files being created every few minutes. The directories are like abc_date, def_date, ghi_date. I am looking to keep the latest 2 unique directories and delete everything else. Here is what I have so far This gives me unique names excluding the... (5 Replies)
Discussion started by: zzstore
5 Replies

3. Shell Programming and Scripting

Shell script to count unique rows in a CSV

HI All, I have a CSV file of 30 columns separated by ,. I want to get a count of all unique rows written to a flat file. The CSV file is around 5000 rows The first column is a time stamp and I need to exclude while counting unique Thanks, Ravi (4 Replies)
Discussion started by: Nani369
4 Replies

4. UNIX for Dummies Questions & Answers

Delete rows with unique value for specific column

Hi all I have a file which looks like this 1234|1|Jon|some text|some text 1234|2|Jon|some text|some text 3453|5|Jon|some text|some text 6533|2|Kate|some text|some text 4567|3|Chris|some text|some text 4567|4|Maggie|some text|some text 8764|6|Maggie|some text|some text My third column is my... (9 Replies)
Discussion started by: A-V
9 Replies

5. UNIX for Dummies Questions & Answers

Extract unique combination of rows from text files

Hi Gurus, I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those... (3 Replies)
Discussion started by: Unilearn
3 Replies

6. UNIX and Linux Applications

Script to delete few rows from a file and then update header

HJKL1Name00014300010800000418828124201 L201207022012070228XAM 00000000031795404 001372339540000000000000000000000 COOLTV KEYA Zx00 xI-50352202553 00000000 00000000 G000000000000 00000000 ... (10 Replies)
Discussion started by: mirwasim
10 Replies

7. Shell Programming and Scripting

Unique extraction of rows

I do have a tab delimited file of the following format: 431 kat1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 432 kat2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 433 KATe NA 3 NA NA 6 NA NA NA 10 11 NA NA NA NA 542 Kaed 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 543 hkwuy NA NA NA NA 6 NA NA NA NA 11 NA NA... (11 Replies)
Discussion started by: Kanja
11 Replies

8. Shell Programming and Scripting

Script to delete rows in a file

Hi All, I am new to UNIX . Please help me in writing code to delete all records from the file where all columns after cloumn 5 in file is either 0, #MI or NULL. Initial 5 columns are string e.g. "alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi "malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0... (4 Replies)
Discussion started by: alok2082
4 Replies

9. UNIX for Dummies Questions & Answers

Removing rows that contain non-unique column entry

Background: I have a file of thousands of potential SSR primers from Batch Primer 3. I can't use primers that will contain the same sequence ID or sequence as another primer. I have some basic shell scripting skills, but not enough to handle this. What you need to know: I need to remove the... (1 Reply)
Discussion started by: msatseqs
1 Replies

10. Shell Programming and Scripting

Help Optimize the Script Further

Hi All, I have written a new script to check for DB space and size of dump log file before it can be imported into a Oracle DB. I'm relatively new to shell scripting. Please help me optimize this script further. (0 Replies)
Discussion started by: narayanv
0 Replies
UNBUFFER(1)						      General Commands Manual						       UNBUFFER(1)

NAME
unbuffer - unbuffer output SYNOPSIS
unbuffer program [ args ] INTRODUCTION
unbuffer disables the output buffering that occurs when program output is redirected from non-interactive programs. For example, suppose you are watching the output from a fifo by running it through od and then more. od -c /tmp/fifo | more You will not see anything until a full page of output has been produced. You can disable this automatic buffering as follows: unbuffer od -c /tmp/fifo | more Normally, unbuffer does not read from stdin. This simplifies use of unbuffer in some situations. To use unbuffer in a pipeline, use the -p flag. Example: process1 | unbuffer -p process2 | process3 CAVEATS
unbuffer -p may appear to work incorrectly if a process feeding input to unbuffer exits. Consider: process1 | unbuffer -p process2 | process3 If process1 exits, process2 may not yet have finished. It is impossible for unbuffer to know long to wait for process2 and process2 may not ever finish, for example, if it is a filter. For expediency, unbuffer simply exits when it encounters an EOF from either its input or process2. In order to have a version of unbuffer that worked in all situations, an oracle would be necessary. If you want an application-specific solution, workarounds or hand-coded Expect may be more suitable. For example, the following example shows how to allow grep to finish pro- cessing when the cat before it finishes first. Using cat to feed grep would never require unbuffer in real life. It is merely a place- holder for some imaginary process that may or may not finish. Similarly, the final cat at the end of the pipeline is also a placeholder for another process. $ cat /tmp/abcdef.log | grep abc | cat abcdef xxxabc defxxx $ cat /tmp/abcdef.log | unbuffer grep abc | cat $ (cat /tmp/abcdef.log ; sleep 1) | unbuffer grep abc | cat abcdef xxxabc defxxx $ BUGS
The man page is longer than the program. SEE ALSO
"Exploring Expect: A Tcl-Based Toolkit for Automating Interactive Programs" by Don Libes, O'Reilly and Associates, January 1995. AUTHOR
Don Libes, National Institute of Standards and Technology 1 June 1994 UNBUFFER(1)
All times are GMT -4. The time now is 07:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy