04-02-2009
How to get Duplicate rows in a file
Hi all,
I have written one shell script. The output file of this script is having sql output.
In that file, I want to extract the rows which are having multiple entries(duplicate rows).
For example, the output file will be like the following way.
===============================================================
<SH12_MC30_CE_VS_NY_HIST_T>
===============================================================
397 44847
400 33653
401 46455
===============================================================
<SH12_MC30_CE_VS_NY_HIST_T_BKP>
===============================================================
397 44847
398 40107
399 39338
400 33653
In this output, I want numeric duplicate rows only. Suppose this file is having lines to separate the values, those lines also considered as duplicate rows. So I want only the out put from this file which is having more than one entry and which is related to numbers.
Can anyone please tell me the command?
Thanks in advance.
Regards,
Raghu.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
hi all
can anyone please let me know if there is a way to find out duplicate rows in a file. i have a file that has hundreds of numbers(all in next row).
i want to find out the numbers that are repeted in the file.
eg.
123434
534
5575
4746767
347624
5575
i want 5575
please help (3 Replies)
Discussion started by: infyanurag
3 Replies
2. Shell Programming and Scripting
I have a file content like below.
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies
3. UNIX for Dummies Questions & Answers
Hi,
I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g.
COL1 COL2 COL3
A 1234 1234
B 3k32 2322
C Xk32 TTT
A NEW XX22
B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies
4. Shell Programming and Scripting
I have searched the internet for duplicate row extracting.
All I have seen is extracting good rows or eliminating duplicate rows.
How do I extract duplicate rows from a flat file in unix.
I'm using Korn shell on HP Unix.
For.eg.
FlatFile.txt
========
123:456:678
123:456:678
123:456:876... (5 Replies)
Discussion started by: bobbygsk
5 Replies
5. Shell Programming and Scripting
Hi,
I have a log file having size of 48mb.
For such a large log file. I want to get the message in a particular format which includes only unique error and exception messages.
The following things to be done :
1) To remove all the date and time from the log file
2) To remove all the... (1 Reply)
Discussion started by: Pank10
1 Replies
6. Shell Programming and Scripting
Hi! I have a file as below:
line1
line2
line2
line3
line3
line3
line4
line4
line4
line4
I would like to extract duplicate lines (not unique, triplicate or quadruplicate lines). Output will be as below:
line2
line2
I would appreciate if anyone can help. Thanks. (4 Replies)
Discussion started by: chromatin
4 Replies
7. Shell Programming and Scripting
notes: i am using cygwin and notepad++ only for checking this and my OS is XP.
#!/bin/bash
typeset -i totalvalue=(wc -w /cygdrive/c/cygwinfiles/database.txt)
typeset -i totallines=(wc -l /cygdrive/c/cygwinfiles/database.txt)
typeset -i columnlines=`expr $totalvalue / $totallines`
awk -F' ' -v... (5 Replies)
Discussion started by: whitecross
5 Replies
8. Shell Programming and Scripting
Hi,
This is a followup to my earlier post
him mno klm 20 76 . + . klm_mango unix_00000001;
alp fdc klm 123 456 . + . klm_mango unix_0000103;
her tkr klm 415 439 . + . klm_mango unix_00001043;
abc tvr klm 20 76 . + . klm_mango unix_00000001;
abc def klm 83 84 . + . klm_mango... (5 Replies)
Discussion started by: jacobs.smith
5 Replies
9. Shell Programming and Scripting
Hello
I have a file with contents like this...
Part1 Field2 Field3 Field4 (line1)
Part2 Field2 Field3 Field4 (line2)
Part3 Field2 Field3 Field4 (line3)
Part1 Field2 Field3 Field4 (line4)
Part4 Field2 Field3 Field4 (line5)
Part5 Field2 Field3 Field4 (line6)
Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies
10. UNIX for Beginners Questions & Answers
How can i get the duplicates rows from a file using unix, for example i have data like
a,1
b,2
c,3
d,4
a,1
c,3
e,5
i want output to be like
a,1
c,3 (4 Replies)
Discussion started by: ggupta
4 Replies
LEARN ABOUT OPENDARWIN
rs
RS(1) BSD General Commands Manual RS(1)
NAME
rs -- reshape a data array
SYNOPSIS
rs [-[csCS][x] [kKgGw][N] tTeEnyjhHmz] [rows [cols]]
DESCRIPTION
The rs utility reads the standard input, interpreting each line as a row of blank-separated entries in an array, transforms the array accord-
ing to the options, and writes it on the standard output. With no arguments it transforms stream input into a columnar format convenient for
terminal viewing.
The shape of the input array is deduced from the number of lines and the number of columns on the first line. If that shape is inconvenient,
a more useful one might be obtained by skipping some of the input with the -k option. Other options control interpretation of the input col-
umns.
The shape of the output array is influenced by the rows and cols specifications, which should be positive integers. If only one of them is a
positive integer, rs computes a value for the other which will accommodate all of the data. When necessary, missing data are supplied in a
manner specified by the options and surplus data are deleted. There are options to control presentation of the output columns, including
transposition of the rows and columns.
The following options are available:
-cx Input columns are delimited by the single character x. A missing x is taken to be `^I'.
-sx Like -c, but maximal strings of x are delimiters.
-Cx Output columns are delimited by the single character x. A missing x is taken to be `^I'.
-Sx Like -C, but padded strings of x are delimiters.
-t Fill in the rows of the output array using the columns of the input array, that is, transpose the input while honoring any rows and
cols specifications.
-T Print the pure transpose of the input, ignoring any rows or cols specification.
-kN Ignore the first N lines of input.
-KN Like -k, but print the ignored lines.
-gN The gutter width (inter-column space), normally 2, is taken to be N.
-GN The gutter width has N percent of the maximum column width added to it.
-e Consider each line of input as an array entry.
-n On lines having fewer entries than the first line, use null entries to pad out the line. Normally, missing entries are taken from
the next line of input.
-y If there are too few entries to make up the output dimensions, pad the output by recycling the input from the beginning. Normally,
the output is padded with blanks.
-h Print the shape of the input array and do nothing else. The shape is just the number of lines and the number of entries on the first
line.
-H Like -h, but also print the length of each line.
-j Right adjust entries within columns.
-wN The width of the display, normally 80, is taken to be the positive integer N.
-m Do not trim excess delimiters from the ends of the output array.
-z Adapt column widths to fit the largest entries appearing in them.
With no arguments, rs transposes its input, and assumes one array entry per input line unless the first non-ignored line is longer than the
display width. Option letters which take numerical arguments interpret a missing number as zero unless otherwise indicated.
EXAMPLES
The rs utility can be used as a filter to convert the stream output of certain programs (e.g., spell, du, file, look, nm, who, and wc(1))
into a convenient ``window'' format, as in
% who | rs
This function has been incorporated into the ls(1) program, though for most programs with similar output rs suffices.
To convert stream input into vector output and back again, use
% rs 1 0 | rs 0 1
A 10 by 10 array of random numbers from 1 to 100 and its transpose can be generated with
% jot -r 100 | rs 10 10 | tee array | rs -T > tarray
In the editor vi(1), a file consisting of a multi-line vector with 9 elements per line can undergo insertions and deletions, and then be
neatly reshaped into 9 columns with
:1,$!rs 0 9
Finally, to sort a database by the first line of each 4-line field, try
% rs -eC 0 4 | sort | rs -c 0 1
SEE ALSO
jot(1), pr(1), sort(1), vi(1)
BUGS
Handles only two dimensional arrays.
The algorithm currently reads the whole file into memory, so files that do not fit in memory will not be reshaped.
Fields cannot be defined yet on character positions.
Re-ordering of columns is not yet possible.
There are too many options.
BSD
December 30, 1993 BSD