12-01-2007
This is my script code
It's very simple shell script
cut -f2 -d";" $1 > /tmp/mdn1
sort /tmp/mdn1 | uniq -d > /tmp/mdn2
cat /tmp/mdn2 | while read line;
do
echo $line > /tmp/mdn3
x=`cut -f1 -d" " /tmp/mdn3`
echo $x
y=`grep "$x" "$1"`
echo $y >> duplicate
done
rm -f /tmp/mdn*
$1 is the input file and duplicate is the output file.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
hi all
can anyone please let me know if there is a way to find out duplicate rows in a file. i have a file that has hundreds of numbers(all in next row).
i want to find out the numbers that are repeted in the file.
eg.
123434
534
5575
4746767
347624
5575
i want 5575
please help (3 Replies)
Discussion started by: infyanurag
3 Replies
2. Shell Programming and Scripting
I have searched the internet for duplicate row extracting.
All I have seen is extracting good rows or eliminating duplicate rows.
How do I extract duplicate rows from a flat file in unix.
I'm using Korn shell on HP Unix.
For.eg.
FlatFile.txt
========
123:456:678
123:456:678
123:456:876... (5 Replies)
Discussion started by: bobbygsk
5 Replies
3. HP-UX
Hi all,
I have written one shell script. The output file of this script is having sql output.
In that file, I want to extract the rows which are having multiple entries(duplicate rows).
For example, the output file will be like the following way.
... (7 Replies)
Discussion started by: raghu.iv85
7 Replies
4. Shell Programming and Scripting
Hi! I have a file as below:
line1
line2
line2
line3
line3
line3
line4
line4
line4
line4
I would like to extract duplicate lines (not unique, triplicate or quadruplicate lines). Output will be as below:
line2
line2
I would appreciate if anyone can help. Thanks. (4 Replies)
Discussion started by: chromatin
4 Replies
5. Shell Programming and Scripting
Hi,
In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.
In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies
6. Shell Programming and Scripting
Hi,
I have data like below.
SID=D6EB96CC0
HID=9C246D6
CSource=xya
Cappe=1
Versionc=3670
MAR1=STL
MARS2=STL
REQ_BUFFER_ENCODING=UTF-8
REQ_BUFFER_ORIG_ENCODING=UTF-8
RESP_BODY_ENCODING=UTF-8
CON_ID=2713
I want to select
CSource=xya (18 Replies)
Discussion started by: chetan.c
18 Replies
7. Shell Programming and Scripting
Hi,
This is a followup to my earlier post
him mno klm 20 76 . + . klm_mango unix_00000001;
alp fdc klm 123 456 . + . klm_mango unix_0000103;
her tkr klm 415 439 . + . klm_mango unix_00001043;
abc tvr klm 20 76 . + . klm_mango unix_00000001;
abc def klm 83 84 . + . klm_mango... (5 Replies)
Discussion started by: jacobs.smith
5 Replies
8. Shell Programming and Scripting
Hi All,
I need to extract duplicate rows from a file and write these bad records into another file. And need to have a count of these bad records.
i have a command
awk '
{s++}
END {
for(i in s) {
if(s>1) {
print i
}
}
}' ${TMP_DUPE_RECS}>>${TMP_BAD_DATA_DUPE_RECS}... (5 Replies)
Discussion started by: Arun Mishra
5 Replies
9. Shell Programming and Scripting
Gents
Can you help please.
Input file
5490921425 1 7 1310342 54909214251
5490921425 2 1 1 54909214252
5491120937 1 1 3 54911209371
5491120937 3 1 1 54911209373
5491320785 1 ... (4 Replies)
Discussion started by: jiam912
4 Replies
10. Shell Programming and Scripting
Hello
I have a file like this:
> cat examplefile
ghi|NN603762|eee
mno|NN607265|ttt
pqr|NN613879|yyy
stu|NN615002|uuu
jkl|NN607265|rrr
vwx|NN615002|iii
yzA|NN618555|ooo
def|NN190486|www
BCD|NN628717|ppp
abc|NN190486|qqq
EFG|NN628717|aaa
HIJ|NN628717|sss
>
I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies
LEARN ABOUT OPENDARWIN
cut
CUT(1) BSD General Commands Manual CUT(1)
NAME
cut -- select portions of each line of a file
SYNOPSIS
cut -b list [-n] [file ...]
cut -c list [file ...]
cut -f list [-d delim] [-s] [file ...]
DESCRIPTION
The cut utility selects portions of each line (as specified by list) from each file and writes them to the standard output. If no file argu-
ments are specified, or a file argument is a single dash ('-'), cut reads from from the standard input. The items specified by list can be
in terms of column position or in terms of fields delimited by a special character. Column numbering starts from 1.
The list option argument is a comma or whitespace separated set of increasing numbers and/or number ranges. Number ranges consist of a num-
ber, a dash ('-'), and a second number and select the fields or columns from the first number to the second, inclusive. Numbers or number
ranges may be preceded by a dash, which selects all fields or columns from 1 to the first number. Numbers or number ranges may be followed
by a dash, which selects all fields or columns from the last number to the end of the line. Numbers and number ranges may be repeated, over-
lapping, and in any order. It is not an error to select fields or columns not present in the input line.
The options are as follows:
-b list
The list specifies byte positions.
-c list
The list specifies character positions.
-d delim
Use the first character of delim as the field delimiter character instead of the tab character.
-f list
The list specifies fields, delimited in the input by a single tab character. Output fields are separated by a single tab character.
-n Do not split multi-byte characters.
-s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified.
ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of cut if the -n option is specified. Their effect is described in
environ(7).
EXAMPLES
Extract users' login names and shells from the system passwd(5) file as ``name:shell'' pairs:
cut -d : -f 1,7 /etc/passwd
Show the names and login times of the currently logged in users:
who | cut -c 1-16,26-38
DIAGNOSTICS
The cut utility exits 0 on success, and >0 if an error occurs.
SEE ALSO
paste(1)
STANDARDS
The cut utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
HISTORY
A cut command appeared in AT&T System III UNIX.
BUGS
The -c option is a synonym for the -b option, which causes incorrect behaviour in locales that support multibyte characters.
When operating on fields (-f option is specified), cut does not recognise multibyte characters, and the delim character is recognised in the
middle of multibyte sequences.
BSD
June 6, 1993 BSD