Sponsored Content
Full Discussion: Multiple line duplicates
Top Forums UNIX for Dummies Questions & Answers Multiple line duplicates Post 302611243 by birei on Thursday 22nd of March 2012 12:18:21 PM
Old 03-22-2012
Hi maximus73,

So you want to learn, that's good. Here is an awk script that does what you wish, if I understood correctly your question (code is commented):
Code:
$ cat infile
xyc 1
xyd 1
xye 1
xyf 1
xyf 1
xyf 1
xyf 2
xyg 1
xyh 1
leg 2
$ cat script.awk
## For every line...
{
        ## Concatenate each number in a hash. The key will
        ## be the first field.
        data[ $1 ] = data[ $1 ] $2
}

## After processing file...
END {
        ## Variable that indicates if the field has not both numbers.
        bad = 0

        ## Go throught the hash.
        for ( idx in data ) {

                ## Check if both numbers are found, when one of them fails, set
                ## 'bad' variable and break the loop.
                for ( i = 1; i <= 2; i++) {
                        if ( ! index( data[idx], i ) ) {
                                bad = 1
                                break
                        }
                }

                ## Check 'bad' variable. If unset, both numbers were found, so
                ## print the key of the hash and reset variable for next loop.
                if ( ! bad ) {
                        printf "%s\n", idx
                }
                bad = 0
        }
}
$ awk -f script.awk infile
xyf

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

identifying duplicates line & reporting their line number

I need to find to find duplicate lines in a document and then print the line numbers of the duplicates The files contain multiple lines with about 100 numbers on each line I need something that will output the line numbers where duplicates were found ie 1=5=7, 2=34=76 Any suggestions would be... (5 Replies)
Discussion started by: stresslog
5 Replies

2. Shell Programming and Scripting

How can i delete the duplicates based on one column of a line

I have my data something like this (08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb (08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa (08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts (08/03/2009 22:57:42.425)(:) Ravi... (11 Replies)
Discussion started by: rdhanek
11 Replies

3. Shell Programming and Scripting

Scanning columns for duplicates and printing in one line

Description of data: NC_002737.1 4 F1VI4M001A3IAU F1VI4M001A3IAU F1VI4M001A3IAU F1VI4M001A3IAU NC_006372.1 5 F1VI4M001BH0HY FF1VI4M001BH0HY F1VI4M001C0ZC5 F1VI4M001DOF2X F1VI4M001AYNTS Every field in every record is tab separated There can be "n" columns. Problem: What I want to... (4 Replies)
Discussion started by: Deep9000
4 Replies

4. Shell Programming and Scripting

Search Duplicates, Print Line #

Masters, I have a text file in the following format. vrsonlviee RVEBAALSKE lyolzteglx UUOSIWMDLR pcybtapfee DKGFJBHBJO ozhrucfeau YQXATYMGJD cjwvjolrcv YDHALRYQTG mdukphspbc CQZRIOWEUB nbiqomzsgw DYSUBQSSPZ xovgvkneav HJFQQYBLAF boyyzdmzka BVTVUDHSCR vrsonlviee TGTKUCUYMA... (2 Replies)
Discussion started by: genehunter
2 Replies

5. Shell Programming and Scripting

delete from line and remove duplicates

My Input.....file1 ABCDE4435 Connected to 107.71.136.122 (SubNetwork=ONRM_RootMo_R SubNetwork=XYVLTN29CRBR99 MeContext=ABCDE4435 ManagedElement=1) ABCDE4478 Connected to 166.208.30.57 (SubNetwork=ONRM_RootMo_R SubNetwork=KLFMTN29CR0R04 MeContext=ABCDE4478 ManagedElement=1) ABCDE4478... (5 Replies)
Discussion started by: pareshkp
5 Replies

6. UNIX for Dummies Questions & Answers

script to remove duplicates per line

Hello experts! I'd like a way to remove duplicates per line. Strings are enclosed in brackets, and I would prefer to maintain the order of the file: example input (56)(63) (56)(70)(56)(70)(24) (25)(78) (12)(33)(12) (10) (10) desired output (56)(63) (56)(70)(24) (25)(78)... (5 Replies)
Discussion started by: torchij
5 Replies

7. Shell Programming and Scripting

Removing duplicates in fixed width file which has multiple key columns

Hi All , I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file . File has 8 columns. Key columns are col1 and col2. Col1 has the length of 8 col 2 has the length of 3. ... (5 Replies)
Discussion started by: saj
5 Replies

8. Shell Programming and Scripting

Find All duplicates based on multiple keys

Hi All, Input.txt 123,ABC,XYZ1,A01,IND,I68,IND,NN 123,ABC,XYZ1,A01,IND,I67,IND,NN 998,SGR,St,R834,scot,R834,scot,NN 985,SGR0399,St,R180,T15,R180,T1,YY 985,SGR0399,St,R180,T15,R180,T1,NN 985,SGR0399,St,R180,T15,R180,T1,NN 2943,SGR?99,St,R68,Scot,R77,Scot,YY... (2 Replies)
Discussion started by: unme
2 Replies

9. Shell Programming and Scripting

Match a char with duplicates in a line and replace one of them

Hi, i have a huge file that need to check for a pattern that occur more than once in a line like below:- #lkk>cd-m>A0DV0>192.134.1.1 blablabladsdjsk jshdfskfslfs #lqk>cd-m>A1SV0>192.14.11.1 blalalbnalablab balablablajakjakjakja #pldqw>sf-w>PH67FR>168.55.1.1 balablabala... (5 Replies)
Discussion started by: redse171
5 Replies

10. UNIX for Beginners Questions & Answers

Find duplicates in file with line numbers

Hello All, This is a noob question. I tried searching for the answer but the answer found did not help me . I have a file that can have duplicates. 100 200 300 400 100 150 the number 100 is duplicated twice. I want to find the duplicate along with the line number. expected... (4 Replies)
Discussion started by: vatigers
4 Replies
bidiv(1)							       Ivrix								  bidiv(1)

NAME
bidiv - bidirectional text filter SYNOPSIS
bidiv [ -plj ] [ -w width ] [file...] DESCRIPTION
bidiv is a filter, or viewer, for birectional text stored in logical-order. It converts such text into visual-order text which can be viewed on terminals that do not handle bidirectionality. The output visual-order text is formatted assuming a fixed number of characters per line (automatically determined or given with the -w parameter). bidiv is oriented towards Hebrew, and assumes the input to be a Hebrew and ASCII text encoded in one of the two common logical-order encod- ings: ISO-8859-8-i or UTF-8. Actually, bidiv guesses the encoding of its input at a character by character basis, so the input might be a mix of ISO-8859-8-i and Hebrew UTF-8. bidiv's output is visual-order text, in either the ISO-8859-8 or UTF-8 encoding, depending on your locale setting. bidiv reads each file in sequence, converts it into visual order and writes it on the standard output. Thus: $ bidiv file prints file on your terminal (assuming it has the appropriate fonts, but no bidirectionality support), and: $ bidiv file1 file2 | less concatenates file1 and file2, and shows the results using the pager less. If no input file is given, bidiv reads from the standard input file. For more ideas on how to use bidiv, see the EXAMPLES section below. OPTIONS
-p Paragraph-based direction (default): When formatting a bidirectional output line, bidiv needs to be aware of that line's base direc- tion. A line whose base direction is RTL (right to left) gets right-justified and its first element appears on the right. Otherwise, the line is left-justified and its first element appears on the left. The -p option tells bidiv to choose a base direction per paragraph, where a paragraph is delimited by an empty line. This is bidiv's default behavior, and usually gives the expected results on most texts and emails. The direction of the entire paragraph is chosen according to the first strongly-directioned character (i.e., an alphabetic charac- ter) appearing in the paragraph. Currently, if the first output line of a paragraph has no directional characters (e.g., a line of minus signs before an email signature, or a line containing only numbers) that line is output with the same direction of the previ- ous paragraph, but it does not determine the direction of the rest of the paragraph. If the first line of the first paragraph does not have a direction, the RTL direction is arbitrarily chosen. -l Line-based direction: This option choose an alternative method of choosing each output line's base direction. When this option is enabled, the base direction of each output line is determined on its own (again, according to the first character on the line with a strong direction). This method may give wrong results in the case where a line starts with a word of the opposite direction. This case is rare, but does happen under random line-splitting circumstances, or when the text is defining words of a foreign language. -j Do not justify: By default, RTL lines are right-justified, i.e., they are padded with spaces on the left when shorter than the required line width (see the -w option). The -j option tells bidiv not to preform this justifications, and leave short lines unpadded. -w width bidiv formats its output for lines of the given width. Lines are split when longer than this width, and RTL lines are right-justfied to fill that width unless the -j option is given. When the -w option is not given, bidiv uses the value of the COLUMNS variable, which is usually automatically defined by the user's shell. When that both the -w option and the COLUMNS variable are missing, the default of 80 columns is used. OPERANDS
The following operand is supported: file A path name of an input file. If no file is specified, the standard input is used. EXAMPLES
1. bidiv README | less 2. man something | bidiv | less (or groff -man -Tlatin1 something.1 |sed 's/.^H(.)/1/g' |../bidiv -w 65) 3. set "bidiv" as a filter for your mail program (mutt, pine, etc.) for viewing mail with the ISO 8859-8-i character set, and Hebrew UTF-8 mail. ENVIRONMENT
COLUMNS see -w option. EXIT STATUS
The following exit values are returned: 0 All input files were output successfully. >0 An error occurred. AUTHOR
Written by Nadav Har'El, http://nadav.harel.org.il. Please send bug reports and comments to nyh@math.technion.ac.il. The latest version of this software can be found in ftp://ftp.ivrix.org.il/pub/ivrix/src/cmdline SEE ALSO
cat(1), fribidi(3) Bidiv 7 Jan 2006 bidiv(1)
All times are GMT -4. The time now is 08:19 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy