Sponsored Content
Top Forums Shell Programming and Scripting Script to find duplicate pattern in a file irrespective of case Post 302714805 by Don Cragun on Friday 12th of October 2012 05:32:49 PM
Old 10-12-2012
This problem seems easier to me in awk than in sed:
Code:
awk -F" *:" '$1=="" {next}
{       if(list[toupper($1)]++)
                printf("%s on line %d has been seen %d times\n",
                        $1, NR, list[toupper($1)])
}' in

In case it isn't obvious what is going on here. This makes the assumption that any line starting with zero or more spaces followed by a colon is a continuation line, and any other line is the 1st line in a configuration record. It converts the 1st field to uppercase and counts how many have been seen with the name in the first field. If more than one has been seen; it reports the name, input line number, and the number of time it has been seen each time it finds a duplicate entry.

If your configuration file has comments on lines starting with a particular string, this script can easily be modified to skip them.

Last edited by Don Cragun; 10-12-2012 at 06:41 PM.. Reason: Add explanation of how it works
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find script with input pattern file

Howdy: I have a file with 140+ file name patterns. Each prefix can have dozens of files with different extension names. e.g. 1-S51 1113-G6V 1117-G6V 1119-G6V 1127-G6V 12XW-AF5W 14-UA8N I need to search in 12 directories, (/data/lgc1/basin_mas to /data/lgc12/basin_mas) for all the... (8 Replies)
Discussion started by: iguanathompson
8 Replies

2. Shell Programming and Scripting

Script to find file name for non matching pattern

Hi, I want to list only the file names which do not contain a specific keyword or search string. OS: Solaris Also is there any way ; through the same script I can save the output of search to a CSV (comma seperated) so that the file can be used for inventory purpose. Any assistance will... (5 Replies)
Discussion started by: sujoy101
5 Replies

3. UNIX for Advanced & Expert Users

Updating entire column irrespective of any data in a file

Hi, I have a file A.txt (tab separated) as below: pavan chennai/tes/bangalore 100 sunil mangalore/abc/mumbai 230 kumar delhi/nba/andhra 310 I want to change only second column as below . Rest of columns as it is ;The ouput file is also tab... (4 Replies)
Discussion started by: kpavan2004
4 Replies

4. Shell Programming and Scripting

find out duplicate records in file?

Dear All, I have one file which looks like : account1:passwd1 account2:passwd2 account3:passwd3 account1:passwd4 account5:passwd5 account6:passwd6 you can see there're two records for account1. and is there any shell command which can find out : account1 is the duplicate record in... (3 Replies)
Discussion started by: tiger2000
3 Replies

5. Shell Programming and Scripting

logrotate irrespective of the size of a file/directory

hi, How to logrotate irrespective of the size of a file/directory...? Please help me in this regard... (4 Replies)
Discussion started by: Dedeepthi
4 Replies

6. Shell Programming and Scripting

File pattern in Case

Hi , I have writen a scipt and passing one Parameter. In the scipt i want verify the parameter patteren using Case statement. exp: sh script.sh 1213 Code: i want verify the paramater values as only number not charater. can you please advise. (2 Replies)
Discussion started by: koti_rama
2 Replies

7. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header... (8 Replies)
Discussion started by: arvindosu
8 Replies

8. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

9. Shell Programming and Scripting

[Solved] Find duplicate and add pattern in sed/awk

<Update> I have the solution: sed 's/\{3\}/&;&;---;4/' The thread can be marked as solved! </Update> Hi There, I'm working on a script processing some data from a website into cvs format. There is only one final problem left I can't find a solution. I've processed my file... (0 Replies)
Discussion started by: lolworlds
0 Replies

10. Shell Programming and Scripting

awk script to find duplicate values

The data below consits of items with Class, Sub Class and Property values. I would like to find out same value being captured for different property values for a same Class/Sub Class combination (with in an Item & across items). Like 123 being captured for PAD1, PAD2, PAD4 for ABC-DEF, 456 captured... (4 Replies)
Discussion started by: aramacha
4 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [ options ] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading separators are discarded. The following options are recognized, with POSIX syntax. -a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -v n Like -a, omitting output for paired lines. -e s Replace empty output fields by string s. -1 m -2 m Join on the mth field of file1 or file2. -jn m Archaic equivalent for -n m. -ofields Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators. -tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant. EXAMPLES
sort /adm/users | join -t: -a 1 -e "" - bdays Add birthdays to password information, leaving unknown birthdays empty. The layout of is given in users(6); bdays contains sorted lines like tr : ' ' </adm/users | sort -k 3 3 >temp join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2' Print all pairs of users with identical userids. SOURCE
/sys/src/cmd/join.c SEE ALSO
sort(1), comm(1), awk(1) BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y. One of the files must be randomly accessible. JOIN(1)
All times are GMT -4. The time now is 09:06 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy