Sponsored Content
Top Forums Shell Programming and Scripting Find duplicate rows between files Post 302878763 by Selva_2507 on Sunday 8th of December 2013 09:52:17 PM
Old 12-08-2013
Find duplicate rows between files

Hi champs,

I have one of the requirement, where I need to compare two files line by line and ignore duplicates. Note, I hav files in sorted order.

I have tried using the comm command, but its not working for my scenario.

Input file1

Code:
srv1..development..employee..empname,empid,empdesg
srv1..enhancement..Team_Level..level1,level2,level3
srv0..testing..functional..test1,test2,test3
srv2..analog..analysis_tb..step1,step2,step3
srv2..Zoo..India..chennai,delhi,Agra

Input file2

Code:
srv1..development..employee..empname,empid,empdesg
srv1..enhancement..Team_Level..level1,level2,level3
srv2..analog..analysis_tb..step1,step2,step3
srv0..testing..functional..test1,test2,test3
srv2..Zoo..India..chennai,delhi,Agra

Using the command:

comm -13 Inputfile1 Inputfile2

Final output file which I am getting

Code:
srv1..development..employee..empname,empid,empdesg,empsalary
srv2..analog..analysis_tb..step1,step2,step3
srv0..testing..functional..test1,test2,test3
srv2..Zoo..India..chennai,delhi,Agra

But I need output as below:

Code:
srv1..development..employee..empname,empid,empdesg,empsalary

Note: need to ignore duplicate lines by checking these delimiters (.. & ,) irrespective of checking line by line.

Please help me here..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies

2. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

3. Shell Programming and Scripting

Duplicate rows in CSV files based on values

I want to duplicate a row if found two or more values in a particular column for corresponding row which is delimitted by comma. Input abc,line one,value1 abc,line two, value1, value2 abc,line three,value1 needs to converted to abc,line one,value1 abc,line two, value1 abc,line... (8 Replies)
Discussion started by: Incrediblian
8 Replies

4. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

5. Ubuntu

delete duplicate rows with awk files

Hi every body I have some text file with a lots of duplicate rows like this: 165.179.568.197 154.893.836.174 242.473.396.153 165.179.568.197 165.179.568.197 165.179.568.197 154.893.836.174 how can I delete the repeated rows? Thanks Saeideh (2 Replies)
Discussion started by: sashtari
2 Replies

6. Shell Programming and Scripting

Duplicate rows in CSV files based on values

I am new to this forum and this is my first post. I am looking at an old post with exactly the same name. Can not paste URL because I do not have 5 posts My requirement is exactly opposite. I want to get rid of duplicate rows and try to append the values of columns in those rows ... (10 Replies)
Discussion started by: vbhonde11
10 Replies

7. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

8. Shell Programming and Scripting

Find duplicate files but with different extensions

Hi ! I wonder if anyone can help on this : I have a directory: /xyz that has the following files: chsLog.107.20130603.gz chsLog.115.20130603 chsLog.111.20130603.gz chsLog.107.20130603 chsLog.115.20130603.gz As you ca see there are two files that are the same but only with a minor... (10 Replies)
Discussion started by: fretagi
10 Replies

9. Shell Programming and Scripting

Find help in shell - that clears away duplicate files

I am so frustrated!!! I want a nice command that clears away duplicate files: find . -type f -regex '.*{1,3}\..*' | xargs -I## rm -v '##' should work in my opinion. But it finds nothing even though I have files that have the file name: Scooby-Doo-1.txt Himalaya-2.jpg Camping... (8 Replies)
Discussion started by: Mr.Glaurung
8 Replies

10. UNIX for Advanced & Expert Users

AIX find duplicate backup files

I would like find and delete old backup files in aix. How would I go about doing this? For example: server1_1-20-2020 server1_1-21-2020 server1_1-22-2020 server1_1-23-2020 server2_1-20-2020 server2_1-21-2020 server2_1-22-2020 server2_1-23-2020 How would I go about finding and... (3 Replies)
Discussion started by: cokedude
3 Replies
nisprefadm(1M)						  System Administration Commands					    nisprefadm(1M)

NAME
nisprefadm - NIS+ utility to set server preferences for NIS+ clients SYNOPSIS
/usr/bin/nisprefadm -a {-L | -G} [-o opt-string] [-d domain] [-C client] server... /usr/bin/nisprefadm -m {-L | -G} [-o opt-string] [-d domain] [-C client] oldserver=newserver... /usr/bin/nisprefadm -r {-L | -G} [-o opt-string] [-d domain] [-C client] server... /usr/bin/nisprefadm -u {-L | -G} [-o opt-string] [-d domain] [-C client] server... /usr/bin/nisprefadm -x {-L | -G} [-d domain] [-C client] /usr/bin/nisprefadm -l {-L | -G} [-d domain] [-C client] /usr/bin/nisprefadm -F DESCRIPTION
nisprefadm defines which servers are to be preferred by NIS+ clients. This information is used by nis_cachemgr(1M) to control the order in which it selects which server to use for a particular domain. On a client system, the cache manager first looks for a local preferred server list in /var/nis. If it doesn't find one, it looks for an entry with its host name in the NIS+ table. Finally, if it doesn't find it there, it looks for an entry for its subnet. By default, nis_cachemgr puts all servers that are on the same subnet as the client system (that is, local servers) are on the preferred server list. In some cases this default preferred server list is inadequate. For example, if all of the servers for a domain are remote, but some are closer than others, the cache manager should try to select the closer one. Because the cache manager has no reliable way to determine the distance to remote servers, nisprefadm is used to provide this information. The preferred server information is stored either globally in a NIS+ table (with the -G option) or locally in a file, /var/nis/client_info (with the -L option). It is preferable to store the information globally so that it can be used by all clients on a subnet. The nis_cachemgr process on a client machine reloads the preferred server information periodically, depending on the machine's setup. If the local file is used, the information is reloaded every 12 hours. If the global table is used, the information is reloaded based on the TTL value of the client information table. This TTL value can be changed using nischttl(1). If you want your changes to take effect immedi- ately, use the nisprefadm -F command. When changing local information (-L), nisprefadm automatically forces nis_cachemgr to reload the information. The cache manager assigns weights to all of the servers on the preferred list. By default, local servers (that is, servers on the same sub- net) are given a weight of 0. Other servers are given the weight, "infinite". This can be changed by using the nisprefadm command and giv- ing a weight in parentheses after the server name. When selecting a server for a domain, the cache manager first tries to contact the servers with the lowest weight. If it doesn't get a response, it tries the servers with the next lowest weight, and so on. If it fails to get a response from any of the preferred servers, it tries to contact the non-preferred servers. The use of weights gives fine control over the server selection process, but care must be given to avoid assigning too many different weights. For example, if weights 0, 1, 2, and 3 are used, but all of the servers with weight 0, 1, and 2, are unavailable, then there will be a noticeable delay in selecting a server. This is because the cache manager waits 5 seconds for a response at each weight level before moving on to the next one. As a general rule, one or two weight levels provides a good balance of server selection control and performance. When specifying a server name, it is not necessary to fully qualify the name. When the cache manager tries to access a domain, it compares the list of servers for the domain with the list of preferred servers. It will find a match if a preferred server name is a prefix of the name of a server for the domain. If a domain is served by two servers with the same prefix, the preferred server name must include enough of the domain name to distinguish the two. The nis_cachemgr(1M) process automatically adds local servers (same subnet as the client) to the preferred server list with a weight of 0. Thus, it is not necessary to specify them, though it does no harm. If you specify a weight for a server, you probably should quote the parentheses to avoid having the shell interpret them. The following command illustrates this: example% nisprefadm -G -a -C client1 "srv1(2)" In general, nis_cachemgr does a fairly good job of selecting servers on its own. Therefore, the use of nisprefadm is not usually neces- sary. Some situations in which it is recommended are: No local servers, many remote servers In this case, nis_cachemgr needs to choose one of the remote servers. Because it doesn't have information on which is closest, it sends a ping to all of them and then selects the one that responds fastest. This may not always select the best server. If some of the servers are closer to the client than the others, they should be listed as preferred servers so that nis_cachemgr will try them first. This reduces the amount of network traffic for selecting a server. Very remote servers In some networks there are NIS+ servers that are only reachable through very slow network connections. It is usually best to avoid unnecessary traffic over that connection. If the pref_type=pref_only option is set along with preferred servers, then only the pre- ferred servers are contacted for domains they serve. The non-preferred servers are not tried at all; even if all of the preferred servers are unavailable. For domains that are not served by any of the preferred servers, the pref_only option is ignored. OPTIONS
In the SYNOPSIS, when several options are surrounded by braces (that is, by `{' and `}') one of the options must be specified. -a Add the specified servers to the preferred server list. -C client Store the preferred server information with the key, client. The client can be either a hostname or a subnet number. When a hostname is specified, the preferred server information applies to that host only. When a subnet is specified, the pre- ferred server information applies to all clients on that subnet. The cache manager searches for host specific entries first. It only searches for subnet entries if no host entry is found. If this option is not specified, then the hostname of the machine on which the command is run is used. -d domain Specify the domain to which the command is to apply. -F Tells nis_cachemgr(1M) to refresh its preferred server information. The program periodically does this anyway, but this option forces it to do the refresh immediately. When updating the local information, nis_cachemgr automatically refreshes the preferred server information. This option must be executed as root. -l List the current preferred server information. -L | -G Store the preferred server information locally in the file, /var/nis/client_info (the -L option), or globally in a NIS+ ta- ble client.info.org-dir.domain (the -G option). If the information is stored locally, then it only applies to the system on which the command is run. If it is stored globally then it can apply to all systems on a subnet (depending on the value of the -C option). The -L option must be run as root. -m Modify the preferred server list. The server specified by oldserver is replaced by newserver. This is typically used to change the weight for a server. -o Specify additional options to control server selection. Currently the only valid option is pref_type, which can have a value of either all (the default) or pref_only. If the value is all, then the cache manager tries to contact non-preferred servers if all of the preferred servers fail to respond. If pref_only is specified, then it won't try non-preferred servers. The only exception to this is when a domain is not served by any of the preferred servers. In this case, the cache manager ignores the option. This is to avoid requiring that preferred servers be defined for every domain. -r Remove the specified servers from the preferred server list. -u Clear the list of preferred servers and then add the specified servers to the preferred server list. -x Remove the preferred server information completely. RETURN VALUES
nisprefadm returns the following values: 0 On success. 1 On failure. EXAMPLES
Example 1: Using nisprefadm This command sets the preferred server list for the system on which it is run: example% nisprefadm -L -a srv1 srv2 The information is stored in a file, /var/nis/client_info, so it will only affect this one system. The following command has the same effect, but the information is stored in a NIS+ table in the default domain. example% nisprefadm -G -a srv1 srv2 As a system administrator, you might want to set the preferred server information for a client system other than the one you are running the command on. The following command sets the preferred server information for a client system named client1: example% nisprefadm -G -a -C client1 srv1 srv2 It is common for all client systems on a subnet to use the same set of preferred servers. The following command sets a preferred server list that applies to all clients on subnet, 192.85.18.0: example% nisprefadm -G -a -C 192.85.18.0 srv1 srv2 ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ SEE ALSO
nischttl(1), nis_cachemgr(1M), attributes(5) NOTES
NIS+ might not be supported in future releases of the SolarisTM Operating Environment. Tools to aid the migration from NIS+ to LDAP are available in the Solaris 9 operating environment. For more information, visit http://www.sun.com/directory/nisplus/transition.html. SunOS 5.10 12 Dec 2001 nisprefadm(1M)
All times are GMT -4. The time now is 09:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy