Sponsored Content
Top Forums Shell Programming and Scripting remove duplicate files in a directory Post 101836 by asinha63 on Monday 13th of March 2006 02:32:22 PM
Old 03-13-2006
CPU & Memory remove duplicate files in a directory

Hi ppl.
I have to check for duplicate files in a directory .
the directory has following files
/the/folder /containing/the/file
a1.yyyymmddhhmmss
a1.yyyyMMddhhmmss
b1.yyyymmddhhmmss
b2.yyyymmddhhmmss
c.yyyymmddhhmmss
d.yyyymmddhhmmss
d.yyyymmddhhmmss

where the date time stamp can be different for the same file hence the risk of duplicate files.
How do i make the validate so that there are no duplicate files to be processed .
Anubha

Last edited by asinha63; 03-14-2006 at 10:25 AM.. Reason: to make the issue clearer
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

script that detects duplicate files in directory

I need help with a script which accepts one argument and goes through all the files under a directory and prints a list of possible duplicate files As its output, it prints zero or more lines, each one containing a space-separated list of filenames. All the files listed on one line have the same... (1 Reply)
Discussion started by: trueman82
1 Replies

2. Shell Programming and Scripting

remove all duplicate lines from all files in one folder

Hi, is it possible to remove all duplicate lines from all txt files in a specific folder? This is too hard for me maybe someone could help. lets say we have an amount of textfiles 1 or 2 or 3 or... maximum 50 each textfile has lines with text. I want all lines of all textfiles... (8 Replies)
Discussion started by: lowmaster
8 Replies

3. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Discussion started by: spangberg
1 Replies

4. Shell Programming and Scripting

Remove duplicate files in same directory

Hi all. Am doing continuous backup of mailboxes using rsync. So whenever a new mail arrives it is automatically copied on backup server. When a new mail arrives it is named as xyz:2, when it is read by the email client an S is appended xyz:2,S Eventually , 2 copies of the same file exist on... (7 Replies)
Discussion started by: coolatt
7 Replies

5. Shell Programming and Scripting

Remove Duplicate Files On Remote Servers

Hello, I wrote a basic script that works however I am was wondering if it could be sped up. I am comparing files over ssh to remove the file from the source server directory if a match occurs. Please Advise me on my mistakes. #!/bin/bash for file in `ls /export/home/podcast2/"$1" ` ; do ... (5 Replies)
Discussion started by: jaysunn
5 Replies

6. Shell Programming and Scripting

perl/shell need help to remove duplicate lines from files

Dear All, I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files. Duplicate values may come in different files.... all files laying in single directory.. Need help to... (3 Replies)
Discussion started by: arvindng
3 Replies

7. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

8. Shell Programming and Scripting

Remove duplicate files

Hi, In a directory, e.g. ~/corpus is a lot of files and subdirectories. Some of the files are named: 12345___PP___0902___AA.txt 12346___PP___0902___AA. txt 12347___PP___0902___AA. txt The amount of files varies. I need to keep the highest (12347___PP___0902___AA. txt) and remove... (5 Replies)
Discussion started by: corfuitl
5 Replies

9. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies

10. Shell Programming and Scripting

Remove all but newest two files (Not a duplicate post)

TARGET_DIR='/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/' REGEX='{4}-{2}-{2}_{2}:{2}' # regular expression that match to: date '+%Y-%m-%d_%H:%M' LATEST_FILE="$(ls "$TARGET_DIR" | egrep "^${REGEX}$" | tail -1)" find "$TARGET_DIR" ! -name "$LATEST_FILE" -type f -regextype egrep -regex... (7 Replies)
Discussion started by: drew77
7 Replies
audit.log(4)							   File Formats 						      audit.log(4)

NAME
audit.log - audit trail file SYNOPSIS
#include <bsm/audit.h> #include <bsm/audit_record.h> DESCRIPTION
audit.log files are the depository for audit records stored locally or on an on an NFS-mounted audit server. These files are kept in direc- tories named in the file audit_control(4) using the dir option. They are named to reflect the time they are created and are, when possible, renamed to reflect the time they are closed as well. The name takes the form yyyymmddhhmmss.not_terminated.hostname when open or if the auditd(1M) terminated ungracefully, and the form yyyymmddhhmmss.yyyymmddhhmmss.hostname when properly closed. yyyy is the year, mm the month, dd day in the month, hh hour in the day, mm minute in the hour, and ss second in the minute. All fields are of fixed width. Audit data is generated in the binary format described below; the default for Solaris audit is binary format. See audit_syslog(5) for an alternate data format. The audit.log file begins with a standalone file token and typically ends with one also. The beginning file token records the pathname of the previous audit file, while the ending file token records the pathname of the next audit file. If the file name is NULL the appropriate path was unavailable. The audit.log files contains audit records. Each audit record is made up of audit tokens. Each record contains a header token followed by various data tokens. Depending on the audit policy in place by auditon(2), optional other tokens such as trailers or sequences may be included. The tokens are defined as follows: The file token consists of: token ID 1 byte seconds of time 4 bytes microseconds of time 4 bytes file name length 2 bytes file pathname N bytes + 1 terminating NULL byte The header token consists of: token ID 1 byte record byte count 4 bytes version # 1 byte [2] event type 2 bytes event modifier 2 bytes seconds of time 4 bytes/8 bytes (32-bit/64-bit value) nanoseconds of time 4 bytes/8 bytes (32-bit/64-bit value) The expanded header token consists of: token ID 1 byte record byte count 4 bytes version # 1 byte [2] event type 2 bytes event modifier 2 bytes address type/length 1 byte machine address 4 bytes/16 bytes (IPv4/IPv6 address) seconds of time 4 bytes/8 bytes (32/64-bits) nanoseconds of time 4 bytes/8 bytes (32/64-bits) The trailer token consists of: token ID 1 byte trailer magic number 2 bytes record byte count 4 bytes The arbitrary data token is defined: token ID 1 byte how to print 1 byte basic unit 1 byte unit count 1 byte data items (depends on basic unit) The in_addr token consists of: token ID 1 byte IP address type/length 1 byte IP address 4 bytes/16 bytes (IPv4/IPv6 address) The expanded in_addr token consists of: token ID 1 byte IP address type/length 4 bytes/16 bytes (IPv4/IPv6 address) IP address 16 bytes The ip token consists of: token ID 1 byte version and ihl 1 byte type of service 1 byte length 2 bytes id 2 bytes offset 2 bytes ttl 1 byte protocol 1 byte checksum 2 bytes source address 4 bytes destination address 4 bytes The expanded ip token consists of: token ID 1 byte version and ihl 1 byte type of service 1 byte length 2 bytes id 2 bytes offset 2 bytes ttl 1 byte protocol 1 byte checksum 2 bytes address type/type 1 byte source address 4 bytes/16 bytes (IPv4/IPv6 address) address type/length 1 byte destination address 4 bytes/16 bytes (IPv4/IPv6 address) The iport token consists of: token ID 1 byte port IP address 2 bytes The path token consists of: token ID 1 byte path length 2 bytes path N bytes + 1 terminating NULL byte The path_attr token consists of: token ID 1 byte count 4 bytes path count null-terminated string(s) The process token consists of: token ID 1 byte audit ID 4 bytes effective user ID 4 bytes effective group ID 4 bytes real user ID 4 bytes real group ID 4 bytes process ID 4 bytes session ID 4 bytes terminal ID port ID 4 bytes/8 bytes (32-bit/64-bit value) machine address 4 bytes The expanded process token consists of: token ID 1 byte audit ID 4 bytes effective user ID 4 bytes effective group ID 4 bytes real user ID 4 bytes real group ID 4 bytes process ID 4 bytes session ID 4 bytes terminal ID port ID 4 bytes/8 bytes (32-bit/64-bit value) address type/length 1 byte machine address 4 bytes/16 bytes (IPv4/IPv6 address) The return token consists of: token ID 1 byte error number 1 byte return value 4 bytes/8 bytes (32-bit/64-bit value) The subject token consists of: token ID 1 byte audit ID 4 bytes effective user ID 4 bytes effective group ID 4 bytes real user ID 4 bytes real group ID 4 bytes process ID 4 bytes session ID 4 bytes terminal ID port ID 4 bytes/8 bytes (32-bit/64-bit value) machine address 4 bytes The expanded subject token consists of: token ID 1 byte audit ID 4 bytes effective user ID 4 bytes effective group ID 4 bytes real user ID 4 bytes real group ID 4 bytes process ID 4 bytes session ID 4 bytes terminal ID port ID 4 bytes/8 bytes (32-bit/64-bit value) address type/length 1 byte machine address 4 bytes/16 bytes (IPv4/IPv6 address) The System V IPC token consists of: token ID 1 byte object ID type 1 byte object ID 4 bytes The text token consists of: token ID 1 byte text length 2 bytes text N bytes + 1 terminating NULL byte The attribute token consists of: token ID 1 byte file access mode 4 bytes owner user ID 4 bytes owner group ID 4 bytes file system ID 4 bytes node ID 8 bytes device 4 bytes/8 bytes (32-bit/64-bit) The groups token consists of: token ID 1 byte number groups 2 bytes group list N * 4 bytes The System V IPC permission token consists of: token ID 1 byte owner user ID 4 bytes owner group ID 4 bytes creator user ID 4 bytes creator group ID 4 bytes access mode 4 bytes slot sequence # 4 bytes key 4 bytes The arg token consists of: token ID 1 byte argument # 1 byte argument value 4 bytes/8 bytes (32-bit/64-bit value) text length 2 bytes text N bytes + 1 terminating NULL byte The exec_args token consists of: token ID 1 byte count 4 bytes text count null-terminated string(s) The exec_env token consists of: token ID 1 byte count 4 bytes text count null-terminated string(s) The exit token consists of: token ID 1 byte status 4 bytes return value 4 bytes The socket token consists of: token ID 1 byte socket type 2 bytes remote port 2 bytes remote Internet address 4 bytes The expanded socket token consists of: token ID 1 byte socket domain 2 bytes socket type 2 bytes local port 2 bytes address type/length 2 bytes local port 2 bytes local Internet address 4 bytes/16 bytes (IPv4/IPv6 address) remote port 2 bytes remote Internet address 4 bytes/16 bytes (IPv4/IPv6 address) The seq token consists of: token ID 1 byte sequence number 4 bytes The privilege token consists of: token ID 1 byte text length 2 bytes privilege set name N bytes + 1 terminating NULL byte text length 2 bytes list of privileges N bytes + 1 terminating NULL byte The use-of-auth token consists of: token ID 1 byte text length 2 bytes authorization(s) N bytes + 1 terminating NULL byte The command token consists of: token ID 1 byte count of args 2 bytes argument list (count times) text length 2 bytes argument text N bytes + 1 terminating NULL byte count of env strings 2 bytes environment list (count times) text length 2 bytes env. text N bytes + 1 terminating NULL byte The ACL token consists of: token ID 1 byte type 4 bytes value 4 bytes file mode 4 bytes The zonename token consists of: token ID 1 byte name length 2 bytes name <name length> including terminating NULL byte ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Interface Stability |: | +-----------------------------+-----------------------------+ | binary file format |Evolving | +-----------------------------+-----------------------------+ | binary file contents |Unstable | +-----------------------------+-----------------------------+ SEE ALSO
audit(1M), auditd(1M), bsmconv(1M), audit(2), auditon(2), au_to(3BSM), audit_control(4), audit_syslog(5) NOTES
Each token is generally written using the au_to(3BSM) family of function calls. SunOS 5.10 6 Jan 2004 audit.log(4)
All times are GMT -4. The time now is 07:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy