Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicated records and update last line record counts Post 303032029 by Don Cragun on Saturday 9th of March 2019 07:04:23 PM
Old 03-09-2019
Your description and code are not clear enough to be sure that this is what you want, but it works with the sample data provided:
Code:
awk '
BEGIN {	FS = OFS = ","
}
$1 == "D" {
	if($2 in a)
		next
	a[$2]
	printed++
}
$1 == "T" {
	$2 = printed
}
1' file.CSV

Clearly field #2 is not the key to determining duplicate records, it is at least field #2 when and only when field #1 is "D". And, since you are storing the entire line into the a[] array for some reason, maybe you only want to delete identical lines instead of deleting lines with identical keys???

The above code assumes you just want to delete lines with identical keys where the key is the combination of field #1 being "D" and field #2 being unique. The second field in the line with field #1 being "T" is written with whatever was in field #2 changed to the number of lines with field #1 being "D" and field #2 being unique that have been seen before the line that has field #1 being "T". All lines that do not have field #1 being "D" or "T" are copied to the output without being counted.

You should always tell us what operating system and shell you're using when you start a new thread in this forum. The behavior of many utilities varies from operating system to operating system and the features provided by shells vary from shell to shell.

If you want to try the above code on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated xml record in a file under unix

Hi, If i have a file with xml format, i would like to remove duplicated records and save to a new file. Is it possible...to write script to do it? (8 Replies)
Discussion started by: happyv
8 Replies

2. Shell Programming and Scripting

remove duplicated columns

hi all, i have a file contain multicolumns, this file is sorted by col2 and col3. i want to remove the duplicated columns if the col2 and col3 are the same in another line. example fileA AA BB CC DD CC XX CC DD BB CC ZZ FF DD FF HH HH the output is AA BB CC DD BB CC ZZ FF... (6 Replies)
Discussion started by: kamel.seg
6 Replies

3. Shell Programming and Scripting

Help to Add and Remove Records only from first line/last line

Hi, I need help with a maybe total simple issue but somehow I am not getting it. I am not able to etablish a sed or awk command which is adding to the first line in a text and removing only from the last line the ",". The file is looking like follow: TABLE1, TABLE2, . . . TABLE99,... (4 Replies)
Discussion started by: enjoy
4 Replies

4. Shell Programming and Scripting

Sending e-mail of record counts in 3 or more files

I am trying to load data into 3 tables simultaneously (which is working fine). Then when loaded, it should count the total number of records in all the 3 input files and send an e-mail to the user. The script is working fine, as far as loading all the 3 input files into the database tables, but... (3 Replies)
Discussion started by: msrahman
3 Replies

5. Shell Programming and Scripting

Split a single record to multiple records & add folder name to each line

Hi Gurus, I need to cut single record in the file(asdf) to multile records based on the number of bytes..(44 characters). So every record will have 44 characters. All the records should be in the same file..to each of these lines I need to add the folder(<date>) name. I have a dir. in which... (20 Replies)
Discussion started by: ram2581
20 Replies

6. UNIX for Dummies Questions & Answers

Hardcoding & Record counts in a file

HI , I am having a huge comma delimiter file, I have to append the following four lines before the starting of the file through a shell script. FILE NAME = TEST_LOAD DATETIME = CURRENT DATE TIME LOAD DATE = CURRENT DATE RECORD COUNT = TOTAL RECORDS IN FILE Source data 1,2,3,4,5,6,7... (7 Replies)
Discussion started by: shruthidwh
7 Replies

7. Shell Programming and Scripting

New file should store all the 7 existing filenames and their record counts and ftp th

Hi, I need help regarding below concern. There is a script and it has 7 existing files(in a path say,. usr/appl/temp/file1.txt) and I need to create one new blank file say “file_count.txt” in the same script itself. Then the new file <file_count.txt> should store all the 7 filenames and... (1 Reply)
Discussion started by: pr293
1 Replies

8. Shell Programming and Scripting

How to Remove the new line character inbetween a record

I have a file, in which a single record spans across multiple lines, File 1 ==== 14|\n leave request \n accepted|Yes| 15|\n leave request not \n acccepted|No| I wanted to remove the '\n charecters. I used the below code (foudn somewhere in this forum) perl -e 'while (<>) { if... (1 Reply)
Discussion started by: machomaddy
1 Replies

9. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

10. Shell Programming and Scripting

Join files, omit duplicated records from one file

Hello I have 2 files, eg more file1 file2 :::::::::::::: file1 :::::::::::::: 1 fromfile1 2 fromfile1 3 fromfile1 4 fromfile1 5 fromfile1 6 fromfile1 7 fromfile1 :::::::::::::: file2 :::::::::::::: 3 fromfile2 5 fromfile2 (4 Replies)
Discussion started by: CHoggarth
4 Replies
PASSWD(5)							File Formats Manual							 PASSWD(5)

NAME
passwd - password files DESCRIPTION
Passwd files are files consisting of newline separated records, one per user, containing ten colon (``:'') separated fields. These fields are as follows: name user's login name password user's encrypted password uid user's id gid user's login group id class user's general classification (unused) change password change time expire account expiration time gecos general information about the user home_dir user's home directory shell user's login shell The name field is the login used to access the computer account, and the uid field is the number associated with it. They should both be unique across the system (and often across a group of systems) since they control file access. While it is possible to have multiple entries with identical login names and/or identical user id's, it is usually a mistake to do so. Routines that manipulate these files will often return only one of the multiple entries, and that one by random selection. The login name must never begin with a hyphen (``-''); also, it is strongly suggested that neither upper-case characters or dots (``.'') be part of the name, as this tends to confuse mailers. No field may contain a colon (``:'') as this has been used historically to separate the fields in the user database. The password field is the encrypted form of the password. If the password field is empty, no password will be required to gain access to the machine. This is almost invariably a mistake. Because these files contain the encrypted user passwords, they should not be readable by anyone without appropriate privileges. The group field is the group that the user will be placed in upon login. Since this system supports multiple groups (see groups(1)) this field currently has little special meaning. The class field is currently unused. In the near future it will be a key to a termcap(5) style database of user attributes. The change field is the number in seconds, GMT, from the epoch, until the password for the account must be changed. This field may be left empty to turn off the password aging feature. The expire field is the number in seconds, GMT, from the epoch, until the account expires. This field may be left empty to turn off the account aging feature. The gecos field normally contains comma (``,'') separated subfields as follows: name user's full name office user's office number wphone user's work phone number hphone user's home phone number This information is used by the finger(1) program. The user's home directory is the full UNIX path name where the user will be placed on login. The shell field is the command interpreter the user prefers. If the shell field is empty, the Bourne shell (/bin/sh) is assumed. SEE ALSO
chpass(1), login(1), passwd(1), getpwent(3), mkpasswd(8), vipw(8) adduser(8) BUGS
User information should (and eventually will) be stored elsewhere. 7th Edition May 8, 1989 PASSWD(5)
All times are GMT -4. The time now is 09:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy