Remove duplicated records and update last line record counts


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove duplicated records and update last line record counts
# 8  
Old 03-10-2019
Quote:
Originally Posted by nezabudka
Hi Don, thanks for the explanation.
Code:
awk 'BEGIN {FS=OFS=","} /^T/ {$2=length(A)} !A[$0]++'

Hi nezabudka,
Always glad to help.

This is another interesting way to do it. Unfortunately, the standards do not specify the behavior of the awk length built-in function when given an array name as an argument. This use is described on the GNU gawk man page and works in BSD awk version 20070501 (but is not documented in the BSD awk man page) that is installed on macOS Mojave (version 10.14.3).

I have no idea whether or not this will work (as an undocumented feature) on green_k's Solaris system in /usr/xpg4/bin/awk or nawk. I also do not know if gawk is installed on green_k's system.
This User Gave Thanks to Don Cragun For This Post:
# 9  
Old 03-10-2019
On top of what Don Cragun said, the last approach would not account for "duplicate duplicates".


Illogic nonsense... please disregard.

Last edited by RudiC; 03-10-2019 at 08:33 AM..
# 10  
Old 03-10-2019
Quote:
Originally Posted by RudiC
On top of what Don Cragun said, the last approach would not account for "duplicate duplicates".
Hi RudiC,
I'm not sure what you mean. I don't see any reason why the code shown in post #7 should fail as long as all of the following are true:
  1. There are only "D" and "T" records in the input file.
  2. There is only one "T" record in the input file.
  3. The "T" record is the last record in the input file.
  4. The awk being used returns the number of elements in the array when length(array_name) is called.
The first three are true in the sample data provided in this thread. The fourth is true with gawk starting with version 3.1.5 according to the Linux 2.6 gawk man page available in the UNIX and Linux Man Pages repository. By experiment, it also works on the awk version 20070501 provided with macOS Mojave version 10.14.3.

Unlike the code in post #5, this code is not subtracting the number of duplicates found, it is directly setting the number of unique elements found.

Am I missing something?
This User Gave Thanks to Don Cragun For This Post:
# 11  
Old 03-10-2019
Hi Don Cragun, sorry for posting that nonsense. My logics seem to require some lubrication. I may need some sleep. Post withdrawn.
# 12  
Old 03-10-2019
Quote:
Originally Posted by RudiC
Hi Don Cragun, sorry for posting that nonsense. My logics seem to require some lubrication. I may need some sleep. Post withdrawn.
Hi RudiC,
I know the feeling. I'm just up this late because I checked to see what was going on here after resetting all of the clocks in the house. (Daylight Saving time kicked in here this morning when the clock should have hit 2am. I hate Daylight Saving time!)

Sleep tight.

- Don
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join files, omit duplicated records from one file

Hello I have 2 files, eg more file1 file2 :::::::::::::: file1 :::::::::::::: 1 fromfile1 2 fromfile1 3 fromfile1 4 fromfile1 5 fromfile1 6 fromfile1 7 fromfile1 :::::::::::::: file2 :::::::::::::: 3 fromfile2 5 fromfile2 (4 Replies)
Discussion started by: CHoggarth
4 Replies

2. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

3. Shell Programming and Scripting

How to Remove the new line character inbetween a record

I have a file, in which a single record spans across multiple lines, File 1 ==== 14|\n leave request \n accepted|Yes| 15|\n leave request not \n acccepted|No| I wanted to remove the '\n charecters. I used the below code (foudn somewhere in this forum) perl -e 'while (<>) { if... (1 Reply)
Discussion started by: machomaddy
1 Replies

4. Shell Programming and Scripting

New file should store all the 7 existing filenames and their record counts and ftp th

Hi, I need help regarding below concern. There is a script and it has 7 existing files(in a path say,. usr/appl/temp/file1.txt) and I need to create one new blank file say “file_count.txt” in the same script itself. Then the new file <file_count.txt> should store all the 7 filenames and... (1 Reply)
Discussion started by: pr293
1 Replies

5. UNIX for Dummies Questions & Answers

Hardcoding & Record counts in a file

HI , I am having a huge comma delimiter file, I have to append the following four lines before the starting of the file through a shell script. FILE NAME = TEST_LOAD DATETIME = CURRENT DATE TIME LOAD DATE = CURRENT DATE RECORD COUNT = TOTAL RECORDS IN FILE Source data 1,2,3,4,5,6,7... (7 Replies)
Discussion started by: shruthidwh
7 Replies

6. Shell Programming and Scripting

Split a single record to multiple records & add folder name to each line

Hi Gurus, I need to cut single record in the file(asdf) to multile records based on the number of bytes..(44 characters). So every record will have 44 characters. All the records should be in the same file..to each of these lines I need to add the folder(<date>) name. I have a dir. in which... (20 Replies)
Discussion started by: ram2581
20 Replies

7. Shell Programming and Scripting

Sending e-mail of record counts in 3 or more files

I am trying to load data into 3 tables simultaneously (which is working fine). Then when loaded, it should count the total number of records in all the 3 input files and send an e-mail to the user. The script is working fine, as far as loading all the 3 input files into the database tables, but... (3 Replies)
Discussion started by: msrahman
3 Replies

8. Shell Programming and Scripting

Help to Add and Remove Records only from first line/last line

Hi, I need help with a maybe total simple issue but somehow I am not getting it. I am not able to etablish a sed or awk command which is adding to the first line in a text and removing only from the last line the ",". The file is looking like follow: TABLE1, TABLE2, . . . TABLE99,... (4 Replies)
Discussion started by: enjoy
4 Replies

9. Shell Programming and Scripting

remove duplicated columns

hi all, i have a file contain multicolumns, this file is sorted by col2 and col3. i want to remove the duplicated columns if the col2 and col3 are the same in another line. example fileA AA BB CC DD CC XX CC DD BB CC ZZ FF DD FF HH HH the output is AA BB CC DD BB CC ZZ FF... (6 Replies)
Discussion started by: kamel.seg
6 Replies

10. Shell Programming and Scripting

remove duplicated xml record in a file under unix

Hi, If i have a file with xml format, i would like to remove duplicated records and save to a new file. Is it possible...to write script to do it? (8 Replies)
Discussion started by: happyv
8 Replies
Login or Register to Ask a Question