Sponsored Content
Top Forums Shell Programming and Scripting Using sed, awk or perl to remove substring of all lines except the first Post 302809205 by Don Cragun on Saturday 18th of May 2013 06:18:47 PM
Old 05-18-2013
g[] is a list of user names on the current line. Duplicates are eliminated by removing entries from the g[] array and, if it is changed, reconstructing the 4th field on the current line before writing the updated line. key[gid, user_name] is a two dimensional array that keeps track of what user names have been seen for the gid on the current line (ignoring the group name). Here is a copy of my original script with extensive comments added. Let me know if something still is not clear.

Code:
awk '
# Input file format:
#       gid:do_not_care:gname;uname_list
# where:"gid" is a numeric string specifying the group ID number,
#       "do_not_care" is ignored by this script,
#       "gname" is an alphanumeric group name, and
#       "uname_list" is a comma separated list of alphanumeric user names.
#
# The two dimensional array key[] is indexed by the "gid" and a user name.  The
# array starts out empty.  If key[$1, user name] is present, the user name has
# been seen with the "gid" (either earlier on this line or on an earlier line).
# The "gname" is ignored when making this determination, so when we are done, a
# user name will appear only once for each "gid".
BEGIN { # Set the input and output field separators to ":"
        FS = OFS = ":"
}
{       # Split the "uname_list" into n individual user names:
        n = split($4, g, /,/)
        # Update the array of user names seen for this "gid":
        for(i = 1; i <= n; i++)
                # Determine if we have seen this user name with this "gid"
                if(($1,g[i]) in key) {
                        # We have seen this user name with this "gid".  Remove
                        # this name from the list of user names on this line:
                        for(j = i + 1; j <= n; j++) g[j - 1] = g[j]
                        i--     # Repeat the check for user name i.
                        n--     # Decrease the # of user names on this line.
                        c = 1   # Note that we have changed this line.
                } else  # We have not seen this user name with this "gid".
                        # Add an entry for this combination:
                        key[$1,g[i]]
        # Check to see if we need to reformat the "uname_list" (because we
        # removed a user name from the list on this line).
        if(c) { # We do need to reformat the uname_list on this line:
                c = 0     # Clear the flag for the next line.
                # If there are any user names left in the list, initialize the
                # reformatted "uname_list" to the 1st user name that is left;
                # otherwise set the "uname_list" to the empty string.
                # Note that if you want to discard this "gname" if there are no
                # remaining user names in the updated "uname_list", you could
                # do that by replacing the following uncommented line with the
                # next two lines (after removing the "# " in both lines):
                # if(n == 0) next
                # $4 = g[1]
                $4 = n ? g[1] : ""
                # For each additional remaining user name (if any exist), add a
                # comma and that name to the reformatted "uname_list":
                for(j = 2; j <= n; j++) $4 = $4 "," g[j]
        }
        # Print the original or updated line.
        print
}'  data

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sed or Awk to remove specific lines

I have searched the forum for this - forgive me if I missed a previous post. I have the following file: blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah alter table "informix".esc_acct add constraint (foreign key (fi_id) references "informix".fi ... (5 Replies)
Discussion started by: Shoeless_Mike
5 Replies

2. Shell Programming and Scripting

How to remove lines before and after with awk / sed ?

Hi guys, I need to remove the pattern (ID=180), one line before and four lines after. Thanks. (5 Replies)
Discussion started by: ashimada
5 Replies

3. Shell Programming and Scripting

perl or awk remove empty lines when condition

Hi Everyone, # cat 1 a b b cc 1 2 3 3 3 4 55 5 a b (2 Replies)
Discussion started by: jimmy_y
2 Replies

4. Shell Programming and Scripting

How to remove spaces using awk,sed,perl?

Input: 3456 565 656 878 235 8 4 8787 3 7 35 878 Expected output: 3456 565 656 878 235 8 4 8787 3 7 35 878 How can i do this with awk,sed and perl? (10 Replies)
Discussion started by: cola
10 Replies

5. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

6. Shell Programming and Scripting

Need an awk / sed / or perl one-liner to remove last 4 characters with non-unique pattern.

Hi, I'm writing a ksh script and trying to use an awk / sed / or perl one-liner to remove the last 4 characters of a line in a file if it begins with a period. Here is the contents of the file... the column in which I want to remove the last 4 characters is the last column. ($6 in awk). I've... (10 Replies)
Discussion started by: right_coaster
10 Replies

7. Shell Programming and Scripting

Process alternate lines in awk/sed/perl

hi.. i have a fasta file with the following format >sequence1 CCGGTTTTCGATTTGGTTTGACT >sequence2 AAAGTGCCGCCAGGTTTTGAGTGT >sequence3 AGTGCCGCAGAGTTTGTAGTGT Now, i want to read alternate line and add "GGGGGGGGGGG" to end of every sequence Desired output: >sequence1... (4 Replies)
Discussion started by: empyrean
4 Replies

8. Shell Programming and Scripting

Sed/awk/perl substitution with multiple lines

OSX I have been grinding my teeth on a portion of code. I am building a bash script that edits a html email template. In the template, I have place holders for SED (or whatever program is appropriate) to use as anchors for find and replace, with user defined corresponding html code. The HTML code... (3 Replies)
Discussion started by: sudo
3 Replies

9. Shell Programming and Scripting

Remove lines matching a substring in a specific column

Dear group, I have following input text file: Brit 2016 11 18 12 00 10 1.485,00 EUR Brit 2016 11 18 12 00 10 142,64 EUR Brit 2016 11 18 12 00 10 19,80 EUR Brit 2016 11 18 12 00 10 545,00 EUR Brit 2016 11 18 12 00 10 6.450,00 EUR... (3 Replies)
Discussion started by: gfhsd
3 Replies

10. UNIX for Beginners Questions & Answers

awk with sed to combine lines and remove specific odd # pattern from line

In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line... (4 Replies)
Discussion started by: cmccabe
4 Replies
groupadd(8)						      System Manager's Manual						       groupadd(8)

NAME
groupadd - create a new group entry SYNOPSIS
groupadd [-D binddn] [-P path] [-g gid [-o]] [-p password] [--preferred-gid gid] [-r] [--service service] [--help] [--usage] [-v] group DESCRIPTION
groupadd creates a new group entry using the values specified on the command line. Depending on the command line options the new entry will be added to the system files or LDAP database. The group name must begin with an alphabetic character and the rest of the string should be from the POSIX portable character class ([A-Za- z_][A-Za-z0-9_-.]*). OPTIONS
-g, --gid gid Force the new group ID to be the given number. This value must be positive and unique. The default is to use the first free ID after the greatest used one. The range from which the group ID is chosen can be specified in /etc/login.defs. --preferred-gid gid Set the new group ID to the specified value if possible. If that value is already in use the first free ID will be chosen as described above. -o, --non-unique Allow duplicate (non-unique) group IDs. -p, --password password Encrypted password as returned by crypt(3) for the new account. The default is to disable the account. -r, --system Create a system group. A system group is an entry with an GID between SYSTEM_GID_MIN and SYSTEM_GID_MAX as defined in /etc/login.defs, if no GID is specified. --service service Add the group to a special directory. The default is files, but ldap is also valid. -D, --binddn binddn Use the Distinguished Name binddn to bind to the LDAP directory. The user will be prompted for a password for simple authentica- tion. -P, --path path The group file is located below the specified directory path. groupadd will use this files, not /etc/group. --help Print a list of valid options with a short description. --usage Print a short list of valid options. -v, --version Print the version number and exit. FILES
/etc/group - group account information SEE ALSO
login.defs(5), group(5), groupdel(8), groupmod(8) AUTHOR
Thorsten Kukuk <kukuk@suse.de> pwdutils December 2003 groupadd(8)
All times are GMT -4. The time now is 01:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy