Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Replace substring by longest string in common field (awk) Post 303042959 by RavinderSingh13 on Tuesday 14th of January 2020 06:07:29 AM
Old 01-14-2020
Hello beca123456,

Could you please try following.

Code:
awk 'BEGIN{FS=OFS="|"} FNR==NR{b[$1]=length($3)>a[$1]?$3:b[$1];a[$1]=length($3)>a[$1]?length($3):a[$1];next} length($3)<a[$1]{$3=b[$1]} 1'  Input_file  Input_file

A non-one liner form of solution is:
Code:
awk '
BEGIN{
  FS=OFS="|"
}
FNR==NR{
  b[$1]=length($3)>a[$1]?$3:b[$1]
  a[$1]=length($3)>a[$1]?length($3):a[$1]
  next
}
length($3)<a[$1]{
  $3=b[$1]
}
1
'   Input_file  Input_file

Output will be as follows.

Code:
name_10|A|BCCC|cat_1
name_11|B|DEEEEEE|cat_2
name_10|A|BCCC|cat_3
name_11|B|DEEEEEE|cat_4

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding longest common substring among filenames

I will be performing a task on several directories, each containing a large number of files (2500+) that follow a regular naming convention: YYYY_MM_DD_XX.foo_bar.A.B.some_different_stuff.EXT What I would like to do is automatically discover the part of the filenames that are common to all... (1 Reply)
Discussion started by: cmcnorgan
1 Replies

2. Shell Programming and Scripting

Advanced AWK Regexp substring to int & Replace

Hi! I have a difficult problem, to step up a unknown version number in a text file, and save the file. It would be great to run script.sh and the version gets increased. Example the content of the textfile.txt hello version = x bye This include three steps 1. First find the char after... (2 Replies)
Discussion started by: Beachboy72
2 Replies

3. Shell Programming and Scripting

Awk Search text string in field, not all in field.

Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below. awk... (3 Replies)
Discussion started by: rocket_dog
3 Replies

4. Shell Programming and Scripting

sed or awk command to replace a string pattern with another string based on position of this string

here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb cat dump.sql INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Discussion started by: vivek d r
10 Replies

5. Shell Programming and Scripting

awk uniq and longest string of a column as index

I met a challenge to filter ~70 millions of sequence rows and I want using awk with conditions: 1) longest string of each pattern in column 2, ignore any sub-string, as the index; 2) all the unique patterns after 1); 3) print the whole row; input: 1 ABCDEFGHI longest_sequence1 2 ABCDEFGH... (12 Replies)
Discussion started by: yifangt
12 Replies

6. UNIX for Dummies Questions & Answers

Values with common field in same line with awk

Hi all ! I almost did it but got a small problem. input: cars red cars blue cars green truck black Wanted: cars red-blue-green truck black Attempt: gawk 'BEGIN{FS="\t"}{a = a (a?"-":"")$2; $2=a; print $1 FS $2}' input But I also got the intermediate records... (2 Replies)
Discussion started by: beca123456
2 Replies

7. Shell Programming and Scripting

Parsing the longest match substring

Hello gurus, I have a database of possible primary signal strings pp22 pt22dx pp22dx jty2234 Also I have a list of scrambled signals which has a shorter string and a longer string separated by // (double slash ). Always the shorter string of a scrambled signal will have the primary... (6 Replies)
Discussion started by: senhia83
6 Replies

8. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

In the awk below I am trying to set/update the value of $14 in file2 in bold, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies

9. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

10. Shell Programming and Scripting

Replace substring from a string variable

Hi, Wish to remove "DR-" from the string variable (var). var="DR-SERVER1" var=`echo $var | sed -e 's/DR-//g'` echo "$var" Expected Output: However, I get the below error: Can you please suggest. (4 Replies)
Discussion started by: mohtashims
4 Replies
LCG_LIMITS(4)							   File Formats 						     LCG_LIMITS(4)

NAME
Castor_limits - LCG internal limits SYNOPSIS
#include <Castor_limits.h> DESCRIPTION
The Castor_limits.h header file contains all the common limits that all LCG subpackages have to respect. These are: CA_MAXACLENTRIES maximum number of ACL entries for a file/dir Default value: 300 CA_MAXCLASNAMELEN maximum length for a fileclass name Default value: 15 CA_MAXCOMMENTLEN maximum length for user comments in metadata Default value: 255 CA_MAXDENFIELDS maximum number of density values in devinfo Default value: 8 CA_MAXDENLEN maximum length for a alphanumeric density Default value: 8 CA_MAXDGNLEN maximum length for a device group name Default value: 6 CA_MAXDPMTOKENLEN maximum length for a Disk Pool Manager token Default value: 36 CA_MAXDVNLEN maximum length for a device name Default value: 63 CA_MAXDVTLEN maximum length for a device type Default value: 8 CA_MAXFIDLEN maximum length for a fid (DSN) Default value: 17 CA_MAXFSEQLEN maximum length for a fseq string Default value: 14 CA_MAXGID maximum value for gid Default value: 0x7FFFFFFF CA_MAXGRPNAMELEN maximum length for a group name Default value: 2 CA_MAXGUIDLEN maximum length for a guid Default value: 36 CA_MAXHOSTNAMELEN maximum length for a hostname Default value: 63 CA_MAXLBLTYPLEN maximum length for a label type Default value: 3 CA_MAXLINELEN maximum length for a line in a log Default value: 1023 CA_MAXMANUFLEN maximum length for a cartridge manufacturer Default value: 12 CA_MAXMLLEN maximum length for a cartridge media_letter Default value: 1 CA_MAXMODELLEN maximum length for a cartridge model Default value: 6 CA_MAXNAMELEN maximum length for a pathname component Default value: 255 CA_MAXNBDRIVES maximum number of tape drives per server Default value: 32 CA_MAXPATHLEN maximum length for a pathname Default value: 1023 CA_MAXPOLICYLEN maximum length for a policy name Default value: 15 CA_MAXPOOLNAMELEN maximum length for a pool name Default value: 15 CA_MAXPROTOLEN maximum length for a protocol name Default value: 7 CA_MAXRBTNAMELEN maximum length for a robot name Default value: 17 CA_MAXRECFMLEN maximum length for a record format Default value: 3 CA_MAXREGEXPLEN maximum length for a regular expression Default value: 63 CA_MAXSFNLEN maximum length for a replica Default value: 1103 CA_MAXSHORTHOSTLEN maximum length for a hostname without domain Default value: 10 CA_MAXSNLEN maximum length for a cartridge serial nb Default value: 24 CA_MAXSTGRIDLEN maximum length for a stager full request id (must be >= nb digits in CA_MAXSTGREQID + CA_MAXHOSTNAMELEN + 8) Default value: 77 CA_MAXSTGREQID maximum value for a stager request id Default value: 999999 CA_MAXUID maximum value for uid Default value: 0x7FFFFFFF CA_MAXSYMLINKS maximum number of symbolic links Default value: 5 CA_MAXTAGLEN maximum length for a volume tag Default value: 255 CA_MAXTAPELIBLEN maximum length for a tape library name Default value: 8 CA_MAXUNMLEN maximum length for a drive name Default value: 8 CA_MAXUSRNAMELEN maximum length for a login name Default value: 14 CA_MAXVIDLEN maximum length for a VID Default value: 6 CA_MAXVSNLEN maximum length for a VSN Default value: 6 AUTHOR
LCG Grid Deployment Team LCG
$Date: 2007/02/03 11:16:45 $ LCG_LIMITS(4)
All times are GMT -4. The time now is 04:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy