Sponsored Content
Top Forums Shell Programming and Scripting awk uniq and longest string of a column as index Post 302700327 by yifangt on Thursday 13th of September 2012 09:26:45 AM
Old 09-13-2012
awk uniq and longest string of a column as index

Thanks vgersh99!
The key point is the substring of the current line to any of the lines that have been read. Kind of recursively comparison.
What's in my mind is:
Code:
read in line;
compare current line to the old ones;
If it is new, remember it;
If it is longer than any of the memory (i.e. any member of the memory is substring of current line), replace the old one with current line;
if it is a substring of any of the memory, ignore current one;

as awk is processing one line at a time, I thought it is good to handle this problem.
Code:
If it is new, remember it;

may not be accurate. Each line for sure is a unique string, but can be substring/"parent"string of other.
Thanks a lot!
yi

Last edited by yifangt; 09-13-2012 at 10:33 AM.. Reason: bug of the algorithm
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using Awk in shell script to extract an index of a substring from a parent string

Hi All, I am new to this shell scripting world. Struck up with a problem, can anyone of you please pull me out of this. Requirement : Need to get the index of a substring from a parent string Eg : index("Sandy","dy") should return 4 or 3. My Approach : I used Awk function index to... (2 Replies)
Discussion started by: sandeepms17
2 Replies

2. UNIX for Dummies Questions & Answers

How to remove duplicated based on longest row & largest value in a column

Hii i have a file with data as shown below. Here i need to remove duplicates of the rows in such a way that it just checks for 2,3,4,5 column for duplicates.When deleting duplicates,retain largest row i.e with many columns with values should be selected.Then it must remove duplicates such that by... (11 Replies)
Discussion started by: reva
11 Replies

3. Shell Programming and Scripting

Find longest string and print it

Hello all, I need to find the longest string in a select field and print that field. I have tried a few different methods and I always end up one step from where I need to be. Methods thus far: nawk '{if (length($1) > long) long=length($1); if(length($1)==long) print $1}' The above... (6 Replies)
Discussion started by: SEinT
6 Replies

4. Shell Programming and Scripting

Longest length of string in array

I would be grateful if someone could help me. I am trying to write a .sh script in UNIX. I have the following code; User=john User=james User=ian User=martin for x in ${User} do print ${#x} done This produces the following output; 4 5 3 6 (12 Replies)
Discussion started by: mmab
12 Replies

5. Shell Programming and Scripting

Finding the length of the longest column

Hi, I am trying to figure out how to get the length of the longest column in the entire file (because the length varies from one row to the other) I was doing this at first to check how many fields I have for the first row: awk '{print NF; exit}' file Now, I can do this: awk '{ if... (4 Replies)
Discussion started by: MIA651
4 Replies

6. Shell Programming and Scripting

awk : search last index in specific column

I am trying to search a given text in a file and find its last occurrence index. The task is to append the searched index in the same file but in a separate column. I am able to accomplish the task partially and looking for a solution. Following is the detailed description: names_file.txt ... (17 Replies)
Discussion started by: tarun.trehan
17 Replies

7. Shell Programming and Scripting

Bring values in the second column into single line (comma sep) for uniq value in the first column

I want to bring values in the second column into single line for uniq value in the first column. My input jvm01, Web 2.0 Feature Pack Library jvm01, IBM WebSphere JAX-RS jvm01, Custom01 Shared Library jvm02, Web 2.0 Feature Pack Library jvm02, IBM WebSphere JAX-RS jvm03, Web 2.0 Feature... (10 Replies)
Discussion started by: kchinnam
10 Replies

8. Shell Programming and Scripting

Need help in awk: running a loop with one column and segregate data 4 each uniq value in that field

Hi All, I have a file like this(having 2 column). Column 1: like a,b,c.... Column 2: having numbers. I want to segregate those numbers based on column 1. Example: file. a 5 b 9 b 620 a 710 b 230 a 330 b 1910 (4 Replies)
Discussion started by: Raza Ali
4 Replies

9. Shell Programming and Scripting

Parse the longest matching string

Hello experts, I am trying to unscramble a mixed signal into component signals. Let the list of known signals be $ cat tmplist DU DU4016 GFF GFF2010 GFF201019 G2115 G211 DU40 (1 Reply)
Discussion started by: senhia83
1 Replies

10. UNIX for Beginners Questions & Answers

Replace substring by longest string in common field (awk)

Hi, Let's say I have a pipe-separated input like so: name_10|A|BCCC|cat_1 name_11|B|DE|cat_2 name_10|A|BC|cat_3 name_11|B|DEEEEEE|cat_4 Using awk, for records with common field 2, I am trying to replace all the shortest substrings by the longest string in field 3. In order to get the... (5 Replies)
Discussion started by: beca123456
5 Replies
mailq(1)						      General Commands Manual							  mailq(1)

NAME
mailq - prints the mail queue SYNOPSIS
DESCRIPTION
prints a summary of the mail messages queued for future delivery. The first line printed for each message shows the internal identifier used on this host for the message, the size of the message in bytes, the date and time the message was accepted into the queue, and the envelope sender of the message. The second line shows the error message that caused this message to be retained in the queue; it will not be present if the message is being processed for the first time. The status characters are: to indicate that the job is being processed to indicate that the load is too high to process the job to indicate that the job is too new in the queue to process. The output lines that follow the second line show the message recipients, one per line. is identical to Options The supported options are: Show the mail submission queue specified in the file instead of the MTA queue specified in the file. Show the lost items in the mail queue instead of normal queue items. Show the quarantined items in the mail queue instead of the normal queue items. Limit processed jobs to those containing substr as a substring of the queue ID or not when is specified. Limit processed jobs to quarantined jobs containing substr as a substring of the quarantine reason or not when is specified. Limit processed jobs to those containing substr as a substring of one of the recipients or not when is specified. Limit processed jobs to those containing substr as a substring of the sender or not when is specified. Print verbose information. This adds the priority of the message and a single character indicator or blank) indicating whether a warning message has been sent on the first line of the message. In addition, extra lines may be intermixed with the recipients indicating the `controlling user' information. This shows who owns the programs that are executed on behalf of this message and the name of the alias this command expanded from, if any. RETURN VALUE
The utility exits with 0 on success, and >0 if an error occurs. AUTHOR
was developed by the University of California, Berkeley, and originally appeared in 4.0BSD. FILES
mail queue files for SEE ALSO
sendmail(1M). mailq(1)
All times are GMT -4. The time now is 09:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy