Common prefix of a list of strings


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Common prefix of a list of strings
# 1  
Common prefix of a list of strings

Is there a simple way to find the longest common prefix of a space-separated list of strings, optionally by field?

For example, given input:
Code:
"aaa_b_cc aaa_b_cc_ddd aaa_b_cc aaa_b_cd"

with no field separator, output:
Code:
aaa_b_c

with _ field separator, output:
Code:
aaa_b

I have an awk solution which appears to work (although I haven't done much testing):
Code:
function get_common_prefix() {
        list="$1"
        sep="$2"

        printf "$list" | awk '(NR==1) {pcount=split($0,prefix)}
                               (NR>1) {for (i=pcount;i>0;i--) {if ($i!=prefix[i]) {pcount=i-1}}}
                                END {NF=pcount;print}' RS=' ' FS=$sep OFS=$sep
}

myprefix=$(get_common_prefix "$1" $2)

printf "[%s]\n" $myprefix

Searching didn't come up with anything more elegant that could handle both by character and by field. So, just wondering if the forum had any better solutions.


EDIT: The above (cygwin) doesn't seem to work very well on (non-gawk) AIX 6.1. Seems you can't fiddle with NF in END the way I've doing above (although just printing pcount fields of prefix works), and having a blank field separator seems equivalent to whitespace (i.e. it won't split by character). Smilie

Last edited by CarloM; 10-25-2013 at 05:45 AM.. Reason: Platforms
# 2  
Not sure if this should be considered more elegant, and it doesn't take into account field separators, but it would do the first job:
Code:
 A=(aaa_b_cc aaa_b_cc_ddd aaa_b_cc aaa_b_cd)
for ((i=1;i<=${#A[0]}; i++))                            # for the entire first string
  do for ((j=0;j<${#A[@]};j++))                         # for all strngs in array
       do P=${A%${A[0]:$i}}                             # get increasing length substring
          [ "${A[$j]#$P}" = "${A[$j]}" ] && break 2     # test if contained in all strings; if not, no CP, break out
       done
     CP=$P                                              # keep last valid common prefix
  done
echo $CP
aaa_b_c

This User Gave Thanks to RudiC For This Post:
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #475
Difficulty: Medium
A Yottabyte (YB) equals 1,208,925,819,614,629,174,706,176 bytes.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

2. Shell Programming and Scripting

Return the list of file name prefix with first 6 character

Good day people, Kindly advice what is the operator/command that I should try out to if I am about to return a list of prefix of my filename with first 6 character. I understand I could use sed to retrieve the first 6 charter itself. but i wonder if there is any aix command allow me to loop... (4 Replies)
Discussion started by: cielle
4 Replies

3. UNIX for Dummies Questions & Answers

Listing no. of files in UNIX with common prefix name

Hi, I am entirely new to Unix, need your help to perform certain actions in unix: Can anyone please tell me how to list the number of files in UNIX with Common prefix name. "I want just the number of files and not the names of files". Thanks (12 Replies)
Discussion started by: Hitesh1008
12 Replies

4. Shell Programming and Scripting

AWK adding prefix/suffix to list of strings

75 103 131 133 138 183 197 221 232 234 248 256 286 342 368 389 463 499 524 538 (5 Replies)
Discussion started by: chrisjorg
5 Replies

5. Shell Programming and Scripting

Need the script to remove common strings,tags etc

I have a file say "example.xml" and the contents of this example.xml are <project name="platform/packages/wallpapers/Basic" path="packages/wallpapers/Basic" revision="225e410f054c4ad5c828b0fec9be1b47c4376711"/> <project name="platform/packages/wallpapers/Galaxy4"... (3 Replies)
Discussion started by: acdc
3 Replies

6. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (18 Replies)
Discussion started by: hnux
18 Replies

7. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (0 Replies)
Discussion started by: hnux
0 Replies

8. UNIX for Advanced & Expert Users

Find common Strings in two large files

Hi , I have a text file in the format DB2: DB2: WB: WB: WB: WB: and a second text file of the format Time=00:00:00.473 Time=00:00:00.436 Time=00:00:00.016 Time=00:00:00.027 Time=00:00:00.471 Time=00:00:00.436 the last string in both the text files is of the... (4 Replies)
Discussion started by: kanthrajgowda
4 Replies

9. Shell Programming and Scripting

Simple script to find common strings in two files

Hi , I want to write a simple script. I have two files file1: BCSpeciality Backend CB CBAPQualDisp CBCimsVFTRCK CBDSNQualDisp CBDefault CBDisney CBFaxMCGen CBMCGeneral CBMCQualDisp file2: CSpeciality Backend (8 Replies)
Discussion started by: ramky79
8 Replies

10. Shell Programming and Scripting

different take on common ?: search for two strings and remove lines between them

Thank you for assisting, I've got a partial solution just needs a tweak. Hulk-BASH$ cat somefile.txt oh there is some stuff here some more stuff here START_LABEL stuff I want more stuff I want END_LABEL other stuff here too and even more stuff here too Hulk-BASH$ Hulk-BASH$ sed... (8 Replies)
Discussion started by: laser
8 Replies