Adding variables to repeating strings


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Adding variables to repeating strings
# 1  
Old 01-31-2013
Adding variables to repeating strings

Hello,

I want to add a letter to the end of a string if it repeats in a column.

so if I have a file like this:

Code:
DOG001
DOG0023
DOG004
DOG001
DOG0023
DOG001

the output should look like this:

Code:
DOG001-a
DOG0023-a
DOG004
DOG001-b
DOG0023-b
DOG001-c


how can I do this? thanks in advance

Last edited by Scrutinizer; 01-31-2013 at 07:47 PM.. Reason: quote -> code tags
# 2  
Old 01-31-2013
Code:
awk 'END {
  for (i = 0; ++i <= NR;) {
    c[substr(r[i], length(r[i]) - 1)] > 1 && r[i] = r[i] OFS l[s[i]] 
    print r[i]
    }
  }
{  
  r[NR] = $0
  s[NR] = ++c[substr($0, length($0) - 1)]  
  }
BEGIN {  
  split("a b c d e f g h i j k l m n o p q r s t u v w x y z", l)
  }' infile

With some awk implementations, NR may not be available in the END block,
let me know if you're using one of these.
You may need to decide what to do if your pattern exceeds the letters in the alphabet Smilie
# 3  
Old 01-31-2013
this is the error message I am receiving

Code:
-bash: echo: write error: Broken pipe
awk: f1.awk:1: awk 'END {
awk: f1.awk:1:     ^ invalid char ''' in expression

# 4  
Old 01-31-2013
Wow,
try the following:

1. Create a script file with the following content:
Code:
END {
  for (i = 0; ++i <= NR;) {
    c[substr(r[i], length(r[i]) - 1)] > 1 && r[i] = r[i] OFS l[s[i]] 
    print r[i]
    }
  }
{  
  r[NR] = $0
  s[NR] = ++c[substr($0, length($0) - 1)]  
  }
BEGIN {  
  split("a b c d e f g h i j k l m n o p q r s t u v w x y z", l)
  }

2. Run the following command:
Code:
awk -f script_name input_file

Edit: OK, I realized that you put the entire command in the script file.
Please remove awk ' and ' infile!
# 5  
Old 01-31-2013
sorry im new to this,

so this file has 21,092 lines and around line 5427 this script begins to skip some repeats and doesn't assign a letter. Towards the very bottom of the file there are hardly any repeats being assigned letters. Is there a size limitation to this?

---------- Post updated at 05:49 PM ---------- Previous update was at 05:39 PM ----------

I also noticed that in some cases it jumps letters like in the sample below

Code:
DOG000160 a
DOG000160 b
DOG000161 e
DOG000161 f
DOG000162 b

it's calling DOG000161 "e" instead of "a". And DOG000162 "b" is really supposed to be "a". Why do you suppose this is?

Last edited by Scrutinizer; 01-31-2013 at 07:48 PM.. Reason: quote tags to code tags
# 6  
Old 01-31-2013
Quote:
Originally Posted by verse123
sorry im new to this,

so this file has 21,092 lines and around line 5427 this script begins to skip some repeats and doesn't assign a letter. Towards the very bottom of the file there are hardly any repeats being assigned letters. Is there a size limitation to this?

---------- Post updated at 05:49 PM ---------- Previous update was at 05:39 PM ----------

I also noticed that in some cases it jumps letters like in the sample below



it's calling DOG000161 "e" instead of "a". And DOG000162 "b" is really supposed to be "a". Why do you suppose this is?
Are there more than 26 occurrences of a single input value?

If so, what "letter" do you want to assign when an input value appears more than 26 times?

Do all of the values that aren't assigned trailing letters appear more than once in your input?

Do any of the values that aren't assigned trailing letters appear more than once but less than 27 times?
# 7  
Old 02-01-2013
there are not more than 9 occurrences of a single input value.

Not all of the values that aren't assigned trailing letters appear more than once. some appear only once some appear several times, but never more than 9 times.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Adding sequential index to duplicate strings

I have a text file in the following format >Homo sapiens KQKCLYNLPFKRNLEGCRERCSLVIQIPRCCKGYFGRDCQACPGGPDAPCNNRGVCLDQY SATGECKCNTGFNGTACEMCWPGRFGPDCLPCGCSDHGQCDDGITGSGQCLCETGWTGPS CDTQAVLPAVCTPPCSAHATCKENNTCECNLDYEGDGITCTVVDFCKQDNGGCAKVARCS... (2 Replies)
Discussion started by: jerrild
2 Replies

2. Shell Programming and Scripting

--Parsing out strings for repeating delimiters for everyline

Hello: I have some text output, on SunOS 5.11 platform using KSH: I am trying to parse out each string within the () for each line. I tried, as example: perl -lanF"" -e 'print "$F $F $F $F $F $F"' But for some reason, the output gets all garbled after the the first fields.... (8 Replies)
Discussion started by: gilgamesh
8 Replies

3. Shell Programming and Scripting

Script to rename the repeating strings

All, I have a sample text like below. Key (Header) Key1 ABC Key2 ABC Key3 ABC ABC Key4 ABC Key5 ABC ABC ABC Required Output Key (Header) Key1 (2 Replies)
Discussion started by: ks_reddy
2 Replies

4. UNIX for Dummies Questions & Answers

Need help with repeating variables in a shell script

I should preface this by saying I have never worked with shell scripts before so this is all new to me. I was able to make something that worked, but is terribly optimized, and I have no idea how to improve it. If anything it's a pretty hilarious script: #/bin/bash get_char() { ... (4 Replies)
Discussion started by: ricco19
4 Replies

5. Shell Programming and Scripting

AWK adding prefix/suffix to list of strings

75 103 131 133 138 183 197 221 232 234 248 256 286 342 368 389 463 499 524 538 (5 Replies)
Discussion started by: chrisjorg
5 Replies

6. Shell Programming and Scripting

Adding Variables

Hi. I have a for loop that I use to extract integer values in a shell script (ksh). Now, I would like to add the values. My preference, from my c programming days, would be to do something like the commented out line below in the for loop. However, this is not recognised. So I use the line... (2 Replies)
Discussion started by: mikem22
2 Replies

7. Shell Programming and Scripting

Adding strings to lines in a file

Hi all, I have a positional text file that comes from some source application. Before it is processed by destination application I have to add some header (suffix) to every record(line) in the file. e.g. Actual File ............... AccountDetails AcNO Name Amount 1234 John 26578 5678... (3 Replies)
Discussion started by: sharath160
3 Replies

8. Shell Programming and Scripting

Adding Strings to a file

Well thanks a lot but I have another Problem I try to solve. I habe one simple Textfile with entries like this, for example: file1 file2 file3 file4 ... file200 And I want to add Strings at the beginning on the line. Like this word1 file1 word1 file2 ... I hope you can help me (3 Replies)
Discussion started by: Blackbox
3 Replies

9. Shell Programming and Scripting

bash hell , removing " and adding from a strings

I'm writing a bash script and i'm stuck the out put of a dialog menu is echo $select "foo" "bar" "lemon" cheese" while I need $foo $bar $lemon $cheese to reuse them as strings later in the script and very new to bash scripting and i've no idea how to do this any help would be... (2 Replies)
Discussion started by: xpd259
2 Replies

10. Shell Programming and Scripting

Repeating variables in the code

Hi all, I had written 3 KSH scripts for different functionalities. In all these 3 files there are some 30 variables in common. So I want to reduce the code by placing these variables in a common properties file named (dataload.prop/dataload.parms/dataload.txt) or txt file and access it... (1 Reply)
Discussion started by: mahalakshmi
1 Replies
Login or Register to Ask a Question