Replace string ids with unique numbers


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replace string ids with unique numbers
# 1  
Old 05-14-2012
Replace string ids with unique numbers

Hello,

I have a file with a 1000 ids in the form of strings. I want to replace each id with a unique numbers in the whole file. each id is repeating in all the columns. I know I can use sed command but there are many ids in file which are need to be converted

example of input file
HTML Code:
B752 B295 B289
B295 Y710 B921
B289  B294 B294
B294 B289 B752
B584 B294 X216
B023 B584 B000
B99 B023 B584
B921 B99 B584
B000 T563 B000
24752 Y710  B295
T563 X216 B294 
Y710 B289 B289
X216 B752 B295
T53635 Y710 Y710
B629 24752 T53635 
BX99 B000 B289
BT24 B629 B294
Thanks in advance.
/ryan
# 2  
Old 05-14-2012
what is your expected output?
# 3  
Old 05-14-2012
Quote:
Originally Posted by ryan9011
I know I can use sed
Yes! That's the right attitude! ;-))

You could prepare a translation table in a second file and let a script read this file and invoke sed on every line. For instance (just a sketch):

The translation file:
Code:
0001=B752
0002=B295
0003=B289
...

The script:

Code:
#! /bin/ksh

chCode=""
chID=""
fTranslationTable="/path/to/some/file.xlate"
fWork="/path/to/your/file"

cat $fTranslationTable | cut -d'=' -f1,2 | while read chCode chID ; do
     sed "/${chID}/${chCode}/g" ${fWork} > ${fWork}.tmp
     mv ${fWork}.tmp ${fWork}
done

If the codes you want to replace the IDs with are only required to be unique you could even create them dynamically by splitting the lines of your original file so that every ID sits on a single line. sort -u will give you a list of unique IDs then and you can automatically create a code for every one of them, getting the translation table i used above. From there on you could use my method to replace the IDs with these codes.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 4  
Old 05-14-2012
for example if the number id for string B752 is 001. The code will replace B752 in
the file with the 001.

output for first row and its linked row will be as following
PHP Code:
001 B295 B289
X216 001 B295 
Rest of the string will also be converted accordingly.
# 5  
Old 05-14-2012
Quote:
Originally Posted by ryan9011
for example if the number id for string B752 is 001. The code will replace B752 in
the file with the 001.

output for first row and its linked row will be as following
PHP Code:
001 B295 B289
X216 001 B295 
Rest of the string will also be converted accordingly.
did you try the @bakunin solution?
a similiar solution
Code:
# ./justdoit infile trans
001 002 003
002 0012 008
003  004 004
004 003 001
005 004 0013
006 005 009
007 006 005
008 007 005
009 0011 009
0010 0012  002
0011 0013 004
0012 003 003
0013 001 002
0014 0012 0012
0015 0010 0014
0016 009 003
0017 0015 004

Code:
# more trans
B752=001
B295=002
B289=003
B294=004
B584=005
B023=006
B99=007
B921=008
B000=009
24752=0010
T563=0011
Y710=0012
X216=0013
T53635=0014
B629=0015
BX99=0016
BT24=0017

Code:
## justdoit ##
cp $1 ${1}.bck
while IFS="=" read id val
do
sed "s/$id/$val/g" $1 >${1}.tmp
mv ${1}.tmp $1
done <$2
more $1

This User Gave Thanks to ygemici For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issue using awk to print min values for unique ids

I am using the following script to search for and print minimum values for each individual Fields (3-14) for each unique id (Field 1). But when the field contains a "-99.99" ( I am ignoring "-99.99") and when the minimum value is the first line of a new id (Field 1), the output does not print Field... (13 Replies)
Discussion started by: ncwxpanther
13 Replies

2. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something... (1 Reply)
Discussion started by: funkman
1 Replies

3. Shell Programming and Scripting

Printing unique numbers from each file

I have some files named file1, file2, fille3......etc. These files are in a folder f1. The content of files are shown below. I would like to count the unique pairs of third column in each file. some files have no data. It should be printed as zero. Your help would be appreciated. file1 ARG... (1 Reply)
Discussion started by: samra
1 Replies

4. Shell Programming and Scripting

unique random numbers awk

Hi, I have a small piece of awk code (see below) that generates random numbers. gawk -F"," 'BEGIN { srand(); for (i = 1; i <= 30; i++) printf("%s AM329_%04d\n",$0,int(36 * rand())+1) }' OFS=, AM329_hole_names.csv The code works fine and generates alphanumeric numbers like AM329_0001,... (2 Replies)
Discussion started by: theflamingmoe
2 Replies

5. Shell Programming and Scripting

How to generate 10.000 unique numbers?

hello, does anybody can give me a hint on how to generate a lot of numbers which are not identically via scripting etc? (7 Replies)
Discussion started by: xrays
7 Replies

6. Shell Programming and Scripting

Script to replace numbers by string

Hi! I need the following script: - All numbers in a filename (0-9) have to be replace by a String ("Zero"-"Nine") - The script has to go through all the files in the current directory and has to replace the numbers as described above... I have no idea how to do this... Thanks! Michael (5 Replies)
Discussion started by: Michi21609
5 Replies

7. Shell Programming and Scripting

Replace a random string of numbers

Hi Can someone help me with this one? I have string.. (PROC_PROC_ID == 12183) <--PID is dynamic and i want to replace the PID number with whatever PID from /opt/hpws/apache32_2/logs/httpd.pid file. i'm having problem since the PID on the string is dynamic. It may be 2-5 digits or more. ... (5 Replies)
Discussion started by: ryandegreat25
5 Replies

8. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ... (5 Replies)
Discussion started by: andrewsc
5 Replies

9. UNIX for Dummies Questions & Answers

Getting unique list of numbers using grep

Hi, I am going to fetch a list of numbers that starts with "0032" from a file with a format like the given below: " 0032459999 0032458888 0032457777 0032451111 0032452222 0032453333 0032459999 0032458888 0032457777 0032451111 0032452222 0032453333 " I want to get a unique... (6 Replies)
Discussion started by: tinku
6 Replies

10. UNIX for Dummies Questions & Answers

To get unique numbers from two files

here i have two files: file 1 1 2 3 4 5 5 6 7 8 9 file 2 4 5 6 6 8 8 (6 Replies)
Discussion started by: i.scientist
6 Replies
Login or Register to Ask a Question