Field extract w/o - and adding formats


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Field extract w/o - and adding formats
# 1  
Old 08-24-2012
Field extract w/o - and adding formats

I have a txt file like below:
HTML Code:
Accession  Orderable  Unique ID  Subject ID  Timepoint  Colletion Date/Time  
12-170-0185  Glucose  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00  
12-170-0185  Insulin  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00  
12-170-0200  Glucose  1756487  1  D-07_H00.05  18-JUN-2012 06:05:00  
12-170-0200  Insulin  1756487  1  D-07_H00.05  18-JUN-2012 06:05:00  
12-170-0215  Glucose  1756487  1  D-07_H00.15  18-JUN-2012 06:15:00  
......

Accession  Orderable  Unique ID  Subject ID  Timepoint  Colletion Date/Time  
12-170-0220  Glucose  1759452  3  D-07_H00.15  18-JUN-2012 06:15:00  
12-170-0220  Insulin  1759452  3  D-07_H00.15  18-JUN-2012 06:15:00  
12-170-0181  Insulin  1751183  4  D-07_H00.03  18-JUN-2012 06:03:00  
......
The field I need to extract is the first one called "Accession".

I need to convert the Accession Number in the following format with length 18.

i.e.

HTML Code:
12-198-0212 => "000002012198000212",
12-198-0214 => "000002012198000214",
12-198-0216 => "000002012198000216",
12-198-0218 => "000002012198000218",
12-198-0220 => "000002012198000220",
i.e.
HTML Code:
0000020
is added in front of
HTML Code:
12-198
with
HTML Code:
- (dash)
taken out, and
HTML Code:
00
is added in front of
HTML Code:
-0212
with
HTML Code:
- (dash)
taken out as well.

HTML Code:
"000002012198000212"
has
HTML Code:
" " ,
, and the double-quotes and comma are added for text delimiter file.

Other than "Accession" field, I don't need the rest of fields on the extracted output.

Please advise.
# 2  
Old 08-24-2012
hope this helps

Code:
 perl -alne 'if(/^\d+-\d+-\d+/) {@acc=split(/-/,$F[0]);printf("12-198-%s => \"00020%s19800%s\",\n",$acc[2],$acc[0],$acc[2]);}' input_file


Last edited by msabhi; 08-24-2012 at 04:05 PM..
# 3  
Old 08-24-2012
Sorry, if I am not clear on i.e.
PHP Code:
12-198 
It is not static. It can be
PHP Code:
12-170 
or else.

i.e.

PHP Code:
12-170-0185  Glucose  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00  
12
-170-0185  Insulin  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00 
Please advise.
# 4  
Old 08-24-2012
I thought it wouldn't be constant and had given the code for the same at first..later after going through your req again, i re-edited it since it seemed to be constantSmilie ..cool..here it goes
Code:
perl -alne 'if(/^\d+-\d+-\d+/) {@acc=split(/-/,$F[0]);printf("%s => \"00020%s%s00%s\",\n",$F[0],@acc);}'  input_file

# 5  
Old 08-24-2012
This is one way to do it:
Code:
$ cat t
Accession  Orderable  Unique ID  Subject ID  Timepoint  Colletion Date/Time
12-170-0185  Glucose  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00
12-170-0185  Insulin  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00
12-170-0200  Glucose  1756487  1  D-07_H00.05  18-JUN-2012 06:05:00
12-170-0200  Insulin  1756487  1  D-07_H00.05  18-JUN-2012 06:05:00
12-170-0215  Glucose  1756487  1  D-07_H00.15  18-JUN-2012 06:15:00
12-170-0220  Glucose  1759452  3  D-07_H00.15  18-JUN-2012 06:15:00
12-170-0220  Insulin  1759452  3  D-07_H00.15  18-JUN-2012 06:15:00
12-170-0181  Insulin  1751183  4  D-07_H00.03  18-JUN-2012 06:03:00

$ sed -e '1d' -e 's/^\([[:digit:]]\{2\}\)-\([[:digit:]]\{3\}\)-\([[:digit:]]\{4\}\).*/"0000020\1\200\3"/g' t
"000002012170000185"
"000002012170000185"
"000002012170000200"
"000002012170000200"
"000002012170000215"
"000002012170000220"
"000002012170000220"
"000002012170000181"

# 6  
Old 08-24-2012
It works great!

Two things I like to ask:

1) The output shows as below:

PHP Code:
12-170-0185 => "0002012170000185",
12-170-0185 => "0002012170000185",
12-170-0200 => "0002012170000200",
12-170-0200 => "0002012170000200",
12-170-0215 => "0002012170000215"
i.e.
PHP Code:
12-170-0185 => 
is not needed, just
PHP Code:
"0002012170000185""0002012170000185"etc 
. So, I just need the converted Accession number.

2) If I can have another script showing the converted Accession Number along with the other fields, that will be great!

So, the output will look like;

PHP Code:
"0002012170000185",  Glucose  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00  
"0002012170000185",  Insulin  1756487  1  D-07_H00.03  18-JUN-2012 06:03:00  
"0002012170000200",  Glucose  1756487  1  D-07_H00.05  18-JUN-2012 06:05:00 
The reason I need to have two scripts (one with only Accession Number; the other with all fileds beginning with Accession Number) is for verifications as it will be very helpful to go through the rows.

Please advise.
# 7  
Old 08-24-2012
Req 1>
Code:
perl -alne 'if(/^\d+-\d+-\d+/) {@acc=split(/-/,$F[0]);printf("\"00020%s%s00%s\",\n",@acc);}' input_file

Req 2>
Code:
perl -alne 'if(/^\d+-\d+-\d+/) {@acc=split(/-/,$F[0]);$F[0]=sprintf("\"00020%s%s00%s\",",@acc); print "@F"}'

input_file
This User Gave Thanks to msabhi For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Adding field to file

Hi I have file as given below 040|14300|40.0|563000 042|13200000|40.0 041|100|40.0 043|10000|40.0 045|102|40.0 I want to check if field 4 is null then I have to place | in position 4. result set should be 040|14300|40.0|563000 042|13200000|40.0| 041|100|40.0| 043|10000|40.0|... (9 Replies)
Discussion started by: shabeena
9 Replies

2. Shell Programming and Scripting

Adding an additional blank field to a file

Hi, I have the following file, I'd like to add an additional blank field to this file This is a tab delimited file, I have tried the same thing on excel, but looking for a unix solution. Here is my input: Country Postal Admin4 StreetBaseName StreetType HUN 2243 Kóka Dózsa György ... (3 Replies)
Discussion started by: ramky79
3 Replies

3. Shell Programming and Scripting

Adding matching field

I have a flat file test.log red,5,,,,, green,7,,,,, blue,4,,,,, red,8,,,,, green,9,,,,, How i get a a result: blue,4,,,,, green,16,,,,, red,13,,,,, Thanks Video tutorial on how to use code tags in The UNIX and Linux Forums. (2 Replies)
Discussion started by: sabercats
2 Replies

4. Shell Programming and Scripting

Adding total of first field for each number in the second field

Dears, I need a script or command which can find the unique number from the second filed and against that number it adds the total of first field . 17215630 , 0 907043 ,1 201050 ,10 394149 ,4 1964 ,9 17215630, 0 907043 ,1 201050, 10 394149 ,4 1964 ,9 1234234, 55 23 ,100 33 ,67 ... (2 Replies)
Discussion started by: shary
2 Replies

5. Shell Programming and Scripting

Adding new field

Hello, I have a main file with IP addresses like this: Erisim var,100,172.17.241.5,4006,60,IS0799,TCP/IP Erisim var,1003,172.17.140.4,4004,60,IS2156,TCP/IP Erisim var,1004,172.17.140.5,4002,60,IS2636,TCP/IP Erisim var,1005,172.17.140.5,4004,60,IS2436,TCP/IP Erisim... (8 Replies)
Discussion started by: Spunkerspawn
8 Replies

6. Shell Programming and Scripting

adding field values if field matches

hi i have file as below , i want to add duplicate records like bell_bb to one record with valuve as 15 ( addition of both ) any oneline awk script to achive this ? header 0 CAMPAIGN_NAME 1 Bell_BB 14 Bell_MONTHLY 803 SOLO_UNBEATABLE 644 Bell_BB 1 Bell_MONTHLY 25 SOLO_UNBEATABLE... (4 Replies)
Discussion started by: raghavendra.cse
4 Replies

7. Shell Programming and Scripting

adding another field to SED output

Dear experts, I have a file called "check" with contents like below i used the sed command like below to get the value of "success" and "failed" only My question is how can i get the value to include the time "03:15", so that i can get a value such as below : - Appreciate... (4 Replies)
Discussion started by: aismann
4 Replies

8. Shell Programming and Scripting

Problem adding into an array field!!!

Hi, Kindly assist by analyzing the code below and suggest changes to achieve the required output. The input file: 01-010241800000 35000 MV010 02/03/09 0306 03060226 03 02-004103300000 470000 MV010 02/03/09 0301 03010276 03 The objective is to convert field No4. from dd/mm/yy to yyyymmdd... (5 Replies)
Discussion started by: talk2pawee
5 Replies

9. UNIX for Dummies Questions & Answers

Adding a new field using sed or awk?

I have a bcp file that contains 10 fields. These fields are separated by a tab. How can I add my name as a new field in the 8th position for every record? I've been playing w/ sed and awk but can't seem to figure this out. (3 Replies)
Discussion started by: sasabune
3 Replies

10. Shell Programming and Scripting

Adding field of flatfile by an index.

Hey guys, I was given a task that involved parcing a large file that looked somthing like this... A1-0999999,SMITH,.25 A1-0999999,JOHN,.75 A1-0999999,HELMET,.1.25 A1-0999999,HOOP,.10.25 D1-1212121,SMITH,4.00 D1-1212121,TH,9.00 D1-1212121,MITCH,10.20 D1-1212121,RETAL,3.00 A1-9909555,,3.00... (2 Replies)
Discussion started by: djsal
2 Replies
Login or Register to Ask a Question