Fetch entries in front of specific word till next word


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fetch entries in front of specific word till next word
# 8  
Old 11-05-2012
Hi all,

Thanks for continuous help but this time my output file is completely blank!

I am attaching here my sample input file.

Kindly check it.
# 9  
Old 11-05-2012
The first line in your file is a killer - it does not obey any rules. You might want to deal with it separtely. Try this - works on linux/mawk:
Code:
awk     '{gsub(/\n/,"") 
          Ar[++i]=$1}   
          /Phase/ {for (j in Ar)  print Ar[j], $NF; delete Ar; i=0}
        ' RS="\|?[A-Za-z ]*: " FS="   *" OFS="\t" file

What it does is split the file at "Drugs: ", "other: " etc, register all first fields until a phase information shows up, and then print all registered first fields together with the respective phase info.
The exotic record and field separators will not work on all awk implementations.
# 10  
Old 11-05-2012
With your new sample file, use FS="\t" in lieu of FS=" *"
# 11  
Old 11-05-2012
There are few lines where phase is not present.

So i added previous phase of the line.

Code:
sed "s/^Drug: //;s/Drug: /|/g;s/ *||* */|/g;s/ *Phase/@Phase/" file | awk -F"@" '{if($2){s=$2};n=split($1,d,"|");for(j=1;j<=n;j++) {print d[j]":"p[i],s}}'

If you want to keep as it is blank. then use below.

Code:
sed "s/^Drug: //;s/Drug: /|/g;s/ *||* */|/g;s/ *Phase/@Phase/" file | awk -F"@" '{n=split($1,d,"|");for(j=1;j<=n;j++) {print d[j]":"p[i],$2}}'

# 12  
Old 11-05-2012
Hi all,

Thankyou for your support but my output in my sytem seems just irregular as it was!

Code:
Drug: Ramipril: 
Drug: Placebo: 
Drug: Placebo    Phase 3: 
Drug: Etanercept: 
Drug: Placebo    Phase 1: 
Phase 2: 
Drug: 1,25-dihydroxy-vitamin D3 (calcitriol): 
Drug: placebo    Phase 2: 
Drug: Pro insulin peptide: 
Drug: Pro insulin peptide: 
Drug: Saline    Phase 1: 
Phase 2: 
Procedure: Islet transplant: 
Drug: Deoxyspergualin: 
Drug: Antithymocyte globulin: 
Drug: Daclizumab or basiliximab: 
Drug: Sirolimus: 
Drug: Tacrolimus: 
Drug: Etanercept    Phase 2: 
Drug: TAK-329: 
Drug: TAK-329: 
Drug: Insulin: 
Drug: Placebo    Phase 1: 
Drug: Exenatide: 
Drug: Rapid and long acting insulin: 
Drug: long acting insulin + rapid acting + 1.25 mcg Exenatide    Phase 4: 
Drug: Insulin glargine (HOE901): 
Drug: NPH insulin    Phase 3: 
Procedure: Islet transplant: 
Drug: Belatacept: 
Drug: Basiliximab: 
Drug: Mycophenolate Mofetil    Phase 2: 
Drug: Insulin glargine new formulation (HOE901): 
Drug: Insulin glargine (HOE901)    Phase 2: 
Drug: Insulin glargine new formulation (HOE901): 
Drug: Insulin glargine (HOE901)    Phase 3: 
Drug: Insulin glargine new formulation (HOE901): 
Drug: Insulin glargine (HOE901) (Lantus)    Phase 3: 
Drug: insulin detemir: 
Drug: insulin NPH:

# 13  
Old 11-05-2012
Quote:
Originally Posted by Priyanka Chopra
Hi all,

Thankyou for your support but my output in my sytem seems just irregular as it was!
[/CODE]
Have you tried my code?

In my previous post I already mentioned that. There are few line which don't have phase in your input file. Please look at my previous post.
# 14  
Old 11-05-2012
Hello Pamu,

Yes I checked. Thanks for your help.

But my expected out put is like this if a sentence is mentioned like this:

Drug: MK-3102|Drug: Matching placebo to MK-3102|Drug: Basal medication Phase 3

expected output is

Code:
Code:
MK-3102                                               Phase 3
Matching placebo to MK-3102            Phase 3
Basal medication                                   Phase
3

For lastline expected output is

Drug: Alogliptin and glimepiride|Drug: Alogliptin and glimepiride|Drug: Alogliptin and metformin|Drug: Alogliptin and metformin Phase 2|Phase 3



Code:
Alogliptin and glimepiride         Phase 2|Phase  3
Alogliptin and glimepiride          Phase 2|Phase  3
Alogliptin and metformin           Phase 2|Phase  3
Alogliptin and metformin           Phase 2|Phase  3

so anyhting between Drug: and the symbol | get separated with phase mentioned in front of line in second column.

Although I dont want duplicates as present in second line but if it is there I can manage. But good nto to have duplicates

Code:
Code:
Alogliptin and glimepiride         Phase 2|Phase  3
Alogliptin and metformin           Phase 2|Phase  3

But I can use another programm fo rthat lateron but separation is a bit difficult
[/CODE]

if there is not phase all the words in front of it will be blank
for eg drug: MK01 drug:VV09

Code:
  MK01
   VV09

so in front of these two words there are blank spaces without any phase. Thats what I expected.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for a specific word and print only the word from the input file

Hi, I have a sample file as shown below, I am looking for sed or any command which prints the complete word only from the input file. Ex: $ cat "sample.log" I am searching for a word which is present in this file We can do a pattern search using grep but I need to cut only the word which... (1 Reply)
Discussion started by: mohan_kumarcs
1 Replies

2. Shell Programming and Scripting

Merge lines till a particular word

Hi Experts, I have a requirement like, I have to search between 2 words (<deviceDetails> and </deviceDetails>) and merge all lines in between into 1 line. Example: <deviceDetails><subscriberName>#UNKNOWN#</subscriberName> <customerNumber>#UNKNOWN#</customerNumber>... (5 Replies)
Discussion started by: satyaatcgi
5 Replies

3. Shell Programming and Scripting

Need a word which just comes next to after grep of a specific word

Hi, Below is an example : ST1 PREF: int1 AVAIL: int2 ST2 PREF :int1 AVAIL: int2 I need int1 to come in preferred variable while programming and int2 in available variable Please help me doing so Best regards, Vishal (10 Replies)
Discussion started by: Vishal_dba
10 Replies

4. Shell Programming and Scripting

Match the word or words and fetch the entries

Hi all, I have 7 words Now I have 1 file which contain data in large number of rows and columns and 6th column contain any of these words or may be more than one words among above 7 words: I want script should search for the above mentioned 7 words in the 6th column ... (9 Replies)
Discussion started by: manigrover
9 Replies

5. UNIX for Dummies Questions & Answers

Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word

I have a file that has the words I want to find in other files (but lets say I just want to find my words in a single file). Those words are IDs, so if my word is ZZZ4, outputs like aaZZZ4, ZZZ4bb, aaZZZ4bb, ZZ4, ZZZ, ZyZ4, ZZZ4.8 (or anything like that) WON'T BE USEFUL. I need the whole word... (6 Replies)
Discussion started by: chicchan
6 Replies

6. UNIX for Dummies Questions & Answers

How to print line starts with specific word and contains specific word using sed?

Hi, I have gone through may posts and dint find exact solution for my requirement. I have file which consists below data and same file have lot of other data. <MAPPING DESCRIPTION ='' ISVALID ='YES' NAME='m_TASK_UPDATE' OBJECTVERSION ='1'> <MAPPING DESCRIPTION ='' ISVALID ='NO'... (11 Replies)
Discussion started by: tmalik79
11 Replies

7. Shell Programming and Scripting

Bash take word after specific point and till next space?

Hello, I have an output like Interface Chipset Driver wlan0 Intel 4965/5xxx iwlagn - and I want to take only the 'wlan0' string. This can be done by a="Interface Chipset Driver wlan0 Intel 4965/5xxx iwlagn - " b=${a:25:6} echo $bThe thing is that wlan0 can be something else, like eth0 or... (2 Replies)
Discussion started by: hakermania
2 Replies

8. Shell Programming and Scripting

Grep out specific word and only that word

ok, so this is proving to be kind of difficult even though it should not be. say for instance I want to grep out ONLY the word fkafal from the below output, how do I do it? echo ajfjf fjfjf iafjga fkafal foeref afoafahfia | grep -w "fkafal" If i run the above command, i get back all the... (4 Replies)
Discussion started by: SkySmart
4 Replies

9. Shell Programming and Scripting

Want to add a word in front a of each line of a file

Hi, Can anybody help me how to add a word in front of a line in a file.Actually it is bit tricky to add a word. i will give a sample for this: Input : 1110001 ABC DEF 1110001 EFG HIJ 1110001 KLM NOP 1110002 QRS RST 1110002 UVW XYZ Output: %HD% 1110001 ABC DEF %DT% 1110001 EFG HIJ... (4 Replies)
Discussion started by: apjneeraj
4 Replies

10. Shell Programming and Scripting

Adding a word in front of a word of each line.

Adding a word in front of a word of each line.In that line only one word will be there. pl help:( (4 Replies)
Discussion started by: Ramesh Vellanki
4 Replies
Login or Register to Ask a Question