Text processing using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Text processing using awk
# 1  
Old 07-16-2014
Text processing using awk

I dispose of two tab-delimited files (the first column is the primary key):

File 1 (there are multiple rows sharing the same key, I cannot merge them)
Code:
A    28,29,30,31
A    17,18,19
B    11,13,14,15
B    8,9

File 2 (there is one only row beginning with a given key)
Code:
A    2,8,18,30,31
B    3,11

I'd like to put a star symbol (tab-separated) in File 1 if there is a corresponding element in the second column of File 2.

The output should look like:
Code:
A    28,29,30,31        **
A    17,18,19        *
B    11,13,14,15        *
B    8,9

I'm trying an awk solution, but I cannot find my way out. Please let me know if you have an idea of how I could deal with this issue.
# 2  
Old 07-16-2014
Please show us your awk approach.
# 3  
Old 07-16-2014
Something like this. But it really need a fix, it doesn't give the expected output.

PRE.cjk { font-family: "WenQuanYi Micro Hei",monospace; }PRE.ctl { font-family: "Lohit Hindi",monospace; }P { margin-bottom: 0.1in; line-height: 120%; }CODE.cjk { font-family: "WenQuanYi Micro Hei",monospace; }CODE.ctl { font-family: "Lohit Hindi",monospace; }A:link { }
Code:
$ awk '     FNR == NR {         a[$1] = $2;         next;     }     {         split($2,b,",");         split(a[$1],c,",");         for (i in b) {             if (b[i] in c) {                 printf("%s %s\t*\n",$1,a[$1]);next;             }}                 print $1, a[$1];      } ' file1 file2

Thanks.
# 4  
Old 07-17-2014
You were on the right track. Here is an approach with two-dimensional arrays :
Code:
awk '{split($2,F,/,/)} NR==FNR{for(i in F) A[$1,F[i]]; next} {for(i in F) if(($1,F[i]) in A) $3=$3 "*"}1' FS='\t' OFS='\t' file2 file1

or

Code:
awk '{split($2,F,/,/); for(i in F) if(NR==FNR){A[$1,F[i]]} else if(($1,F[i]) in A) $3=$3 "*"}NR>FNR' FS='\t' OFS='\t' file2 file1

This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk for text processing

Hi,my file is in this format ", \"symbol\": \"Rbm38\" } ]" I want to convert it to a more user readable format _id pubmed text symbol 67196 18667844 Overexpression of UBE2T in NIH3T3 cells significantly promoted colony formation in mouse cell cultures Ube2t 56190 21764855 ... (3 Replies)
Discussion started by: biofreek
3 Replies

2. Shell Programming and Scripting

Text processing

Hi, Need an advise on $ cat test.txt START field1 field2 field3 field4 field5 field6 END 12345|6|1|2|3|4|111|119 67890|6|1|3|8|9|112|000 $ (4 Replies)
Discussion started by: getmilo
4 Replies

3. Shell Programming and Scripting

Help with text processing

I have an Input file which has a series of lines(which could vary) followed by two blank lines and then another series of lines(Could be any number of lines) followed by two blank lines and then repeats. I need to use filters to convert the following input file(which is an example) to an output... (7 Replies)
Discussion started by: bikerboy
7 Replies

4. Shell Programming and Scripting

Text columns processing using awk

P { margin-bottom: 0.25cm; line-height: 120%; }CODE.cjk { font-family: "WenQuanYi Micro Hei",monospace; }CODE.ctl { font-family: "Lohit Hindi",monospace; }A:link { } I'm trying to build an awk statement to print from a file (file1): A 1,2,3 * A 4,5,6 ** B 1 ... (4 Replies)
Discussion started by: dovah
4 Replies

5. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

6. Shell Programming and Scripting

Awk text processing

Hi Very much appreciate if somebody could give me a clue .. I undestand that it could be done with awk but have a limited experience. I have the following text in the file 1 909 YES NO 2 500 No NO . ... 1 ... (8 Replies)
Discussion started by: zam
8 Replies

7. Shell Programming and Scripting

awk, perl Script for processing a single line text file

I need a script to process a huge single line text file: The sample of the text is: "forward_inline_item": "Inline", "options_region_Australia": "Australia", "server_event_err_msg": "There was an error attempting to save", "Token": "Yes", "family": "Family","pwd_login_tab": "Enter Your... (1 Reply)
Discussion started by: hmsadiq
1 Replies

8. Shell Programming and Scripting

text processing ( sed/awk)

hi.. I have a file having record on in 1 line.... I want every 400 characters in a new line... means in 1st line 1-400 in 2nd line - 401-800 etc pl help. (12 Replies)
Discussion started by: clx
12 Replies

9. UNIX for Dummies Questions & Answers

text file processing

Hello! There is a text file, that contains hierarchy of menues, like: Aaaaa->Bbbbb Aaaaa->Cccc Aaaaa-> {spaces} Ddddd (it means that the full path is Aaaaa->Cccc->Ddddd ) Aaaaa-> {more spaces} Eeeee (it means that the full path is Aaaaa->Cccc->Ddddd->Eeeee ) Fffffff->Ggggg... (1 Reply)
Discussion started by: alias47
1 Replies

10. UNIX for Dummies Questions & Answers

Processing a text file

A file contains one name per line, such as: john doe jack bruce nancy smith sam riley When I 'cat' the file, the white space is treated as a new line. For example list=`(cat /path/to/file.txt)` for items in $list do echo $items done I get: john doe (1 Reply)
Discussion started by: TheCrunge
1 Replies
Login or Register to Ask a Question