handling arrays with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting handling arrays with awk
# 1  
Old 05-24-2010
handling arrays with awk

Hi,

I have an issue that I am trying to resolve using arrays in awk. I have two files, the first one is a dictionary with this format:

FILE 1 (dictionary)
Code:
'Abrir' 'Open'
'Aceptar' 'Accept'

Every line has two fields, a word in two languages.


The second file is a simple list of words, that can be in either of the languages of the dictionary or not present in the dictionary.

FILE 2 (wordlist)
Code:
'Open'
'Aceptar'
'Absoluto'
...


I need to split this second file in three parts:
- words in dictionary, first language
- words in dictionary, second language
- words not in dictionary


What I have done with awk:

Code:
BEGIN {
dictfile=ARGV[1];
listfile=ARGV[2];
} if (FILENAME == dictfile) {
dic[$1] = $2;
tran[$2];
} else { #FILENAME = listfile;
if ($1 in dic) {
print "word in dictionary", $1, dic[$1];
}
else {
if ($1 in tran) {
print "word already translated", $1;
}
else {
print "word not in dictionary", $1;
}
}
}


The problem comes with the case of words already translated. Seems like the array "tran" is not properly constructed, and I canīt fix it.

I know the code is quite messy, my apologies. I will thank any help and/or suggestions

Last edited by gmartinez; 05-24-2010 at 05:02 PM.. Reason: code tags, please!
# 2  
Old 05-24-2010
In that one post, I think you referred to the same array by three different names.

Quote:
Originally Posted by gmartinez
tran[$2];}
...
if ($1 in trans) {
...
Seems like the array "tras" is not properly constructed
tran, trans, or tras? You need to pick one and stick with it. Smilie

There may also be some issue with regard to the structure of the nested if/else statements, but without code tags to preserve indentation (assuming it was there to begin with), it's a pain to read.

Regards,
Alister
# 3  
Old 05-25-2010
Is this what you're looking for?
Code:
awk '
NR==FNR{a[$1];next}
$1 in a {print $1 > "InDictionary"; next}
$2 in a {print $2 > "Translated"; next}
{print > "NotInDirectionary"}
' file2 file1

# 4  
Old 05-25-2010
Solved

That was a nice and simple solution, just switching the order of the input files, thanks a lot Franklin52.

Actually the original code worked as well, in a different environment (seems that there was some problem with the machine local configuration). But the suggested solution is much more ellegant and efficient.


Sorry for the lack of code tag and indentation in the first post
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Arrays in awk

Array A contains lines of numbers from files. Array B contains number of fields for each line of each file. I want to run on array A. To do that I need to know the number of fields for each file in array A (because each line doesn't have the same NF). The NF is in array B, problem is: I don't... (6 Replies)
Discussion started by: guitarist684
6 Replies

2. Shell Programming and Scripting

help in awk arrays!

Hi, buddies I am new to shell scripting and trying to solve a problem. I read about arrays in awk that they are quite powerful and are associative in nature. Awk Gurus Please help! I have a file: Id1 pp1 0t4 pp8 xy2 Id43 009y black Id6 red xy2 Id12 new pp1 black I have... (5 Replies)
Discussion started by: geli21
5 Replies

3. UNIX for Dummies Questions & Answers

Help to know the various possible uses of awk arrays

Hi to all, I write this time to ask for different syntax or ways of arrays within awk and uses. Since I don't know how actually could work and the real potencial of awk arrays, I look for a proactive reply/help in order to get more information than it seems I'm trying to get. I think I... (2 Replies)
Discussion started by: cgkmal
2 Replies

4. Shell Programming and Scripting

handling asterix in AWK

I have a file like below. colA^col2^col3^col4^col5 aa^11^aaa^a1a^111^aa* bb*^22^bbb*^bb2^222^bb cc^33^ccc*^3cc^333^ccc dd^44^d*dd*^d4d^444^ddd ee^55^e*ee^e5e*^555^e*e NOTE: '^' is the field separator. I need to get the output as colA^col2^col3^col4^col5 aa^11^aaa^a1a^111^aa... (5 Replies)
Discussion started by: rinku11
5 Replies

5. Shell Programming and Scripting

Data handling using AWK

Hi, I have requirement to fetch lines with a particular character in one column e.g. 2.5M asdsad 3.5M sadsadas 12323M ssdss i tried following so far #echo 2.3M asdsad | nawk -F " " '{print substr($1,length($1))}' M So far i have tried following # echo 2.3M asdsad | nawk... (4 Replies)
Discussion started by: mtomar
4 Replies

6. Shell Programming and Scripting

awk arrays can do this better - but how?

Hi, I have spent the afternoon trawling Google, Unix.com and Unix in a Nutshell for information on how awk arrays work, and I'm not really getting too far. I ahve a batch of code that I am pretty sure can be better managed using awk, but I'm not sure how to use awk arrays to do what I'm... (1 Reply)
Discussion started by: littleIdiot
1 Replies

7. Shell Programming and Scripting

column handling in awk

Dear Scripting experts, I have a problem which i cannot get my head around and wondered if anyone can help me. I have two files "file1" and "file2" and i want to replace column one from file 1 with column one with file2.(where file two has many columns). see example.. ive tried to use cut and... (4 Replies)
Discussion started by: Mish_99
4 Replies

8. Shell Programming and Scripting

Need Help with awk and arrays

now its owkring - thanks fo rthe help all . (7 Replies)
Discussion started by: fusionX
7 Replies

9. Shell Programming and Scripting

awk arrays

Guys, OK so i have been trying figure this all all day, i guess its a pretty easy way to do it. Right, so i have to column of data which i have gotten from one huge piece of data. What i would like to do is to put both of these into one array using awk. Is this possible?? If so could... (1 Reply)
Discussion started by: imonthejazz
1 Replies

10. Shell Programming and Scripting

arrays in awk???

Been struggling with a problem, I have been trying to do this in awk, but am unable to figure this out, I think arrays have to be used, but unsure how to accomplish this. I have a input file that looks like this: 141;ny;y;g 789;ct;e;e 23;ny;n;u 45;nj;e;u 216;ny;y;u 7;ny;e;e 1456;ny;e;g... (3 Replies)
Discussion started by: craigsky
3 Replies
Login or Register to Ask a Question