Sponsored Content
Top Forums Shell Programming and Scripting Reducing multiple entries in a tri-lingual dictionary to single entries Post 302942051 by Scrutinizer on Friday 24th of April 2015 12:13:03 AM
Old 04-24-2015
Hi, try:
Code:
awk '{n=split($1,F,/[,;]/); for(i=1; i<=n; i++) print F[i],$2,$3}' FS='\t' OFS='\t' file

--edit--
This will work on Linux / Unix. Just noticed that it needs to work under Windows.

Can't help you there.. I know there can be quoting issues, maybe CR/LF related issues...

Perhaps you could put the script in a file and execute that:

keyword_split.awk:
Code:
BEGIN {
  FS=OFS="\t"
}
{
  n=split($1,F,/[,;]/)
  for(i=1; i<=n; i++) print F[i],$2,$3
}

And execute with
Code:
awk -f keyword_split.awk file

Or use Cygwin or some other simulation...

Last edited by Scrutinizer; 04-24-2015 at 01:45 PM..
This User Gave Thanks to Scrutinizer For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Tri-booting?

Is it possible to triple boot with Solaris 9 (x86 version)? I installed XP Prof first, then Linux Fedora. Currently there it is a dual boot, and the dual boot software came with Fedora. I already used partition magic to allocate 5 gigs of free space on my disk. Basically my questions are:... (1 Reply)
Discussion started by: CapsuleCorpJX
1 Replies

2. UNIX for Dummies Questions & Answers

Need advice! Removing multiple entries in a single file!

Hello, I have a file Test.txt with 9 columns that looks like this: 1g12 A 14 19 2OAY A 326 331 AAAASA 1l7v A 68 73 1l7v A 68 73 AALAIS 1l7v A 68 73 1XVW B 72 77 AALAIS 1l7v A 68 73 1XXU A 65 70 AALAIS 1l7v A 68 73 1XXU B 65 70 AALAIS 1l7v A 68 73 1XXU C 65 70 AALAIS 1l7v A 68 73 1XXU D... (4 Replies)
Discussion started by: InfoSeeker
4 Replies

3. UNIX for Dummies Questions & Answers

Grep multiple strings in multiple files using single command

Hi, I will use below command for grep single string ("osuser" is search string) ex: find . -type f | xarg grep -il osuser but i have one more string "v$session" here i want to grep in which file these two strings are present. any help is appreciated, Thanks in advance. Gagan (2 Replies)
Discussion started by: gagan4599
2 Replies

4. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

5. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

Filtering out Non-Lingual characters

In one of our project requirements , we will be SCANNING ALL RECORDS OF AN INPUT TEXT FILE AND WILL BE FILTERING OUT RECORDS WHICH CONTAINS NON-LINGUAL CHARACTERS What's meant by this requirement is that we will be retaining records that contains alphabets used in any language , like English... (1 Reply)
Discussion started by: kumarjt
1 Replies

7. Shell Programming and Scripting

Multiple entries for shell

I have a simple shell file (convert.sh), that I would like to add a loop to that allows the user to have the "Enter ID:" prompt keep displaying until end is typed. So instead of: bash ~/convert.sh Enter ID:123 bash ~/convert.sh Enter ID:456 bash ~/convert.sh Enter ID:789 The user would... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. Shell Programming and Scripting

Script to code every 2 consecutive entries as single entry

All, I come across the below requirement and my search on the previous posts did not result into any matches. I have one column of data from a csv file like below. And I need to add additional column based on string count in first column. Given column, Required column, Other columns A, 1,... (8 Replies)
Discussion started by: ks_reddy
8 Replies

9. Shell Programming and Scripting

Help need to convert bi-lingual files in sub-title format

I have a large number of files in the standard subtitle format with the additional proviso that the files are bi-lingual i.e. English and a second language: in this case Hindi. A small sample is given below: 00 04 07 08 00 04 11 00 I mean very high fever... He even vomited. 00 04 07 08 00... (6 Replies)
Discussion started by: gimley
6 Replies

10. Shell Programming and Scripting

Identifying single words in a dictionary database

I am reworking a Marathi-English dictionary to be out on open-source. My dictionary has the Headword in Marathi, followed by its Part of Speech and subsequently by its English glosses as in the examples below; अकरसणें v i To contract, shrink. अकरा a Eleven. अकराळ a Frightful, terrible. विकराळ... (2 Replies)
Discussion started by: gimley
2 Replies
lt-comp(1)																lt-comp(1)

NAME
lt-comp - This application is part of the lexical processing modules and tools ( lttoolbox ) This tool is part of the apertium machine translation architecture: http://www.apertium.org. SYNOPSIS
lt-comp [ lr | rl ] dictionary_file output_file DESCRIPTION
lt-comp Is the application responsible of compiling dictionaries used by lt-proc in Apertium into a compact and efficient representation (a class of finite-state transducers called augmented letter transducers). OPTIONS
lr The resulting transducer will process dictionary entries left-to-right. rl The resulting transducer will process dictionary entries right-to-left. FILES
dictionary_file The input dictionary. output_file The compiled dictionary (a finite state transducer). SEE ALSO
lt-proc(1), lt-expand(1), apertium-tagger(1), apertium-translator(1). BUGS
Lots of...lurking in the dark and waiting for you! AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved. 2006-03-08 lt-comp(1)
All times are GMT -4. The time now is 12:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy