Sponsored Content
Top Forums Shell Programming and Scripting Splitting Concatenated Words in Input File with Words from a Master File Post 302499246 by gimley on Wednesday 23rd of February 2011 07:53:42 PM
Old 02-23-2011
Hi Yinyuemi,
Many thanks for the timely help. The residue problem seems to be sorted with the new code. However the largest string issue still remains.
I used the code which you had posted (reproduced below)
Code:
awk 'NR==FNR{a[NR]=$1;b[$1]=1;x=NR}
NR>FNR{IGNORECASE=1;{for (j=1;j<=x;j++){for(i=1;i<=NF;i++) if(length($i)>length(a[j]) && !($i in b) && $i~a[j] && $i!=a[j])
{gsub(a[j]," "a[j]" ",$0)}}}}END{$1=toupper(substr($1,1,1))substr($1,2);print}' lookup raw

And I still get
The boy ran through slow ly
for
theboyranthroughslowly

Sorry to hassle you, but the largest string split is vital for the dictionary work I am doing.
Many thanks once again and hoping to read you,
Best regards,
Gimley

Last edited by Franklin52; 02-24-2011 at 03:32 AM.. Reason: Please use code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

splitting words from a string

Hi, I have a string like this in a file, I want to retrive the words separated by comma's in 3 variables. like How do i get that.plz advice (2 Replies)
Discussion started by: suresh_kb211
2 Replies

2. Shell Programming and Scripting

Shell script to find out words, replace them and count words

hello, i 'd like your help about a bash script which: 1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel, 2.finds the link which leads to the download location of the Latest Stable Kernel version, (the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies

3. Shell Programming and Scripting

Awk splitting words into files problem

Hi, I am trying to split the words having the delimiter as colon ';' in to separate files using awk. Here's my code. echo "f1;f2;f3" | awk '/;/{c=sprintf("%02d",++i); close("out" c)} {print > "out" c}' echo "f1;f2;f3" | awk -v i=0 '/;/{close("out"i); i++; next} {print > "out"i}' But... (4 Replies)
Discussion started by: royalibrahim
4 Replies

4. Shell Programming and Scripting

Splitting Concatenated Words With Largest Strings First

hello, I had posted earlier help for a script for splitting concatenated words . The script was supposed to read words from a master file and split concatenated words in the slave/input file. Thanks to the help I got, the following script which works very well was posted. It detects residues by... (14 Replies)
Discussion started by: gimley
14 Replies

5. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Dear all, I am working with names and I have a large file of names in which some words are written together (upto 4 or 5) and their corresponding single forms are also present in the word-list. An example would make this clear annamarie mariechristine johnsmith johnjoseph smith john smith... (8 Replies)
Discussion started by: gimley
8 Replies

6. Shell Programming and Scripting

Grepping a list of words from one file in a master database of homophones

Hello, I am sorry if the title is confusing, but I need a script to grep a list of Names from a Source file in a Master database in which all the homophonic variants of the name are listed along with a single indexing key and store all of these in an output file. I need this because I am testing... (4 Replies)
Discussion started by: gimley
4 Replies

7. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

8. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

9. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about... (5 Replies)
Discussion started by: malfolozy
5 Replies

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies
CDDB::File(3pm) 					User Contributed Perl Documentation					   CDDB::File(3pm)

NAME
CDDB::File - Parse a CDDB/freedb data file SYNOPSIS
my $disc = CDDB::File->new("rock/f4109511"); print $disc->id, $disc->all_ids; print $disc->artist, $disc->title; print $disc->year, $disc->genre, $disc->extd; print $disc->length, $disc->track_count; print $disc->revision, $disc->submitted_via, $disc->processed_by; foreach my $track ($disc->tracks) { print $track->number, $track->title, $track->artist; print $track->length, $track->extd; } DESCRIPTION
This module provides an interface for extracting data from CDDB-format data files, as used by freedb. It does not read data from your CD, or submit information to freedb. METHODS
new my $disc = CDDB::File->new("rock/f4109511"); This will create a new object representing the data in the file name specified. id / all_ids my $discid = $disc->id; my @discid = $disc->all_ids; Due to how freedb works, one CD may have several IDs associated with it. 'id' will return the first of these (not necessarily related to the filename from which this was read), whilst 'all_ids' will return all of them. title / artist The title and artist of this CD. For eponymous CDs these will be identical, even if the data file leaves the artist field blank. year The (4-digit) year of release. genre The genre of this CD. This is the genre as stored in the data file itself, which is not related to the 11 main freedb genres. extd The "extended data" for the CD. This is used for storing miscellaneous information which has no better storage place, and can be of any length. length The run time of the CD in seconds. track_count The number of tracks on the CD. revision Each time information regarding the CD is updated this revision number is incremented. This returns the revision number of this version. processed_by / submitted_via The software which submitted this information to freedb and which processed it at the other end. tracks foreach my $track ($disc->tracks) { print $track->number, $track->title, $track->artist; print $track->length, $track->extd; } Returns a list of Track objects, each of which knows its number (numering from 1), title, length (in seconds), offset, and may also have extended track data. Tracks may also contain an 'artist' field. If this is not set the artist method will return the artist of the CD. SEE ALSO
http://www.freedb.org/ AUTHOR
Tony Bowden BUGS and QUERIES Please direct all correspondence regarding this module to: bug-CDDB-File@rt.cpan.org COPYRIGHT
Copyright (C) 2001-2005 Tony Bowden. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. perl v5.10.1 2005-10-04 CDDB::File(3pm)
All times are GMT -4. The time now is 10:54 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy