09-06-2008
Easy unix/sed question that I could have done 10 years ago!
Hi all and greetings from Ireland!
I have not used much unix or awk/sed in years and have forgotten a lot.
Easy enough query tho.
I am cleansing/fixing 10,000 postal addresses using global replacements.
I have 2 pipe delimited files , one is basically a spell checker for geographical areas. The second file is actual addresses.
Sample file 1 - 100+ lines (basically a spell checker):
|Irlllland|Ireland|
|Dubblin|Dublin|
|Corrk|Cork|
etc..
Sample file 2 - 10,000+ lines (Addresses to be cleansed):
|10 Main Street Irlllland|
|11 High Road Irlllland|
|1 High Road, Corrk|
The output required is :
|10 Main Street Ireland|
|11 High Road Ireland|
|1 High Road, Cork|
I am very rusty but reckon I need a loop with a global substition in it.
I used to know unix, awk and sed reasonably well but have forgotten the basic syntax.
All helpers there?
8 More Discussions You Might Find Interesting
1. Cybersecurity
Hi,
Could anyone direct me to any sites that have any info on unix attcks or hacks in the last 5 years. This is needed for an assignment. All help would be greatly appreciated.
Thanks:) (6 Replies)
Discussion started by: suzant
6 Replies
2. UNIX for Dummies Questions & Answers
can anyone tell me what exactly the following UNIX notation code does cause I need to do the same in windows?
for x in webapps/sal/*.htm*
do
mv $x $x.bak
sed 's@bob@sal@g' $x.bak > $x
done
Thanks (1 Reply)
Discussion started by: lavaghman
1 Replies
3. UNIX for Dummies Questions & Answers
I am trying to check through all of a certain type of file in all main directories, and find the top 10 that are taking up the most space. How can I do that? I was thinking like du *.file | sort -n | head (1 Reply)
Discussion started by: wallacer
1 Replies
4. Shell Programming and Scripting
I have a file name in this format
ABC_WIRE_TRANS_YYYYMMDD_00.DAT
I need to cut out the _00 out of the file name everytime. It could be _00, _01,_02, etc ....
How do I cut it out to look as follows?
ABC_WIRE_TRANS_YYYYMMDD.DAT (6 Replies)
Discussion started by: lesstjm
6 Replies
5. UNIX for Dummies Questions & Answers
I have a line like:
"Jun 19 12:56:22 routername 45454:"
I want to keep all information except the seconds of the time. I tried:
sed 's/..:..:../..:../g'
but apparently I'm on the wrong track, because although that matches on the time, it replaces it with the literal ..:..
How... (6 Replies)
Discussion started by: earnstaf
6 Replies
6. UNIX for Dummies Questions & Answers
Hi everybody:
Could anybody tell me if I have several files which each one it has this pattern name:
name1.dat name2.dat name3.dat name4.dat name10.dat name11.dat name30.dat
If I would like create one like:
name_total.dat
If I do:
paste name*.dat > name_total.dat (15 Replies)
Discussion started by: tonet
15 Replies
7. UNIX for Dummies Questions & Answers
Hello - I have a folder that contains files from 2003 till 2010. I am trying to figure out a command that would seperate each years file and show me a count?
Even if i can find a command that would give me year by year count, thats good enough too.
Thanks (8 Replies)
Discussion started by: DallasT
8 Replies
8. What is on Your Mind?
From Wed Sep 4 09:35 MDT 1991
Received: from by with SMTP
(16.6/15.5+IOS 3.20) id AA25932; Wed, 4 Sep 91 09:35:27 -0600
Return-Path:
Received: by
(16.6/15.5+IOS 3.20) id AA10424; Wed, 4 Sep 91 09:34:58 -0600
Date: Wed, 4 Sep 91 09:34:58 -0600
From:
Message-Id: <>
To: ... (0 Replies)
Discussion started by: jpezz
0 Replies
spell(1) General Commands Manual spell(1)
NAME
spell, spellin, spellout - Finds spelling errors
SYNOPSIS
spell [-b] [-i | -l] [-v | -x] [-d hash_list] [-s hash_stop] [-h history_list] [+word_list] [file...]
spellin [list] [number]
spellout [-d] list
The spell command reads words in file and compares them to those in a spelling list. Default files contain English words only, but you can
supply your own list of words in other languages.
STANDARDS
Interfaces documented on this reference page conform to industry standards as follows:
spell: XCU5.0
Refer to the standards(5) reference page for more information about industry standards and associated tags.
OPTIONS
[Tru64 UNIX] The following options are for the spell command only. Checks for correct British spelling. Besides preferring centre,
colour, programme, speciality, travelled, and so on, this option causes spell to insist upon the use of the infix -ise in words like stan-
dardise. [Tru64 UNIX] Specifies hash_list as the alternate spelling list. The default is /usr/lbin/spell/hlist[ab]. [Tru64 UNIX] Speci-
fies history_list as the alternate history list that is used to accumulate all output. The default is /usr/lbin/spell/spellhist. [Tru64
UNIX] Suppresses processing of included files through the and troff macros. If the -i and -l options are both specified, the last one of
the two options entered on the command line takes effect. [Tru64 UNIX] Follows the chain of all included files (.so and
spell(1)