Sponsored Content
Operating Systems AIX Accent words file from windows Post 302839483 by jockx on Thursday 1st of August 2013 06:02:59 PM
Old 08-01-2013
Question Accent words file from windows

Hi guys, I'm having a issue with a windows file from, at first the file is readed like one big and extense line and the famous "Ctrl+V Ctrl+R" or "^M return carriage"... fixed with:

Code:
perl -pe 'if ( s/\r\n?/\n/g ) { $f=1 }; if ( $f || ! $m ) { s/([^\n])\z/$1\n/ }; $m=1' $file_input > file_output

before, the file assembly the columns... fixed too:

Code:
sed -r 's/(.{11})/\1 /'

Here the ".{11}" is the positional new field separator.

But.. I'm not so good with locale and chartset in aix (is cuckoo for my... sometimes Smilie )

my question this:
mi file contains latin1 characteres, like "ñ, é, ü, etc" im looked for iconv but doesnt work for me at moment, i tried next lines:

Knowing the charset charset=iso-8859-1 moving the file to ubuntu and running "file -i".

Code:
iconv -f ISO-8859-1 -t UTF-8 file_input > file_output

but.. nothing

then
Code:
iconv -f ISO-8859-1 -t ISO-8859-15 file_input > file output

Neither...

I tried with an ubuntu distro and the results are seems... So, ayone has had this problem before? i did browse the entire network today and the answer are allways with iconv or vim, but vim doesn't have ":set filesetting" or ":set codesetting"

Any idea or suggestions?

Thanks and regards!
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

accent in emacs

Hello, I try to insert charcheters with accent with emacs. It doesn't work. How can I do ? Thank you in advance (2 Replies)
Discussion started by: annemar
2 Replies

2. Shell Programming and Scripting

Sed accent

Hi everyone ! I'd like to write a unix command for correcting all european accent errors in a document (spanish, german, french, danisch, etc )! i need to do this for correcting my document : sed -e 's/%2B/\ /g' -e 's/%25C9/É/g' doc1 > doc2 The first command is ok and change "%2B" into... (3 Replies)
Discussion started by: Tomat75
3 Replies

3. Shell Programming and Scripting

Spanish accent symbol removed by sed

Hello All in a text file I have to replace some numeric code by a string. This is an exemple of the file: 000000001 LDR L ^^^^^nam^^2200169Ia^45e0 000000001 008 L 100604s9999^^^^xx^^^^^^^^^^^^000^0^und^d 000000001 022 L $$a0365-6675 000000001 090 L $$aBMA 1934-1937. 000000001 245... (1 Reply)
Discussion started by: ldiaz2106
1 Replies

4. Shell Programming and Scripting

Splitting Concatenated Words in Input File with Words from a Master File

Hello, I have a complex problem. I have a file in which words have been joined together: Theboy ranslowly I want to be able to correctly split the words using a lookup file in which all the words occur: the boy ran slowly slow put child ly The lookup file which is meant for look up... (21 Replies)
Discussion started by: gimley
21 Replies

5. Shell Programming and Scripting

Remove spanish accent from file name

Hello All hope all fine, I have a question about spanish accents... I have in a redhat server, a lot of files with Ñ or Ú accent into the NAME of the file. So my question is: Is it possible to change this name but with a script...to change all the occurence in one shot. Exemple: cd... (2 Replies)
Discussion started by: ldiaz2106
2 Replies

6. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Dear all, I am working with names and I have a large file of names in which some words are written together (upto 4 or 5) and their corresponding single forms are also present in the word-list. An example would make this clear annamarie mariechristine johnsmith johnjoseph smith john smith... (8 Replies)
Discussion started by: gimley
8 Replies

7. HP-UX

Problem with sftp for accent character files

Hi All, Following are the problems i am facing.Resolution for these would be highly commendable: a)I am trying to transfer files from my local C: directory to unix server using sftp. The problem is coming when files with accent characters are picked for transfer.These files are not... (6 Replies)
Discussion started by: destinykrishan
6 Replies

8. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

9. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about... (5 Replies)
Discussion started by: malfolozy
5 Replies

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies
unaccent(1)						      General Commands Manual						       unaccent(1)

NAME
unaccent - remove accents from input stream or a string SYNOPSIS
unaccent [--debug_low] [--debug_high] [-h] charset [string] [expected] DESCRIPTION
With a single argument, unaccent reads data from stdin, replaces accented letters by their unaccented equivalent and writes the result on stdout. If the second argument ('string') is provided unaccent transforms it by replacing accented letters by their unaccented equivalent. The result is printed on the standard output. The charset of the input string or the data read from stdin is specified by the 'charset' argument (ISO-8859-15 for instance). The output is printed using the same charset. If the 'expected' argument is provided, the output string is compared to it. If they are not equal unaccent exits on error. unaccent relies on the iconv(3) library to convert from the specified charset to UTF-16BE (or UTF-16 if UTF-16BE is not available). You should check the manual pages for available charsets. On GNU/Linux the command iconv -l shows all available charsets. OPTIONS
--debug_low Prints human readable information about the unaccentuation process. See unac(3) for more information. --debug_high Prints very detailed information about the unaccentuation process. See unac(3) for more information. --help -h Prints a short usage and exits. EXAMPLES
Remove accents from the string ete and check that the result is ete. unaccent ISO-8859-1 ete ete Remove accents from file myfile and put the result in file myfile.unaccent unaccent ISO-8859-1 < myfile > myfile.unaccent SEE ALSO
unac(3), iconv(3) AUTHOR
Loic Dachary loic@senga.org http://www.senga.org/unac/ local unaccent(1)
All times are GMT -4. The time now is 08:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy