hi all,
i have the next question:
how can i identify the type of a file? . I'm working in Unix (Solaris 5.7) and i would like identify if a file is or not is a "flat file". I need have a program what separates the flat file in a directory, and the excel file in another directory.
I must get... (1 Reply)
I'm trying to change the user Id that the script is running under. I tried the sudo comand but the job was submitted under ControlM and it seems that controlM is not allowing the user id to change. I have included the job output below. The sudo comand was suppose to set the user id to "DWSOLAP"... (3 Replies)
Hi Friends,
I need to have a command in Unix which output all teh records havingg junk characters in a file....
I know a command cat -tv <Filename> which opens the file and we can check for any junk character in it.
But my requirement is to fetch ONLY THOSE records having junk characters.... (6 Replies)
I have a directory with hundreds of files that can not have data pass column 80. I do not know of way to combine "grep" and "cut" command.
I tried:
cat * | cut -c 81-120 |pg
but it only shows me the line, not the file name.
Any help would be appreciated. Been on this all... (3 Replies)
I have a very large system generated file containing around 500K rows size 100MB like following
HOME|ALICE STREET|3||NEW LISTING
HOME|NEWPORT STREET|1||NEW LISTING
HOME|KING STREET|5||NEW LISTING
HOME|WINSOME AVENUE|4||MODIFICATION
CAR|TOYOTA|4||NEW LISTING
CAR|FORD|4||NEW... (9 Replies)
Hi,
Is there a way to identify the lines in a file having extended ascii characters and display the same?
For instance I have a file abc.txt having below data
aaa|bbb|111|This is first line
aaa|bbb|222|This is secõnd line
aaa|bbb|333|This is third line
aaa|bbb|444|This is foùrth line... (3 Replies)
I want to filter out the special character whose ascii value doesn't fall within the range "" .
Example:� or Ć. So in that case is there any defined range which will filter out this characters.
I can filter those which falls withing "" . Need to filter those special chracter which doesn't... (14 Replies)
I am on linux and I am supposed to receive 3 files. If any of the files are not received I need to identify the missing file and throw it out in a variable.
I have put in something like this
if ]
then echo "file $file1 was found"
else
echo "ERROR: file $file1 was not found!!!"... (8 Replies)
I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters.
I have identified the character set of Sindhi which is given below:
For clarity's sake, each... (8 Replies)
HI Team,
I running below script from controlM and job is reporting as failure everyday so i tried to change the if exitstatus=1 (send only email) but not to end as a job is failed. can you let me know where i have to change this script to make the script not to fail but instead send email and... (3 Replies)
Discussion started by: Mi4304
3 Replies
LEARN ABOUT DEBIAN
make-lingua-identify-language
MAKE-LINGUA-IDENTIFY-LANGUAGE(1p) User Contributed Perl Documentation MAKE-LINGUA-IDENTIFY-LANGUAGE(1p)NAME
make-lingua-identify-language - creates language modules for Lingua::Identify
SYNOPSIS
make-lingua-identify-language Language-Tag Language-Name file1 [file2 ...]
or
make-lingua-identify-language -d TAG1-LANGUAGE1/ [TAG2-LANGUAGE2/ ...]
or
make-lingua-identify
DESCRIPTION
Creates language modules to be used by Lingua::Identify.
After creating the modules, you still have to install them.
Please note that this script is still at an early stage. Please do not even look at the code...
Without parameters, make-lingua-identify-language assumes mode -d and goes through all the directories in the current one. This is useful
to be used in a directory where you something like this:
.
|-- en-english
| `-- english.txt
|-- fr-french
| `-- french1.txt
| `-- french2.txt
`-- pt-portuguese
`-- portuguese.txt
OPTIONS
-d
Directory mode. Each parameter passed should be a directory whose name must be of the form tag-name (e.g., en-english/ ). Each of the
directories passed should contain text files that can be used to train Lingua::Identify.
-D
Debug mode. Only for development.
-h
Display help and exit.
-v
Show version and exit.
-verbose
Verbose mode.
-locale="<locale>"
Set a specific locale. This way your text will be all lowercased before analysed.
META.yml
"META.yml" files are not parsed as other files, they are ignored.
In directory mode ("-d" switch), "META.yml" files are checked for info on languages codes and sets.
Here's a simple "META.yml" for you to put in your directories:
two_letter_code: pt
three_letter_code: por
sets:
spoken_in_portugal
With that, the language will be identified with the two letter code "pt" or the three letter code "por"; it will also be in the set
":spoken_in_portugal".
CONTRIBUTING WITH NEW LANGUAGES
Please do not contribute with modules you made yourself. It's easier to contribute with unprocessed text, because that allows for new
versions of Lingua::Identify not having to drop languages down in case I can't contact you by that time.
Use make-lingua-identify-language to create a new module for your own personal use, if you must, but try to contribute with unprocessed
text rather than those modules.
SEE ALSO Lingua::Identify(3), langident(1)
A linguist and/or a shrink.
The latest CVS version of "Lingua::Identify" (which includes make-lingua-identify) can be attained at
http://natura.di.uminho.pt/natura/viewcvs.cgi/Lingua/Identify/
ISO 639 Language Codes, at http://www.w3.org/WAI/ER/IG/ert/iso639.htm
AUTHOR
Jose Alves de Castro, <cog@cpan.org>
COPYRIGHT AND LICENSE
Copyright 2004-2005 by Jose Alves de Castro
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2012-01-02 MAKE-LINGUA-IDENTIFY-LANGUAGE(1p)