I don't think it's wise to get rid of them, because they separate (and thus define) the words. Leave them in, count them, and then eliminate the "blank" count.
Those non-ASCII opening and closing double quotes are multibyte unicode characters. It might be easier to convert them to ASCII- quotes beforehand. Same holds true mayhap for the "blank" chars above...
Once converted, this
might give you a feeling for the "speech length".
Hello,
I read a file whose in lines are datas and between thses datas there is blank characters (10, 12 or 5 or 1 .......)
So when i use the command while read line in the script(see under) there is also only one character between the datas and the others blank characters are not here.
... (3 Replies)
Hi,
I am trying to do two things in my script. I will really appreciate any help in this regards.
Is there a way to delete a last line from a pipe delimited flat file if the last line is blank. If the line is not blank then do nothing.....
Is there a way to count a word that are starting... (4 Replies)
Does anyone of you know how to turn off color and weird characters on bash shell when using the command "script"? Everytime users on my server used that command to record their script, they either couldn't print it because lp kept giving the "unknown format character" messages or the print paper... (1 Reply)
Hi.
I have files in my OS that has weird file names with not-conventional ascii characters.
I would like to run them but I can't refer them.
I know the ascii # of the problematic characters.
I can't change their name since it belongs to a 3rd party program... but I want to run it.
is there... (2 Replies)
Hi everyone,
I'm trying to write a shell script that process a log file. The log format is generally:
(8 digit hex of unix time),(system ID),(state)\n
My shell script gets the file from the web, saves it in a local text directory. I then want to change the hex to decimal, convert from unix time... (7 Replies)
I just finish the shell script .
This shell can replace weird characters (such as #$%^@!'"...) in file or directory name by "_"
I spent long time on replacing apostrophe in file/directory name
added: 2012-03-14
the 124th line (/usr/bin/perl -i -e "s#\'#\\'#g" /tmp/rpdir_level$i.tmp) is... (5 Replies)
Hi,
I was trying to remove the blank from beginning of a line.
when I try:
sed 's/^ +//' filename
it does not work
but when I try
sed 's/^ *//' filename
it works
But I think the first command should have also replaced any line with one or more blanks.
Kindly help me in understanding... (5 Replies)
Dear all,
I have the files: xaa xab xac
and I try to paste them using $paste -d, xaa xab xac
I see:
output
3e-130
,6e-78
,5e-74
6e-124
,0,007
,0,026
2e-119
When I type: $ paste -d, xaa xab xac |less
I see:
output
3e-130^M,6e-78^M,5e-74
6e-124^M,0,007^M,0,026 (2 Replies)
I am using Korn shell on Linux 2.6x platform , and I am suing the following code to capture the lines which contain CONTROL CHARACTERS in my file :
awk '/]/ {print NR}' EROLLMENT_INPUT.txt
The problem is that this code shows the file has control characters when the file is in folder A ,... (2 Replies)
Hi All
Need Help
I have a file with the below format (ABC.TXT) :
®¿¿ABCDHEJJSJJ|XCBJSKK01|M|7348974982790
HDFLJDKJSKJ|KJALKSD02|M|7378439274898
KJHSAJKHHJJ|LJDSAJKK03|F|9898982039999
(cont......)
I need to write a script where it will check for : blank lines (between rows,before... (6 Replies)
Discussion started by: chatwithsaurav
6 Replies
LEARN ABOUT HPUX
iswctype
wctype(3C)wctype(3C)NAME
wctype(), iswalpha(), iswblank(), iswupper(), iswlower(), iswdigit(), iswxdigit(), iswalnum(), iswspace(), iswpunct(), iswprint(), isw-
graph(), iswcntrl(), iswctype() - classify wide characters
SYNOPSIS
Remarks
These functions are compliant with the XPG4 Worldwide Portability Interface wide-character classification functions. They parallel the
8-bit character classification functions defined in ctype(3C).
DESCRIPTION
These functions classify wide character values according to the rules of the coded character set identified by the last successful call to
(see setlocale(3C)).
If has not been called successfully, characters are classified according to the rules of the default ASCII 7-bit coded character set (see
setlocale(3C)).
Each of the classification functions is a predicate that returns non-zero for true, zero for false.
is defined for valid character class names as defined in the current locale. charclass is a string identifying a generic character class
for which codeset-specific type information is required. The following class names are defined in all locales: and User-defined class
names may be specified if supported by the current locale as defined by (see setlocale(3C)). returns a value of type that can be used in a
subsequent call to or if charclass is not valid in the current locale.
The classification functions return non-zero under the following circumstances, and zero otherwise:
wc has the property defined by prop.
wc is a letter.
wc is a blank character; that is a space or tab.
wc is an uppercase letter.
wc is a lowercase letter.
wc is a decimal digit (in ASCII: characters [0-9]).
wc is a hexadecimal digit (in ASCII: characters [0-9], [A-F] or [a-f]).
wc is an alphanumeric (letters or digits).
wc is a character that creates "white space" in displayed text (in ASCII: space, tab, carriage return, new-line,
vertical tab, and form-feed).
wc is a punctuation character (in ASCII: any printing character except the space character(040), digits, letters).
wc is a printing character.
wc is a visible character (in ASCII: printing characters, excluding the space character(040)).
wc is a control character (in ASCII: character codes less than 040 and the delete character(0177)).
If the argument to any of these functions is outside the domain of the function, the result is 0 (false).
Definitions for these functions and the types and are provided in the header.
EXTERNAL INFLUENCES
Locale
The category determines the classification of character type.
International Code Set Support
Single-byte and multibyte character code sets are supported.
AUTHOR
was developed by IBM, OSF, and HP.
SEE ALSO ctype(3C), multibyte(3C), setlocale(3C), ascii(5), thread_safety(5).
STANDARDS CONFORMANCE wctype(3C)