Sponsored Content
Top Forums Shell Programming and Scripting How to find out the weird blank characters? Post 303003866 by RudiC on Thursday 21st of September 2017 03:48:03 PM
Old 09-21-2017
I don't think it's wise to get rid of them, because they separate (and thus define) the words. Leave them in, count them, and then eliminate the "blank" count.
Those non-ASCII opening and closing double quotes are multibyte unicode characters. It might be easier to convert them to ASCII- quotes beforehand. Same holds true mayhap for the "blank" chars above...

Once converted, this
Code:
awk -F\" '
        {while (!(NF%2))        {getline X
                                 $0 = $0 " " X
                                }
         for (i=2; i<=NF; i+=2) print gsub (/[A-Za-z0-9]+/, "&", $i)
        }
' file

might give you a feeling for the "speech length".

Last edited by RudiC; 09-21-2017 at 05:00 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Blank characters between Datas

Hello, I read a file whose in lines are datas and between thses datas there is blank characters (10, 12 or 5 or 1 .......) So when i use the command while read line in the script(see under) there is also only one character between the datas and the others blank characters are not here. ... (3 Replies)
Discussion started by: steiner
3 Replies

2. Shell Programming and Scripting

Deleting the blank line in a file and counting the characters....

Hi, I am trying to do two things in my script. I will really appreciate any help in this regards. Is there a way to delete a last line from a pipe delimited flat file if the last line is blank. If the line is not blank then do nothing..... Is there a way to count a word that are starting... (4 Replies)
Discussion started by: rkumar28
4 Replies

3. UNIX for Dummies Questions & Answers

How to get rid of all the weird characters and color on bash shell

Does anyone of you know how to turn off color and weird characters on bash shell when using the command "script"? Everytime users on my server used that command to record their script, they either couldn't print it because lp kept giving the "unknown format character" messages or the print paper... (1 Reply)
Discussion started by: Micz
1 Replies

4. Shell Programming and Scripting

Weird Ascii characters in file names

Hi. I have files in my OS that has weird file names with not-conventional ascii characters. I would like to run them but I can't refer them. I know the ascii # of the problematic characters. I can't change their name since it belongs to a 3rd party program... but I want to run it. is there... (2 Replies)
Discussion started by: yamsin789
2 Replies

5. UNIX for Advanced & Expert Users

cat / sed process weird characters

Hi everyone, I'm trying to write a shell script that process a log file. The log format is generally: (8 digit hex of unix time),(system ID),(state)\n My shell script gets the file from the web, saves it in a local text directory. I then want to change the hex to decimal, convert from unix time... (7 Replies)
Discussion started by: bencpeters
7 Replies

6. Shell Programming and Scripting

share a shell script which can replace weird characters in directory or file name

I just finish the shell script . This shell can replace weird characters (such as #$%^@!'"...) in file or directory name by "_" I spent long time on replacing apostrophe in file/directory name added: 2012-03-14 the 124th line (/usr/bin/perl -i -e "s#\'#\\'#g" /tmp/rpdir_level$i.tmp) is... (5 Replies)
Discussion started by: begonia
5 Replies

7. Shell Programming and Scripting

Removing one or more blank characters from beginning of a line

Hi, I was trying to remove the blank from beginning of a line. when I try: sed 's/^ +//' filename it does not work but when I try sed 's/^ *//' filename it works But I think the first command should have also replaced any line with one or more blanks. Kindly help me in understanding... (5 Replies)
Discussion started by: babom
5 Replies

8. Shell Programming and Scripting

Weird ^M characters is disturbing the paste command

Dear all, I have the files: xaa xab xac and I try to paste them using $paste -d, xaa xab xac I see: output 3e-130 ,6e-78 ,5e-74 6e-124 ,0,007 ,0,026 2e-119 When I type: $ paste -d, xaa xab xac |less I see: output 3e-130^M,6e-78^M,5e-74 6e-124^M,0,007^M,0,026 (2 Replies)
Discussion started by: valente
2 Replies

9. Shell Programming and Scripting

Control characters -weird problem

I am using Korn shell on Linux 2.6x platform , and I am suing the following code to capture the lines which contain CONTROL CHARACTERS in my file : awk '/]/ {print NR}' EROLLMENT_INPUT.txt The problem is that this code shows the file has control characters when the file is in folder A ,... (2 Replies)
Discussion started by: kumarjt
2 Replies

10. Shell Programming and Scripting

To check Blank Lines, Blank Records and Junk Characters in a File

Hi All Need Help I have a file with the below format (ABC.TXT) : ®¿¿ABCDHEJJSJJ|XCBJSKK01|M|7348974982790 HDFLJDKJSKJ|KJALKSD02|M|7378439274898 KJHSAJKHHJJ|LJDSAJKK03|F|9898982039999 (cont......) I need to write a script where it will check for : blank lines (between rows,before... (6 Replies)
Discussion started by: chatwithsaurav
6 Replies
STRSPLIT(3pub)						       C Programmer's Manual						    STRSPLIT(3pub)

NAME
strsplit - split string into words SYNOPSIS
#include <publib.h> int strsplit(char *src, char **words, int maxw, const char *sep); DESCRIPTION
strsplit splits the src string into words separated by one or more of the characters in sep (or by whitespace characters, as specified by isspace(3), if sep is the empty string). Pointers to the words are stored in successive elements in the array pointed to by words. No more than maxw pointers are stored. The input string is modifed by replacing the separator character following a word with ''. However, if there are more than maxw words, only maxw-1 words will be returned, and the maxwth pointer in the array will point to the rest of the string. If maxw is 0, no modification is done. This can be used for counting how many words there are, e.g., so that space for the word pointer table can be allocated dynamically. strsplit splits the src string into words separated by one or more of the characters in sep (or by whitespace characters, as defined by isspace(3), if sep is the empty string). The src string is modified by replacing the separator character after each word with ''. A pointer to each word is stored into successive elements of the array words. If there are more than maxw words, a '' is stored after the first maxw-1 words only, and the words[maxw-1] will contain a pointer to the rest of the string after the word in words[maxw-2]. RETURN VALUE
strsplit returns the total number of words in the input string. EXAMPLE
Assuming that words are separated by white space, to count the number of words on a line, one might say the following. n = strsplit(line, NULL, 0, ""); To print out the fields of a colon-separated list (such as PATH, or a line from /etc/passwd or /etc/group), one might do the following. char *fields[15]; int i, n; n = strsplit(list, fields, 15, ":"); if (n > 15) n = 15; for (i = 0; i < n; ++i) printf("field %d: %s ", i, fields[i]); In real life, one would of course prefer to not restrict the number of fields, so one might either allocated the pointer table dynamically (first counting the number of words using something like the first example), or realize that since it is the original string that is being modified, one can do the following: char *fields[15]; int i, n; do { n = strsplit(list, fields, 15, ":"); if (n > 15) n = 15; for (i = 0; i < n; ++i) printf("field %d: %s ", i, fields[i]); list = field[n-1] + strlen(field[n-1]); } while (n == 15); SEE ALSO
publib(3), strtok(3) AUTHOR
The idea for this function came from C-News source code by Henry Spencer and Geoff Collyer. Their function is very similar, but this implementation is by Lars Wirzenius (lars.wirzenius@helsinki.fi) Publib C Programmer's Manual STRSPLIT(3pub)
All times are GMT -4. The time now is 03:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy