Sponsored Content
Top Forums UNIX for Advanced & Expert Users foreign characters in flat file Post 302256852 by jim mcnamara on Monday 10th of November 2008 05:59:21 PM
Old 11-10-2008
We can't tell what is in the file. If it is not a foreign language then try to remove all "weird characters".

Code:
tr -dc '[:print:]'  < inputfile

If that does not do it then
Code:
/* file should be named badchar.c */
#include <stdio.h>
int main(void)
{
      int ch=0;
      while(fgetc(stdin)!=EOF)
         if(ch<128 ) fprintf(stdout, "%c", ch);
      return 0;
}

Your c compiler is either cc or gcc so I use [g]cc below -- you pick.
Code:
[g]cc badchar.c -o badchar

to run the program do this
Code:
badchar < badinputfile > newfile

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Foreign characters in bash

Hello, I'm trying to type in foreign characters (á, é, í, ñ...) from the bash when doing a Telnet to my UNIX account. So far it only allows me to type in the standard character set (up to ASCII 128). I need this to feed parameters to certains scripts and programs. Thanks! Miguel (4 Replies)
Discussion started by: czerny
4 Replies

2. Shell Programming and Scripting

Help Replacing Characters in Flat File

I was wondering if somebody could help me with something on UNIX. I have a file that looks like this - "nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william I want to replace all comma with pipes (|), except if the comma is within double quotes. (The first field is an example of... (8 Replies)
Discussion started by: nelson553011
8 Replies

3. Programming

Problem with including foreign characters in Lex rule

Hi, I'm hoping that someone might be able to help me with this problem: I have already added new code to several existing Lex rules to accept the following foreign characters: å ä ö Å Ä Ö æ Æ ø Ø ü Ü ß. The code looks like this: /*Nathalie Stern, 080121 - Add å ä ö Å Ä Ö æ Æ ø Ø ü Ü ß handling to function*/... (1 Reply)
Discussion started by: Nathalie1
1 Replies

4. UNIX for Advanced & Expert Users

foreign characters

I have a flat file and have foreign characters in three fields. Can somebody tell me how to get rid of these special characters? It's very urgent because without this my process is failing. Thanks in advance. Angielina (1 Reply)
Discussion started by: angelina
1 Replies

5. Shell Programming and Scripting

foreign characters

I have a flat file and have foreign characters in three fields. Can somebody tell me how to get rid of these special characters? It's very urgent because without this my process is failing. Thanks in advance. Angielina (2 Replies)
Discussion started by: angelina
2 Replies

6. Shell Programming and Scripting

Merge lines in Flat file based on first 5 characters

Hi I have the fixed width flat file having the following data 12345aaaaaaaaaabbbbbbbbbb 12365sssssssssscccccccccc 12365sssss 12367ddddddddddvvvvvvvvvv 12367 vvvvv Here the first column is length 5 second is length 10 third is length 10 if the second or third column exceeds... (3 Replies)
Discussion started by: Brado
3 Replies

7. AIX

How to cut a flat file according to a certain number of characters?

hello everybody i am looking for a shell to cut a flat file (with a long unique line) according to a certain number of characters and redirect every result to an output file. here is an example MyFile : 12 3 456 12 3 456 12 3 456 ..... and i took every 9-characters including BLANKS... (6 Replies)
Discussion started by: fastlane3000
6 Replies

8. UNIX for Dummies Questions & Answers

How to remove numeric characters in the flat file

HI, can any one help me please .. i have flat file like qwer123rt ass3242ccf jjk654 kjh838ppp nhdg453ok hdkk34 i want remove numeric characters in the flat file i want output like this qwerrt assccf jjk kjhppp nhdgok hdkk help me... (4 Replies)
Discussion started by: rafimd1985
4 Replies

9. Shell Programming and Scripting

Finding distinct characters from flat file

Hi....I need one help.... I'm having a files which is having the data as follows... a b c c d d d e f Now I need to find out distinct characters from this file and the output should be as follows - a b c d e f Can you please help me on this? I'm using KSH script. (18 Replies)
Discussion started by: Krishanu Saha
18 Replies

10. UNIX for Advanced & Expert Users

Foreign Key in UNIX File System

Hi, Do we have Foreign Key concept in File system like UNIX, as we have in DBMS?? If yes, Can you please tell me how it is implemented in File System? Thanks & Regards, Archana (2 Replies)
Discussion started by: Archana Batta
2 Replies
GETS(3) 						     Linux Programmer's Manual							   GETS(3)

NAME
fgetc, fgets, getc, getchar, gets, ungetc - input of characters and strings SYNOPSIS
#include <stdio.h> int fgetc(FILE *stream); char *fgets(char *s, int size, FILE *stream); int getc(FILE *stream); int getchar(void); char *gets(char *s); int ungetc(int c, FILE *stream); DESCRIPTION
fgetc() reads the next character from stream and returns it as an unsigned char cast to an int, or EOF on end of file or error. getc() is equivalent to fgetc() except that it may be implemented as a macro which evaluates stream more than once. getchar() is equivalent to getc(stdin). gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte (''). No check for buffer overrun is performed (see BUGS below). fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('') is stored after the last character in the buffer. ungetc() pushes c back to stream, cast to unsigned char, where it is available for subsequent read operations. Pushed-back characters will be returned in reverse order; only one pushback is guaranteed. Calls to the functions described here can be mixed with each other and with calls to other input functions from the stdio library for the same input stream. For nonlocking counterparts, see unlocked_stdio(3). RETURN VALUE
fgetc(), getc() and getchar() return the character read as an unsigned char cast to an int or EOF on end of file or error. gets() and fgets() return s on success, and NULL on error or when end of file occurs while no characters have been read. ungetc() returns c on success, or EOF on error. CONFORMING TO
C89, C99, POSIX.1-2001. LSB deprecates gets(). POSIX.1-2008 marks gets() obsolescent. ISO C11 removes the specification of gets() from the C language, and since version 2.16, glibc header files don't expose the function declaration if the _ISOC11_SOURCE feature test macro is defined. BUGS
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead. It is not advisable to mix calls to input functions from the stdio library with low-level calls to read(2) for the file descriptor associ- ated with the input stream; the results will be undefined and very probably not what you want. SEE ALSO
read(2), write(2), ferror(3), fgetwc(3), fgetws(3), fopen(3), fread(3), fseek(3), getline(3), getwchar(3), puts(3), scanf(3), ungetwc(3), unlocked_stdio(3), feature_test_macros(7) COLOPHON
This page is part of release 3.44 of the Linux man-pages project. A description of the project, and information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/. GNU
2012-01-18 GETS(3)
All times are GMT -4. The time now is 09:30 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy