File conversion and removing special characters from a file in Linux
I have a .CSV file when I check for the special characters in the file using the command
Code:
cat -vet filename.csv
, i get very lengthy lines with "^@", "^I^@" and "^@^M" characters in between each alphabet in all of the records. Using the code below
Code:
file filename.csv
I get the output as
Quote:
filename.csv: Little-endian UTF-16 Unicode English character data, with very long lines, with CRLF, CR line terminators
I have a script to remove the control M (^M) from the file, whose output returns me an error saying : cannot execute binary file.
I know that ^I represent a tab. I also have a script to convert ^I to comma delimited file but Can anyone help me format the file with respect to the error and also ^@.
I have file special.txt with the following data.
<header info>
123$ty5%98&0asd
1@356fgbv78
09*&^5jkns43(
...........some more rows.
In my output file, I want to eliminate all the special characters in my file and I want all other data. need some help. (6 Replies)
How to remove special chracters @ END OF EACH LINE in a file
file1.txt:
0003073413^M
0003073351^M
0003073379^M
0003282724^M
0003323334^M
0003217159^M
0003102760^M
0002228911^M
I used the below command but it is not working ?
perl -pi -e 's/^M\/g' file1.txt (6 Replies)
Hi,
On AIX 5200-07-00 I have a find command as following to delete files from a certain location that are more than 7 days old. I am being told that I cannot use -exec option to delete files from these directories.
Having said that I am more curious to know how this can be done.
an sample... (3 Replies)
what my code is doing, it is executing a sql file and the resullset of the query is getting stored in the text file in a fixed format. for that fixed format i have used the following code::
Code:
awk -F":"... (2 Replies)
Dear Friends,
I want to remove text between two patters.
Problem is, it has random special characters like \ / | * ` ~ ! $ etc.
These random special characters has no fixed length. But these special characters are appearing between a fixed pattern
e.g.
DM&^%#|#!\/?CT
Expected output... (14 Replies)
Hi,
My file has this special character "^M"
I would like to remove this characters.
eg:
abc,abc,^M
i tried using sed but doesnt work.
i used octal dump command to see special character it returns following:
015
\r
Appreciate your reply. (6 Replies)
Hi,
I have a .csv file which as empty lines with comma and some special characters in 3rd column as below.
Source data
1,2,3,4,%#,6
,,,,,,
1,2,3,4,5,6
Target Data
1,2,3,4,5,6I need to remove blank lines and special charcters
I am trying to get this using the below awk
awk -F","... (2 Replies)
I have developed a small script to remove the Control M characters that get embedded when we move any file from Windows to Unix. For some reason, its not working in all scenarios. Some times I still see the ^M not being removed. Is there anything missing in the script:
cd ${inputDir}... (7 Replies)
Hello All ,
1. I am trying to do a task where I need to remove Blank spaces from my file , I am usingawk '{$1=$1}{print}' file>file1Input :-
;05/12/1990 ;31/03/2014 ;
Output:-
;05/12/1990 ;31/03/2014 ;This command is not removing all spaces from... (6 Replies)
Hi,
Does anyone know if there is a script or program available out there that uses a conversion table to replace special characters from a file?
I am trying to remove some special characters from a file but there are several unprintable/control characters that some I need to remove but some I... (2 Replies)
Discussion started by: newbie_01
2 Replies
LEARN ABOUT POSIX
regex
regex(1F) FMLI Commands regex(1F)NAME
regex - match patterns against a string
SYNOPSIS
regex [-e] [ -v "string"] [ pattern template] ... pattern [template]
DESCRIPTION
The regex command takes a string from the standard input, and a list of pattern / template pairs, and runs regex() to compare the string
against each pattern until there is a match. When a match occurs, regex writes the corresponding template to the standard output and
returns TRUE. The last (or only) pattern does not need a template. If that is the pattern that matches the string, the function simply
returns TRUE. If no match is found, regex returns FALSE.
The argument pattern is a regular expression of the form described in regex(). In most cases, pattern should be enclosed in single quotes
to turn off special meanings of characters. Note that only the final pattern in the list may lack a template.
The argument template may contain the strings $m0 through $m9, which will be expanded to the part of pattern enclosed in ( ... )$0 through
( ... )$9 constructs (see examples below). Note that if you use this feature, you must be sure to enclose template in single quotes so
that FMLI does not expand $m0 through $m9 at parse time. This feature gives regex much of the power of cut(1), paste(1), and grep(1), and
some of the capabilities of sed(1). If there is no template, the default is $m0$m1$m2$m3$m4$m5$m6$m7$m8$m9.
OPTIONS
The following options are supported:
-e Evaluates the corresponding template and writes the result to the standard output.
-v "string" Uses string instead of the standard input to match against patterns.
EXAMPLES
Example 1: Cutting letters out of a string
To cut the 4th through 8th letters out of a string (this example will output strin and return TRUE):
`regex -v "my string is nice" '^.{3}(.{5})$0' '$m0'`
Example 2: Validating input in a form
In a form, to validate input to field 5 as an integer:
valid=`regex -v "$F5" '^[0-9]+$'`
Example 3: Translating an environment variable in a form
In a form, to translate an environment variable which contains one of the numbers 1, 2, 3, 4, 5 to the letters a, b, c, d, e:
value=`regex -v "$VAR1" 1 a 2 b 3 c 4 d 5 e '.*' 'Error'`
Note the use of the pattern '.*' to mean "anything else".
Example 4: Using backquoted expressions
In the example below, all three lines constitute a single backquoted expression. This expression, by itself, could be put in a menu defini-
tion file. Since backquoted expressions are expanded as they are parsed, and output from a backquoted expression (the cat command, in this
example) becomes part of the definition file being parsed, this expression would read /etc/passwd and make a dynamic menu of all the login
ids on the system.
`cat /etc/passwd | regex '^([^:]*)$0.*$' '
name=$m0
action=`message "$m0 is a user"`'`
DIAGNOSTICS
If none of the patterns match, regex returns FALSE, otherwise TRUE.
NOTES
Patterns and templates must often be enclosed in single quotes to turn off the special meanings of characters. Especially if you use the
$m0 through $m9 variables in the template, since FMLI will expand the variables (usually to "") before regex even sees them.
Single characters in character classes (inside []) must be listed before character ranges, otherwise they will not be recognized. For exam-
ple, [a-zA-Z_/] will not find underscores (_) or slashes (/), but [_/a-zA-Z] will.
The regular expressions accepted by regcmp differ slightly from other utilities (that is, sed, grep, awk, ed, and so forth).
regex with the -e option forces subsequent commands to be ignored. In other words, if a backquoted statement appears as follows:
`regex -e ...; command1; command2`
command1 and command2 would never be executed. However, dividing the expression into two:
`regex -e ...``command1; command2`
would yield the desired result.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWcsu |
+-----------------------------+-----------------------------+
SEE ALSO awk(1), cut(1), grep(1), paste(1), sed(1), regcmp(3C), attributes(5)SunOS 5.10 12 Jul 1999 regex(1F)