10-28-2008
Lines with strange characters and sed...
Dear All:
I Have a bunch of files which I'd like to process with a shell script. The problem is that the files have strange characters in their headers, like
�g�8@L-000-MSG2__-ABCD________-FIRA_____-000001___-200806181330-__
��e�
Data from BLABLABLA, Instrument: BLABLA, Date: 2008/06/18 13:30Z
Row: 1078 Col: 1130 Lat: -22.267 Lon: 22.256 *** Something here ***
For my intents, I only need the information (in this case) from line 3 onwards. Sometimes this strange header occupies 2 lines, others 3...others...I don't know.
I made a very simple test, like
FILE=`find . -type f -name "FILENAME"`
for i in $FILE
do
FNOW=`echo $i`
#Cuts two first lines of the file
sed '1,2d' $FNOW > newfile
sed '/^$/d' -i newfile
HEADER=`head -1 newfile | cut -c1-4`
if [ "$HEADER" != "Data" ]
then
sed '1d' -i newfile
sed '/^$/d' -i newfile
fi
#A simple testing
HEADER2=`head -1 newfile | cut -c1-4`
echo ${HEADER2},${HEADER} >> test.txt
done
The problem is that.....sometimes i don't get to cut all the "strange" headers to obtain "clean" files, as you can see in some lines of test.txt
Data,@H
Data,ۘ
Data,Data
Data,@H
(etc)
So:
Is there any way to fulfill my intentions with sed? Maybe some "delete all the first lines until find the expression «Data»? Honestly, i don't know what else to try.
Thank you very much in advance
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi folks. None of the conventional methods are working for my dilemma:
I have a file in my root directory that has a name comprised of strange characters. When I do an ls, it just hangs at that file until I do a Cntrl-C.
rm ./filename
&
rm \filename
do not work. I am entering the... (4 Replies)
Discussion started by: kristy
4 Replies
2. Shell Programming and Scripting
Hello ,
I'm trying to split a file which contains a single very long line.
My aim is to split this single line each 120 characters.
I tried with the sed command :
`cat ${MYPATH}/${FILE}|sed -e :a -e 's/^.\{1,120\}$/&\n/;ta' >{MYPATH}/${DEST}`
but when I wc -l the destination file it is... (2 Replies)
Discussion started by: jerome_1664
2 Replies
3. Shell Programming and Scripting
Hello all,
I'm new to UNIX and new to this forum, so forgive my lack of knowledge. I'm new with editing in vi so I FTP scripts to a Windows machine and edit the script in notepad (when I need to do something quickly). Sometimes when I FTP the script back to the UNIX box, strange characters... (4 Replies)
Discussion started by: dgower2
4 Replies
4. Shell Programming and Scripting
Hi, I have a bash script and I am looking for a command that will merge specific lines together.
Sample Data:
registration time = 1300890272
Id = 1
setd = 0
tagunt = 26
tagId=6, length=8, value=
tagId=9, length=5, value=
tagId=7, length=2, value=
tagId=16, length=2, value=
tagId=32,... (8 Replies)
Discussion started by: Winsarc
8 Replies
5. Shell Programming and Scripting
I've written a script:
find -depth | awk ‘
{
if ( substr($1,length($0)-2,3) == “/1.” )
{ print $1 }
{ system(“awk -f test1.awk “ $1 ) }
}
‘
The idea is that it trundles through a large directory structure looking for files which are named '1.' and then... (3 Replies)
Discussion started by: nashcom
3 Replies
6. Shell Programming and Scripting
Hello unix users :)
I am trying to grep a string from a file that both the file and the string may have characters in them that are quite... strange, like würzburger.
Well, bash reads this as
W%C3%BCrzburger
For example, if i do
wget W%C3%BCrzburger
the output is:
--2012-01-08... (2 Replies)
Discussion started by: hakermania
2 Replies
7. Red Hat
I am trying to sftp a textfile from windows to linux. The file includes some spanish characters. When I vi the file in LINUX, the special (spanish) characters get converted into some strange characters. anyone know how i can resolve this? for example México gets converted into México on LINUX. (0 Replies)
Discussion started by: mrx1350
0 Replies
8. Hardware
Hello,
I have an x86 server with an ILOM connection that produces strange characters when I perform a start /SP/console, see below:
Oracle(R) Integrated Lights Out Manager
Version 3.0.16.10.a r68533
Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
-> start... (9 Replies)
Discussion started by: kerrygold
9 Replies
9. Shell Programming and Scripting
sed -e "s// /g" old.txt > new.txt
While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies
10. Programming
Hi guys,
After compiling a .f90 code and executing it, i get strange characters in the output file like :
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Are these windows characters? how can i get rid of this?
Much appreciated.
Paul (1 Reply)
Discussion started by: Paul Moghadam
1 Replies
LEARN ABOUT DEBIAN
plan9-grep
GREP(1) General Commands Manual GREP(1)
NAME
grep, g - search a file for a pattern
SYNOPSIS
grep [ option ... ] pattern [ file ... ]
g [ option ... ] pattern [ file ... ]
DESCRIPTION
Grep searches the input files (standard input default) for lines that match the pattern, a regular expression as defined in regexp(7) with
the addition of a newline character as an alternative (substitute for |) with lowest precedence. Normally, each line matching the pattern
is `selected', and each selected line is copied to the standard output. The options are
-c Print only a count of matching lines.
-h Do not print file name tags (headers) with output lines.
-e The following argument is taken as a pattern. This option makes it easy to specify patterns that might confuse argument parsing,
such as -n.
-i Ignore alphabetic case distinctions. The implementation folds into lower case all letters in the pattern and input before interpre-
tation. Matched lines are printed in their original form.
-l (ell) Print the names of files with selected lines; don't print the lines.
-L Print the names of files with no selected lines; the converse of -l.
-n Mark each printed line with its line number counted in its file.
-s Produce no output, but return status.
-v Reverse: print lines that do not match the pattern.
-f The pattern argument is the name of a file containing regular expressions one per line.
-b Don't buffer the output: write each output line as soon as it is discovered.
Output lines are tagged by file name when there is more than one input file. (To force this tagging, include /dev/null as a file name
argument.)
Care should be taken when using the shell metacharacters $*[^|()= and newline in pattern; it is safest to enclose the entire expression in
single quotes '...'. An expression starting with '*' will treat the rest of the expression as literal characters.
G invokes grep with -n and forces tagging of output lines by file name. If no files are listed, it searches all files matching
*.C *.b *.c *.h *.m *.cc *.java *.cgi *.pl *.py *.tex *.ms
SOURCE
/src/cmd/grep
/bin/g
SEE ALSO
ed(1), awk(1), sed(1), sam(1), regexp(7)
DIAGNOSTICS
Exit status is null if any lines are selected, or non-null when no lines are selected or an error occurs.
GREP(1)