How to filter only comments while reading a file including line break characters.


View Poll Results: Is this useful ?
no 2 100.00%
yes 0 0%
Voters: 2. This poll is closed

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to filter only comments while reading a file including line break characters.
# 8  
Old 05-03-2010
Code:
sed '/^ *#/d;s/^ *//' input.txt | cut -d# -f1

# 9  
Old 05-03-2010
Quote:
Originally Posted by kchinnam
I still get the space prefix chracter[s] on third line.. I used a TAB + spacebar on third line in my input.. Again trimming should work for any combination of space characters.

Code:
$> /usr/xpg4/bin/awk 'NF && $1 != "#"{sub("#.*","");sub(/^ */,"");print}' input.txt
one="1111111" 
two="22222222"
         three=333333333

Try this
Code:
/usr/xpg4/bin/awk 'NF && $1 != "#"{sub("#.*","");$1=$1;print}' file

# 10  
Old 05-03-2010
Code:
/usr/xpg4/bin/awk '$1!="#" && NF{gsub(/^[\t ]*/,x);sub("#.*",x);print}' file

# 11  
Old 05-03-2010
input.txt

Code:
# Need to ignore this line
                  # need to ignore this line as well
one='1111111' # we want anthing after # ignored,, but stll keep line break.
two = '22222222'
          
         
         three=333333333
                   four=4444444
  five=5555555

Here is the output of pseudocoder's code

Code:
$> sed '/^ *#/d;s/^ *//' input.txt | cut -d# -f1
                  
one='1111111' 
two = '22222222'
          
         
         three=333333333
                   four=4444444
five=5555555

Here is the output of "danmero" code. Works great,, This is easier for me to understand..

Code:
$> /usr/xpg4/bin/awk '$1!="#" && NF{gsub(/^[\t ]*/,x);sub("#.*",x);print}' input.txt
one='1111111' 
two = '22222222'
three=333333333
four=4444444
five=5555555

Dan how would you trim spaces that could be suffixed using your approach? Even though I did not mention this requirement earlier. That is where I want to take this,, so that my code would not break even if users make common mistakes..


Here is the output of "Franklin52". You must be a mind reader too..apart from being great at scripting :-).

It meets my requirement of removing spaces at the end of line too.. I think, that could sometimes break code if users make typos with input files.

Code:
$> /usr/xpg4/bin/awk 'NF && $1 != "#"{sub("#.*","");$1=$1;print}' input.txt
one='1111111'
two = '22222222'
three=333333333
four=4444444
five=5555555

I would really appreciate if you could explain how the above command works. I am getting little dizzy with it..

Thanks in advance !..

---------- Post updated at 07:59 PM ---------- Previous update was at 07:56 PM ----------

Since suffix spaces in my input.txt file got removed,, in my previous post,, I am posting it again..

input.txt
Code:
$> cat input.txt
# Need to ignore this line
                  # need to ignore this line as well
one='1111111' # we want anthing after # ignored,, but stll keep line break.
two = '22222222'
          
         
         three=333333333
                   four=4444444
  five=5555555                  # Let us see if we can remove suffix spaces.

# 12  
Old 05-03-2010
One file, two solutions:
Code:
# cat file
# All empty lines with any combination of space characters should be ignored.
                  # need to ignore this line as well
one="1111111" # Anything after # should be ignored,, still keep line break.
two="22222222"

                three=333333333 other   data

# awk '$1!="#" && NF{gsub(/^[\t ]*/,x);sub("#.*",x);print}' file
one="1111111"
two="22222222"
three=333333333 other   data

# awk 'NF && $1 != "#"{sub("#.*","");$1=$1;print}' file
one="1111111"
two="22222222"
three=333333333 other data

I'll add comments later Smilie

---------- Post updated at 08:17 PM ---------- Previous update was at 08:09 PM ----------

My solution will remove any leading spaces or tab char preserving the rest of the record.
On the other hand Franklin52 solution will reformat($1==$1) the record replacing FS(space and/or tab char) by default OFS(single space char).
# 13  
Old 05-03-2010
Dan Thanks for the update,, you are right Frank's $1=$1 is removing more than one space between words, I am sure he is going to come up with a better idea :-)!

Latest, input file with requirements...

Code:
$> cat input.txt
# Need to ignore this line
                  # need to ignore this line as well
one='1111111' # we want anthing after # ignored,, but stll keep line break.
two = '22222222'
# Need to remove empty lines && lines with only spaces
          
         
         three=333333333
                   four=4444444
  five=5555555  555     55              # Preserve spaces between words, but remove prefix, suffix spaces && comment at the end

Output from "danmero's" latest code -->

Code:
$>/usr/xpg4/bin/awk '$1!="#" && NF{gsub(/^[\t ]*/,x);sub("#.*",x);print}' input.txt
one='1111111' 
two = '22222222'
three=333333333
four=4444444
five=5555555  555       55

Dan is it possible to trim suffix spaces at the end of the line, i.e after --> 55 ?
Can we adjust the command a little to get that perfection !?

Last edited by kchinnam; 05-03-2010 at 11:15 PM.. Reason: removed what may be a personal reference..
# 14  
Old 05-03-2010
The only way to precisely implement a general purpose sh script to strip comments is to implement a sh language parser. In short, the goal would be to implement a sh parser in sh. Anything less would not be dependable for general purpose use. That said, for your needs (whatever they may be), perhaps an 80% solution is satisfactory 99% of the time.

If the input to the comment stripper is not constrained to some restricted format, there will be problems. The "#" is used for purposes other than to introduce a comment. And, even if it had no other use, situations like quoted strings and command substitutions would need to be taken into account.

The following posix-compliant sh script would be mangled by any naive solution.

nocomments.sh:
Code:
#!/bin/sh

echo There is no comment line beginning with a\
# in this file except for the leading '#!/bin/sh', which you may not want removed

echo If this' # was not quoted', I would be a comment
echo $# is the number of positional parameters, not a comment

Note that the above sh script contains no comments, other than the shebang line (#!/bin/sh), which you may not want to strip, depending on your goal.

Verifying the validity of the script, and that the line beginning with a "#" is indeed not a comment:
Code:
$ ./nocomments.sh 
There is no comment line beginning with a# in this file except for the leading #!/bin/sh, which you may not want removed
If this # was not quoted, I would be a comment
0 is the number of positional parameters, not a comment

Testing a proposed solution (sorry, danmero, I just picked yours because it's the latest post as I write this Smilie; all others suffer the same shortcomings):
Code:
$ awk '$1!="#" && NF{gsub(/^[\t ]*/,x);sub("#.*",x);print}' nocomments.sh 

echo There is no comment line beginning with a\
echo If this' 
echo $

The comment-less (shebang excepted) script has been mutilated.

Again, the proposed solutions may be sufficient for your needs; I'm simply pointing out their unsuitability for general purpose use.

Regards,
Alister

P.S. And those are just some posix-compliant possibilities, who knows what madness awaits beyond the standard Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Including Hash / in sed command filter

Hello All, I want to print data in between two lines in a file sample.txt through more or cat command on the screen. For that I am using below sed command to give the BEGIN and END text. Content of sample.txt server01:~ # cat /proc/mdstat Hello this is a text message 1 Hello this is a... (5 Replies)
Discussion started by: Xtreme
5 Replies

2. UNIX for Dummies Questions & Answers

add a string to a file without line break

I searched and found "echo -n" and "printf" are solution for this, but they are not here: $ echo "hello" >> test $ cat test hello $ echo -n "world" >> test $ cat test hello world $ echo -n " seriously?" >> test $ cat test hello world seriously? This is not successful... (15 Replies)
Discussion started by: stunn3r
15 Replies

3. Shell Programming and Scripting

Add line break for each line in a file

I cannot seem to get this to work.. I have a file which has about 100 lines, and there is no end of line (line break \n) at the end of each line, and this is causing problem when i paste them into an application. the file looks like this this is a test that is a test balblblablblhblbha... (1 Reply)
Discussion started by: fedora
1 Replies

4. Shell Programming and Scripting

[Solved] Problem in reading a file line by line till it reaches a white line

So, I want to read line-by-line a text file with unknown number of files.... So: a=1 b=1 while ; do b=`sed -n '$ap' test` a=`expr $a + 1` $here do something with b etc done the problem is that sed does not seem to recognise the $a, even when trying sed -n ' $a p' So, I cannot read... (3 Replies)
Discussion started by: hakermania
3 Replies

5. Shell Programming and Scripting

filter record from a file reading another file

Hi, I want to filter record from a file if the records in the second column matches the data in another file. I tried the below awk command but it filters the records in the filter file. I want the opposite, to include only the records in the filter file. I tried this: awk -F'|'... (8 Replies)
Discussion started by: gpaulose
8 Replies

6. Shell Programming and Scripting

Break line after last "/" if length > X characters

Hello. I am a french newbie in unix shell scripting (sorry if my english speaking is wrong). I have a file with path and filenames in it. I want to limit the number of characters on each line and break the line if necessary. But the "break" should occur after a slash caracter "/". Example of... (9 Replies)
Discussion started by: SportBilly
9 Replies

7. UNIX for Dummies Questions & Answers

Reading a line including spaces

Hi All, I have a script that reads a file and echo it back to std out. Test.txt 1aaaaaaaaaaa . The script is ReadLine.sh #!/bin/ksh cat $1 | while read file do echo $file done I invoke the script as ReadLine.sh Test.txt The output that I get is (1 Reply)
Discussion started by: aksarben
1 Replies

8. Shell Programming and Scripting

Replacing characters in file with line break

Hi, Apologies if this has been asked before, but I searched and was not able to find an answer. It's probably a simple question to answer for those of you with some experience, though... I have a relatively long string where tokens are separated by the colon (':') character. Let's say the... (10 Replies)
Discussion started by: johnemb
10 Replies

9. Shell Programming and Scripting

Reading a path (including ref to shell variable) from file

Hi! 1. I have a parameter file containing path to log files. For this example both paths are the same, one is stated directly and the second using env variables. /oracle/admin/orcl/bdump/:atlas:trc:N ${ORACLE_BASE}/admin/${ORACLE_SID}/bdump/:${ORACLE_SID}:trc:N 2. I try to parse the path... (1 Reply)
Discussion started by: lojzev
1 Replies

10. Programming

Reading special characters while converting sequential file to line sequential

We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)
Discussion started by: Rajeshsu
2 Replies
Login or Register to Ask a Question