(g)awk how to preseve white spaces (FS characters) or read a right subpart of $0?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting (g)awk how to preseve white spaces (FS characters) or read a right subpart of $0?
# 1  
Old 07-28-2009
(g)awk how to preseve white spaces (FS characters) or read a right subpart of $0?

Hi,
I am using gawk (--posix) for extracting some information from something like the following lines (in a text file):

sms_snath_hp_C/CORE BUILD PREREQUISITE:
total 1556
drwxrwxrwx 2 sn sn 4096 2008-06-27 08:31 ./
drwxrwxrwx 13 sn sn 4096 2009-07-22 14:48 ../
-rwxrwxrwx 1 sn sn 15348 2007-05-11 08:37 This is a file name with seven spaces.jar*
-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name with eight spaces.jar*
-rwxrwxrwx 1 sn sn 73687 2007-05-11 08:37 ibmjcefw.jar*
-rwxrwxrwx 1 sn sn 767101 2007-05-11 08:37 ibmjceprovider.jar*

With regular expressions (pattern matching) I am ignoring all the lines except the ones which are NOT directories with long listing format.

So I consider only:
-rwxrwxrwx 1 sn sn 15348 2007-05-11 08:37 This is a file name with seven spaces.jar*
-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name with eighteen spaces.jar*
-rwxrwxrwx 1 sn sn 73687 2007-05-11 08:37 ibmjcefw.jar*
-rwxrwxrwx 1 sn sn 767101 2007-05-11 08:37 ibmjceprovider.jar*

Question is: How do I get the file names with preserving the white spaces in between?
Note that the file has no embedded FS character, then it is just $8 and the problem is over. If the file name has embedded multiple FS characters, then I just do not want to concatenate $8 FS $9 FS $10 (etc in a loop) but I also want to have the multiplicity of the FS characters preserved.
(something like that "read v1 v2 v3 v4 v5 v6 v7 fileName" would do).

Thanks.

-sn
# 2  
Old 07-28-2009
you can try to use cut
Code:
echo $line | cut -d' ' -f8-

# 3  
Old 07-28-2009
I am sorry the line

-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name with eighteen spaces.jar*

in the original had multiple spaces in the name of the file (on the html posting here on the forum those got collapsed into single spaces Smilie)

It is something like:
ThisXisXXaXXXfileXXXXnameXXXXwithXXXeighteen spaces.jar*

---------- Post updated at 11:20 PM ---------- Previous update was at 10:58 PM ----------

Quote:
Originally Posted by ryandegreat25
you can try to use cut
Code:
echo $line | cut -d' ' -f8-

Hi ryandegreat25

Your quick and correct answer is appreciated. Yes I could use "cut", "read" etc., but all of these are shell (external/internal) commands.

But could we solve this inside the gawk script itself (I mean without calling other shell commands/scripts) ? I already have a gawk script in place that does some other things too. If it cannot be done, then I will have to do a "surgery" on the script and split it into possibly many scripts with "read" or "cut" piped in between.

Thanks.

-sn
# 4  
Old 07-28-2009
i see.. well maybe you could try narrowing the spaces by. I'm sure there are better suggestions out there.
Code:
echo $x | tr -s " " | cut -d' ' -f8-

I see lets wait reply from others Smilie

---------- Post updated at 02:23 PM ---------- Previous update was at 02:20 PM ----------

can you show us your script?
# 5  
Old 07-28-2009
Quote:
Originally Posted by shri_nath
I am sorry the line

-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name with eighteen spaces.jar*

in the original had multiple spaces in the name of the file (on the html posting here on the forum those got collapsed into single spaces Smilie)
You know how to use [size] and [color] tags , now you have to learn to use [code] tags when you post sample data , like that your multiple space will not "collapse into a single space" Smilie
# 6  
Old 07-28-2009
All the above commands will work.
Only thing is you will have to quote the echo.
Eg:
Code:
echo "$x" | cut -d' ' -f8-

Have you tried:
Code:
ls -l +d
and
ls +d

Newer versions accept them.

May be this will also help you:
Code:
xx='-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name     with  eighteen    spaces.jar*'
echo "$xx"  | sed 's/^.*[0-9] \(.*\)\*$/\1/'
Output:
This is a file name     with  eighteen    spaces.jar

***I have removed the * also for you.***

---------- Post updated at 03:34 AM ---------- Previous update was at 03:20 AM ----------

Quote:
Originally Posted by edidataguy
All the above commands will work.
Only thing is you will have to quote the echo.
Eg:
Code:
echo "$x" | cut -d' ' -f8-

Have you tried:
Code:
ls -l +d
and
ls +d

Newer versions accept them.

May be this will also help you:
Code:
xx='-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file name     with  eighteen    spaces.jar*'
echo "$xx"  | sed 's/^.*[0-9] \(.*\)\*$/\1/'
Output:
This is a file name     with  eighteen    spaces.jar

***I have removed the * also for you.***
You can get rid of most of your coding with this:
Code:
ls -ltr | sed -n '/^-/ s/^.*[0-9] \(.*\)$/\1/p'


Last edited by edidataguy; 07-28-2009 at 05:28 AM..
# 7  
Old 07-28-2009
You know that the filename will be after the 7th field,
so you could do something like this:

Code:
gawk --posix 'NR > 2 && !/\/$/ {
  sub(/([^ \t]+[ \t]+){7}/,"")
  print
  }' infile

Which produces:

Code:
zsh-4.3.10[t]% cat infile
sms_snath_hp_C/CORE BUILD PREREQUISITE:
total 1556
drwxrwxrwx 2 sn sn 4096 2008-06-27 08:31 ./
drwxrwxrwx 13 sn sn 4096 2009-07-22 14:48 ../
-rwxrwxrwx 1 sn sn 15348 2007-05-11 08:37 This is a file name with seven spaces.jar*
-rwxrwxrwx 1 sn sn 22395 2007-05-11 08:37 This is a file    name with      eight spaces.jar*
-rwxrwxrwx 1 sn sn 73687 2007-05-11 08:37 ibmjcefw.jar*
-rwxrwxrwx 1 sn sn 767101 2007-05-11 08:37 ibmjceprovider.jar*
zsh-4.3.10[t]% gawk --posix 'NR > 2 && !/\/$/ {
  sub(/([^ \t]+[ \t]+){7}/,"")
  print
  }' infile
This is a file name with seven spaces.jar*
This is a file    name with      eight spaces.jar*
ibmjcefw.jar*
ibmjceprovider.jar*

If you don't want to modify the current record you can save it in a variable and then manipulate the saved record:

Code:
gawk --posix 'END {
  # print the filenames
  while (++i <= c) print fn[i]
  }
{
  # build an array to hold the filenames
  if (NR > 2 && !/\/$/) {
    rec = $0; sub(/([^ \t]+[ \t]+){7}/,"", rec)
    fn[++c] = rec
    }
  }' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing blank/white spaces and special characters

Hello All , 1. I am trying to do a task where I need to remove Blank spaces from my file , I am usingawk '{$1=$1}{print}' file>file1Input :- ;05/12/1990 ;31/03/2014 ; Output:- ;05/12/1990 ;31/03/2014 ;This command is not removing all spaces from... (6 Replies)
Discussion started by: himanshu sood
6 Replies

2. Shell Programming and Scripting

Bash - read white spaces

Hello! I have one problem with my bash script - I would like to be able to read white space characters from stdin (for example single " ") - can I acomplish that somehow? I need to read only one character at the time, so I use read -s -n 1 var but it doesn't work for whitespaces apparently. ... (3 Replies)
Discussion started by: xqwzts
3 Replies

3. AIX

Replace all TAB characters with white spaces

Dear Gurus Can you please advise me on how to Replace all TAB characters with white spaces in a text file in AIX? Either using vi or any utilities (2 Replies)
Discussion started by: tenderfoot
2 Replies

4. Shell Programming and Scripting

Leading white spaces

Hi, I am having problem in deleting the leading spaces:- cat x.csv baseball,NULL,8798765,Most played baseball,NULL,8928192,Most played baseball,NULL,5678945,Most played cricket,NOTNULL,125782,Usually played cricket,NOTNULL,678921,Usually played $ nawk 'BEGIN{FS=","}!a... (2 Replies)
Discussion started by: scripter12
2 Replies

5. Solaris

removing special characters, white spaces from a field in a file

what my code is doing, it is executing a sql file and the resullset of the query is getting stored in the text file in a fixed format. for that fixed format i have used the following code:: Code: awk -F":"... (2 Replies)
Discussion started by: priyanka3006
2 Replies

6. UNIX for Dummies Questions & Answers

Replace only first found white spaces with some other characters

Anybody can help me How can I replace only four first white spaces with , or any other characters aaaa 08/31/2004 08/31/2009 permanent Logical Processors in System: 64 bedad 08/16/2001 08/15/2011 permanent Logical Processors in System: 64 badnv14 05/31/2008 05/30/2013 permanent Logical... (5 Replies)
Discussion started by: pareshan
5 Replies

7. Shell Programming and Scripting

trimming white spaces

I have a variable that calls in a string from txt file. Problem is the string comes with an abundance of white spaces trailing it. Is there any easy way to trim the tailing white spaces off at the end? Thanks in advance. (9 Replies)
Discussion started by: briskbaby
9 Replies

8. Shell Programming and Scripting

Trim white spaces using awk

Hi, I have a CSV file with footer information as below. The third value is the number of records in the file. Sometimes it contains both leading and trailing white spaces which i want to trim using awk. C,FOOTER , 00000642 C,FOOTER , 00000707 C, FOOTER,... (2 Replies)
Discussion started by: mona
2 Replies

9. Shell Programming and Scripting

delete white spaces

hi all... i have the next question: i have a flat file with a lot of records (lines). Each record has 10 fields, which are separated by pipe (|). My problem is what sometimes, in the first record, there are white spaces (no values, nothing) in the beginning of the record, like this: ws ws... (2 Replies)
Discussion started by: DebianJ
2 Replies

10. UNIX for Dummies Questions & Answers

deleting white spaces

How would I delete white spaces in a specified file? Also, I'd like to know what command I would use to take something off a regular expression, and put it onto another. ie. . . . expression1 <take_off> . . . expression2 (put here) . . . Any help would be great, thanks! (10 Replies)
Discussion started by: cary530
10 Replies
Login or Register to Ask a Question