Insidious Space


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Insidious Space
# 1  
Old 12-01-2008
Insidious Space

Hi,

My ksh script generates a pattern list and a directory list and then iterates over each directory to grep for the pattern in each file in each directory.

The script uses find to output a directory list, which is preprocessed to escape spaces before continuing.
It then calls a function, which calls grep inside a while loop.
The call to grep appends a * to each directory path to find the pattern in each file in the directory.

The problem is that some of my directory names contain space characters.

grep fails on these. Depending on quoting, escaping, etc, I get either "cannot open" or wordsplitting on the pathname. The script works beautifully on paths without spaces.

I have tried every way of quoting, escaping etc I can think of. I've tried setting IFS=''.

Curiously, the function does not appear to recognize my path either, and I've been forced to use absolute paths for utility calls (i.e. /usr/bin/awk,) but only in the function.

Here are excerpts of code:

Code:
#!/bin/ksh

PATH=/usr/bin
export PATH

function procfile
{
SEARCH=$1
PATTERNFILE=$2
OUTPUTFILE=$3
TEMPFILE=/tmp/output_$$.tmp
OWNER=
OBJECT=

# escape spaces in paths
SEARCHFILE=`echo "$SEARCH"|/usr/bin/sed 's/ /\\\ /g'`

/usr/bin/cat ${PATTERNFILE} | while read LINE
do
  OBJECT=`echo ${LINE}|/usr/bin/awk -F, '{ print $2 }'`
  # find refs to the pattern in each of the files
  /usr/bin/grep -in "[^a-zA-Z0-9_]${OBJECT}[^a-zA-Z0-9_]" "$SEARCHFILE"/* > ${TEMPFILE}
  # iterate over the result to marry the pattern to the result in tabular format
  /usr/bin/cat ${TEMPFILE} | while read LINE2
  do
 
    # do query
    ${ORACLE_HOME}/bin/sqlplus -s...
  done
done
}

INPUT=dirlist.txt
PATTERN=oracle_objects.txt
OUTPUT=output.txt

# Query for current patternlist
${ORACLE_HOME}/bin/sqlplus ...

# Prepare the directory list
find ...
# escape spaces in paths
sed 's/ /\\\ /g' /tmp/dirlist.txt > dirlist.txt

# iterate over dirlist
cat ${INPUT} | while read LINE
do
    COUNT=`ls -l "$LINE"|wc -l`
    if [ ${COUNT} -gt 1 ]
    then
      procfile "$LINE" ${PATTERN} ${OUTPUT}
    fi
done

Any insight you can provide will be most helpful.

Thanks,
Dave
# 2  
Old 12-01-2008
It seems to work okay for me if I take out the sed commands to insert the escapes... you shouldn't need them since you are quoting the filenames where they are used anyway.

Also you can probably simplify the grep by using the word match option as follows:

Code:
   /usr/bin/grep -win "${OBJECT}" "$SEARCH"/* > ${TEMPFILE}

# 3  
Old 12-02-2008
Hi,

I followed your advice and removed the sed commands. Also, the nice tip for the grep -w option.

Unfortunately, I'm still getting wordsplitting:

The current directory (value of $SEARCH):
Code:
vob/Materialized Views

Output:
Code:
vob/Materialized: No such file or directory
Views/MV_V_IBI_ACT_BILL_BY_BU_24MTHS.sql: No such file or directory

This is what I was getting before adding the sed commands. Smilie

I tried using IFS to no avail:

Code:
  OIFS=$IFS
  IFS=''
  /usr/bin/grep -win "${OBJECT}" "$SEARCH"/* > ${TEMPFILE}
  IFS=$OIFS

I'm running on Solaris 10. Is it possible there are some subshell restrictions, maybe due to ssh? Could the wordsplitting and failure to honor $PATH be related?
# 4  
Old 12-02-2008
I don't believe setting IFS has any effect in that scenario... if it did then the shell would consider /usr/bin/grep -win ... all to be one word/command name, and wouldn't find it.

I just tested this on Solaris 10:

Code:
#!/bin/ksh

PATH=/usr/bin
export PATH

function procfile
{
SEARCH=$1
PATTERNFILE=$2
OUTPUTFILE=$3
TEMPFILE=_tmp_output_$$.tmp
OWNER=
OBJECT=

/usr/bin/cat ${PATTERNFILE} | while read LINE
do
  OBJECT=`echo ${LINE}|/usr/bin/awk -F, '{ print $2 }'`
  # find refs to the pattern in each of the files
  /usr/bin/grep -win "${OBJECT}" "$SEARCH"/* > ${TEMPFILE}
  # iterate over the result to marry the pattern to the result in tabular format
  /usr/bin/cat ${TEMPFILE} | while read LINE2
  do
    # do query
    #${ORACLE_HOME}/bin/sqlplus -s...
    echo $LINE2
  done
done
rm $TEMPFILE
}

INPUT=dirlist.txt
PATTERN=oracle_objects.txt
OUTPUT=output.txt

# Query for current patternlist
#${ORACLE_HOME}/bin/sqlplus ...

# Prepare the directory list
#find ...
find t -type d > $INPUT

# iterate over dirlist
cat ${INPUT} | while read LINE
do
    COUNT=`ls -l "$LINE"|wc -l`
    if [ ${COUNT} -gt 1 ]
    then
      procfile "$LINE" ${PATTERN} ${OUTPUT}
    fi
done

These are the test files I used:

Code:
$ find t -type f -ls
  554    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testdir/testfile
  555    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testdir/testfile with spaces
  559    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testdir with spaces/testfile
  560    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testdir with spaces/testfile with spaces
  551    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testfile
  552    1 -rw-r--r--   1 anni     other          14 Dec  2 17:05 t/testfile with spaces
$ cat t/testdir/testfile
one
two
three
$ cat oracle_objects.txt
1,one
2,two

And this was the output:

Code:
$ ./s
t/testfile:1:one
t/testfile with spaces:1:one
t/testfile:2:two
t/testfile with spaces:2:two
t/testdir/testfile:1:one
t/testdir/testfile with spaces:1:one
t/testdir/testfile:2:two
t/testdir/testfile with spaces:2:two
t/testdir with spaces/testfile:1:one
t/testdir with spaces/testfile with spaces:1:one
t/testdir with spaces/testfile:2:two
t/testdir with spaces/testfile with spaces:2:two

I doubt ssh is having any effect here... failing to honour $PATH is weird though, are you sure it's exported?
# 5  
Old 12-02-2008
OK, it's prounounced "doo-mah"

Thanks for cluing me in with your 'echo $LINE2'
I was doing a few things wrong:

  1. Storing the filepath component of the grep output in a variable called $PATH (Doh!) I've changed it to $FILEPATH
  2. Suppressing the output of the process so only errors were appearing--the grep was actually working, but I assumed the errors were caused by it because at one point, likely prior to my initial quoting effort, grep was throwing an error.
  3. Failing to realize the wordsplitting error was coming from a deeper command in output processing, where I was capturing the file mod date of files tagged by grep (i.e. `ls -e $DIRPATH...`)
Now I've double-quoted all vars with path refs, changed the var names so as not to conflict, and for good measure, redirected standard error to a file.

Thanks for lubing my brain!

Cheers,
Dave
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Gawk --- produce the output in pattern space instead of END space

hi, I'm trying to calculate IP addresses and their respective calls to our apache Server. The standard format of the input is HOST IP DATE/TIME - - "GET/POST reuest" "User Agent" HOST IP DATE/TIME - - "GET/POST reuest" "User Agent" HOST IP DATE/TIME - - "GET/POST reuest" "User Agent" HOST... (2 Replies)
Discussion started by: busyboy
2 Replies

2. Linux

No space left on device while there is plenty of space available

Hello all posting here after scanning the net and tried most of the things offered still no solution that worked when I do : $ df -h Filesystem Size Used Avail Use% Mounted on footmpfs 7.9G 60K 7.9G 1% /dev tmpfs 7.9G 0 7.9G 0% /dev/shm /dev/da1 ... (3 Replies)
Discussion started by: umen
3 Replies

3. UNIX for Advanced & Expert Users

Need to remove leading space from awk statement space from calculation

I created a awk state to calculate the number of success however when the query runs it has a leading zero. Any ideas on how to remove the leading zero from the calculation? Here is my query: cat myfile.log | grep | awk '{print $2,$3,$7,$11,$15,$19,$23,$27,$31,$35($19/$15*100)}' 02:00:00... (1 Reply)
Discussion started by: bizomb
1 Replies

4. UNIX for Dummies Questions & Answers

Changing only the first space to a tab in a space delimited text file

Hi, I have a space delimited text file but I only want to change the first space to a tab and keep the rest of the spaces intact. How do I go about doing that? Thanks! (3 Replies)
Discussion started by: evelibertine
3 Replies

5. Fedora

Need to incrwase PHYSICAL VOLUME space on hard drive with free space on it

Hi, I run Fedora 17. I created a physical volume of 30GB on a disk with 60GB of space so there is 30GB of free space. On the physical volume, I created my volume group and logical volumes. I assigned all the space in the physical volume to my volume group. I need to add the 30GB of free space... (1 Reply)
Discussion started by: mojoman
1 Replies

6. Solaris

No space left on device but free space and inodes are available...

hi guys, me again ;) i recently opened a thread about physical to zone migration. My zone is mounted over a "bigger" LUN (500GB) and step is now to move the old files, from the physical server, to my zone. We are talking about 22mio of files. i used rsync to do that and every time at... (8 Replies)
Discussion started by: beta17
8 Replies

7. Linux

How to reclaim the space which i used to increse the swap space on Xen,

Hi, i have done a blunder here, i increased the swap space on Xen5.6 server machine using below steps :- 1056 dd if=/dev/zero of=/root/myswapfile bs=1M count=1024 1057 ls -l /root/myswapfile 1058 chmod 600 /root/myswapfile 1059 mkswap /root/myswapfile 1060 swapon /root/myswapfile ... (1 Reply)
Discussion started by: apm
1 Replies

8. Shell Programming and Scripting

Stripping out more than a space from a line, but keep single space.

Hi all, Is there a way to perform the above, I am trying to strip out more than one space from a line, but keep the single space. See below output example. My Name is test test2 test3 test4 test5 My Name is test test2 test3 test4 test5 Please note that the lines would contain... (7 Replies)
Discussion started by: eo29
7 Replies

9. Shell Programming and Scripting

Calculate total space, total used space and total free space in filesystem names matching keyword

Good afternoon! Im new at scripting and Im trying to write a script to calculate total space, total used space and total free space in filesystem names matching a keyword (in this one we will use keyword virginia). Please dont be mean or harsh, like I said Im new and trying my best. Scripting... (4 Replies)
Discussion started by: bigben1220
4 Replies

10. UNIX for Advanced & Expert Users

wake up user space thread from kernel space ISR

Hello, I'm searching for a proper way to let the kernel space ISR(implemented in a kernel module) wake up a user space thread on a hardware interrupt. Except for sending a real-time signal, is it possible to use a semaphore? I've searched it on google, but it seems impossible to share a... (0 Replies)
Discussion started by: aaronwong
0 Replies
Login or Register to Ask a Question