strange "No such file or directory" errors on NFS volumes


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users strange "No such file or directory" errors on NFS volumes
# 1  
Old 01-14-2008
Question strange "No such file or directory" errors on NFS volumes

we're seeing very strange "No such file or directory" errors on NFS volumes on one of our suse servers - can anyone please help?

we're seeing it for both our NetApp NAS Device and one of our Solaris NFS servers too

Here is what we're seeing:

Code:
stg-backup:~ # cd /rmt/sge
stg-backup:/rmt/sge # ls -l
/bin/ls: bin: No such file or directory
/bin/ls: doc: No such file or directory
/bin/ls: lib: No such file or directory
/bin/ls: man: No such file or directory
/bin/ls: mpi: No such file or directory
/bin/ls: pvm: No such file or directory
/bin/ls: ckpt: No such file or directory
/bin/ls: 3rd_party: No such file or directory
/bin/ls: qmon: No such file or directory
/bin/ls: util: No such file or directory
/bin/ls: default: No such file or directory
/bin/ls: install_qmaster: No such file or directory
/bin/ls: sge-5.3p6-doc.tar.gz: No such file or directory
/bin/ls: stg_config: No such file or directory
/bin/ls: README.inst_sgeee: No such file or directory
/bin/ls: ssh_comm_dir: No such file or directory
/bin/ls: inst_sgeee: No such file or directory
/bin/ls: sge-5.3p6-common.tar.gz: No such file or directory
/bin/ls: catman: No such file or directory
/bin/ls: sge-5.3p6-bin-glinux.tar.gz: No such file or directory
/bin/ls: utilbin: No such file or directory
/bin/ls: stg_config.old.tgz: No such file or directory
/bin/ls: examples: No such file or directory
/bin/ls: install_execd: No such file or directory
/bin/ls: inst_sge: No such file or directory
/bin/ls: sge-5.3p7-bin-solaris64.tar: No such file or directory
/bin/ls: sge-5.3p7-common.tar: No such file or directory
/bin/ls: sge-5.3p7-doc.tar: No such file or directory
/bin/ls: core: No such file or directory
/bin/ls: neilcopyrcsge.sh: No such file or directory
/bin/ls: neilstopoldinstallrcsge.sh: No such file or directory
total 0
dr-xr-xr-x  2 root root 0 2008-01-14 11:40 .
drwxr-xr-x  7 root root 0 2008-01-14 13:06 ..
stg-backup:/rmt/sge # ls -l
total 0
stg-backup:/rmt/sge # ls -l
total 0
stg-backup:/rmt/sge # pwd
/rmt/sge
stg-backup:/rmt/sge # logout
Connection to stg-backup closed.
rhobbs@stg-mkc5:~> ssh stg-backup -l root
Password:
Last login: Mon Jan 14 13:05:56 2008 from stg-mkc5.domain.co.uk
stg-backup:~ # cd /rmt/sge
stg-backup:/rmt/sge # ls -l
total 73174
drwxr-xr-x  18 sgeadmin sgeadmin     1024 2008-01-02 23:09 .
drwxr-xr-x   4 root     root            0 2008-01-14 13:09 ..
drwxr-xr-x   3 sgeadmin sgeadmin      512 2005-04-27 14:48 3rd_party
drwxr-xr-x   4 root     root          512 2007-12-04 13:43 bin
drwxr-xr-x   4 sgeadmin sgeadmin      512 2002-03-27 14:30 catman
drwxr-xr-x   2 sgeadmin sgeadmin     1024 2005-04-27 14:48 ckpt
-rw-------   1 root     root      8388403 2008-01-02 23:09 core
drwxr-xr-x   4 sgeadmin sgeadmin      512 2007-12-04 15:33 default
drwxr-xr-x   2 sgeadmin sgeadmin      512 2005-04-27 14:48 doc
drwxr-xr-x   4 sgeadmin sgeadmin      512 2005-04-14 13:39 examples
-rwxr-xr-x   1 sgeadmin sgeadmin     1354 2004-04-07 12:29 install_execd
-rwxr-xr-x   1 sgeadmin sgeadmin     1354 2004-04-07 12:29 install_qmaster
-rwxr-xr-x   1 sgeadmin sgeadmin    77667 2006-02-27 15:53 inst_sge
lrwxrwxrwx   1 sgeadmin sgeadmin        8 2007-12-04 13:37 inst_sgeee -> inst_sge
drwxr-xr-x   4 root     root          512 2007-12-04 13:43 lib
drwxr-xr-x   6 sgeadmin sgeadmin      512 2002-03-27 14:30 man
drwxr-xr-x   3 sgeadmin sgeadmin      512 2005-04-27 14:48 mpi
-rwxr-xr-x   1 root     root          125 2008-01-02 11:46 neilcopyrcsge.sh
-rwxr-xr-x   1 root     root           63 2008-01-02 11:46 neilstopoldinstallrcsge.sh
drwxr-xr-x   3 sgeadmin sgeadmin      512 2005-04-27 14:48 pvm
drwxr-xr-x   4 sgeadmin sgeadmin      512 2005-04-27 14:48 qmon
-rw-r--r--   1 root     bin           396 2004-04-07 12:29 README.inst_sgeee
-rw-r--r--   1 root     root      9312974 2005-04-27 14:46 sge-5.3p6-bin-glinux.tar.gz
-rw-r--r--   1 root     root       822815 2005-04-27 14:46 sge-5.3p6-common.tar.gz
-rw-r--r--   1 root     root      3082603 2005-04-27 14:46 sge-5.3p6-doc.tar.gz
-rw-r--r--   1 root     root     45015040 2007-12-04 10:44 sge-5.3p7-bin-solaris64.tar
-rw-r--r--   1 root     root      2508800 2007-12-04 10:43 sge-5.3p7-common.tar
-rw-r--r--   1 root     root      5580800 2007-12-04 10:43 sge-5.3p7-doc.tar
drwxrwxrwx   2 root     root          512 2007-04-17 13:09 ssh_comm_dir
drwxr-xr-x   4 root     root          512 2006-09-19 09:34 stg_config
-rw-r--r--   1 root     root         8404 2006-07-21 08:38 stg_config.old.tgz
drwxr-xr-x   5 sgeadmin sgeadmin      512 2006-02-27 16:18 util
drwxr-xr-x   4 root     root          512 2007-12-04 13:43 utilbin
stg-backup:/rmt/sge #

As you can see, periodically we get strange "ls" behaviour which happens repeatedly until i log out and in again, at which point it works.

Sometimes it works first time, and other times it errors until i log out and in again.

I hope someone knows what's causing this, because it's a nightmare! lol

Thanks in advance, people! Smilie Smilie

Last edited by fishsponge; 01-14-2008 at 09:22 AM.. Reason: removed useless "pastebin" URL
This User Gave Thanks to fishsponge For This Post:
# 2  
Old 01-14-2008
could it be tied into these messages that we're seeing in "/var/log/messages" and "/var/log/warn"?

Code:
stg-backup:/var/log # tail messages
Jan 14 14:57:09 stg-backup kernel: svc: bad direction 256, dropping request
Jan 14 14:57:09 stg-backup kernel: svc: short len 20, dropping request
Jan 14 14:57:30 stg-backup kernel: svc: bad direction 256, dropping request
Jan 14 14:57:30 stg-backup kernel: svc: short len 20, dropping request
Jan 14 14:57:39 stg-backup kernel: svc: bad direction 256, dropping request
Jan 14 14:57:39 stg-backup kernel: svc: short len 20, dropping request
Jan 14 14:58:00 stg-backup kernel: svc: bad direction 256, dropping request
Jan 14 14:58:00 stg-backup kernel: svc: short len 20, dropping request
Jan 14 14:58:09 stg-backup kernel: svc: bad direction 256, dropping request
Jan 14 14:58:09 stg-backup kernel: svc: short len 20, dropping request

This may be a red herring, posting these messages, but they are also strange and so may be somehow related...
# 3  
Old 01-14-2008
strangely, i'm beginning to think it's some strange environment problem because i just suffered the problem again, and this time decided to open a second terminal to see if the problem could exist in two separate terminals.

Here are the results from both terminals:

Code:
TERMINAL 1:

stg-backup:/rmt/project2 # date; ls -l
Mon Jan 14 15:02:08 GMT 2008
total 0
stg-backup:/rmt/project2 #

Code:
TERMINAL 2:

stg-backup:/rmt/project2 # date; ls -l
Mon Jan 14 15:02:08 GMT 2008
total 44
drwxrwsr-x   9 root     stg  4096 2008-01-13 20:20 .
drwxr-xr-x   9 root     root    0 2008-01-14 15:01 ..
-rw-r--r--   1 root     stg    21 2008-01-13 21:16 .arkeiaNOBACKUP
-rw-r--r--   1 root     stg   480 2005-06-28 12:52 .arkeiaNOBACKUP.email
-rw-r--r--   1 root     stg    21 2008-01-12 21:15 .arkeiaNOBACKUP.old
drwxrwsr-x  11 stg      stg  4096 2008-01-11 13:27 ASR
drwxrwsr-x   8 mstuttle stg  4096 2007-08-10 15:34 demos
drwxrwxr-x  11 gwebster stg  4096 2007-03-23 18:39 gabe
drwxrwsr-x   2 root     stg  4096 2005-02-04 17:36 home
-rw-r--r--   1 root     stg    99 2004-03-18 17:46 Makefile
drwxr-xr-x   2 root     root 4096 2008-01-08 10:06 mysqlbackup
-rw-r--r--   1 root     stg     6 2004-10-29 07:50 neil.txt
-rw-------   1 root     stg   675 2004-03-22 09:58 nohup.out
lrwxrwxrwx   1 root     stg    20 2005-11-30 09:13 remote -> /rmt/sysadmin/remote
drwxrwxrwx  29 root     root 4096 2008-01-14 15:01 .snapshot
lrwxrwxrwx   1 root     stg    22 2005-09-09 17:00 stguser -> /rmt/stg14/TTS/stguser
drwxrwsrwx   4 kate     stg  4096 2004-07-27 14:48 sysadmin
stg-backup:/rmt/project2 #

as you can see, i had two terminals open on the same machine at exactly the same time, both in the same automounted NFS directory, running exactly the same command.

One terminal failed, and the other worked.

Therefore this cannot be a hardware problem, right?

to prove that this is not a coincidence, i ran the same test three more times, and got exactly the same results - TERMINAL 1 was "broken" and TERMINAL 2 was working.

I then logged out on TERMINAL 1 and logged back in again and it works again:

Code:
stg-backup:/rmt/project2 # date; ls -l
Mon Jan 14 15:05:26 GMT 2008
total 0
stg-backup:/rmt/project2 # logout
Connection to stg-backup closed.
rhobbs@stg-mkc5:~> ssh stg-backup -l root
Password:
Last login: Mon Jan 14 15:01:36 2008 from stg-mkc5.crl.toshiba.co.uk
stg-backup:~ # cd /rmt/project2
stg-backup:/rmt/project2 # ls -l
total 44
drwxrwsr-x   9 root     stg  4096 2008-01-13 20:20 .
drwxr-xr-x   7 root     root    0 2008-01-14 15:05 ..
-rw-r--r--   1 root     stg    21 2008-01-13 21:16 .arkeiaNOBACKUP
-rw-r--r--   1 root     stg   480 2005-06-28 12:52 .arkeiaNOBACKUP.email
-rw-r--r--   1 root     stg    21 2008-01-12 21:15 .arkeiaNOBACKUP.old
drwxrwsr-x  11 stg      stg  4096 2008-01-11 13:27 ASR
drwxrwsr-x   8 mstuttle stg  4096 2007-08-10 15:34 demos
drwxrwxr-x  11 gwebster stg  4096 2007-03-23 18:39 gabe
drwxrwsr-x   2 root     stg  4096 2005-02-04 17:36 home
-rw-r--r--   1 root     stg    99 2004-03-18 17:46 Makefile
drwxr-xr-x   2 root     root 4096 2008-01-08 10:06 mysqlbackup
-rw-r--r--   1 root     stg     6 2004-10-29 07:50 neil.txt
-rw-------   1 root     stg   675 2004-03-22 09:58 nohup.out
lrwxrwxrwx   1 root     stg    20 2005-11-30 09:13 remote -> /rmt/sysadmin/remote
drwxrwxrwx  29 root     root 4096 2008-01-14 15:01 .snapshot
lrwxrwxrwx   1 root     stg    22 2005-09-09 17:00 stguser -> /rmt/stg14/TTS/stguser
drwxrwsrwx   4 kate     stg  4096 2004-07-27 14:48 sysadmin
stg-backup:/rmt/project2 #

so now i'm really confused...

The annoying thing is that i have also just noticed that this problem is causing some of the cron jobs that access remote NFS volumes to fail as well!

Argh!

Someone help me, please! lol
# 4  
Old 06-17-2008
Do you know how to fix that yet?

It seems that we are seeing the same thing here.
Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

3. Shell Programming and Scripting

Why awk print is strange when I set FS = " " instead of FS = "\t"?

Look at the following data file(cou.data) which has four fields separated by tab. Four fields are country name, land area, population, continent where it belongs. As for country name or continent name which has two words, two words are separated by space. (Data are not accurately... (1 Reply)
Discussion started by: chihuyu
1 Replies

4. Red Hat

Related to "NAS" some file system (mounted volumes) were not writable

Dear friends, I have been facing an issue with one of my red hat unix machine, suddenly lost to switch sudo users. My all colleagues lost to switch to access sudo users. Then, we have realized its related to NAS issue which does not allowing to write the file. because of this we got so many... (1 Reply)
Discussion started by: Chand
1 Replies

5. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

6. Shell Programming and Scripting

"find . -printf" without prepended "." path? Getting path to current working directory?

If I enter (simplified): find . -printf "%p\n" then all files in the output are prepended by a "." like ./local/share/test23.log How can achieve that a.) the leading "./" is omitted and/or b.) the full path to the current directory is inserted (enclosed by brackets and a blank)... (1 Reply)
Discussion started by: pstein
1 Replies

7. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

8. Shell Programming and Scripting

Delete files older than "x" if directory size is greater than "y"

I wrote a script to delete files which are older than "x" days, if the size of the directory is greater than "y" #!/bin/bash du -hs $1 while read SIZE ENTRY do if ; then find $1 -mtime +$2 -exec rm -f {} \; echo "Files older than $2 days deleted" else echo "free Space available"... (4 Replies)
Discussion started by: JamesCarter
4 Replies

9. AIX

"too big" and "not enough memory" errors in shell script

Hi, This is odd, however here goes. There are several shell scripts that run in our production environment AIX 595 LPAR m/c, which has sufficient memory 14GB (physical memory) and horsepower 5CPUs. However from time to time we get the following errors in these shell scripts. The time when these... (11 Replies)
Discussion started by: jerardfjay
11 Replies

10. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies
Login or Register to Ask a Question