Sponsored Content
Full Discussion: It's a puzzle
Operating Systems Linux It's a puzzle Post 302233130 by jwoude on Saturday 6th of September 2008 07:21:37 AM
Old 09-06-2008
Question It's a puzzle

Hi,

Recently I installed Fedora 9 on the following hardware
- Asus A8N-SLI Deluxe motherboard bios version 1805
- 2GB twinmos ram
- AMD 4400 CPU
- Tagan PSU 550 W
- Asus EN6200LE video card
- WD 74 GB Raptor
- Areca ARC-1222 raid controller
- 4x 1TB Seagate Baracudas
- Symbios Logic 53C875J SCSI controller card (made for Compaq)
- HP surestore DAT40 tape drive

Fedora installed, booted and worked fine for a couple of days. With yum I installed all relevant updates.

Trouble started when using Amanda for the first backups to tape. Amanda would work ok a few times, but then
the entire machine crashed. I mean really crashed. The machine would not get through the bios post.
So I cleared cmos by removing battery , setting jumper appropriately and a wait for 15 secs. No avail, motherboard dead.

I ordered a replacement identical motherboard and put everything back together. Linux boots fine and I did not touch Amanda
for a week. All was well, so I thought. I used the machine intensively, copying over 1 TB of data to the raid array, installing
Horde packages and all kinds of other fun stuff. No problems what so ever.

I did look through the logs obviously. The only entries of note were related to the scsi controller card. A couple of
SCSI bus resets just prior to the crash. I did find a few articles from 2005 on the net about 53c8XX driver problems:
Please fix bug #1852 (hald causes SYM53C8xx SCSI errors, device disconnects + GNOME hang). Surely this problems was fixed a long time ago ?
For double measure I also checked all termination caps and scsi cables.
I am pretty sure, but not absolutely sure, these resets were related to the 53c875J scsi controller card and not to the
Areca raid card. Anyhow, I had no problems with the raid array at all, even when using it intensively.

The next weekend I ran an Amanda backup again. Two amflush jobs went fine, so old backups on holding disk were flushed to tape ok.
Then I proceeded with a new backup (amdump). After some time the machine crashed again. Absolutely identical symptoms.

This time, I stripped the machine down to bare minimum.
Only motherboard, PSU, 1 GB ram, AMD 4400 CPU, old pci videocard, keyboard, monitor.
Result only one beep (that's good) and colorful gibberish on the monitor, not even the bios mem check and such.

So, it appears that some error related to using the backup software (scsi?) causes the motherboard to die.

Presently, I have two courses of action I can think of:
1. I ordered a new bios chip, hoping that the board will then get through post.
If this works it suggests to me that some error in scsi subsystem can actually overwrite (flash!) the motherboard bios.
Two weeks ago, I had not believed this possible, but here it is.
2. If option 1. does not work I will order yet another replacement motherboard and think of a new backup strategy.
I do not mind chasing bugs, but loosing a motherboard at every step of the way is not very appealing.
So out with the scsi card.

BTW until I get the machine up and running again I cannot look at the logs and present more detailed error reports.
This is all from memory.

I have spent quite some time googling this particular problem. I cannot find any similar cases.
So anyone out there, does this ring any bells?

Thanks
Jos
It's a puzzle-2008-09-06jpg
 

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Alias escape puzzle

Here is "escape puzzle" from real life task: Conditions: We need to create an alias which will Save current directory path Will ssh to particular server Then will cd to saved path (it's mounted via NFS) Then will find all files with name patter as "All*.bld" and run particular editor... (0 Replies)
Discussion started by: BaruchLi
0 Replies

2. IP Networking

Puzzle about sctp_bindx in UNP

It writes in Section 9.3 in Unix Network programming about SCTP: "The sctp_bindx call can be used on a bound or unbound socket." And then it writes: "The port number in all the socket address structures must be the same and must match any port number that is already bound; if it doesn't, then... (0 Replies)
Discussion started by: tomdean001
0 Replies

3. Programming

The puzzle for malloc some spaces for a key

Hi, all, I am writing a BST (Binary Search Tree). What I am concerned about is typedef struct BST{ struct BST *p_left; struct BST *p_right; void *p_data; char *p_key; unsigned int *length; }BST; I have to malloc some space for p_key. How many of chars... (4 Replies)
Discussion started by: mythmgn
4 Replies

4. UNIX for Advanced & Expert Users

Chroot jail environment puzzle

I have a simple sandbox program which runs a command as user "nobody" in a chroot jail. It sets resource limits with setrlimit, changes the user id with setuid, changes the root dir with chroot, and then calls exec to execute the command given as command line parameters. It is of course a... (8 Replies)
Discussion started by: john.english
8 Replies

5. Shell Programming and Scripting

A puzzle with a printing function executing in background

Somebody on a thread in the (french) Mandriva Forum recently suggested a script, designed to provide a tool to display kind of "temporisation widgets" on the console (to be ultimately pasted in other more complex scripts). One version of this script was something like the following, which seems... (6 Replies)
Discussion started by: klease
6 Replies

6. Solaris

Swap puzzle

I'm getting confused by swap # swap -l swapfile dev swaplo blocks free /dev/zvol/dsk/rpool/swap 256,2 16 16777200 16777200 /dev/zvol/dsk/swappool/swap2 256,1 16 50331632 50331632 # swap -s total: 6710256k bytes allocated + 3402944k reserved = 10113200k used,... (6 Replies)
Discussion started by: redstone
6 Replies

7. Shell Programming and Scripting

Another sed Syntax Puzzle . . .

Greetings! Have a quick question for the community today; this time looking at a nifty little sed puzzle ;) Consider the following file content to be worked through:What needs to happen is theblock should be removed up to and including the following blank line, leavingI have bits and pieces... (8 Replies)
Discussion started by: LinQ
8 Replies
All times are GMT -4. The time now is 06:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy