close

Privacy guaranteed - Your email is not shared with anyone.

Welcome to Glock Talk

Why should YOU join our Glock forum?

  • Converse with other Glock Enthusiasts
  • Learn about the latest hunting products
  • Becoming a member is FREE and EASY

If you consider yourself a beginner or an avid shooter, the Glock Talk community is your place to discuss self defense, concealed carry, reloading, target shooting, and all things Glock.

Bizarre System Crash. . Any Ideas?

Discussion in 'Tech Talk' started by Bushflyr, May 8, 2012.

  1. Bushflyr

    Bushflyr ʇno uıƃuɐɥ ʇsnɾ Millennium Member

    3,524
    0
    Mar 17, 1999
    Western WA
    I noticed my server (Ubuntu 11.10 Server) HDD light was on when nothing was accessing the server. I already had a SSH window open on my Mac. So I tried a few commands to try and see what was going on.

    Code:
    [/usr/local/sbin/hourly.active]: htop
    -bash: /usr/bin/htop: Input/output error
    [/usr/local/sbin/hourly.active]: sudo cat /proc/mdstat
    -bash: /usr/bin/sudo: Input/output error
    [/usr/local/sbin/hourly.active]: ls
    /usr/local/sbin/ls: line 50: 19495 Bus error               /bin/ls $@ 1>&1
    [/usr/local/sbin/hourly.active]: la
    /usr/local/sbin/ls: line 50: 19500 Bus error               /bin/ls $@ 1>&1
    [/usr/local/sbin/hourly.active]: cd
    [~]: ls
    /usr/local/sbin/ls: line 50: 19507 Bus error               /bin/ls $@ 1>&1
    [~]: top
    Segmentation fault
    
    top, htop, cat, and ls gave errors, but cd worked fine.

    I tried a reboot, but wound up with a "no operating system found" sort of error. I had to go to work, so I shut down and switched off the PSU. After coming home I rebooted into recovery, powered down the system normally (halt -p) and rebooted. It came up fine except for a failed sdb in in the raid. It rebuilt fine on the spare. I'm currently running a smart test (smartctl -t long /dev/sdb) but I don't expect any errors as the RAID has dropped disks before and they checked out fine.

    It seems odd thought that just failing a raid disk (the OS is on a separate drive) would take the whole system down.

    Any thoughts?
     
  2. gemeinschaft

    gemeinschaft AKA Fluffy316

    2,201
    45
    Feb 7, 2004
    Houston, TX
    I am not sure, but I would consider setting up a Cron job to check your disks daily to monitor to see if you just had a bad drive or what the deal was.
     


  3. Linux3

    Linux3

    1,399
    0
    Dec 31, 2008
    Not enough info about your system but.
    cd /var/log
    ls -al
    look at the time stamps on dmesg and syslog.
    cat dmesg |grep sdb
    cat /var/log/syslog |grep sdb

    Or use dmesg.0 and syslog.1
    or whatever to match the time of the problems.

    Any errors?
     
  4. Bushflyr

    Bushflyr ʇno uıƃuɐɥ ʇsnɾ Millennium Member

    3,524
    0
    Mar 17, 1999
    Western WA
    Thanks for the ideas. I've gone through all the log files and there's nothing there. I'll try a smartctl cronjob, but I don't expect much there. The drives are all new and I've run a long test after each failure with no errors. Different drives have dropped out at different points, but it had been running reliably for a few weeks now with no probs. :dunno:
     
  5. Bushflyr

    Bushflyr ʇno uıƃuɐɥ ʇsnɾ Millennium Member

    3,524
    0
    Mar 17, 1999
    Western WA
    If by "would have prevented that" you mean "would have prevented my even installing a RAID since Win7 wouldn't know a RAID if it bit it on the ASSH," then yes, you are correct.

    Oh, wait, it doesn't do ASSH either. :upeyes:
     
    Last edited: May 11, 2012
  6. Detectorist

    Detectorist

    14,118
    3,377
    Jul 16, 2008
    Missouri
    Win 7 Professional Ultimate supports Mirrored type of RAID.
     
  7. Bushflyr

    Bushflyr ʇno uıƃuɐɥ ʇsnɾ Millennium Member

    3,524
    0
    Mar 17, 1999
    Western WA
    I know. But adding in the exception in ruined the lyrical flow. :supergrin:

    And the intent is still correct since Windows 7 Professional Ultimate Super Duper Apex Pinnacle etc etc still doesn't do RAID 5 (which is what I'm using), RAID 6, or any sort of nested RAID. It does RAID 1. And I'm purposely leaving out "RAID" 0 because it's not really RAID as there is no Redundant in it.
     
    Last edited: May 12, 2012
  8. jarubla

    jarubla Dos Pistolas

    377
    0
    Feb 16, 2010
    UT
    Raid 5 is single parity, right? Can you ID which disk failed or hiccuped? Any chance that you had more than the one disk report an issue, or even when it was rebuilding? Smells like a possible RAID rebuild issue to me, and disks sometimes do funny things at the worst possible times. ONe of the main reasons why I am a RAID 6 guy. More costs involved on that extra disk, but can help alleviate dual disk failures.

    Are you able to parse through any log files to pinpoint when the issue occurred? Hoping maybe an error message can be pulled and we can wash it through the ubuntu bug tool:

    https://bugs.launchpad.net/ubuntu

    Also, as a side note, I just saw your thread over on http://ubuntuforums.org, following this as I am curious now as to the outcome.

    -Jay
     
  9. Bushflyr

    Bushflyr ʇno uıƃuɐɥ ʇsnɾ Millennium Member

    3,524
    0
    Mar 17, 1999
    Western WA
    Raid 5 is single parity, but I'm also running a hot spare, so there is some extra safety there. I've lost sdb and sde at one point or another, but no errors ever showed up when scanning the drives afterward and I readded them to the RAID without issue.

    The first couple times it happened I was thinking maybe cables, but there were no IO errors. And nothing listed in any log files. At this point I'm wondering if it's possibly flaky power in my house. (I haven't gotten the UPS yet, but it's on the list) I recall reading somewhere that RAIDs are particularly sensitive to power fluctuations. And, all my lights dim for a second when the wife turns on the hair dryer.

    Also previous RAID failures never took down the OS. Everything is back up and running fine, so at this point :dunno: