SVX日記
2011-02-26(Sat) デグRAIDする
そういえば、このHDDはファームウェア問題を持っていたっけ。24時間稼働で使っているので、発現確率は低いと見積もり、対処してなかった。フツーに壊れた今となってはどうでもイイ話だけど。
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST3500320AS
Serial Number: 9QM6THGV
Firmware Version: SD15
User Capacity: 500,107,862,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Feb 26 22:11:11 2011 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 650) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 120) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103b) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 084 084 006 Pre-fail Always - 9800766
3 Spin_Up_Time 0x0003 095 094 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 65
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 2045
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail Always - 8698539546
9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 19213
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 204
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 061 039 045 Old_age Always In_the_past 39 (31 161 39 38)
194 Temperature_Celsius 0x0022 039 061 000 Old_age Always - 39 (0 14 0 0)
195 Hardware_ECC_Recovered 0x001a 040 021 000 Old_age Always - 9800766
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 2
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 2
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 235 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 235 occurred at disk power-on lifetime: 19213 hours (800 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 71 04 9d 00 32 e0 Device Fault; Error: ABRT
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
a1 00 00 00 00 00 e0 00 20d+21:02:03.382 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:02:03.361 IDENTIFY DEVICE
e5 00 55 9d 00 32 e0 00 20d+21:02:03.359 CHECK POWER MODE
a1 00 00 00 00 00 e0 00 20d+21:02:01.283 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:02:01.261 IDENTIFY DEVICE
Error 234 occurred at disk power-on lifetime: 19213 hours (800 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 71 04 9d 00 32 e0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 e0 00 20d+21:02:03.361 IDENTIFY DEVICE
e5 00 55 9d 00 32 e0 00 20d+21:02:03.359 CHECK POWER MODE
a1 00 00 00 00 00 e0 00 20d+21:02:01.283 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:02:01.261 IDENTIFY DEVICE
e5 00 55 01 00 00 e0 00 20d+21:02:01.259 CHECK POWER MODE
Error 233 occurred at disk power-on lifetime: 19213 hours (800 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 71 04 9d 00 32 e0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
e5 00 55 9d 00 32 e0 00 20d+21:02:03.359 CHECK POWER MODE
a1 00 00 00 00 00 e0 00 20d+21:02:01.283 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:02:01.261 IDENTIFY DEVICE
e5 00 55 01 00 00 e0 00 20d+21:02:01.259 CHECK POWER MODE
00 00 00 00 00 00 00 ff 20d+21:01:56.167 NOP [Abort queued commands]
Error 232 occurred at disk power-on lifetime: 19213 hours (800 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 71 04 9d 00 32 e0 Device Fault; Error: ABRT
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
a1 00 00 00 00 00 e0 00 20d+21:02:01.283 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:02:01.261 IDENTIFY DEVICE
e5 00 55 01 00 00 e0 00 20d+21:02:01.259 CHECK POWER MODE
00 00 00 00 00 00 00 ff 20d+21:01:56.167 NOP [Abort queued commands]
a1 00 00 00 00 00 e0 00 20d+21:01:24.676 IDENTIFY PACKET DEVICE
Error 231 occurred at disk power-on lifetime: 19213 hours (800 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 71 04 9d 00 32 e0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 e0 00 20d+21:02:01.261 IDENTIFY DEVICE
e5 00 55 01 00 00 e0 00 20d+21:02:01.259 CHECK POWER MODE
00 00 00 00 00 00 00 ff 20d+21:01:56.167 NOP [Abort queued commands]
a1 00 00 00 00 00 e0 00 20d+21:01:24.676 IDENTIFY PACKET DEVICE
ec 00 00 00 00 00 e0 00 20d+21:01:24.651 IDENTIFY DEVICE
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md8 : active raid1 sdb8[2](F) sda8[0]
244189056 blocks [2/1] [U_]
md7 : active raid1 sdb7[2](F) sda7[0]
145893248 blocks [2/1] [U_]
md0 : active raid1 sdb1[2](F) sda1[1]
16383872 blocks [2/1] [_U]
unused devices: <none>
md: bind<sdb1>
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:1, o:1, dev:sdb1
disk 1, wo:0, o:1, dev:sda1
md: recovery of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 16383872 blocks.
md: bind<sdb7>
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda7
disk 1, wo:1, o:1, dev:sdb7
md: delaying recovery of md7 until md0 has finished (they share one or more physical units)
md: bind<sdb8>
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda8
disk 1, wo:1, o:1, dev:sdb8
md: delaying recovery of md8 until md7 has finished (they share one or more physical units)
md: delaying recovery of md7 until md0 has finished (they share one or more physical units)
usb 5-2: reset low speed USB device using uhci_hcd and address 3
usb 5-2: reset low speed USB device using uhci_hcd and address 3
md: md0: recovery done.
md: delaying recovery of md8 until md7 has finished (they share one or more physical units)
md: recovery of RAID array md7
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 145893248 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sdb1
disk 1, wo:0, o:1, dev:sda1
ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata4.00: BMDMA stat 0x25
ata4.00: cmd c8/00:10:d0:c9:96/00:00:00:00:00/e0 tag 0 dma 8192 in
res 71/04:04:9d:00:32/00:00:00:00:00/e0 Emask 0x1 (device error)
ata4.00: status: { DRDY DF ERR }
ata4.00: error: { ABRT }
ata4.00: both IDENTIFYs aborted, assuming NODEV
ata4.00: revalidation failed (errno=-2)
ata4: soft resetting link
ata4.00: both IDENTIFYs aborted, assuming NODEV
ata4.00: revalidation failed (errno=-2)
usb 5-2: reset low speed USB device using uhci_hcd and address 3
ata4: soft resetting link
ata4.00: both IDENTIFYs aborted, assuming NODEV
ata4.00: revalidation failed (errno=-2)
ata4.00: disabled
ata4: EH complete
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 9882064
raid1: sdb1: rescheduling sector 9882032
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 9882080
raid1: sdb1: rescheduling sector 9882048
raid1: sdb1: rescheduling sector 9882056
raid1: sdb1: rescheduling sector 9882064
raid1: sdb1: rescheduling sector 9882072
raid1: sdb1: rescheduling sector 9882080
raid1: sdb1: rescheduling sector 9882088
raid1: sdb1: rescheduling sector 9882096
raid1: sdb1: rescheduling sector 9882104
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 200956448
raid1: Disk failure on sdb7, disabling device.
raid1: Operation continuing on 1 devices.
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 200957472
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 200958496
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 200959520
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 32767776
end_request: I/O error, dev sdb, sector 32767776
md: super_written gets error=-5, uptodate=0
raid1: Disk failure on sdb1, disabling device.
raid1: Operation continuing on 1 devices.
md: md7: recovery done.
md: recovery of RAID array md8
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 244189056 blocks.
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:1, o:0, dev:sdb1
disk 1, wo:0, o:1, dev:sda1
RAID1 conf printout:
--- wd:1 rd:2
disk 1, wo:0, o:1, dev:sda1
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 488394784
raid1: Disk failure on sdb8, disabling device.
raid1: Operation continuing on 1 devices.
md: md8: recovery done.
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 488395808
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 488396832
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 488397856
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda7
disk 1, wo:1, o:0, dev:sdb7
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda7
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda8
disk 1, wo:1, o:0, dev:sdb8
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda8
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EADS-00L5B1
Serial Number: WD-WCAU45330156
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sat Feb 26 22:10:41 2011 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (23400) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 169 166 021 Pre-fail Always - 6533
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 28
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 075 075 000 Old_age Always - 18784
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 26
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 7
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 28
194 Temperature_Celsius 0x0022 118 094 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.