Hello everyone,
we have one MSA 2040 with two controllers and 49 disk drives (two of them SSDs) running in our company.
Five of the 49 disk drives are configured as "Global Spare".
We are planning the end of life of the system in the upcoming May - we're gonna start moving the virtual machines and other data that are on the SAN system to two newer SAN systems very soon.
I'd really like to get your advice on how to react to the disk drive errors on one of the HDDs:
I'm tending to replace the affected HDD with one of the global spare drives since I'm afraid that the performance of the whole system might go down with the defective HDD still in use.
So - what's the best practice of how to replace the defective HDD with one of the global spare HDDs? Do I just hot swap the defective disk and exchange it with one of the disks configured as "global spare" or do I have to make any further configurations on the web interface of the controllers?
Since we are not plannning with a long continued use of the system I think 4 global spares remaining should be enough and we will stop using the system asap.
Thank you very much for your thoughts and advices.
Here's the warning we get sent from the system:
Degraded System: action required (SN:**Confidential info erased**)
System Status indicates a degraded system.
Failure to take action may result in loss of availability or loss of data.
Log into system to determine corrective actions.
"show systems" output:
System Information
------------------
System Name: san2
System Information: MSA2040-SSD
Midplane Serial Number: **Confidential info erased**
Vendor Name: HP
Product ID: MSA 2040 SAN
Product Brand: MSA Storage
SCSI Vendor ID: HP
SCSI Product ID: MSA 2040 SAN
Enclosure Count: 2
Health: Degraded
Health Reason: A subcomponent of this component is unhealthy.
Other MC Status: Operational
PFU Status: Idle
Supported Locales: English (English), Arabic (العربية), Portuguese (português), Spanish (español), French (français), German (Deutsch), Italian (italiano), Japanese (日本語), Korean (한êµì–´), Dutch (Nederlands), Russian (руÑÑкий), Chinese-Simplified (ç®€ä½“ä¸æ–‡), Chinese-Traditional (ç¹é«”䏿–‡)
Unhealthy Component
-------------------
Component ID: Disk 2.16
Health: Degraded
Health Reason: The system determined that the indicated disk is degraded because it experienced a number of disk errors in excess of a configured threshold.
Health Recommendation: Monitor the disk.
Success: Command completed successfully. (2024-02-25 00:01:00)
Best wishes
it_faber