Failure of memex-c[013-014,038,072,075] - Dell ticket opened
timestamp1550698260001
System Failure
Announcement
A Dell support ticket was opened on 2/7/19 to address the failure of compute nodes, memex-c[013-014,038,072,075]. Onsite and remote logs were emailed to Dell subject matter experts but they haven’t found any hardware issues yet.
- Memex-c[013-014] were stable for about a week and put back into rotation for SLURM jobs.
- Memex-c072 is highly unstable and shuts down during boot.
- Memex-c[075,038], are also unstable but will boot into the OS until they shut down (up to a week later).