changelog Updates

Memex Login Hung on 11/17/21

by Floyd Fayton
Initial incident and probable cause: Broken pipe on the login caused by a file update on the login. The master node and login were both rebooted and “wwsh file sync” commands were automated memex_routecheck.sh (crontab and cron.hourly).
New
System Failure
Maintenance
Announcement

Login hangs after kernel message...

by Floyd Fayton, HPC Admin
Update (6/29/20): The login hang was caused by high I/O load which is returning after a weekend hiatus. Unfortunately, limiting the I/O on the login is not yet feasible due to the design of the system. The issue is not due to a lack of...
New
System Failure

Master Node Rebooted

by Floyd Fayton, HPC Admin
Incident (5/21/20): While fixing issues with the GPU nodes, the master node became unstable because of several mount points that were damaged. All of the mount were runtime filesystems so a reboot was requested and fulfilled by SRCF...
System Failure
Fix

Did You Know ... Slack Edition

by Floyd Fayton, HPC Admin
Did you know we have a Slack channel for HPC/Research Computing? Signup to our Carnegie Institution for Science workspace (click here) and then join the #hpc channel. Please use your Google login, "@carnegiescience.edu", email...
Tips
Announcement
Welcome Guide

Did You Know ... Python Edition

by Floyd Fayton, HPC Admin
Did you know official support for Python 2 is over? That said, we have Python 3 available on Memex by loading the module, "python/3.6.7". This Python version includes conda, R, Jupyter, IntelMPI, and many other packages. Most...
Announcement
Tips
Welcome Guide