Copyright © changeloghttps://validator.w3.org/feed/docs/rss2.htmlchangelog Updateshpc-internal.carnegiescience.eduhttps://hpc-internal.carnegiescience.edu?utm_source=noticeable&utm_campaign=3f43ej0latlbxv21efel&utm_content=other&utm_id=3f43Ej0LaTLbXv21eFel.t8lIbf2iSTWZIIP91xqU&utm_medium=newspageenTue, 07 Dec 2021 18:26:14 GMThttps://noticeable.io[email protected] (changelog)[email protected] (Noticeable Team)https://storage.noticeable.io/projects/3f43Ej0LaTLbXv21eFel/newspages/t8lIbf2iSTWZIIP91xqU/01h55ta3gshjbemty2fj8xrzn2-header-logo.pngchangelog Updateshttps://hpc-internal.carnegiescience.edu?utm_source=noticeable&utm_campaign=3f43ej0latlbxv21efel&utm_content=other&utm_id=3f43Ej0LaTLbXv21eFel.t8lIbf2iSTWZIIP91xqU&utm_medium=newspagehttps://storage.noticeable.io/projects/3f43Ej0LaTLbXv21eFel/newspages/t8lIbf2iSTWZIIP91xqU/01h55ta3gshjbemty2fj8xrzn2-header-logo.png#1e88e5qicYM3U7JTa4BbA78PsIWed, 17 Nov 2021 21:32:57 GMT[email protected] (Floyd Fayton)Memex Login Hung on 11/17/21https://noticeable.news/3f43ej0latlbxv21efel/publications/memex-login-hung-on-11-17-21Initial incident and probable cause:

Broken pipe on the login caused by a file update on the login. The master node and login were both rebooted and “wwsh file sync” commands were automated memex_routecheck.sh (crontab and cron.hourly). After reboot, the routing table and maintenance motd were both updated sooner than before. Also added a ping check in order to determine whether the networks needs restarting (after the routing table is updated).

]]>
Initial incident and probable cause:

Broken pipe on the login caused by a file update on the login. The master node and login were both rebooted and “wwsh file sync” commands were automated memex_routecheck.sh (crontab and cron.hourly). After reboot, the routing table and maintenance motd were both updated sooner than before. Also added a ping check in order to determine whether the networks needs restarting (after the routing table is updated).

]]>
NewSystem FailureMaintenanceAnnouncement
kiXk98fZM3iLHtLNxMyQThu, 25 Jun 2020 20:09:00 GMT[email protected] (Floyd Fayton)Login hangs after kernel message...https://noticeable.news/3f43ej0latlbxv21efel/publications/login-hangs-after-kernel-messageNewSystem Failurezx7mwKxW16mNkfChp64xFri, 22 May 2020 02:02:00 GMT[email protected] (Floyd Fayton)Master Node Rebootedhttps://noticeable.news/3f43ej0latlbxv21efel/publications/master-node-rebootedSystem FailureFixomAlGcd5omhvBQgZVFeqTue, 14 Jan 2020 21:45:00 GMT[email protected] (Floyd Fayton)Memex unable to accept new user loginshttps://noticeable.news/3f43ej0latlbxv21efel/publications/memex-unable-to-accept-new-user-loginsFixSystem Failure5csmLDRBAVK9iyQKDttSThu, 19 Dec 2019 17:35:00 GMT[email protected] (Floyd Fayton)User reported that rsync/cp/scp too slow on /memexnfs/apps,https://noticeable.news/3f43ej0latlbxv21efel/publications/user-reported-that-rsync-cp-scp-too-slow-on-memexnfs-appsSystem FailureAnnouncementFixEC7QoW1ZmiKkRDgyNIDrMon, 02 Dec 2019 19:57:00 GMT[email protected] (Floyd Fayton)Login Node Slowness (module command hanging on memex.carnegiescience.edu)https://noticeable.news/3f43ej0latlbxv21efel/publications/login-slowness-module-command-hangingAnnouncementSystem FailureFixy3QRYShmzZxk6tnDtkgnMon, 14 Oct 2019 15:40:00 GMT[email protected] (Floyd Fayton)System Update & Failed disk in SureStoreHD, memexnfs ZFS pool degradedhttps://noticeable.news/3f43ej0latlbxv21efel/publications/failed-disk-in-sure-store-hd-memexnfs-zfs-pool-degradedAnnouncementSystem FailureKaPTOoCZ6rcqrRGzijD3Tue, 30 Jul 2019 18:53:00 GMT[email protected] (Floyd Fayton)Memory failing in our SureStore UHD server (replacing DIMMs today)https://noticeable.news/3f43ej0latlbxv21efel/publications/memory-failing-in-our-sure-store-uhd-server-replacing-dim-ms-todaySystem FailureAnnouncementoXsj5DmMQxuikxEAwXp6Tue, 09 Jul 2019 19:41:00 GMT[email protected] (Floyd Fayton)Intel Python Issue.. conda base corruptedhttps://noticeable.news/3f43ej0latlbxv21efel/publications/intel-python-issue-conda-base-corruptedAnnouncementSystem FailureFixF4gcDNCdfQboogwv66ptFri, 12 Apr 2019 16:07:00 GMT[email protected] (Floyd Fayton)Error via getvnfs networkinghttps://noticeable.news/3f43ej0latlbxv21efel/publications/error-via-getvnfs-networkingSystem FailureFix