hotdisk
This commit is contained in:
51
content/5.nonsense/2.bash/4.hotdisk.md
Normal file
51
content/5.nonsense/2.bash/4.hotdisk.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
navigation: true
|
||||
title: HotDisk
|
||||
main:
|
||||
fluid: false
|
||||
---
|
||||
:ellipsis{left=0px width=40rem top=10rem blur=140px}
|
||||
|
||||
# HotDisk
|
||||
---
|
||||
|
||||
When you have a NAS with several drives sitting in a laundry room, temperatures can quickly rise.
|
||||
Hard drives are very sensitive to heat and can suffer serious damage if they exceed a certain temperature threshold for too long.
|
||||
After a particularly hot summer that caused a few cold sweats while monitoring my drives’ temperatures, I started looking for a way to automatically shut down the server when disk temperatures stay above their safe limit for an extended period.
|
||||
|
||||
Since I couldn’t find a convincing solution, I decided to build my own.
|
||||
|
||||
- The script reads SMART temperature data from all SATA drives every minute.
|
||||
- It counts the number of consecutive minutes the temperature stays above or below the threshold.
|
||||
- It sends Discord notifications if the threshold is exceeded or when the temperature cools down.
|
||||
- It triggers a system shutdown if the temperature stays above the limit for the configured duration.
|
||||
- It logs all temperatures and counter states, and automatically rotates log files.
|
||||
|
||||
While I was at it, I also added an installation script that installs the main script, makes it executable, creates a systemd service and timer, and enables them automatically.
|
||||
The installer also lets you configure various parameters:
|
||||
|
||||
| Variable | Description | Default Value |
|
||||
|-----------------------|------------------------------------------------------------------------------|-----------------------------------------------|
|
||||
| `MAX_TEMP` | Maximum allowed temperature (°C) before the shutdown countdown starts | `60` |
|
||||
| `HOT_DURATION` | Consecutive minutes above `MAX_TEMP` before shutdown | `5` |
|
||||
| `COOL_RESET_DURATION` | Consecutive minutes below `MAX_TEMP` to reset all counters | `5` |
|
||||
| `LOG_FILE` | Path to the main log file | `/var/log/hdd_temp_monitor.log` |
|
||||
| `LOG_ROTATE_COUNT` | Number of log files to keep | `7` |
|
||||
| `LOG_ROTATE_PERIOD` | Log rotation period (`daily` or `weekly`) | `daily` |
|
||||
| `DISCORD_WEBHOOK` | Discord webhook URL for notifications | _Required_ |
|
||||
|
||||
It also runs another script that configures **logrotate** with the parameters defined above.
|
||||
Finally, the installer can even be executed directly via a simple `curl` command followed by one last setup script — perfect for the laziest of us.
|
||||
|
||||
I also had to handle several tricky cases: running as root without sudo, using sudo directly, running as a non-sudo user, missing dependencies, permission issues, file creation errors, disk data reading errors, and more.
|
||||
|
||||
Concurrent access to the status file also had to be managed carefully.
|
||||
|
||||
More details are available directly on the repository:
|
||||
|
||||
::card
|
||||
#title
|
||||
📜 __HotDisk__
|
||||
#description
|
||||
[Keep your drives cool!](https://git.djeex.fr/Djeex/hotdisk)
|
||||
::
|
||||
Reference in New Issue
Block a user