hotdisk
This commit is contained in:
51
content/5.nonsense/2.bash/4.hotdisk.md
Normal file
51
content/5.nonsense/2.bash/4.hotdisk.md
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
---
|
||||||
|
navigation: true
|
||||||
|
title: HotDisk
|
||||||
|
main:
|
||||||
|
fluid: false
|
||||||
|
---
|
||||||
|
:ellipsis{left=0px width=40rem top=10rem blur=140px}
|
||||||
|
|
||||||
|
# HotDisk
|
||||||
|
---
|
||||||
|
|
||||||
|
When you have a NAS with several drives sitting in a laundry room, temperatures can quickly rise.
|
||||||
|
Hard drives are very sensitive to heat and can suffer serious damage if they exceed a certain temperature threshold for too long.
|
||||||
|
After a particularly hot summer that caused a few cold sweats while monitoring my drives’ temperatures, I started looking for a way to automatically shut down the server when disk temperatures stay above their safe limit for an extended period.
|
||||||
|
|
||||||
|
Since I couldn’t find a convincing solution, I decided to build my own.
|
||||||
|
|
||||||
|
- The script reads SMART temperature data from all SATA drives every minute.
|
||||||
|
- It counts the number of consecutive minutes the temperature stays above or below the threshold.
|
||||||
|
- It sends Discord notifications if the threshold is exceeded or when the temperature cools down.
|
||||||
|
- It triggers a system shutdown if the temperature stays above the limit for the configured duration.
|
||||||
|
- It logs all temperatures and counter states, and automatically rotates log files.
|
||||||
|
|
||||||
|
While I was at it, I also added an installation script that installs the main script, makes it executable, creates a systemd service and timer, and enables them automatically.
|
||||||
|
The installer also lets you configure various parameters:
|
||||||
|
|
||||||
|
| Variable | Description | Default Value |
|
||||||
|
|-----------------------|------------------------------------------------------------------------------|-----------------------------------------------|
|
||||||
|
| `MAX_TEMP` | Maximum allowed temperature (°C) before the shutdown countdown starts | `60` |
|
||||||
|
| `HOT_DURATION` | Consecutive minutes above `MAX_TEMP` before shutdown | `5` |
|
||||||
|
| `COOL_RESET_DURATION` | Consecutive minutes below `MAX_TEMP` to reset all counters | `5` |
|
||||||
|
| `LOG_FILE` | Path to the main log file | `/var/log/hdd_temp_monitor.log` |
|
||||||
|
| `LOG_ROTATE_COUNT` | Number of log files to keep | `7` |
|
||||||
|
| `LOG_ROTATE_PERIOD` | Log rotation period (`daily` or `weekly`) | `daily` |
|
||||||
|
| `DISCORD_WEBHOOK` | Discord webhook URL for notifications | _Required_ |
|
||||||
|
|
||||||
|
It also runs another script that configures **logrotate** with the parameters defined above.
|
||||||
|
Finally, the installer can even be executed directly via a simple `curl` command followed by one last setup script — perfect for the laziest of us.
|
||||||
|
|
||||||
|
I also had to handle several tricky cases: running as root without sudo, using sudo directly, running as a non-sudo user, missing dependencies, permission issues, file creation errors, disk data reading errors, and more.
|
||||||
|
|
||||||
|
Concurrent access to the status file also had to be managed carefully.
|
||||||
|
|
||||||
|
More details are available directly on the repository:
|
||||||
|
|
||||||
|
::card
|
||||||
|
#title
|
||||||
|
📜 __HotDisk__
|
||||||
|
#description
|
||||||
|
[Keep your drives cool!](https://git.djeex.fr/Djeex/hotdisk)
|
||||||
|
::
|
||||||
Reference in New Issue
Block a user