New directory and icons
This commit is contained in:
141
content/5.nonsense/2.bash/1.servarr-duplicates.md
Normal file
141
content/5.nonsense/2.bash/1.servarr-duplicates.md
Normal file
@ -0,0 +1,141 @@
|
||||
---
|
||||
navigation: true
|
||||
title: Bash Scripts
|
||||
main:
|
||||
fluid: false
|
||||
---
|
||||
:ellipsis{left=0px width=40rem top=10rem blur=140px}
|
||||
# Servarr duplicates corrector
|
||||
---
|
||||
|
||||
Six months after downloading terabytes of media, I realized that Sonarr and Radarr were copying them into my Plex library instead of creating hardlinks. This happens due to a counterintuitive mechanism: if you mount multiple folders in Sonarr/Radarr, it sees them as different filesystems and thus cannot create hardlinks. That’s why you should mount only one parent folder containing all child folders (like `downloads`, `movies`, `tvseries` inside a `media` parent folder).
|
||||
|
||||
So I restructured my directories, manually updated every path in Qbittorrent, Plex, and others. The last challenge was finding a way to detect existing duplicates, delete them, and automatically create hardlinks instead—to save space.
|
||||
|
||||
My directory structure:
|
||||
|
||||
```console
|
||||
.
|
||||
└── media
|
||||
├── seedbox
|
||||
├── radarr
|
||||
│ └── tv-radarr
|
||||
├── movies
|
||||
└── tvseries
|
||||
```
|
||||
|
||||
The originals are in `seedbox` and must not be modified to keep seeding. The copies (duplicates) are in `movies` and `tvseries`. To complicate things, there are also unique originals in `movies` and `tvseries`. And within those, there can be subfolders, sub-subfolders, etc.
|
||||
|
||||
So the idea is to:
|
||||
|
||||
- list the originals in seedbox
|
||||
- list files in movies and tvseries
|
||||
- compare both lists and isolate duplicates
|
||||
- delete the duplicates
|
||||
- hardlink the originals to the deleted duplicate paths
|
||||
|
||||
Yes, I asked ChatGPT and Qwen3 (which I host on a dedicated AI machine). Naturally, they suggested tools like rfind, rdfind, dupes, rdupes, rmlint... But hashing 30TB of media would take days, so I gave up quickly.
|
||||
|
||||
In the end, I only needed to find `.mkv` files, and duplicates have the exact same name as the originals, which simplifies things a lot. A simple Bash script would do the job.
|
||||
|
||||
Spare you the endless Q&A with ChatGPT—I was disappointed. Qwen3 was much cleaner. ChatGPT kept pushing awk-based solutions, which fail on paths with spaces. With Qwen’s help and dropping awk, the results improved significantly.
|
||||
|
||||
To test, I first asked for a script that only lists and compares:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Create an associative array to store duplicates
|
||||
declare -A seen
|
||||
|
||||
# Find all .mkv files only (exclude directories)
|
||||
find /media/seedbox /media/movies /media/tvseries -type f -name "*.mkv" -print0 | \
|
||||
while IFS= read -r -d '' file; do
|
||||
# Get the file's inode and name
|
||||
inode=$(stat --format="%i" "$file")
|
||||
filename=$(basename "$file")
|
||||
|
||||
# If the filename has been seen before
|
||||
if [[ -n "${seen[$filename]}" ]]; then
|
||||
# Check if the inode is different from the previous one
|
||||
if [[ "${seen[$filename]}" != "$inode" ]]; then
|
||||
# Output the duplicates with full paths
|
||||
echo "Duplicates for \"$filename\":"
|
||||
echo "${seen["$filename"]} ${seen["$filename:full_path"]}"
|
||||
echo "$inode $file"
|
||||
echo
|
||||
fi
|
||||
else
|
||||
seen[$filename]="$inode"
|
||||
seen["$filename:full_path"]="$file"
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
This gave me outputs like:
|
||||
|
||||
```
|
||||
Duplicates for "episode1.mkv":
|
||||
1234567 /media/seedbox/sonarr/Serie 1/Season1/episode1.mkv
|
||||
2345678 /media/tvseries/Serie 1/Season1/episode1.mkv
|
||||
```
|
||||
|
||||
With `awk`, it would’ve stopped at `/media/seedbox/sonarr/Serie`. I’m far from an expert, but Qwen3 performed better and explained everything clearly.
|
||||
|
||||
Once I verified the output, I asked for a complete script: compare, delete duplicates, create hardlinks.
|
||||
|
||||
Again, ChatGPT disappointed. Despite my requests, it created hardlinks *before* deleting the duplicates—effectively linking and then deleting the link (though the original is kept). Not helpful.
|
||||
|
||||
Quick stopover to Qwen3, RTX 5090 in overdrive, and bam—much better result. Yes, it kept ChatGPT-style emojis, but here it is:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
echo "🔍 Step 1: Indexing original files in /media/seedbox..."
|
||||
declare -A seen
|
||||
|
||||
# Index all .mkv files in seedbox
|
||||
while IFS= read -r -d '' file; do
|
||||
filename=$(basename "$file")
|
||||
seen["$filename"]="$file"
|
||||
done < <(find /media/seedbox -type f -name "*.mkv" -print0)
|
||||
|
||||
echo "📦 Step 2: Automatically replacing duplicates..."
|
||||
total_doublons=0
|
||||
total_ko_saved=0
|
||||
|
||||
while IFS= read -r -d '' file; do
|
||||
filename=$(basename "$file")
|
||||
original="${seen[$filename]}"
|
||||
|
||||
if [[ -n "$original" && "$original" != "$file" ]]; then
|
||||
inode_orig=$(stat -c %i "$original")
|
||||
inode_dupe=$(stat -c %i "$file")
|
||||
|
||||
if [[ "$inode_orig" != "$inode_dupe" ]]; then
|
||||
size_kb=$(du -k "$file" | cut -f1)
|
||||
echo "🔁 Replacing:"
|
||||
echo " Duplicate : $file"
|
||||
echo " Original : $original"
|
||||
echo " Size : ${size_kb} KB"
|
||||
|
||||
rm "$file" && ln "$original" "$file" && echo "✅ Hardlink created."
|
||||
|
||||
total_doublons=$((total_doublons + 1))
|
||||
total_ko_saved=$((total_ko_saved + size_kb))
|
||||
fi
|
||||
fi
|
||||
done < <(find /media/movies /media/tvseries -type f -name "*.mkv" -print0)
|
||||
|
||||
echo ""
|
||||
echo "🧾 Summary:"
|
||||
echo " 🔗 Duplicates replaced by hardlink: $total_doublons"
|
||||
echo " 💾 Approx. disk space saved: ${total_ko_saved} KB (~$((total_ko_saved / 1024)) MB)"
|
||||
echo "✅ Done."
|
||||
```
|
||||
|
||||
So, in conclusion, I:
|
||||
- Learned many Bash subtleties
|
||||
- Learned never to blindly copy-paste a ChatGPT script without understanding and dry-running it
|
||||
- Learned that Qwen on a RTX 5090 is more coherent than ChatGPT-4o on server farms (not even mentioning “normal” ChatGPT)
|
||||
- Learned that even with 100TB of storage, monitoring it would’ve alerted me much earlier to the 12TB of duplicates lying around
|
88
content/5.nonsense/2.bash/2.luks- backup.md
Normal file
88
content/5.nonsense/2.bash/2.luks- backup.md
Normal file
@ -0,0 +1,88 @@
|
||||
---
|
||||
navigation: true
|
||||
title: LUKS Backup
|
||||
main:
|
||||
fluid: false
|
||||
---
|
||||
:ellipsis{left=0px width=40rem top=10rem blur=140px}
|
||||
|
||||
# Backup of LUKS Headers for Encrypted Disks/Volumes
|
||||
---
|
||||
|
||||
I recently realized that having just the password is not enough to unlock a LUKS volume after a failure or corruption. I learned how to dump the LUKS headers from disks/volumes and to use the serial numbers along with partition names to accurately identify which header corresponds to which disk/partition (I have 10 of them!).
|
||||
|
||||
After struggling to do this manually, I asked Qwen3 (an LLM running on my RTX 5090) to create a script that automates the listing and identification of disks, dumps the headers, and stores them in an encrypted archive ready to be backed up on my backup server.
|
||||
|
||||
This script:
|
||||
* Lists and identifies disks with their serial numbers
|
||||
* Lists partitions
|
||||
* Dumps headers into a secured folder under `/root`
|
||||
* Creates a temporary archive
|
||||
* Prompts for a password
|
||||
* Encrypts the archive with that password
|
||||
* Deletes the unencrypted archive
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Directory where LUKS headers will be backed up
|
||||
DEST="/root/luks-headers-backup"
|
||||
mkdir -p "$DEST"
|
||||
|
||||
echo "🔍 Searching for LUKS containers on all partitions..."
|
||||
|
||||
# Loop through all possible disk partitions (including NVMe and SATA)
|
||||
for part in /dev/sd? /dev/sd?? /dev/nvme?n?p?; do
|
||||
# Skip if the device doesn't exist
|
||||
if [ ! -b "$part" ]; then
|
||||
continue
|
||||
fi
|
||||
|
||||
# Check if the partition is a LUKS encrypted volume
|
||||
if cryptsetup isLuks "$part"; then
|
||||
# Find the parent disk device (e.g. nvme0n1p4 → nvme0n1)
|
||||
disk=$(lsblk -no pkname "$part" | head -n 1)
|
||||
full_disk="/dev/$disk"
|
||||
|
||||
# Get the serial number of the parent disk
|
||||
SERIAL=$(udevadm info --query=all --name="$full_disk" | grep ID_SERIAL= | cut -d= -f2)
|
||||
if [ -z "$SERIAL" ]; then
|
||||
SERIAL="unknown"
|
||||
fi
|
||||
|
||||
# Extract the partition name (e.g. nvme0n1p4)
|
||||
PART_NAME=$(basename "$part")
|
||||
|
||||
# Build the output filename with partition name and disk serial
|
||||
OUTPUT="$DEST/luks-header-${PART_NAME}__${SERIAL}.img"
|
||||
|
||||
echo "🔐 Backing up LUKS header of $part (Serial: $SERIAL)..."
|
||||
|
||||
# Backup the LUKS header to the output file
|
||||
cryptsetup luksHeaderBackup "$part" --header-backup-file "$OUTPUT"
|
||||
if [[ $? -eq 0 ]]; then
|
||||
echo "✅ Backup successful → $OUTPUT"
|
||||
else
|
||||
echo "❌ Backup failed for $part"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
# Create a timestamped compressed tar archive of all header backups
|
||||
ARCHIVE_NAME="/root/luks-headers-$(date +%Y%m%d_%H%M%S).tar.gz"
|
||||
echo "📦 Creating archive $ARCHIVE_NAME..."
|
||||
tar -czf "$ARCHIVE_NAME" -C "$DEST" .
|
||||
|
||||
# Encrypt the archive symmetrically using GPG with AES256 cipher
|
||||
echo "🔐 Encrypting the archive with GPG..."
|
||||
gpg --symmetric --cipher-algo AES256 "$ARCHIVE_NAME"
|
||||
if [[ $? -eq 0 ]]; then
|
||||
echo "✅ Encrypted archive created: ${ARCHIVE_NAME}.gpg"
|
||||
# Remove the unencrypted archive for security
|
||||
rm -f "$ARCHIVE_NAME"
|
||||
else
|
||||
echo "❌ Encryption failed"
|
||||
fi
|
||||
```
|
||||
|
||||
**Don’t forget to back up `/etc/fstab` and `/etc/crypttab` as well!**
|
2
content/5.nonsense/2.bash/_dir.yml
Normal file
2
content/5.nonsense/2.bash/_dir.yml
Normal file
@ -0,0 +1,2 @@
|
||||
navigation.title: Bash
|
||||
icon: lucide:file-terminal
|
Reference in New Issue
Block a user