Every hard drive manufactured since the mid-1990s contains a self-monitoring system called SMART (Self-Monitoring, Analysis and Reporting Technology) that continuously logs dozens of drive health metrics — power-on hours, reallocated sectors, temperature fluctuations, error rates, and more. Most people never look at this data until a drive fails. Reading it proactively can give you weeks or months of warning before a catastrophic failure — enough time to replace the drive and restore from backup with zero data loss.
This guide explains which SMART attributes actually predict failure (many of them don’t), how to read the data in both Windows and Linux, and what specific values should trigger concern.
How to Access SMART Data
Windows: CrystalDiskInfo
Download and install CrystalDiskInfo (free, from crystalmark.info). Open it. Each drive in your system appears as a tab. The main window shows a health status at top-left (Good, Caution, or Bad) based on whether any attributes have exceeded their threshold values. The table below shows individual SMART attributes with their Current, Worst, Threshold, and Raw values.
Linux / TrueNAS / OpenMediaVault: smartctl
Install smartmontools: sudo apt install smartmontools. Run a SMART check: sudo smartctl -a /dev/sda (replace sda with your drive’s device name — use lsblk to list drives). The output shows all SMART attributes plus the drive’s overall self-assessment result.
TrueNAS and OpenMediaVault both have built-in SMART reporting in their web interfaces — navigate to Storage → Disks → Disk (in TrueNAS) or Disks → S.M.A.R.T. (in OMV) to see drive health at a glance.
The SMART Attributes That Actually Matter
There are 60+ potential SMART attributes. Most are irrelevant for predicting failure. Based on research from Backblaze (which monitors tens of thousands of drives in production) and academic failure prediction studies, these are the attributes that actually correlate with imminent drive failure:
Critical: Any Non-Zero Value Is a Red Flag
| SMART ID | Attribute Name | What It Means | Safe Value |
|---|---|---|---|
| 05 | Reallocated Sectors Count | Bad sectors the drive moved to spare area. Each one = physical damage. | 0 (even 1 is concerning) |
| C5 | Current Pending Sectors | Sectors the drive suspects are bad but hasn’t confirmed yet. Often precede reallocation. | 0 ideally, investigate above 0 |
| C6 | Uncorrectable Sector Count | Sectors that can’t be read even after multiple retries. Data is lost. | Must be 0 |
| BB | Reported Uncorrectable Errors | Seagate-specific. Same concept as C6 on WD drives. | Must be 0 |
Important: Worth Monitoring, Not Immediately Alarming
| SMART ID | Attribute Name | What to Watch For |
|---|---|---|
| 01 | Raw Read Error Rate | Seagate drives show large raw values that aren’t concerning — look at the normalized value (should be above threshold). |
| C4 | Reallocation Event Count | Each reallocation event chips away at spare sector reserves. Rising count = drive is accumulating damage. |
| C7 | UltraDMA CRC Error Count | Data corruption on the cable/connection. Rising values often indicate a bad SATA cable, not the drive itself. Try swapping cable first. |
| BE / C2 | Temperature | Keep spinning drives below 50°C under load. Above 55°C = airflow problem. Below 0°C startup damage risk. |
Context: Informational, Not Predictive
| SMART ID | Attribute Name | Notes |
|---|---|---|
| 09 | Power-On Hours Count | Tells you drive age. Consumer drives rated for 2,400+ hours/year (continuous). 40,000+ hours = genuinely aged, consider replacement. |
| 0C | Power Cycle Count | How many times the drive has been powered on/off. High count on NAS drives is normal; they run 24/7. |
Running SMART Tests Proactively
Checking SMART attributes shows you the current state, but running a SMART test actively looks for problems on all sectors. There are two test types:
- Short test: Takes 2–5 minutes. Tests the drive’s electrical components and a sample of sectors. Run this first when checking a new or used drive.
- Long/Extended test: Tests every sector on the entire drive. Takes 8–24 hours depending on drive capacity. Detects physical surface defects the short test misses. Run this on any drive before adding it to a NAS or RAID array.
In CrystalDiskInfo: Tools → Short or Extended SMART Test.
In Linux: sudo smartctl -t long /dev/sda (check status later with sudo smartctl -l selftest /dev/sda).
For any shucked drive you’re adding to a NAS, always run a full extended test before relying on it. This catches latent defects before your data is on the drive rather than after.
Automating SMART Monitoring
Manual SMART checks are better than nothing, but automated monitoring catches problems you’d otherwise miss. Options:
- TrueNAS: Built-in SMART monitoring with email alerts. Configure under System → Email and enable SMART tests under Storage → SMART Tests. You’ll receive an email if any drive reports a problem.
- OpenMediaVault: S.M.A.R.T. section allows scheduled tests and email notification on failures.
- Windows (CrystalDiskInfo): Enable “Resident” mode — the app runs in the system tray and notifies you via pop-up if any drive enters “Caution” or “Bad” status.
- smartd (Linux): A daemon that monitors drives and sends email alerts. Configure in
/etc/smartd.conf.
When to Replace a Drive: The Decision Framework
Replace immediately, no questions asked: any non-zero Uncorrectable Sector Count (C6/BB), Reallocated Sectors Count above 5 and rising, or a failed SMART self-test.
Replace soon (plan your replacement within 30 days): Reallocated Sectors Count at 1–5 with no recent increase, any Pending Sectors above 0 that persist after a long SMART test, Power-On Hours above 50,000 on a consumer drive without RAID redundancy.
Monitor closely, replacement not urgent: Power-On Hours above 30,000 on an unredundant drive, temperatures consistently above 45°C, a one-time CRC error that hasn’t recurred after a cable swap.