Three ways to differentiate ReLU
SMRTR summary
ReLU (max(0, x)) isn't differentiable at x = 0, but three approaches handle this: pointwise (Heaviside), distributional derivatives, and the subgradient, which allows any slope between 0 and 1 at that point.
SMRTR provides this summary for quick context. The original article belongs to John D. Cook.
Read the original article