A rectified linear unit (ReLU) is an activation function that outputs the input directly if the input is positive, and otherwise outputs zero. ReLU introduces non-linearity to a neural network while still mimicking linear behavior. The advantage of ReLU is that it has sparse activation and efficient computation; however, it is also non-differentiable at zero, which can introduce problems.
ReLU is based on a rectifier function, , also known as a ramp function. Both the function and its derivative are monotonic and the range of the function is . Other versions of the function exist, such as leaky ReLU, which is typically defined as when is positive, else .
ReLU is the most commonly used activation function due to its simplicity and efficiency.