powerpc/32: Optimise __csum_partial()
Improve __csum_partial by interleaving loads and adds.
On a 8xx, it brings neither improvement nor degradation.
On a 83xx, it brings a 25% improvement.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>