gpu_neon: split output code, some refactoring