Basically, the LPM instruction (3 cycles) does not always take 3 cycles,
(I think because of the timing needed to read the Flash memory),
and this disrupts the timing of the signal, leading to some offsets
depending on what is shown on the line.
Thus, the LPM instruction has been replaced with an RJMP one, which,
instead, always takes 3 perfect cycles to be performed, timing is
not disrupted anymore, and image appears ok.
This does not work as expected, and I don't understand why.
Clock cycle count seems ok, but if there are no changes in the pixels,
it seems to go faster