Generative Models for Image and Long Video Synthesis