Details

Description
Phenaki is an AI tool that can generate videos from textual prompts. It addresses the challenges of generating videos from text by using a causal model that compresses videos into small representations of discrete tokens. The tool uses a bidirectional masked transformer to generate video tokens from text prompts and then de-tokenizes them to create the actual video. Phenaki can generate videos of any length, conditioned on a sequence of prompts, and has the ability to generate videos that change over time. It also demonstrates generalization beyond available video datasets by training on image-text pairs and a smaller number of video-text examples. Phenaki outperforms other video generation methods in terms of spatio-temporal quality and the number of tokens per video. Potential applications of this tool include video content creation, storytelling, and visual effects in film and animation.