Joined 5/4/2023, 2:41:26 AM has 13 karma
Video-LLaMA: Instruction-Tuned Audio-Visual Lang Model for Video Understanding