4:02 STARS (WACV'26): Self-supervised Tuning for 3D Action Recognition in Skeleton Sequences. Soroush Mehraban 129 views - 2 weeks ago
5:01 FastHMR (WACV'26):Accelerating Human Mesh Recovery via Token & Layer Merging with Diffusion Decoding Soroush Mehraban 124 views - 2 weeks ago
14:38 LightlyTrain - Train Better Models, Faster - No Labels Needed Soroush Mehraban 710 views - 10 months ago
19:30 One-step Diffusion with Distribution Matching Distillation Soroush Mehraban 1.9K views - 1 year ago
14:22 Variational Score Distillation (VSD) Helps Create Amazing 3D Scenes From Text Prompts Soroush Mehraban 599 views - 1 year ago
11:39 Null-text Inversion for Editing Real Images using Guided Diffusion Models Soroush Mehraban 1.1K views - 1 year ago
10:04 Prompt-to-Prompt (P2P) image Editing - Method Explained Soroush Mehraban 787 views - 1 year ago
30:57 Denoising Diffusion Null-Space Model (DDNM) - Method Explained Soroush Mehraban 834 views - 1 year ago
21:44 Autoregressive Image Generation without Vector Quantization Soroush Mehraban 2.2K views - 1 year ago
10:46 GLIGEN (CVPR2023): Open-Set Grounded Text-to-Image Generation Soroush Mehraban 865 views - 1 year ago
9:09 The Entropy Enigma: Success and Failure of Entropy Minimization Soroush Mehraban 790 views - 1 year ago
13:16 Tent: Fully Test-time Adaptation by Entropy Minimization Soroush Mehraban 866 views - 1 year ago
9:44 VPD (ICCV2023): Unleashing Text-to-Image Diffusion Models for Visual Perception Soroush Mehraban 389 views - 1 year ago
30:13 TokenHMR (CVPR2024): Advancing Human Mesh Recovery witha Tokenized Pose Representation Soroush Mehraban 764 views - 1 year ago
22:26 SHViT (CVPR2024): Single-Head Vision Transformer with Memory Efficient Macro Design Soroush Mehraban 1.5K views - 1 year ago
22:17 InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Soroush Mehraban 1.3K views - 1 year ago
28:39 GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Soroush Mehraban 1.9K views - 2 years ago
9:13 MotionAGFormer (WACV2024): Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network Soroush Mehraban 1.5K views - 2 years ago
8:25 ST-GCN: Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition Soroush Mehraban 8.4K views - 2 years ago
13:08 Graph Convolutional Networks (GCN): From CNN point of view Soroush Mehraban 15.7K views - 2 years ago
31:03 MoCo (+ v2): Unsupervised learning in computer vision Soroush Mehraban 5.4K views - 2 years ago
21:00 ConvNet beats Vision Transformers (ConvNeXt) Paper explained Soroush Mehraban 3.2K views - 2 years ago
7:05 Convolutional Block Attention Module (CBAM) Paper Explained Soroush Mehraban 15.1K views - 3 years ago
9:11 Squeeze-and-Excitation Networks (SENet) paper explained Soroush Mehraban 11.6K views - 3 years ago
38:37 Fast R-CNN: Everything you need to know from the paper Soroush Mehraban 22K views - 3 years ago