Dídac Surís - Meta Superintelligence Labs

Publications

My publications are also listed in my Google Scholar profile.

Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Dídac Surís, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman Rädle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Zhou, Liliane Momeni, Rishi Hazra, Shuangrui Ding, Sagar Vaze, Francois Porcher, Feng Li, Siyuan Li, Aishwarya Kamath, Ho Kei Cheng, Piotr Dollár, Nikhila Ravi, Kate Saenko, Pengchuan Zhang, Christoph Feichtenhofer
SAM 3: Segment Anything with Concepts NEW! International Conference on Learning Representations (ICLR), 2026.
[BibTeX] [PDF] [Website] [Demo] [Blog] [Code]

@inproceedings{carion2025sam3,
    title={SAM 3: Segment Anything with Concepts},
    author={Nicolas Carion and Laura Gustafson and Yuan-Ting Hu and Shoubhik Debnath and Ronghang Hu and D\'idac Sur\'is and Chaitanya Ryali and Kalyan Vasudev Alwala and Haitham Khedr and Andrew Huang and Jie Lei and Tengyu Ma and Baishan Guo and Arpit Kalla and Markus Marks and Joseph Greer and Meng Wang and Peize Sun and Roman R{\"a}dle and Triantafyllos Afouras and Effrosyni Mavroudi and Katherine Xu and Tsung-Han Wu and Yu Zhou and Liliane Momeni and Rishi Hazra and Shuangrui Ding and Sagar Vaze and Francois Porcher and Feng Li and Siyuan Li and Aishwarya Kamath and Ho Kei Cheng and Piotr Doll{\'a}r and Nikhila Ravi and Kate Saenko and Pengchuan Zhang and Christoph Feichtenhofer},
    booktitle={International Conference on Learning Representations (ICLR)},
    year={2026}
}

Dante Francisco Wasmuht, Otto Brookes, Maximillian Schall, Pablo Palencia, Chris Beirne, Tilo Burghardt, Majid Mirmehdi, Hjalmar Kühl, Mimi Arandjelovic, Sam Pottie, Peter Bermant, Brandon Asheim, Yi Jin Toh, Adam Elzinga, Jason Holmberg, Andrew Whitworth, Eleanor Flatt, Laura Gustafson, Chaitanya Ryali, Yuan-Ting Hu, Baishan Guo, Andrew Westbury, Kate Saenko, Dídac Surís
The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification NEW! Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
[BibTeX] [PDF] [Website]

@inproceedings{wasmuht2025safari,
    title={The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification},
    author={Dante Francisco Wasmuht and Otto Brookes and Maximillian Schall and Pablo Palencia and Chris Beirne and Tilo Burghardt and Majid Mirmehdi and Hjalmar K{\"u}hl and Mimi Arandjelovic and Sam Pottie and Peter Bermant and Brandon Asheim and Yi Jin Toh and Adam Elzinga and Jason Holmberg and Andrew Whitworth and Eleanor Flatt and Laura Gustafson and Chaitanya Ryali and Yuan-Ting Hu and Baishan Guo and Andrew Westbury and Kate Saenko and D\'idac Sur\'is},
    booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2026}
}

Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick
pix2gestalt: Amodal Segmentation by Synthesizing Wholes Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[BibTeX] [PDF] [Website]

@article{ozguroglu2024pix2gestalt,
    title={pix2gestalt: Amodal Segmentation by Synthesizing Wholes},
    author={Ege Ozguroglu and Ruoshi Liu and D\'idac Sur\'is and Dian Chen and Achal Dave and Pavel Tokmakov and Carl Vondrick},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2024}
}

Dídac Surís*, Sachit Menon*, Carl Vondrick
ViperGPT: Visual Inference via Python Execution for Reasoning International Conference on Computer Vision (ICCV) - ORAL PRESENTATION, 2023.
[BibTeX] [PDF] [Website]

@article{surismenon2023vipergpt,
    title={ViperGPT: Visual Inference via Python Execution for Reasoning},
    author={D\'idac Sur\'is and Sachit Menon and Carl Vondrick},
    journal={Proceedings of IEEE International Conference on Computer Vision (ICCV)},
    year={2023}
}

Purva Tendulkar, Dídac Surís, Carl Vondrick
FLEX: Full-Body Grasping Without Full-Body Grasps Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[BibTeX] [PDF] [Website]

@inproceedings{tendulkar2022flex,
    title={FLEX: Full-Body Grasping Without Full-Body Grasps},
    author={Tendulkar, Purva and Sur\'is, D\'idac and Vondrick, Carl},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2023}
}

Dídac Surís, Carl Vondrick
Representing Spatial Trajectories as Distributions Conference on Neural Information Processing Systems (NeurIPS), 2022.
[BibTeX] [PDF] [Website] [5min Video Presentation]

@article{suris2022trajectories,
    title={Representing Spatial Trajectories as Distributions},
    author={Sur\'is, D\'idac and Vondrick, Carl},
    journal={Advances in Neural Information Processing Systems 35 (NeurIPS)},
    year={2022}
}

Dídac Surís, Carl Vondrick, Bryan Russell and Justin Salamon
It's Time for Artistic Correspondence in Music and Video Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[BibTeX] [PDF] [Website] [5min Video Presentation]

@article{suris2022musicforvideo,
    title={It's Time for Artistic Correspondence in Music and Video},
    author={Sur\'is, D\'idac and Vondrick, Carl and Russell, Bryan and Salamon, Justin},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2022}
}

Dídac Surís, Dave Epstein and Carl Vondrick
Globetrotter: Unsupervised Multilingual Translation from Visual Alignment Conference on Computer Vision and Pattern Recognition (CVPR) - ORAL PRESENTATION, 2022.
[BibTeX] [PDF] [Code and model] [Website] [5min Video Presentation]

@article{suris2022globetrotter,
    title={Globetrotter: Connecting Languages by Connecting Images},
    author={Sur\'is, D\'idac and Epstein, Dave and Vondrick, Carl},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2022}
}

Basile Van Hoorick, Purva Tendulkar, Dídac Surís, Dennis Park, Simon Stent and Carl Vondrick
Revealing Occlusions with 4D Neural Fields Conference on Computer Vision and Pattern Recognition (CVPR) - ORAL PRESENTATION, 2022.
[BibTeX] [PDF] [Code and models] [Website] [5min Video Presentation]

@article{vanhoorick2022revealing,
    title={Revealing Occlusions with 4D Neural Fields},
    author={Van Hoorick, Basile and Tendulkar, Purva and Sur\'is, D\'idac and Park, Dennis and Stent, Simon and Vondrick, Carl},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2022}
}

Dídac Surís*, Ruoshi Liu* and Carl Vondrick
Learning the Predictability of the Future Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[BibTeX] [PDF] [Code and model] [Website] [Press release] [Video Presentations (1h) (15min) (5min)]

@InProceedings{suris2021hyperfuture,
    title={Learning the Predictability of the Future},
    author={Sur\'is, D\'idac and Liu, Ruoshi and Vondrick, Carl},
    journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Dídac Surís*, Dave Epstein, Heng Ji, Shih-Fu Chang and Carl Vondrick
Learning to Learn Words from Visual Scenes European Conference on Computer Vision (ECCV), 2020.
[BibTeX] [PDF] [Code and model] [Video Presentation] [Website]

@Article{Suris2020learning,
    author = {Dídac Surís and D. Epstein and H. Ji and S. Chang and C. Vondrick},
    title = {Learning to Learn Words from Visual Scenes},
    journal = {European Conference on Computer Vision (ECCV)},
    year = {2020}
}

Dídac Surís*, Adrià Recasens*, David Bau, David Harwath, James Glass and Antonio Torralba
Learning Words by Drawing Images Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[BibTeX] [PDF] [Code] [Website]

@Article{Suris2019,
    author = {D. Sur\'is and A. Recasens and D. Bau and D. Harwath and J. Glass and A. Torralba},
    title = {Learning Words by Drawing Images},
    journal = {Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2019}
}

David Harwath, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba and James Glass
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input European Conference on Computer Vision (ECCV), 2018 (ORAL PRESENTATION).
[BibTeX] [PDF] [Code and data] [Video Presentation] [MIT News]

@Article{Harwath2018,
    author = {D. Harwath and A. Recasens and D. Sur\'is and G. Chuang and A. Torralba and J. Glass},
    title = {Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input},
    journal = {European Conference on Computer Vision (ECCV)},
    year = {2018}
}

Joan Serrà, Dídac Surís, Marius Miron and Alexandros Karatzoglou
Overcoming catastrophic forgetting with hard attention to the task International Conference on Machine Learning (ICML), 2018 (LONG TALK).
[BibTeX] [PDF] [Code] [Video Presentation(21:20)] [Tech World News]

@Article{Serra2018,
    author = {J. Serr\`a and D. Sur\'is and M. Miron and A. Karatzoglou},
    title = {Overcoming catastrophic forgetting with hard attention to the task},
    journal = {International Conference on Machine Learning (ICML)},
    year = {2018}
}

Dídac Surís, Amanda Duarte, Amaia Salvador, Jordi Torres and Xavier Giró-i-Nieto
Cross-modal Embeddings for Video and Audio Retrieval European Conference on Computer Vision Workshops (ECCV Workshops), 2018.
[BibTeX] [PDF]

@Article{Suris2018,
    author = {D. Sur\'is and A. Duarte and A. Salvador and J. Torres and X. Gir\'o-i-Nieto},
    title = {Cross-modal Embeddings for Video and Audio Retrieval},
    journal = {European Conference on Computer Vision Workshops (ECCV Workshops)},
    year = {2018}
}

Dídac Surís, Adrian Agustin and Josep Vidal
Delay minimization in dynamic and scalable multi-operator wireless backhauling IEEE International Conference on Communications Workshops (ICC Workshops), 2017.
[BibTeX] [PDF]

@Article{Suris2017,
    author = {D. Sur\'is and A. Agustin and J. Vidal},
    title = {Delay minimization in dynamic and scalable multi-operator wireless backhauling},
    journal = {IEEE International Conference on Communications Workshops (ICC Workshops)},
    year = {2017}
}

Dídac Surís Coll-Vinent

About Me

Publications

Resume

Research Experience

Awards & Fellowships

Education