default search action
18th ECCV 2024: Milan, Italy - Part VIII
- Ales Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part VIII. Lecture Notes in Computer Science 15066, Springer 2025, ISBN 978-3-031-73241-6 - Mattia Segù, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele:
Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Appearance Graphs. 1-18 - Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kim:
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition. 19-36 - Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat:
DiffiT: Diffusion Vision Transformers for Image Generation. 37-55 - Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu, Jiajun Bu, Qi Zheng, Cong Yao:
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation. 56-74 - Changshuo Wang, Meiqing Wu, Siew-Kei Lam, Xin Ning, Shangshu Yu, Ruiping Wang, Weijun Li, Thambipillai Srikanthan:
GPSFormer: A Global Perception and Local Structure Fitting-Based Transformer for Point Cloud Understanding. 75-92 - Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma:
FreeMotion: A Unified Framework for Number-Free Text-to-Motion Synthesis. 93-109 - Zheng Jiang, Jinqing Zhang, Yanan Zhang, Qingjie Liu, Zhenghui Hu, Baohui Wang, Yunhong Wang:
FSD-BEV: Foreground Self-distillation for Multi-view 3D Object Detection. 110-126 - Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Dániel Béla Baráth:
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs. 127-150 - Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu:
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities. 151-168 - Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Yu Qiao, Peng Gao, Hongsheng Li:
MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? 169-186 - Zhonghan Zhao, Wenhao Chai, Xuan Wang, Li Boyi, Shengyu Hao, Shidong Cao, Tian Ye, Gaoang Wang:
See and Think: Embodied Agent in Virtual Environment. 187-204 - Guangcheng Chen, Yicheng He, Li He, Hong Zhang:
PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects. 205-222 - Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu:
Bridging the Gap Between Human Motion and Action Semantics via Kinematic Phrases. 223-240 - Ofir Abramovich, Niv Nayman, Sharon Fogel, Inbal Lavi, Ron Litman, Shahar Tsiper, Royee Tichauer, Srikar Appalaraju, Shai Mazor, R. Manmatha:
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding. 241-259 - Zhihao Li, Biao Hou, Siteng Ma, Zitong Wu, Xianpeng Guo, Bo Ren, Licheng Jiao:
Masked Angle-Aware Autoencoder for Remote Sensing Images. 260-278 - Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li:
Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm. 279-296 - Zhi-Fan Wu, Lianghua Huang, Wei Wang, Yanheng Wei, Yu Liu:
MultiGen: Zero-Shot Image Generation from Multi-modal Prompts. 297-313 - Xianyu Chen, Ming Jiang, Qi Zhao:
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths. 314-333 - Yifeng Zhang, Ming Jiang, Qi Zhao:
Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning. 334-351 - Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian L. Price, Dan Xu:
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis. 352-370 - Ishan Rajendrakumar Dave, Fabian Caba Heilbron, Mubarak Shah, Simon Jenni:
Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets. 371-388 - Ishan Rajendrakumar Dave, Mamshad Nayeem Rizve, Mubarak Shah:
FinePseudo: Improving Pseudo-labelling Through Temporal-Alignablity for Semi-supervised Fine-Grained Action Recognition. 389-408 - Yu Liu, Fatimah Binti Khalid, Lei Wang, Youxi Zhang, Cunrui Wang:
Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting. 409-425 - Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu:
UniCode: Learning a Unified Codebook for Multimodal Large Language Models. 426-443 - Baifeng Shi, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell:
When Do We Not Need Larger Vision Models? 444-462 - Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He:
GVGEN: Text-to-3D Generation with Volumetric Representation. 463-479 - Zhening Liu, Xinjie Zhang, Jiawei Shao, Zehong Lin, Jun Zhang:
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model. 480-496
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.