Room 327

Last Update: July 21, 2017


Detailed Schedule

8:30 am Welcome Workshop Organizer
8:40 am Invited Talk: What eye-gaze tells us about the image Dimitris Samaras
Stony Brook University
9:05 am Invited Talk: Research at Vicarious AI: toward data efficiency, task generality and conceptual understanding
Huayan Wang
Vicarious AI
9:45 am Invited Talk: Crack the autonomous driving puzzle from vision and cognition perspective Yibiao Zhao
10:10 am Morning Break
10:20 am Invited Talk: Misconceptions in Artificial Intelligence and the Tasks Forward
Seng-Beng Ho
Institute of High Performance
Computing, A*STAR, Singapore
10:45 am Invited Talk: Learning to Reason with Compositional Models of Language and Vision Kate Saenko
Boston University
11:10 am Keynote Talk: Dark, Beyond Deep
Song-Chun Zhu
11:50 am Poster Presentation
12:30 pm Lunch Break
1:15 pm Invited Talk: The Data-Fusion Problem -- Causal Inference and Reinforcement Learning Elias Bareinboim
Purdue University
1:55 pm Invited Talk: Visual Cognition for Interaction Joseph Lim
University of Southern California
2:20 pm Invited Talk: Inferring Human Interaction from Motion Trajectories
Tianmin Shu
2:45 pm Invited Talk: "Theory of Mind" from Videos
Tao Gao
GE Research
3:10 pm Invited Talk: Seeing Stability: Physical Understanding is Rooted in Automatic Visual Processing Chaz Firestone
Johns Hopkins University
3:35 pm Afternoon Break
3:45 pm Keynote Talk Ali Farhadi
University of Washington
4:25 pm Invited Talk: Synthesizing 3D Shapes via Modeling Multi-View Depth Maps
and Silhouettes with Deep Generative Networks
Amir A. Soltani
4:50 pm Invited Talk: Deep Predictive Model for Autonomous Driving
Wongun Choi
5:15 pm Keynote Talk Abhinav Gupta
5:55 pm Closing Remarks Workshop Organizer
6:00 pm Poster Presentation (Continue)


Accepted Papers

Full Papers

Title Authors
What Will I Do Next? The Intention from Motion Experiment Andrea Zunino, Jacopo Cavazza, Atesh Koul, Andrea Cavallo, Cristina Becchio, and Vittorio Murino
Automatic Layout Synthesis and Visualization From Images of Interior or Exterior Spaces Masaki Nakada, Tomer Weiss, and Demetri Terzopoulos
The Role of Synchronic Causal Conditions in Visual Knowledge Learning Seng-Beng Ho
AcFR: Active Face Recognition Using Convolutional Neural Networks Masaki Nakada, Han Wang, and Demetri Terzopoulos
Joint 3D Human Motion Capture and Physical Analysis from Monocular Videos Petrissa Zell, Bastian Wandt, and Bodo Rosenhahn
Attention-based Natural Language Person Retrieval Tao Zhou, Muhao Chen, Jie Yu, and Demetri Terzopoulos
Inferring Hidden Statuses and Actions in Video by Causal Reasoning Amy Fire, and Song-Chun Zhu

Extended Abstract and Invited Posters

Title Authors
Vision and Reasoning based Image Riddles Answering through Probabilistic Soft Logic Somak Aditya, Yezhou Yang, Chitta Baral, and Yiannis Aloimonos
A Framework for Emotion Recognition with Scene Understanding Yingxuan Zhu, Jian Li, Xiaotian Yin, Lifeng Liu, and Jun Zhang
Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes with Deep Generative Networks Amir Arsalan Soltani, Haibin Huang, Jiajun Wu, Tejas Kulkaeni, and Joshua Tenenbaum
Improving Image Memorability Prediction via Coarse Scene Parsing Sejong Yoon, and Jongpil Kim
Predictive-Corrective Networks for Action Detection Achal D Dave, Olga Russakovsky, and Deva Ramanan
Generating Holistic 3D Scene Abstractions for Text-Based Image Retrieval Ang Li, Jin Sun, Joe Yue-Hei Ng, Ruichi Yu, Vlad I. Morariu and Larry S. Davis
Semantic Scene Completion From a Single Depth Image Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, and Thomas Funkhouser
Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, and Jiashi Feng
Learning Motion Patterns in Videos Pavel Tokmakov, Karteek Alahari, and Cordelia Schmid
Automatic Understanding of Image and Video Advertisements Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, and Adriana Kovashka
Procedural Generation of Videos to Train Deep Action Recognition Networks César Roberto de Souza, Adrien Gaidon, Yohann Cabon, and Antonio Manuel López Peña
Fine-Grained Recognition of Thousands of Object Categories With Single-Example Training Leonid Karlinsky, Joseph Shtok, Yochay Tzur, and Asaf Tzadok
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Chi Li, M. Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Gregory D. Hager, and Manmohan Chandraker
Seeing What Is Not There: Learning Context to Determine Where Objects Are Missing Jin Sun, and David Jacobs
Cognitive Mapping and Planning for Visual Navigation Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, and Jitendra Malik
Zero-Shot Learning - the Good, the Bad and the Ugly Yongqin Xian, Bernt Schiele, and Zeynep Akata
Indoor Scene Parsing With Instance Segmentation, Semantic Labeling and Support Relationship Inference Wei zhuo, Mathieu Salzmann, Xuming He, and Miaomiao Liu
Connecting Look and Feel: Associating the Visual and Tactile Properties of Physical Materials Wenzhen Yuan, Shaoxiong Wang, Siyuan Dong, and Edward Adelson
Expecting the Unexpected: Training Detectors for Unusual Pedestrians With Adversarial Imposters Shiyu Huang, and Deva Ramanan
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Niessner