Keynote Speakers

Andrea Thomaz

Robots in the Real World: Putting Robot Learning to Work

  • This talk will be focused on the unique challenges in deploying a mobile manipulation robot into an environment where the robot is working closely with people on a daily basis. Diligent Robotics' first product, Moxi, is a mobile manipulation service robot that is at work in hospitals today assisting nurses and other front line staff with materials management tasks. This talk will dive into the computational complexity of developing a mobile manipulator with social intelligence. Dr. Thomaz will focus on how human-robot interaction theories and robot learning algorithms translate into the real-world and the impact on functionality of robots that perform delivery tasks in a busy human environment. The talk will include many examples and data from the field, with commentary and discussion around both the expected and unexpected hard problems in building robots operating 24/7 as reliable teammates.

  • Dr. Andrea Thomaz is the CEO and Co-Founder of Diligent Robotics and a renowned social robotics expert. Her accolades include being recognized by the National Academy of Science as a Kavli Fellow, the US President’s Council of Advisors on Science and Tech, MIT Technology Review on its Next Generation of 35 Innovators Under 35 list, Popular Science on its Brilliant 10 list, TEDx as a featured keynote speaker on social robotics and Texas Monthly on its Most Powerful Texans of 2018 list. Andrea’s robots have been featured in the New York Times and on the covers of MIT Technology Review and Popular Science. Her passion for social robotics began during her work at the MIT Media Lab, where she focused on using AI to develop machines that address everyday human needs.

    Andrea co-founded Diligent Robotics to pursue her vision of creating socially intelligent robot assistants that collaborate with humans by doing their chores so humans can have more time for the work they care most about. She earned her Ph.D. from MIT and B.S. in Electrical and Computer Engineering from UT Austin and is a Robotics Professor at UT Austin and the PI of the Socially Intelligent Machines Lab.

Zhengyou Zhang

Lifelike Robotic Learning through Hierarchical Neural Control

  • Drawing inspiration from the knowledge of animals and humans, this study presents a hierarchical learning framework to enable legged robots to exhibit lifelike agility and strategy in complex environments. We leverage advanced deep generative models, akin to large pre-trained models in language and image understanding, to generate motor control signals that stimulate legged robots to mimic real animal behavior. Our approach moves beyond conventional task-specific controllers and end-to-end RL methods by pre-

    training generative models on animal motion datasets, thus retaining extensive knowledge of animal behavior. Although the pre-trained model possesses ample primitive-level knowledge, it remains environment-agnostic. Subsequently, the model undergoes further learning to adapt to various environments by overcoming numerous challenging obstacles. These adaptive capabilities are stored as reusable parameters at the environmental level. A task-specific controller is then trained to tackle complex downstream tasks by utilizing knowledge from previous stages while preserving strategic- level knowledge as reusable parameters. This flexible framework allows for continuous accumulation of knowledge at different levels without affecting the usage of other knowledge levels. We successfully apply the multi-level controllers to the MAX robot, an in-house developed quadrupedal robot, enabling it to mimic animals, navigate complex obstacles, and participate in a challenging multi-agent Chase Tag Game, where lifelike agility and strategy emerge. This research advances the field of robot control by offering new insights into the reuse of multi-level pre-trained knowledge and the effective tackling of complex real-world tasks.

  • Zhengyou Zhang (ACM Fellow and IEEE Fellow) is the Chief Scientist at Tencent, China, and is the Director of Tencent AI Lab and Tencent Robotics X since March 2018. Before that, he was a Partner Research Manager with Microsoft Research, Redmond, WA, USA, for 20 years. Before joining Microsoft Research in March 1998, he was a Senior Research Scientist with INRIA (French National Institute for Research in Computer Science and Control), France, for over 10 years. In 1996-1997, he spent a one-year sabbatical as an Invited Researcher with the Advanced Telecommunications Research Institute International (ATR), Kyoto, Japan.

    Dr. Zhang is the Founding Editor-in-Chief of the IEEE Transactions on Cognitive and Developmental Systems, is on the Honorary Board of the International Journal of Computer Vision and on the Steering Committee of the Machine Vision and Applications, and serves or served as an Associate Editor for many journals including IEEE T-PAMI, IEEE T-MM, IEEE T-CSVT. He was a General Co-Chair of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017. He received the IEEE Helmholtz Test of Time Award at ICCV 2013 for his paper published in 1999 on camera calibration, now known as Zhang’s method. According to Google Scholar, his h-index is 104, and he has over 70K citations.

    Zhengyou Zhang received the B.S. degree in electronic engineering from Zhejiang University, Hangzhou, China, in 1985, the M.S. degree in computer science from the University of Nancy, Nancy, France, in 1987, and the Ph.D. degree in computer science in 1990 and the Doctorate of Science (Habilitation à diriger des recherches) in 1994 from the University of Paris XI, Paris, France.

Early Career Keynote Speakers

Wed, Nov 8, 1:45 - 3:30pm, Session Chairs: Sonia Chernova & Byron Boots

  • Moravec’s Paradox tells us that skills that are easy for people are often difficult for AI, and vice versa. For example, multiplying 10-digit numbers is far more difficult for an (adult) person than a basic calculator. And, on the other hand, we have not yet achieved toddler-level motor skills in robots. In my talk, I will revisit this paradox specifically in the context of robotics, in an attempt to understand which problems are easy and which are difficult in modern-day robot learning. Many factors that have traditionally been thought of as presenting difficulty for robots are seemingly not a challenge in recent works. And yet, other basic robotics problems remain unsolved. I will present a hypothesis that explains these differences, and discuss problems that remain the most challenging under this hypothesis.

  • Chelsea Finn is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Her research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has pioneered end-to-end deep learning methods for vision-based robotic manipulation, meta-learning algorithms for few-shot learning, and approaches for scaling robot learning to broad datasets. Her research has been recognized by awards such as the Sloan Fellowship, the IEEE RAS Early Academic Career Award, the NSF CAREER Award, and the ACM doctoral dissertation award. Prior to Stanford, she received her Bachelor's degree in EECS at MIT and her PhD in CS at UC Berkeley.

Chelsea Finn

Revising Moravec’s Paradox

Yuke Zhu

Pathway to Generalist Robots: Scaling Law, Data Flywheel, and Humanlike Embodiment

  • We have witnessed remarkable advancement in developing generalist models in AI and Machine Learning. These models, such as OpenAI's ChatGPT, can be applied to various tasks in open domains. The creation of these generalist AI models primarily relies on the trinity of powerful algorithms, big data, and advanced computing hardware. The compelling capabilities of these models have intrigued us robot learning researchers to ask: How close are we to building generalist robots capable of performing everyday tasks? In this talk, I will present our work on building principles and methods toward general-purpose robot autonomy in the wild.

  • Yuke Zhu is an Assistant Professor in the Computer Science department of UT-Austin, where he directs the Robot Perception and Learning (RPL) Lab. He is also a core faculty at Texas Robotics and a senior research scientist at NVIDIA. He focuses on developing intelligent algorithms for generalist robots and embodied agents to reason about and interact with the real world. His research spans robotics, computer vision, and machine learning. He received his Master's and Ph.D. degrees from Stanford University. His works have won several awards and nominations, including the Best Conference Paper Award in ICRA 2019, Outstanding Learning Paper at ICRA 2022, Outstanding Paper at NeurIPS 2022, and Best Paper Finalists in IROS 2019, 2021, and RSS 2023. He received the NSF CAREER Award and the Amazon Research Awards.

Stefan Leutenegger

From Multi-Sensor SLAM to Spatial AI

  • To power the next generation of mobile robots and drones, the field of spatial perception has made much progress from robust multi-sensor SLAM to dense, semantic, and object-level maps, with the aim of understanding open-ended environments as a basis for navigation and interaction. I will show recent progress in reliable and real-time capable state estimation, as well as 3D scene understanding, where we heavily rely on both models and learning. Our approaches are demonstrated as core building blocks for a range of robot applications, from manipulation to drones flying through the forest. With the aim of deploying robots safely amongst people, we have recently included human motion tracking in 3D from very dynamically moving cameras into our Spatial AI stack – posed as a tightly-coupled SLAM system to improve robustness and accuracy of not only the human tracking, but also the camera motion.

  • Stefan Leutenegger is an Assistant Professor (Tenure Track) at the Technical University of Munich (TUM) in the School of Computation, Information and Technology (CIT) and has further affiliations with the Munich Institute of Robotics and Machine Intelligence (MIRMI) as well as Munich Data Science Institute (MDSI). He leads the Smart Robotics Lab (SRL) working at the intersection of perception, mobile robotics, and machine learning. Stefan Leutenegger also still holds the position of a Reader at the Department of Computing of Imperial College London, his previous post, where he undertook research within SRL and the Dyson Robotics Lab. He has also co-founded SLAMcore, a spin-out company aiming at commercialisation of localisation and mapping solutions for robots and drones. Stefan Leutenegger has received a BSc and MSc in Mechanical Engineering with a focus on Robotics and Aerospace Engineering from ETH Zurich, as well as a PhD on “Unmanned solar airplanes: design and algorithms for efficient and robust autonomous operation”, completed in 2014.

Jemin Hwangbo

Learning legged robot controllers for real-world applications

  • Learning-based controllers have recently garnered significant attention for their impressive performance on legged robots. This control methodology, initially met with skepticism by the majority of the legged robotics community, now stands at the forefront of advancements in the field. Nonetheless, these controllers still exhibit a tendency toward short-sightedness, struggling to tackle dynamically complex, long-horizon challenges. Furthermore, many of them lack the capacity to integrate established vision and planning algorithms that have proven effective over several decades. During this presentation, I will outline our endeavors in formulating a comprehensive control solution tailored for legged robots. At Railab, KAIST, we have been dedicated to the development of a complete legged robotics system, encompassing hardware design, dynamic modeling, physics simulation, state estimation, control algorithms, and planning methodologies. This holistic approach equips us with the expertise needed to devise robust and pragmatic controllers capable of navigating demanding real-world terrains.

  • Jemin Hwangbo serves as an assistant professor in the Department of Mechanical Engineering at KAIST and concurrently holds the position of Director at the Robotics and Artificial Intelligence Lab (RAI Lab). His group's research is primarily centered around legged robotics, encompassing areas such as design, vision, control, and navigation. He obtained his B.S. degree from the University of Toronto, while his M.S. and Ph.D. were conferred by ETH Zurich. Jemin Hwangbo has contributed significantly to the field of legged robotics, with four published papers in Science Robotics, one of which was featured as one of the top ten most remarkable papers of 2019 by the prestigious journal, Nature.

Karol Hausman

Bitter Lessons & Sweet Future in Robot Learning

  • Richard Sutton’s essay titled “The Bitter Lesson” remains one of the most insightful observations of the last 70 years of AI progress. Its significance lies in identifying external trends that, when aligned with research, can significantly advance it. In this talk, I will attempt to forecast some of these trends and contemplate how to position robot learning research to harness their potential. Building on this, I will explore how utilizing foundation models in robotics can fit this approach, and present various examples demonstrating interesting results of these combinations in robot learning at scale.

  • Karol Hausman is a Staff Research Scientist at Google DeepMind and an Adjunct Professor at Stanford, where he works on robotics and machine learning. He earned his PhD from the University of Southern California, and his M.Sc. from the Technical University of Munich. His primary interest lies in enabling robots to acquire general-purpose skills in real-world environments. Recently, he has been very enthusiastic about investigating foundation models for robot decision making. When not debugging robots at Google, he co-teaches the Deep RL class at Stanford.

Shuran Song

What I Wish I Had for Robot Learning

  • What do we need to take robot learning to the "next level?" Is it better algorithms, improved policy representations, or is it advancements in affordable robot hardware? While all of these factors are undoubtedly important, however, what I really wish for is something that underpins all these aspects – the right data. In particular, we need data that is scalable, reusable, and robot-complete. While "scale" often takes center stage in machine learning today; I would argue that in robotics, having data that is also both reusable and complete can be just as important. Focusing on sheer quantity and neglecting these properties make it difficult for robot learning to benefit from the same scaling trend that other machine learning fields have enjoyed. In this talk, we will explore potential solutions to such data challenges, shed light on some of the often-overlooked hidden costs associated with each approach, and more importantly, how to potentially bypass these obstacles.

  • Shuran Song is an Assistant Professor at Stanford University starting from Fall 2023. Before joining Stanford University, she was faculty at Columbia University. Shuran received her Ph.D. in Computer Science at Princeton University, BEng. at HKUST. Her research interests lie at the intersection of computer vision and robotics. Song’s research has been recognized through several awards including the Best Paper Awards at RSS’22 and T-RO’20, Best System Paper Awards at CoRL’21, RSS’19, and finalist at RSS, ICRA, CVPR, and IROS.  She is also a recipient of the NSF Career Award, Sloan Foundation fellowship as well as research awards from Microsoft, Toyota Research, Google, Amazon, and JP Morgan. To learn more about Shuran’s work please visit: https://shurans.github.io/

Early Career Panel

Wed, Nov 8, 4:15 - 5:15 pm, Session Chairs: Leslie Kaelbling & Ken Goldberg

Banquet Speakers

Louis Whitcomb

  • A professor of mechanical engineering, Louis Whitcomb is renowned for innovative robotics research and development for space, underwater, and other extreme environments, as well as novel systems for medicine and industry. He founded and directs the Johns Hopkins Dynamical Systems and Control Laboratory, leading student researchers in nonlinear and adaptive control of robot systems; robot actuators and sensors; mechanical design; and control systems design for high-performance robot control. Whitcomb’s lab has participated in the development of underwater vehicles for oceanographic science missions, including the Nereus hybrid underwater vehicle that dove to the bottom of the Mariana Trench and Nereid Under-Ice hybrid underwater vehicle deployed under Arctic sea ice in 2014, 2016, and 2019. Whitcomb was co-PI on these vehicle development projects with his collaborators at the Woods Hole Oceanographic Institution. Whitcomb, a veteran of more than 25 oceanographic expeditions and sea-trials, also develops manipulators for medical robotic arms to correct control algorithms, enable dexterous surgical tasks, and improve upper-limb prosthesis.

Justin Manley

  • Justin Manley is an innovative technologist and executive with experience in startup, public corporation, academic, and public sectors. Justin has been working with marine technology and robotics since 1990 and is a recognized leader in uncrewed systems development and operations.

    After earning degrees and professional roles at MIT he moved on to supporting the National Oceanic and Atmospheric Administration Office of Ocean Exploration. Subsequently he helped launch Liquid Robotics, one of the first venture backed ocean robotics startups, and then led business development for Teledyne Marine Systems.

    Drawn back to entrepreneurial endeavors, Justin founded Just Innovation Inc. in 2015. He has supported a variety of clients with a focus on robotics and oceantech. Today, Justin is a Senior Advisor at Oceankind, which seeks to improve the health of global ocean ecosystems while supporting the livelihoods of people who rely on them. He is also a Venture Partner at AiiM Partners, an impact fund that invests in businesses that have differentiated technologies and evidence of high-growth market traction in key climate sectors. 

     Justin is extensively involved in the marine technology profession through a variety of leadership roles. He is a Senior Member of IEEE, a Fellow of the Institute for Marine Engineering Science and Technology (IMAREST), President of the Marine Technology Society, and a member of the U.S. Ocean Exploration Advisory Board. He has judged numerous XPrizes, is an advisor to several oceantech startups, and holds two patents in the field of uncrewed systems technology.