The PIXL lunch meets every Monday during the semester at noon in
room 402 of the Computer Science building. To get on the mailing
list to receive announcements, sign up for the "pixl-talks" list at
Monday, April 19, 2021
Monday, April 26, 2021
Monday, February 01, 2021
Human Visual Perception of Art as Computation
I will describe two projects aiming to further our understanding of visual art, building on insights from computer science. First, in representational art, I discuss the question of line drawings work. It has been a long-standing mystery that line drawings are very different from our natural world, yet they can be understood by people who have never seen pictures before. I argue that this question can be resolved using the idea of Abstracted Shading from computer graphics. Second, I discuss abstract art, and, particularly, the notion of visual indeterminacy in ambiguous artworks. I discuss how a particular crowdsourced methodology could help quantify perceptual ambiguity in artworks.
Aaron Hertzmann is a Principal Scientist at Adobe Research. He received a BA in computer science and art & art history from Rice University in 1996, and a PhD in computer science from New York University in 2001. He was a Professor at University of Toronto for 10 years, and has also worked at Pixar Animation Studios, University of Washington, Microsoft Research, Mitsubishi Electric Research Lab, and Interval Research Corporation. He is an Affiliate Professor at University of Washington, an ACM Fellow, an IEEE Fellow, and the Editor-in-Chief of Foundations and Trends in Computer Graphics and Vision.
Monday, February 08, 2021
Talk by PIXL researchers
We have a series of short presentations by PIXL students and post-docs on their latest research.
Monday, February 15, 2021
Towards Personable, Consistent & Intuitive AI
While there has been tremendous progress in AI/ML in the last decade in computer vision, speech and audio, these technologies are not typically developed keeping the end human in mind. In order to make these technologies accessible to different communities, there is a need to develop AI that can understand and reason in human-like ways to better engage and interact with humans in a manner that they would like and connect with.
Towards my goal of advancing human-centered AI, I will first describe my work in developing personable, creative AI where I have looked at how AI can empower humans to accomplish mutual goals and inspire them in their creative processes through visual doodle generation and getting an agent to dance interestingly to music. Next, I will describe my work in teaching AI to reason about visual content in human-like, consistent ways in the context of visual question answering (VQA). Finally, I will briefly discuss some ongoing work that looks at an intuitive physical understanding of how objects interact with each other by encoding heuristics of human-like reasoning through modular approaches.
Purva Tendulkar is a visiting researcher at UC San Diego where is she working with Prof. Xiaolong Wang. She completed her Masters in Computer Science in 2020 at Georgia Tech where she was advised by Prof. Devi Parikh. She is interested in developing AI systems that better understand the physical and digital worlds that surround us with the goal of improving our interactions within them. She has previously interned at Aibee Inc, Nanyang Technological University (NTU) and Indian Institute of Technology (IIT Bombay) and has collaborated with Allen AI and Microsoft Research. She graduated from College of Engineering Pune (COEP) in 2018 with a Bachelors degree in Computer Science.
Monday, February 22, 2021
Learning the Predictability of the Future
Carl Vondrick + Didac Suris Coll-Vinent
Not everything in the future is predictable. We cannot anticipate the outcomes of coin flips, and we cannot forecast the exact trajectory of a person walking. Selecting what to predict is therefore a central issue for future prediction. In this talk, I will introduce a framework for learning from unlabeled video what is predictable in the future. Instead of committing up front to features to predict, our approach learns from data which features are predictable. Based on the observation that hyperbolic geometry naturally and compactly encodes hierarchical structure, we propose a predictive model in hyperbolic space. When the model is most confident, it will predict at a concrete level of the hierarchy, but when the model is not confident, it learns to automatically select a higher level of abstraction. Although our representation is trained with unlabeled video, visualizations show that action hierarchies emerge in the representation.
Carl Vondrick is on the computer science faculty at Columbia University. His research group studies computer vision and machine learning. His research is supported by the NSF, DARPA, Amazon, and Toyota, and his work has appeared on the national news, such as CNN, NPR, the Associated Press as well as some childrens books.
Dídac Surís is a second-year PhD student in computer vision at Columbia University, advised by Prof. Carl Vondrick. He obtained his BS and MS in telecommunications at the Polytechnic University of Catalonia, in Barcelona. Before starting his PhD, he interned with the research team at Telefonica, and he conducted research stays at MIT with Prof. Antonio Torralba and at University of Toronto with Prof. Sanja Fidler. His research interests include multimodal machine learning, video prediction, and self-supervised representation learning. More generally, he is interested in the areas of artificial intelligence that exploit all the available information using as little human supervision as possible.
Wednesday, March 03, 2021
Detailed Human Action Recognition and Anticipation in Videos
Videos convey informative content in our daily life. Recently deep-learning-based networks have proven to be successful in understanding generic actions in videos. However, most existing video models are only good at capturing coarse actions, but may fail in providing detailed descriptions of fine-grained human-object interaction in videos. In this talk, I will introduce my recent works in video understanding, focusing on fine-grained action recognition and anticipation. I would share our winner solutions at CVPR EPIC-Kitchens Egocentric Action Recognition Challenge 2019 and 2020. The core idea is to encourage the action models interaction with the object model via detection guidance. The designed cross gated attention mechanism could largely benefit the overall human-object action predictions. Then I would introduce a new framework to anticipate future human activity via imagination and contrastive learning. Besides, I would also discuss how to protect video privacy via natural language dialog.
Yu Wu is a fourth-year PhD candidate at University of Technology Sydney, advised by Prof. Yi Yang. He is interested in video understanding and multimodal perception, especially on fine-grained action recognition and human activity anticipation. His research is supported by the Google PhD Fellowship.
Monday, March 08, 2021
Fair Attribute Classification through Latent Space De-biasing
Fairness in visual recognition is becoming a prominent and critical topic of discussion as recognition systems are deployed at scale in the real world. Models trained from data in which target labels are correlated with protected attributes (e.g., gender, race) are known to learn and exploit those correlations. In this work, we introduce a method for training accurate target classifiers while mitigating biases that stem from these correlations. We use GANs to generate realistic-looking images, and perturb these images in the underlying latent space to generate training data that is balanced for each protected attribute. We augment the original dataset with this perturbed generated data, and empirically demonstrate that target classifiers trained on the augmented dataset exhibit a number of both quantitative and qualitative benefits. We conduct a thorough evaluation across multiple target labels and protected attributes in the CelebA dataset, and provide an in-depth analysis and comparison to existing literature in the space.
Monday, March 22, 2021
Quantum and computational imaging: from multipath imaging to healthcare applications
I will review some of our work in the area of computational imaging based on, or inspired by quantum technologies for light detection. The key aspect we have been investigating is the role of temporal (time-of-flight) information that can be recorded using single photon counting (and other) techniques. This temporal information allows a range of applications such as non-line-of-sight imaging or tracking, which I will briefly overview as in introduction to more general “multi-path” imaging, i.e. use of return echoes from a scene that have bounced multiple times between objects before being detected. Temporal information alone of mutlipath echoes can be sufficient to reconstrct a full 3D image of the scene and can be equally applied to light, radar and acoustic sensing. An extreme example of multipath information is diffuse imaging, i.e. imaging through highly scattering media with potential applications for through-body imaging (e.g. imaging inside the brain) and remote heart activity monitoring.
Daniele Faccio is a Royal Academy Chair in Emerging Technologies and Fellow of the Royal Society of Edinburgh and Optical Society of America. He joined the University of Glasgow in 2017 where he leads the Extreme-Light group and is Director of Research for the School of Physics and Astronomy. He is also adjunct professor at the University of Arizona, Tucson (USA) and previously was at Heriot-Watt University and University of Insubria (Italy). He has been visiting scientist at MIT (USA), Marie-Curie fellow at ICFO, Barcelona (Spain) and EU-ERC fellow 2012-2017. He was awarded the Philip Leverhulme Prize in Physics in 2015, the Royal Society of Edinburgh Senior Public Engagement medal and the Royal Society Wolfson Merit Award in 2017. He worked in the optical telecommunications industry for four years before obtaining his PhD in Physics in 2007 at the University of Nice-Sophia Antipolis (France). His research focuses on the physics of classical and quantum states of light, on how we can harness light to answer fundamental questions (e.g. through analogue gravity studies and the study of quamtum states of light in non-inertial reference frames) and on how we harness light to improve society (e.g. through applications of computational imaging to diffuse imaging and bio-imaging).
Monday, April 05, 2021
Monday, April 12, 2021