Princeton > CS Dept > PIXL > Graphics > Lunch Local Access 


The PIXL lunch meets every Monday during the semester at noon in room 402 of the Computer Science building. To get on the mailing list to receive announcements, sign up for the "pixl-talks" list at lists.cs.princeton.edu.

Upcoming Talks


Monday, February 27, 2017
Linguang Zhang


Monday, March 06, 2017
Shuran Song


Monday, March 13, 2017
Manolis


Monday, March 27, 2017
Kyle


Monday, April 03, 2017
Elena Sizikova


Monday, April 10, 2017
Nora


Monday, April 17, 2017
Xinyi / Amit


Monday, April 24, 2017
Riley


Monday, May 01, 2017
Maciej / Andy


Previous Talks


Monday, February 13, 2017
VoCo 2.0 and Wave/FFT-net
Zeyu Jin

Abstract
Editing audio narration using conventional software typically involves many painstaking low-level manipulations. Some state of the art systems allow the editor to work in a text transcript of the narration, and perform select, cut, copy and paste operations directly in the transcript; these operations are then automatically applied to the waveform in a straightforward manner. However, an obvious gap in the text-based interface is the ability to type new words not appearing in the transcript, for example inserting a new word for emphasis or replacing a misspoken word. We present a system called VoCo, a text-based speech editor that can synthesize a new word or short phrase such that it blends seamlessly in the context of the existing narration. The main idea is to use a text to speech synthesizer to say the word in a generic voice, and then use voice conversion to convert it into a voice that matches the narration. Offering a range of degrees of control to the editor, our interface supports fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and even guidance by the editors own voice. It’s earlier version has been presented in Adobe Max (https://www.youtube.com/watch?v=I3l4XLZ59iw). The first half of this talk will focus on the lated version of VoCo submitted to SigGraph.

VoCo has generated quite a bit buzz lately, but it is not perfect. Since it is a data-driven method that concatenates audio snippets to form new words, the quality degrades from segmentation error and data insufficiency. We would like to address this problem with parametric methods. Although existing parametric methods introduce artifacts that are unfit for VoCo, we borrowed the idea from WaveNet and devised parameter-to-waveform generator called FFT-net. While wavenet works as a reenactment of the mathematical structure of wavelet with coefficients replaced by neural connections, FFT-net mimics fast fourier transform. Our experiment shows FFT-net can produce almost perfect audio signal from widely-used parametric representations such as MFCC+pitch. In the second half of this talk I would like to demonstrate its unprecedented strength in text-to-speech synthesis, transformation, audio feature sonification and most importantly making a parametric-model based VoCo.


Monday, February 20, 2017
Studying the Internet of (Any)Things: Confessions of a Human-Computer Interaction Researcher
Marshini

Abstract
The Internet has become a part of our daily lives yet we often do not stop to consider how users interact with this infrastructure or what is needed to make the Internet of any and every thing run smoothly in everyday activities. In this talk, I present my current and ongoing work that focuses on helping users understand, gain control over, and make more effective use of Internet infrastructure in their day to day tasks. This will entail research confessions from the perspective of human-computer interaction and cover projects in ubiquitous computing, usable security, and information and communications technologies for development.

Bio
Marshini Chetty is a research scholar in the Department of Computer Science at Princeton University specializing in human computer interaction and ubiquitous computing. Marshini designs, implements, and evaluates technologies to help users manage different aspects of Internet use from security to performance. She often works in resource constrained settings and uses her work to help inform policy. She has a Ph.D. in Human-Centered Computing from Georgia Institute of Technology, USA and a Masters and Bachelors in Computer Science from University of Cape Town, South Africa. Her passions are all things broadband related and trying to make the world a better place, one bit at a time.