Picture of Jason

Jason Yosinski

Neural Net Hacker

Machine Learning Researcher

Hello there! I'm Jason, a Machine Learning scientist, founding member of Uber AI Labs (previously Geometric Intelligence), and scientific advisor to Recursion Pharmaceuticals. We're hiring: if you're an experienced ML researcher or coder and interested in working on some fun datasets and problems, drop me an email.Note: we're mostly looking for (1) researchers with a PhD or equivalent level of research experience, (2) research engineers / coders with significant open-source or private industry track records, or (3) those with some combination of the above. If emailing, please include CV or resume and links to your Google Scholar and Github pages (or, if you don't have these pages, a summary of your publications and coding projects).

My research focuses on training and understanding neural networks and figuring out how to make them better. I completed my Ph.D. at Cornell, where at various times I worked with Hod Lipson (at the Creative Machines Lab), Jeff Clune, Yoshua Bengio (at U. Montreal's MILA), Thomas Fuchs (at Caltech JPL), and Google DeepMind. I was fortunate to be supported by a NASA Space Technology Research Fellowship, which gave me the opportunity to trek around and work with all these great folks.


PPGN teaser image

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

Anh Nguyen, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, and Jeff Clune

arXiv (pdf) | code | project page | quick overview »

Methods that generate images by iteratively following class gradients in image space in some cases have been used to produce unrealistic adversarial or fooling images (Szegedy et al, 2013, Nguyen et al, 2014) and in other cases have been used as pseudo-generative models to produce somewhat realistic images that show good global structure but still don't look fully natural (Yosinski et al, 2015) or do look natural but lack diversity (Nguyen et al. 2016). Deficiencies in previous approaches result in part from training and sampling methods that have just been hacked together to produce pretty pictures rather than designed from the ground up as a trainable, generative model. In this paper, we formalize consistent training and sampling procedures for such models and as a result obtain much more diverse and visually compelling samples. Read more »

Conv3 matches

Convergent Learning: Do different neural networks learn the same representations?

Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, and John Hopcroft

ICLR 2016 arXiv (pdf) | code | video of talk and more info »

Deep neural networks have recently been working really well, which has prompted active investigation into the features learned in the middle of the network. The investigation is hard because it requires making sense of millions of learned parameters. But it’s also valuable, because any understanding we acquire promises to help us build and train better models. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We probe representations by training multiple networks and then comparing and contrasting their individual, learned features at the level of neurons or groups of neurons. This initial investigation has led to several insights which you will find out if you read the paper. Read more »

Deep Vis Toolbox

Understanding Neural Networks Through Deep Visualization

Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson

ICML DL Workshop paper | video | code and more info »

Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Here we introduce two tools for better visualizing and interpreting neural nets. The first is a set of new regularization methods for finding preferred activations using optimization, which leads to clearer and more interpretable images than had been found before. The second tool is an interactive toolbox that visualizes the activations produced on each layer of a trained convnet. You can input image files or read video from your webcam, which we've found fun and informative. Both tools are open source. Read more »

Example fooling images

Deep Neural Networks are Easily Fooled

Anh Nguyen, Jason Yosinski, and Jeff Clune

CVPR paper | code | more »

Deep neural networks (DNNs) have recently been doing very well at visual classification problems (e.g. recognizing that one image is of a lion and another image is of a school bus). A recent study by Szegedy et al. showed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a network to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). We show methods of producing fooling images both with and without the class gradient in pixel space. The results shed light on interesting differences between human vision and state-of-the-art DNNs. Read more »

Plot of convnet transfer performance

How Transferable are Features in Deep Neural Networks?

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson

NIPS paper | code | more »

Many deep neural networks trained on natural images exhibit a curious phenomenon: they all learn roughly the same Gabor filters and color blobs on the first layer. These features seem to be generic — useful for many datasets and tasks — as opposed to specific — useful for only one dataset and task. By the last layer features must be task specific, which prompts the question: how do features transition from general to specific throughout the network? In this paper, presented at NIPS 2014, we show the manner in which features transition from general to specific, and also uncover a few other interesting results along the way. Read more »

Generative Stochastic Networks (GSN) example

Generative Stochastic Networks

Yoshua Bengio, Éric Thibodeau-Laufer, Guillaume Alain, and Jason Yosinski

First arXiv paper | ICML paper | Latest arXiv paper

Unsupervised learning of models for probability distributions can be difficult due to intractable partition functions. We introduce a general family of models called Generative Stochastic Networks (GSNs) as an alternative to maximum likelihood. Briefly, we show how to learn the transition operator of a Markov chain whose stationary distribution estimates the data distribution. Because this transition distribution is a conditional distribution, it's often much easier to learn than the data distribution itself. Intuitively, this works by pushing the complexity that normally lives in the partition function into the “function approximation” part of the transition operator, which can be learned via simple backprop. We validate the theory by showing several successful experiments on two image datasets and with a particular architecture that mimics the Deep Boltzmann Machine but without the need for layerwise pretraining.

Endless Forms shapes


Jason Yosinski, Jeff Clune, and Hod Lipson

Watch the two minute intro video. Users on EndlessForms.com collaborate to produce interesting crowdsourced designs. Since launch, over 4,000,000 shapes have been seen and evaluated by human eyes. This volume of user input has produced some really cool shapes. EndlessForms has received some favorable press. Evolve your own shape »



Sara Lohmann*, Jason Yosinski*, Eric Gold, Jeff Clune, Jeremy Blum and Hod Lipson

(read the paper) Many labs work on gait learning research, but since they each use different robotic platforms to test out their ideas, it is hard to compare results between these teams. To encourage greater collaboration between scientists, we have developed Aracna, an open-source, 3D printed platform that anyone can use for robotic experiments.

Cornell Chatbots

AI vs. AI

Igor Labutov*, Jason Yosinski*, and Hod Lipson

As part of a class project, Igor Labutov and I cobbled together a speech-to-text + chatbot + text-to-speech system that could converse with a user. We then hooked up two such systems, gave them names (Alan and Sruthi), and let them talk together, producing endless robotic comedy. Somehow the video became popular. There was an XKCD about it, and Sruthi even told Robert Siegel to “be afraid” on NPR. Dress appropriately for the coming robot uprising with one of our fashionable t-shirts.


Gait Learning on QuadraTot

Jason Yosinski, Jeff Clune, Diana Hidalgo, Sarah Nguyen, Juan Zagal, and Hod Lipson

code on GitHub

(read the paper) Getting robots to walk is tricky. We compared many algorithms for automating the creation of quadruped gaits, with all the learning done in hardware (read: very time consuming). The best gaits we found were nearly 9 times faster than a hand-designed gait and exhibited complex motion patterns that contained multiple frequencies, yet showed coordinated leg movement. More recent work blends learning in simulation and reality to create even faster gaits.

Nevermind all this, just show me the videos!
Or, if you prefer, here's a slightly outdated CV.

Selected Papers and Posters more »

  1. first page of paper
    Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks?. Advances in Neural Information Processing Systems 27 (NIPS '14), pages 3320 - 3328. 8 December 2014.
    See also: earlier arXiv version. Oral presentation (1.2%).
    abstract▾ | bib▾
  2. first page of paper
    Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 8 June 2015. Oral presentation (3.3%), CVPR 2015 Community Top Paper Award.
    abstract▾ | bib▾
  3. first page of paper
    Yoshua Bengio, Éric Thibodeau-Laufer, Guillaume Alain, Jason Yosinski. Deep Generative Stochastic Networks Trainable by Backprop. Proceedings of the International Conference on Machine Learning. 21 June 2014.
    See also: supplemental section, earlier and later arXiv versions.
    abstract▾ | bib▾
  4. first page of paper
    Jeff Clune, Jason Yosinski, Eugene Doan, and Hod Lipson. EndlessForms.com: Collaboratively Evolving Objects and 3D Printing Them. Proceedings of the 13th International Conference on the Synthesis and Simulation of Living Systems. 21 July 2012.
    Winner of Best Poster award.
    abstract▾ | bib▾
  5. first page of paper
    Sara Lohmann, Jason Yosinski, Eric Gold, Jeff Clune, Jeremy Blum and Hod Lipson. Aracna: An Open-Source Quadruped Platform for Evolutionary Robotics. Proceedings of the 13th International Conference on the Synthesis and Simulation of Living Systems. 19 July 2012.
    Winner of Best Presentation award.
    abstract▾ | bib▾
  6. first page of paper
    Jason Yosinski, Jeff Clune, Diana Hidalgo, Sarah Nguyen, Juan Cristobal Zagal, and Hod Lipson. Evolving Robot Gaits in Hardware: the HyperNEAT Generative Encoding Vs. Parameter Optimization. Proceedings of the 20th European Conference on Artificial Life, Paris, France. pp 890-897. 8 August 2011.
    abstract▾ | bib▾

Google scholar | see all 29 papers and posters »

Selected Pressmore »

XKCD: AI Sept 7, 2011

BBC: First 'chatbot' conversation ends in argument
(Video interview with Igor and I)
Sept 8, 2011

New Scientist: One Percent: Evolve your own objects for 3D printing
(on homepage)
19 August 2011

Through the Wormhole with Morgan Freeman: Through the Wormhole with Morgan Freeman: Are Robots the Future of Human Evolution? See my walking robots from 7:00 - 7:45 and 9:40 - 11:10. (Season 4, episode 7. unreliable video link) July 10, 2013

see more press »


Before coming to Cornell, I did my undergrad at Caltech and then worked on estimation at a research-oriented applied math startup for a couple years.