Articles, Blog

AI Makes Stunning Photos From Your Drawings (pix2pix) | Two Minute Papers #133

November 10, 2019

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. In an earlier work, we were able to change
a photo of an already existing design according to our taste. That was absolutely amazing. But now, hold onto your papers and have a
look at this! Because here, we can create something out
of thin air! The input in this problem formulation is an
image, and the output is an image of a different kind. Let’s call this process image translation. It is translation in a sense, that for instance,
we can add an aerial view of a city as an input, and get the map of this city as an
output. Or, we can draw the silhouette of a handbag,
and have it translated to an actual, real-looking object. And we can go even crazier, for instance,
day to night conversion of a photograph is also possible. What an incredible idea, and look at the quality
of the execution. Ice cream for my eyes. And, as always, please don’t think of this
algorithm as the end of the road – like all papers, this is a stepping stone, and a few
more works down the line, the kinks will be fixed, and the output quality is going to
be vastly improved. The technique uses a conditional adversarial
network to accomplish this. This works the following way: there is a generative
neural network that creates new images all day, and a discriminator network is also available
all day to judge whether these images look natural or not. During this process, the generator network
learns to draw more realistic images, and the discriminator network learns to tell fake
images from real ones. If they train together for long enough, they
will be able to reliably create these image translations for a large set of different
scenarios. There are two key differences that make this
piece of work stand out from the classical generative adversarial networks:
One – both neural networks have the opportunity to look at the before and after images. Normally we restrict the problem to only looking
at the after images, the final results. And two – instead of only positive, both positive
and negative examples are generated. This means that the generator network is also
asked to create really bad images on purpose so that the discriminator network can more
reliably learn the distinction between flippant attempts and quality craftsmanship. Another great selling point here is that we
don’t need several different algorithms for each of the cases, the same generic approach
is used for all the maps and photographs, the only thing that is different is the training
data. Twitter has blown up with fun experiments,
most of them include cute drawings ending up as horrifying looking cats. As the title of the video says, the results
are always going to be stunning, but sometimes, a different kind of stunning than we’d expect. It’s so delightful to see that people are
having a great time with this technique and it is always a great choice to put out such
a work for a wide audience to play with. And if you got excited for this project, there
are tons, and I mean tons of links in the video description, including one to the source
code of the project, so make sure to have a look and read up some more on the topic,
there’s going to be lots of fun to be had! You can also try it for yourself, there is
a link to an online demo in the description and if you post your results in the comments
section, I guarantee there will be some amusing discussions. I feel that soon, a new era of video games
and movies will dawn where most of the digital models are drawn by computers. As automation and mass-producing is a standard
in many industries nowadays, we’ll surely be hearing people going: “Do you remember
the good old times when video games were handcrafted? Man, those were the days!”. If you enjoyed this episode, make sure to
subscribe to the series, we try our best to put out two of these videos per week. We would be happy to have have join our growing
club of Fellow Scholars and be a part of our journey to the world of incredible research
works such as this one. Thanks for watching and for your generous
support, and I’ll see you next time!


  • Reply Quenz March 5, 2017 at 5:38 pm

    "Ice cream for my eyes!" 😀

  • Reply Dave Jacob March 5, 2017 at 6:04 pm

    this is simply awesome.

  • Reply CGPacifica March 5, 2017 at 6:12 pm

    And here I've been thinking "Well at least creative jobs will be safe when autonomy takes over the rest of the jobs…" FML

  • Reply icecaloric March 5, 2017 at 6:24 pm

    If they get something like this to work for videos, then it would be difficult for people to determine what's real and what's fake. People have already been fooled by photoshopped images. How long until people start manufacturing "evidence" for courts or generating "fake news" footage?

  • Reply BjarkeDuDe March 5, 2017 at 6:30 pm

    Khajiit is not entertained by your shenanigans

  • Reply Bishshoy Das March 5, 2017 at 9:07 pm

    Fucking mind blown.

  • Reply Yongliang Qin March 5, 2017 at 9:55 pm

    just read this a month ago

  • Reply allinonemovie March 5, 2017 at 9:56 pm

    Considering the amount of work you have to do for each video, it's incredible how many videos you upload. Just two words: Thank you!

  • Reply m ・ ́ω・ March 5, 2017 at 11:23 pm

    Isn't this how human imagination works? Are we on the dawn of creating thinking machines?

  • Reply Mr Tomato March 6, 2017 at 2:34 am

    love your videos keep uploading

  • Reply Eyesonly - 目だけ March 6, 2017 at 8:05 am

    Been here since 1k subs. Your content is amazing. Thanks for the video

  • Reply Shaul Kedem March 6, 2017 at 11:14 am

    What is this cat on the first slide, I thought we got to that level of generative images 🙁

  • Reply Hyunsung Go March 7, 2017 at 2:36 pm

    Have you ever heard of HTM(Hierarchical Temporal Memory)?
    It's kind of like a neural network but much better.
    The creators of HTM tried to mimic the neocortex and it works really well.
    They claim that it's the real way of true intelligence.
    I just think it's incredibly awesome and I don't think it doesn't get much attention it deserves.
    After all, isn't the neocortex that makes human intelligent, right?
    p.s. Sorry for my bad English ;(

  • Reply Japan is Sinking March 8, 2017 at 1:44 am

    So I was sitting in Spanish class the other day when I got an idea, and I want another take on it.
    You know how sites like Google translate always churn out characteristically faulty and unreliable translations? Well, would it be possible to improve machine translations with the aid of a general AI?
    How I imagine it would work would be similar to the other general AIs discussed on this channel. It would start out with the broken machine translation first, make its evolved changes to it, and then see how close it is to the same sentence translated by a professional translator (there are hundreds of books translated from English to Spanish every year, so a large training data set wouldn't be that hard to acquire I imagine.) The closer the edited bad translation is to the proper translation, the more fitness points the AI is rewarded with.
    Would this even work, or does language have too many subtle complexities for human-made code to be improved upon? I watch the papers featured here and especially in the image generation ones the neural networks' understanding of how RGB values come together to make recognizable images seems impossibly deep for a computer, surely this same understanding can be reached in regards to language? Or maybe I'm missing something in my ignorance?
    Anyway, fantastic channel man, it isn't often that someone finds such a small niche and pours so much effort into filling it!

  • Reply krzysztofnatalicz March 8, 2017 at 12:50 pm

    You can try it as a app now.

  • Reply dewinmoonl March 11, 2017 at 2:34 am

    cool recap. Isola actually came to MIT yesterday to give a talk, it was cool 😀

  • Reply apolotary March 12, 2017 at 6:52 am

    Time to open an artisanal handcrafted game studio

  • Reply Christopher March 19, 2017 at 4:13 am

    I like two minute papers, but some of these papers may end up strictly becoming fun apps on smart phones; never truly lifting off as with the case of VR.

  • Reply VoidMoth March 22, 2017 at 9:33 am

    they look like they about to solve the riemann hypothesis, and I look like Ive just figured out how to draw phallic objects in the sand, by writing a temperature plotting thing. awesome vid tho

  • Reply Pacdev March 23, 2017 at 1:57 pm

    4 minutes paper

  • Reply SUV Tropics March 24, 2017 at 5:05 pm

    I wonder if there will be a human massacre run by robots.

  • Reply TedRobotBuilder March 27, 2017 at 4:15 pm

    It it works for video/animation, this will be huge. I don't see why it wouldn't.

  • Reply Hristo Vrigazov March 27, 2017 at 4:25 pm

    Best channel ever

  • Reply Craig Wall April 18, 2017 at 8:56 pm

    I'd hate it if AI replaced artists and designers.

  • Reply Vinay Seth April 19, 2017 at 2:51 pm

    Just tell me when I'm going to become obsolete :/

  • Reply Marius Langeland June 16, 2017 at 4:12 am

    Why are we still here, just to suffer?

  • Reply TwizzlyTwist June 20, 2017 at 12:05 pm

    This ruins the purpose of actual artists.

  • Reply AlphaCore June 22, 2017 at 2:02 pm

    this is my new favorite channel

  • Reply Chris Walsh June 24, 2017 at 11:11 pm

    ice cream for my eyes LOL!

  • Reply Overkin July 4, 2017 at 1:14 pm

    we'll soon be able to "translate" stick-figure drawings to porn. The golden age of erotica!

  • Reply Constant Throwing July 11, 2017 at 10:00 pm

    You computer people are smart fellers.

  • Reply ksztyrix July 12, 2017 at 2:18 pm

    Those are not photographs, but simulacrums.

  • Reply Bipin Oli July 13, 2017 at 8:27 am

    Where do you find all these information about the latest papers?

  • Reply Alex Taylor Barratt July 23, 2017 at 8:00 pm

    I love your channel so much!!!!!!

  • Reply Kade Blad July 23, 2017 at 11:38 pm

    This is amazing!
    And it's written in Python!!! 😀

    SO COOL!!!!!!!!!!!!!!!!!!

  • Reply Ivan Damico August 5, 2017 at 12:09 am

    Thanks for the evaluation AND for supplying the links! I'm already on Github and have set up my account and learning code branches and repositories (sounds like a bank!) and as the process of learning and sharing information is growing exponentially, my hats off to you for sharing the latest of your discoveries and insights, and not waiting until your patent comes through to share what you have learned like so many others do. To create a new method is inspired, to share with the world, divine.

  • Reply Bannicus August 8, 2017 at 4:41 am

    Maybe it could be used for mouth movements in animation so eg. Pixar can dub content in each language and have the software animate mouths and have them work properly in all languages.

  • Reply Player_1 August 23, 2017 at 6:20 pm

    We need 3D models of those horrifying animals

  • Reply Adrian Salamunovic September 1, 2017 at 5:56 pm

    What does "ground truth" mean between input and output?

  • Reply Erik X September 7, 2017 at 2:45 am

    You should really start to give credit to the actual authors and keep the original title.

  • Reply Swift Fox October 31, 2017 at 2:37 pm

    "Bespoke, artisinal hand-crafted games". I can imagine somebody using this as their sales pitch.

  • Reply Frank Anzalone November 6, 2017 at 10:38 pm

    Four and a half minute paper

  • Reply Limitless 1 November 20, 2017 at 9:19 pm

    wow this is soo cool
    i just played with the website
    thanks 🙂

  • Reply LORDE 2729 December 2, 2017 at 5:43 am

    lol i like how he says his name. "karow zohor…..blaballab"

  • Reply Codex Group December 8, 2017 at 9:41 pm

    Is there an app or service that will let us summarize documents using AI? Like a two minute summary of a PDF?

  • Reply Solve Everything December 16, 2017 at 6:55 pm

    Maybe this can be used to turn lowrez images int high rez images? And make resolution in games go up?

  • Reply ello propello January 26, 2018 at 10:04 pm

    i am not able to find this tool. no link i found so far contained a working online demonstration

  • Reply 4Dm8ion March 5, 2018 at 12:54 pm

    Wish I could UL my drawings of my cat instead of redrawing w a mouse.

  • Reply Isai Karnadhi March 20, 2018 at 8:42 am

    This is how the Chinese designed knockoffs…

  • Reply AnteConfig April 12, 2018 at 10:38 pm

    imagine video game graphics like peoples faces in games being drawn in this fashion. That would be soo good.
    Or cartoons and if the voices are artificially crafted as well the cartoons can run for decades without needing to higher voice actors.
    Imagine TV shows like The Expanse or Stargate but with no actors.
    Why use your voice on the radio when you can have a machine generate another one.
    this is great stuff.

  • Reply Irene Plaster April 18, 2018 at 7:16 am

    Wheres the link to make your own? why do you refuse to post the link for that?

  • Reply adfdasfadfdaaaa April 29, 2018 at 8:12 am

    "Open the pod bay doors, HAL!" This is borderline terrifying to me. I don't want to sound backward minded, but I can see no way all this progress wouldn't go wrong. Society is not flexible enough and neither smart enough (as a whole) to keep the pace and adapt to AI, which will soon be able to evolve based on it's own decisions. The tool will become smarter than the user. Change my mind. Please.

  • Reply Sancarn May 19, 2018 at 5:56 pm

    Hello World

  • Reply Leo Zendo May 30, 2018 at 4:03 am

    What is the cat for?

  • Reply luis pacheco June 4, 2018 at 6:50 pm

    pix2pix: Turning new products into used products

  • Reply Zorn101 June 9, 2018 at 2:29 am


  • Reply YS June 21, 2018 at 4:17 am

    AI is still way far from artistic QC, direction and control. Needs more development..

  • Reply Yves Gomes June 29, 2018 at 10:28 pm

    I'm playing with it. My first result was terrible, probably mostly because I tried to draw the whiskers.

  • Reply Joe Siu July 3, 2018 at 12:31 am


  • Reply Surekha Sarode July 28, 2018 at 5:09 pm

    Its really awesome!!!
    Can it be used for human beings, so that it can be helpful for the crime branch to identify criminals by the sketch drawn?

  • Reply Gustavo Martinez August 21, 2018 at 8:20 am

    How can I become a patreon? I just have cash!!!, no card!!

  • Reply DunnickFayuro September 3, 2018 at 12:41 am

    At one point in the future, this sorts of AI will be embeded into graphics renderers of games.

  • Reply videolabguy September 3, 2018 at 6:52 pm

    WHEN the AI takes over and conquers the human race, we'll be lucky to be kept as pets instead of "a bundle of raw materials". This is a bad path to follow. I just hope the AI doesn't have this narrators horribly irritating voice.

  • Reply ChristianIce November 6, 2018 at 7:55 pm

    Yeah, tried the demo, doesn't do anything.
    You draw what you are supposed to, it elaborates and then a white rectangle appears.

  • Reply Rainbow Doodler209 December 19, 2018 at 2:37 am

    This is very very impressive, but AI hopefully won't rule the world. Remember, our robots are not like skynet.

  • Reply Le Wang January 16, 2019 at 6:15 pm

    Anyone able to generate good-looking picture from drawing on that website? For me, it looks terrible.

  • Reply Jason Hanson January 30, 2019 at 7:53 am

    Won't be for a long time. It looks like crap had a baby with a cat.

  • Reply Justin Saephan March 22, 2019 at 10:50 pm

    looks like a bad dream or some chinese knockoff.

  • Reply raintz randmaa April 21, 2019 at 3:09 pm

    With these, they can start making fake news, in other words, humans are gone soon…

  • Reply Beedy KH May 10, 2019 at 11:34 am

    Oh nice. Now governments can fake satellite images and hide secret projects.

  • Reply Big Fat Rat May 14, 2019 at 11:15 am

    This is kinda just pix2pix but if it was better.

  • Reply zillBoy May 25, 2019 at 7:39 pm


  • Reply José Leonardo Diaz Ordoñez June 27, 2019 at 7:15 pm

    2019 -> Deep nudes.

  • Reply Jeff Greenwade July 10, 2019 at 3:33 am

    AI will be able to automate creative jobs much faster than people think. We need to get ahead of the transformation before drastic changes cause damage in society. Andrew Yang is the only candidate running right now who is informed on automation and its effects.

  • Reply Dim tass July 16, 2019 at 9:55 pm

    Next step for entertainment AI is to put yourself and friends as the main character in movie in real-time.

  • Reply Artisan July 27, 2019 at 10:31 am

    This is just like the photoshop tool that clones a texture, the healing brush, but more advanced. I mean, this is not "artists died", it's more like "great, we artists can design more and better things in less time". It's just a tool.

  • Reply Artisan July 27, 2019 at 10:43 am

    Video games were never hand crafted, the computer was always there and most of the textures were generic and tiled textures. This will be more more close to hand crafted than the old games if you consider the old games somehow hand crafted. Also, there is things that doesn't exist so artists will be creating those things. Then the price factor, you might have the technology but if most studios is just cheaper to so something else even with many artists working on it, that will be better.

    For me in the end this will be like a book. The artists will be the writer. The book will be the story generated with images, a 3d game, a world full of life, but still a narrative of that world will be there and artist driven. In my opinion Machines will never replace artists because IA is not really possible and even if it was at that time no fucking one will need to work at that point so i'ts irrelevant, Africa would be rich like the rest of the world, if not that means artists are still needed for political reasons.

    We are 200 years later saying "machines will replace humans, be scared" this started even before French revolution and nothing happened, in the end machines are just tools to speed up things or make them better.

  • Reply TURRO27 August 9, 2019 at 4:14 pm


  • Reply Tiju John August 21, 2019 at 8:32 pm

    20:40 generated images feels like what you recall from a dream

  • Reply Chonnawit K September 26, 2019 at 4:41 am

    Data usage is not imagine.Mix and Match is not imagine. It can makes good pictures but Art is more than that .Art is "spirit" and now AI can not do that.

  • Reply Ojasvi Singh 786 October 14, 2019 at 8:37 am


  • Reply Weißbrot Waigmann October 27, 2019 at 3:45 pm

    In the future you don't do work yourself as a designer, you let an algorithm handle it for you

  • Reply Lupusregina October 28, 2019 at 1:44 pm

    I'm starting to believe there will not be a single job left when AI gets sufficiently advanced.

  • Reply Creepy Chris October 29, 2019 at 1:57 pm

    he always says fellow scholars which makes me feel smart but im a stoner watching cool ai videos

  • Reply Ugandan Knuckles October 29, 2019 at 3:40 pm

    Someone must have made one and used private parts as training data.

  • Reply Zetsuke4 October 30, 2019 at 6:17 pm

    So cool

  • Reply skudzer1985 November 2, 2019 at 2:52 am

    c'mon man, there's not "tons" of links in the description. There's a good bit, sure, but the way you emphasized "tons" was a bit of an exaggeration, wouldn't you say? I mean, if I clicked on the description and saw a wall of links, from the top of my screen to the bottom, then I would agree with your "tons" claim, but that wasn't the case. You should probably say something like "check the description, I put less than a ton but more than a few links in there for you" just to be safe. Let's not get bogged down with this "tons" business.

    Cool vid btw, I like cats.

  • Reply GraveUypo November 3, 2019 at 3:23 am

    why everyone's working with images but seemingly no one uses theses for sound? i want an algorithm that upsamples low quality sounds, another that is able to isolate specific sounds from a jumbled mess (like a voice from a crowd, or a single instrument from a song), or maybe real time conversion of input speech into an entirely different voice, not just altering the input with filters, but recreating the speech from scratch using an AI. so many cool things to do with sound and no ones does any of them :

    i mean, except for the synth voices that have gotten so convincing over time. but that's text-to-speech. i want speech-to-speech, preserving (or altering) intonation and all the nuances.

  • Reply MikesGameLab November 4, 2019 at 2:48 am

    This could be a big deal for 2d animation. Draw some line art and a detailed illustration, and you might get extremely detailed frames!

  • Reply nasragiel November 4, 2019 at 10:57 pm

    If they get this to work with 3D objects it would be awesome for game development 🙂

  • Reply Daniel Oliveira November 5, 2019 at 12:18 pm

    1 word: rule 34

  • Reply keYserSOze80 November 5, 2019 at 4:02 pm

    As an artist this breaks my heart.

  • Reply DaKussh November 6, 2019 at 3:17 am

    can you please write your full name?

  • Reply Sudipto Borun November 6, 2019 at 11:43 am

    "Dear scalar this is Karol Jorie Faherr"

  • Reply ガMingo - MGx November 8, 2019 at 3:47 pm

    Do NOT tell the weebs this

  • Reply featheredmusic November 9, 2019 at 8:21 am

    "Ice cream for my eyes" this is my new favourite line.

  • Reply marlino321 November 9, 2019 at 8:31 am


  • Reply Jacen Solo November 9, 2019 at 9:48 am

    I think I'm in love

  • Reply Ian November 10, 2019 at 3:48 am

    cant wait to use this on my hentai

  • Reply Mike Jenkins November 10, 2019 at 9:12 am

    Thanks for working towards putting creatives out of work.

  • Leave a Reply