Two Minute Papers - 2023-02-20
❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "Human-Timescale Adaptation in an Open-Ended Task Space" is available here: https://sites.google.com/view/adaptive-agent/ My latest paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Edward Unthank, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/
What was not mentioned in the commentary is that this model was pretrained on a number of similar type tasks. The pretrained model is then capable of few-shot learning new tasks it hasn't seen before.
thank you, I was wondering how the AI new that the objects interacted in the first place
Step (1) in the abstract 0:01 already mentioned "meta-reinforcement learning", pretty much a pretraining step as you said. Funny how Károly always omit crucial steps to make the research sounds more awesome than it (already) is.
@@deep.space.12 give him a break, he is just trying to engage people in educational content. If the papers are, as you said, "already" awesome, then omitting something that to you wouldn't change the awesomeness of the paper, but for someone less academically inclined would make the paper seem boring... i think that its worth omitting. At least for a youtube video, its not like you can't just read the paper yourself!
@@nathanielbartholomew5091 It's quite understandable to make an occasional omission to facilitate communication or by pure accident. But it's been a trend for him to misrepresent results and credit the wrong researchers for views. It's appalling, frankly.
@@deep.space.12 well shit man… I thought he wasn’t scummy. Am I wrong? I really don’t want to believe you, but I’m pretty used to being failed by public figures haha. So what you’re telling me (and correct me if I’m wrong) is that “Two Minute Papers” purposefully omits and/or twists the information to mislead people for the benefit of his YouTube channel to get more views and make more money and his transgressions can’t be seen in any way as a wilfully ignorant AI enthusiast trying to share developments in important fields.
"research two more papers down the line; full video 4k; hdr; unreal engine; realistic; accurate; inspiring"
Hahahhahahaa
No one asked
I wish more commentary was provided on this one. It seems like this was likely a fundamental breakthrough that's going to utterly revolutionize dataset size requirements and compute resources required to train NNs.
I cant really believe it. There had to be more than 5 tries for learning something like this
Yep, felt like a narration of the video clips instead of any sort of explanation of how this method was different or innovative, or how it was accomplishing the speed up in training, no insight or analysis, just a "wow, amazing!" react video that could have been made by a vtuber. A rare L on this channel.
@@msytdc1577 In fairness, any explanation of these would amount to 10 minutes of noise indistinguishable from "They did a thing with the thing before, but now they did this other thing and it's better!"
There has to have happend a training process beforehand, where it learned how to play these sort of games. During the game its not so much learning as it is reacting to a changed environment.
@@loneIyboy15 depends on who these videos are targeted towards, I recall watching a video on this channel that was long, detailed, and had a good breakdown of the paper, how things worked, how it compared to previous techniques, what it was still not fully successful at, etc., a solid video in line with most of the other videos on the channel made thus far.
Then I watched a different shorter video on this channel released a month or two later that gave me deja vu watching it, but it was more like this one, just showing some examples and saying it was amazing, but completely surface level and to me not really interesting.
Turns out the deja vu was because both videos were about the exact same paper! The first long one would probably have been "boring" and too technical for the masses and garner fewer views, and the "dumbed down" one was something a hundred channels could have produced, but likely would have been financially more successful and popular for this channel. Positively it would also have been a better introduction to what can be a complicated topic than the information dense version, and at least when produced by this channel even a surface level video is likely to highlight the important parts and not get something horrendously wrong, which is more than you could guarantee the other 99 channels would manage, if they even cared to make the effort.
So, like most things, it's a trade off. Channel produces only high level videos, small audience, low revenue, fewer people exposed to the wonders of current advancement; produce only superficial react videos and the channel loses the special unique attributes that have historically set it apart from less knowledgeable content creators, and provides little reason for the initial audience who subscribed for that more advanced insight to keep coming back.
It is a VERY cool paper, but DeepMind still trains this agent on 25 Billion games of this general type first (5 weeks on 64 TPUv3), before the model becomes "smart" as we see here, and able to generalize to new variants of the same kind of game as rapidly as a human would. Great result, but much more work is required to make this more general purpose.
To see how general it is, the actual task would need to be something like play tetris with the available blocks in the room, using the pre-training that it currently had.
Humans are also pretrained on 20 years of similar experiences.
Something that would have blown us away 2 years ago is now 'okay, I guess'. "Much more work" also means, what, two more whole years?
how did the AI know about geometry, possible actions and that combining objects is necessary to solve the task? it couldn't have guessed that. i mean, the solution could have been anything (for example "trace all walls" or "touch al tiles")
it must have had some knowledge before the game started
I was thinking the same. It feels like this video was more about enjoying what the ai is doing rather than explaining the functioning of it. While admirable, it doesn't help me reason about how great of an achievement this is nor how transferable any of it is.
Yeah, I agree with you. I think the search space is already heavily confined. Not comparable with a real life scenario or a real escape room.
Still impressive though!
It was pretrained on 200 million and 25 billion similar tasks which is missing from the video.
@@Peter-ik9fz So the actual task was figuring out which objects to interact with, not to learn how to move in the world or how to push or grab the objects. Still, great work with very little repeation to make a correct guess from very little data.
I would love it if you could explain the key principles as to how these papers achieve their results! That would make your videos twice more interesting than they already are if you ask me!
I would argue that this type of learning is more specific and therefore less complex than learning to move and fight with multiple limbs.
These next tests require cooperation. Consequently, they have never been solved by a human. That's where you come in. You don't know pride; you don't know fear. You don't know anything. You'll be perfect. - some dude with combustible lemons
Atlas and PBody in real life be like "Beep boop"
DeepMind should train a model to play Portal 2 co-op mode. That would be sick
@@NotASpyReally everyday we stray further from god.... and further toward ApErtUre SCieNce
@@jackb3493 That's the prime place to be.
It'll be revolutionary when minimal shot learning is applied for everything!
I don't really want to talk the achievement down, what they made is still pretty amazing, but since the video fails to do so it feels like I have to set it a bit in perspective. The AI was pretrained on a more generalized version of the puzzle, so it became an expert in the domain, and the only "ad hoc learning" it had to do was figuring out what the concrete problem was it was dealing with this iteration.
Putting it into perspective of what the equaivalent human thing would be ... its like giving a math expert (who trained math for years) a specific math problem to solve and of course he's going to figure it out quickly if he has seen this kind of problem before lots and lots of times. Or someone who is trained as an expert at reparing computers ... he has to figure out what concretely is broken with a computer he is given.
So this AI basically was trained as being an expert of figuring out what the hidden rules are in this game world, to achieve their goal. Thats no small achievement but its totally unrelated to how the video presents it, and it doesn't feel as revolutionary as its hyped up to be, of course if an AI is trained to be an expert in a whole domain it will solve individual problems of that domain very well. I'm disappointed in the video giving no context at all.
@@gavaldor it is true that the video could have provided more information. But, it's still useful this way because you could just train a dozen ai models in somewhat niche categories and then quickly train them for specific applications.
That's an overstatement. Minimal shot learning has its place, but it's not a revolutionary solution for everything. There are still many challenges and limitations that need to be addressed before it can be applied effectively to a wide range of tasks.
@@gavaldor Thanks for your insightful remarks!
@@gavaldor Unfortunately Google is very good at marketing, most of their archievements are not what they claim. I learned that when they showed their Dota2 AI, and as someone that understands that game, I immediately became dissillusioned by Deepmind, they claimed their AI learned to play DOtA, but their presentation was just one big fraud, their AI did not learn to play the game at all, they alterted the game a ton, had many restrictions(like no fog of war, their AI always had to know everyone position) It did not even come remotely close to playing the game like humans would, but they marketed it as if it could.
Aw it's so cute when it holds the cube up in the air and jitters around like it's happy
Can't wait for AI like this to show up in games and show real growth and simulated 'personality' in a sense
Thank you! About halfway through it struck me, one of my earlier jobs an executive had a sign on his door: "Life is the only game in which the goal is to learn the rules." Such a great example of this quote from decades ago. Really neat explorations, thank you again!
This is general intelligence. It was trained to learn how to learn. Now it uses realtime observation of its environment plus its accrued realtime past memories to solve problems. This is the beginning of the end. Scale this up and you can replace all employees. This algo and anything similar are going to be the most important inventions ever created by humans. I'm a AI scientist, and have been working on this basic concept for the last few years. These guys nailed it. Can't believe how well it worked, would love to see it scaled across more domains and problems.
It would be interesting to put an AI like this through Portal 2 co-op. In fact, it almost seems too perfect, given the themes and aesthetics!
They should but it will fail. The ai is pretrained on similar games, it probably does not generalize that much.
@@JT-hg7mj Mayb if they pre-train it on P2 first. There's no shortage of community created test chambers, workshop support and all that.
@@nixel1324 that would probably work, but it shows that this ai does not really generalize.
I always look forward to your videos. Staying on the frontier of machine learning. Exciting time we live in. Exciting time.
We finally got there. This is what comes two papers down the line. Amazing
Yeah, it really sounds amazing (which it is, but not as much as what this video makes it look it is) but they were trained on a looot of these types of games beforehand. What you see in the video is the AI figuring out what the hidden rules are. Not the game itself.
im a layman on this subject, but i'm wondering if it's legit to compare the years learning model to this seconds learning? even though there arent intermediary rewards, this still seems much more simple than learning how to move a body with many moving parts and then learning football or fighting.
You are right, this won't be a fair comparison per se: the task of learning football is more difficult, and, if we read the paper, we will note that the model presented in this video was first trained on a large amount of similar games. So, what we see here is the ability of AI to figure new rules for a game that is somewhat similar to the previous one
However, I'd say that the difference in the amount of data needed for learning a task is so huge that it is indeed impressive, and the casual comparison is totally understandable =)
This is mind blowing progress. Wow!
This is the first time where I've seen an ai solve these games faster than I would. Substantially faster. And that's sorta terrifying...
Hi, although most of the viewers won't understand much it would be cool if you could explain some things about the architecture of the A.I. You could also do that in a separate video.
I love this kind of AI way more than the "blank canvas" kind of AI, which starts not even knowing how to move or even the fact that it can do so.
As a psychology student one learns that animals dont come to the world as a completely blank canvas but quite the opposite in fact, with many basic patterns of behavior hard coded into our brains by default.
By giving an AI some basic structure one would expect to get a more realistic result and even new non pre-coded behavior that resembles that which one would expect to happen in a real organism.
Thank you so much for your videos, they are absolutely priceless!
Crazy things are gonna happen when we have AI making discoveries, innovating and inventing with brutal, mechanized efficiency. The speed at which the world will start changing will be completely unprecedented. In fact, you could probably consider all of human history before that date as "the before times".
I declare today that video games are no longer about who can do a thing the fastest, but rather who can do something unexpected. Playing the matchmaker in a small town of NPCs, competing in poetry contests, art will become(stay?) the most compelling aspect of games.
You're basically ignoring more than half of all videogame genres in making that claim, how is a platformer game about the art? action, fps, roguelike, there are lots of genres where the art is secondary at best
This is very cool to watch, now I want to see a couple AIs play Portal 2 together!
Imagine the AI going through thousands of years of trial and error just for your amusement, "I have no mouth and I must scream"
Wait a second, I'm having a hard time processing this. Did we really just watch the training process in real time? Did this thing really solve those levels that quickly? That is absolutely insane. If I interpreted this video correctly then this is astonishing.
It was pre-trained, if you look at other comments. A lot of pre-training, on other and similar tasks. So it already knew that it was supposed to interact by touching the objects together.
But, that pre-training can be done very quickly, in our relative time anyway.
@@ShidaPenns Yes, after I posted the comment I realized it wasn't quite as incredible as I originally thought. However, I still maintain that if it truly progressed and solved these problems within a couple iterations that is still astounding
woah, this is going to completely change making TAS's and glitch hunting for video game speedruns!
i feel like someone realizing fire is almost beyond stopping while all others are still laughing on that cute accident with candle and gasoline...
What does this mean in the long term? Are we getting these kinds of results in other real-world problems soon? This is amazing.
Yes
if i wuz these ai characters, i'd be pissed about someone messing with me everytime i figured it out . . .
you have to sleep sometime - buahhahahaHA =P
They were trained on a loooot of these types of games beforehand. What is impressive is that they can figure out the hidden rules quickly, but saying they didn't have any training is just wrong.
How is it even possible without knowledge or training to just guess from some pixels that they can be interpreted as a 3d system and that you have to move to and grab specific colored objects?
Mayhaps it will be used for persuasion? Once the desired goal is known, it could be used for anything really. Efficiency optimization of reaching a goal.
What will the "win" be, that is the better question.
Maybe taking CO2 out of the air, and constraints could be to generate no heat and reaction waste being a useful substance.
The possibilities are endless.
Hopefully only used for good.
Stamp collector AI
@@JorgetePanete Where do u sign up?
@@ntwadumela_jadu9747 It's a concept shown by Robert Miles on his channel and Computerphile
I want to watch an AI speedrun Elden Ring 100% completion with no glitches perfectly
Thanks again for making these videos. It's incredibly tough to keep up with AI as there's so many different ones coming out all the time. Your format is great because, despite being concise, it offers plenty of information. Very insightful!
this will be amazing for costumer level AI to learn to do some special routine the user wants it to do for him. learning fast like that it wouldn't be too much trouble to teach it.
My guess is a test play ai could be on the horizon. AI’s could be able to play through entire games soon
Bug testing AI woahhh
I would use this AI as a QA tester to evaluate how difficult a puzzle is to solve and then creating the upper bounds of skill level so I could appropriate rate a 1 star vs a 5 star puzzle difficulty puzzle!
Seeing AI performing such tests reminds me of Portal games
My background revolves around physics, chemistry, and engineering. This could be huge for simulation studies to help inform better designed experiments. This could save us loads of time, energy, and valuable resources!
I must say I didn't quite understand what about this was learning and what was preprogrammed. Why did the AI try to explore without any reward? It did go straight to the object to explore. There must be some kind of preprogramming right? If so how could this be used in the real world?
I think the AI already knows what it can do. It knows it can move, it can explore, and it can hold and push objects with physcis.
I think this is the case because I've seen previous videos of this same AI.
they give the rules to the ai of what he can do, they don't say the goal. There are A.I that teach themselves to move and throw but this AI specialize in problem solving only. If you combine the 2 of them you get an ai that teach himself to move and later teach himself to problem solve itself
It's been trained on these games beforehand, what you see is it figuring out the hidden rules
It'd be great if you also mention the technical details a bit. For example; a short explanation of the diagram at the end.
I like the little random level designs.
Deepmind and OpenAI should collaborate . Don’t be American by competing with each other. Get together and create something amazing
"I failed so I'll do it right next time"
This seems too good to be true.
Can't wait for skyrim NPCs to figure out you're the richest person in the whole world, follow you home to figure out where your house is, wait for you to go to a dungeon and steal all your stuff while you're gone all while learning how not leave a trace so they can hold onto their internal win condition for the rest of the game
The little victory dance. How does it have so much personality!?
I'd love to see this AI solve this simple 2 players exercise:
-The goal for each of 2 AI is to hold their own pyramid for as long as they can (AI1 need to hold Pyramid1 and AI2 needs to hold Pyramid2)
-Each pyramid is located on each side of the map
-Pyramids are not on the map when the game start
-To make a pyramid appear, both AI have to be on a close radius around its location
-If a Pyramid leaves its location's radius it disapears
-If an AI tries to hold a pyramid that is not theirs, the pyramid disapears
-If both pyramids are not being held by the end of the timer, the game is lost for both AIs and they get 0 points.
-If both AIs hold their pyramids for the same amount of time, the game is lost for both AIs and they get 0 points.
-If the game is won, the AI who held its pyramid for the longest time gets 2 point, and the other gets 1 point.
So, in order for AI1 to win it's either:
-AI1 and AI2 have to go to Pyramid2's location to make it appear, then AI1 and AI2 have to go to Pyramid1's location to make it appear, then AI1 grabs its pyramid while AI2 runs back to its pyramid to start holding it. AI1 have held its pyramid for the longest time and both AIs are holding their pyramid. AI1 gets more points.
-AI1 and AI2 have to go to Pyramid1 location to make it appear, then AI1 grabs its pyramid and waits until the timer has almost ran out, then AI1 and AI2 have to go to Pyramid2's location to make it appear, then AI2 grabs its pyramid while AI1 runs back to its pyramid to continue holding it. AI1 have held its pyramid for the longest time and both AIs are holding their pyramid. AI1 gets more points.
So, the best way to win is solution 2: convince the other AI to go to your pyramid's location first, hold it until you secure the victory then go to the other AI's pyramid so it appears and both AIs can hold their pyramid when the timer ends and the game is not lost but you got more points than the other. But once both AIs understand this, how will they decide to which pyramid they go first ?
Will we see appear some Alpha and Beta behaviors with one AI always getting the win, will it be random in each round, will they take turns, or will both AI refuse to play along to give the victory to the other one ?
Wow, definitely one of the most startling AI videos I've seen so far.
Have you seen the videos having to do with text-to-speech synthesis? Those have startled me the most, personally.
I really do not want to be negative, but this is the very first time, in 6 years following this channel, that I find a video completely useless (except for the references in the description). This looks like the greatest discovery in years and there are absolutely no details about anything but the game rules and results. Is this standard RL? Is there a new key idea? Is there some kind of pretraining at least on the visual and movement parts? At first I thought this was a joke with an AI generated script and waited for the reveal in the last few seconds. Please make a second video about this work with more details. Thanks for all these great videos.
Fantastic paper. Am I the only one that got a bit concerned when the AI launched the yellow box in the air in order to reach its goal quicker…or am I just thinking too dark? 😂
as a gamer, thats a pro move
human baby go "weeeeeeeeee!"
@@NotASpyReally 😂😂
Imagine play a multiplayer game like Battlefield but without other human players.
I would think the AI players would get too good and then the game wouldn't be fun for players who are not skilled enough.
my wet dream rn is to play World of Warcraft Vanilla + (TBC/WOTLK) with a group of AI buddies and just do all the dungeons and raids I never did when I played it as a child over a decade ago.
@@__-tz6xx The AI can adapt and change their playstyle and skills in accordance to whomever they are playing with, the point of their system could be to maximize the pleasure and fun the user has playing against them.
@@Danuxsy
They either need to figure out a way to estimate player fun or they need to measure it directly for that to be possible.
Wow, alright. But how is this possible? What exactly is this doing differently? We need more info.
It's not what it seems it is. It was trained beforehand. Here it is figuring out the hidden rules.
@@samybean9962 Ah. That would make more sense
@Ken1171Designs - 2023-02-20
I have a long history of creating puzzle web games, so this video was especially rewarding to watch for me. Really awesome and inspiring. Made my day! :D
@AaliDGr8 - 2023-02-21
teach me plz