Gpt went like: U SAID COLUMNS, YOU GOT THEM.
“Ok go make a castle” Llama: “birthday cake got it”
9:33 the way LLama kept trying to flip the switch as if it’ll make the lamp light up somehow 😂
When are we getting their group speedrun attempt to beat the dragon?
10:28 that "does anyone know where i can find some" makes me really want to see these ais try to do something together. like just set them loose in survival mode and see what happens
the aliens gave egyptians creative mode? I see!
It's actually cute watching these AI models try
It is still interesting to see these LLMs do their best at understanding how to build in Minecraft, i wonder if more of them ever get image scanning abilities, you could let them take pictures of builds or the environment so they can see what they built and they can auto-correct?
the future of gaming looks amazing, imagine having multiple ai bots that help you in your world all dwarf fortress style base building
4:08 In Llama's defence, I can see how those could be described as one-block columns spaced 'one block apart' as requested at 2:48, it's just included the column itself in the measurement of 'spaced'.
@Emergent Garden Recommendation from me: Your prompts have no leverage, what I mean is that the LLM does not handle complex building tasks well because its limited by the single shot answer it needs to generate. Your template for "NewAction" is a great idea, my idea to improve its leverage is to add another template "NewActionPlan" Which it then fills with a list of generated prompts that will then be fed back into itself one after another (kind of like writing a todo list before getting started) My vision for it was kind of like this: -You whisper"Build a bridge for me" -<LLM> "Okay lets plan this out" used newActionPlan -<LLM> Okay lets see whats first on the todo list... used actionPlan[0] <LLM> Sure I will build the supporting pillars used newAction ...etc Getting a shared reference point for superimposed building actions is of course something to consider. Using plans recursively might also be interesting, like making a plan for planning multiple plans for even more abstracted tasks. Some way of sensing the world is possible, maybe you can let it take screenshots of the game and feed the image into some of the multi modal image recognition capable models
Doing this without computer vision is interesting and really makes me appreciate how incredibly complex the human brain is to be able to do so much in real time. Imagine the resources needed to give a multimodal model with vision/language/action the ability to play in real time, the power requirements, where we can just eat for energy
you know we are doomed when gpt doesn`t know how to do an or gate but loves tnt
GPT4 lighting the portal on itself and going in was absolutely whiplash inducing
11:12 its so cute how Gpt ran away 😭
I don’t know exactly how your system works but have you tried letting them use something like mathematical curves for building? Like vectors at positions pointing to positions with some formulas on top if required? Another thing you could do to help them out is allow them to write classes per object in a build. I think this would be great for things like columns because they then realise there would be spatial rules like spacing.
i think it would be cool if you let every iteration build a skyscraper and add them all to a single city which will then grow with skyscrapers that are slowly getting better so you can see the improvement in one place
1:28 llama(animal): who the heck are you? Llama (AI): im you but better.
Gemini 1.5 is generally available via Vertex AI since a couple of days. You can also create an API key via AI Studio; it's not only their chatbot interface and a little easier to create an account.
@ChannelMiner