Earlier this week, Google introduced Gemini, a new AI model that powers Bard and will do a lot more in the future. As an example of Gemini’s capabilities, Google produced an impressive demo that was too good to be true. In fact, it was, but Google is showing exactly how it was made.
In the video, it looks like the person is talking live with Gemini and Gemini is responding right away. But that’s not really what happened. Google admits the video cuts parts out and does some editing. The voice you hear is not actually Gemini – it’s based on the text questions, not the real conversation. And the video only shows still pictures, not real video that Gemini saw.
Even though the demo was not completely real, Google’s blog post shows what they really did with Gemini, which is still pretty amazing. It shows the pictures and text questions used. In the first few, Gemini understands hand gestures for Rock Paper Scissors from just pictures. Then it understands when they are all put together. Later on, it recognizes the pattern of who is winning the game.
The lead engineer from Google DeepMind also talked about it on Twitter. He said the video was made to give developers ideas for how people could interact with Gemini in the future.
Google is already using the more powerful version of Gemini, called Gemini Pro, to power Bard. The smaller version, called Gemini Nano, will be on the new Pixel phone to do local things. An even more powerful version, called Gemini Ultra, will come out next year.
This helps explain how Google showed off Gemini’s abilities in their impressive, but not fully real, first demo video. It’s nice to see what they really did behind the scenes.