r/Blind • u/janneroblind • 1d ago

Can Google Gemini describe live situations?

I have seen a Video where someone used Chat GPT and asked What was in front of him. Does that work With Google Gemini as well?

5 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/Blind/comments/1hmjxez/can_google_gemini_describe_live_situations/
No, go back! Yes, take me to Reddit

86% Upvoted

u/Ok-Virus-2198 1d ago

OpenAI's Chat GPT, Google's Gemini, and Meta AI live mode can and will describe only the moment when you ask about scene in front of you. It doesn't actually stream a video, it only takes a snapshot when asked.

In future, Google in the project Astra, plan to take a one snapshot per second,to describe changes in the scene. Most likely, that's coming in 2025, but currently - it takes a photo and describes it when asked.

u/VixenMiah NAION 1d ago

Judging by the adverts, it does everything but cook you breakfast. That will probably be in the next update.

u/Bachelor-pad-72 1d ago

No as other people have said here it's only taking a picture as you ask the question and describing it. Amazing amazing stuff but not yet describing a video feed

u/1makbay1 1d ago

I have an Iphone, and it now has “live recognition” available through Voiceover which you can set to “scene and it will constantly describe things in front of you as you move your phone around, but you can’t ask questions. I’m guessing this type of constant live description of will probably happen for other AI interfaces as well, but I don’t know when there will be an interactive element added to a live description. It would be nice to be able to ask questions and get responses along with the live recognition feature.

1

u/janneroblind 22h ago

How do I that with VoiceOver?

1

u/1makbay1 20h ago

The best way to get to it reliably is to go to settings>accessibility>voiceover, then go to the rotor, and add live recognition to your rotor. When you turn the rotor to live recognition, use swiping up and down and double tapping to choose the things you want the live recognition to find, such as doors, people, etc. The “scene” setting tells you all the things the camera is pointed at. If you have a late enough Iphone, such as 12 pro or later, the lidar will also tell you how far away some of the things are.

u/Embarrassed-Bison767 1d ago

You constantly have to ask it again. It won't give you live updates unless you keep querying it. Do not use this for dangerous situations.

2

u/DHamlinMusic Bilateral Optic Neuropathy 1d ago

Wel of course, the ToS of every one of these things, both for video and images say this, as do the ToS for both BeMyEyes and Ira calls.

Can Google Gemini describe live situations?

You are about to leave Redlib