r/Blind 14d ago

Technology Have you seen the new feature offered by Gemini?

https://aistudio.google.com/live

I haven't seen him pass here yet and I was quite surprised. Recently, Google released a feature of its artificial intelligence. You can now show your environment in real time for free and with fairly good fluidity. And he will answer you as if he were a Be My Eyes volunteer for example. It is capable of finding objects over small distances, it is even capable of reading the expiration date of certain products, the flavors of yogurts, etc. I'm really amazed, so to test, here is the link. Enjoy and see you soon

18 Upvotes

17 comments sorted by

5

u/jamestoh 14d ago

I have, been using it to describe game menus for me lol

3

u/highspeed_steel 14d ago edited 14d ago

Pretty dumb at tech here. Someone told me to go to AIstudio.google.com/live a few days ago and it doesn't work now. How do I access it?

2

u/ChipsAhoiMcCoy 14d ago

Yeah, it’s insane. It’s actually so much better than the real-time vision that OpenAI announced recently as well. I’m not sure what the deal is with OpenAI, but it seems like they’re quantizing all of their models once they release them to the public and they just end up being significantly worse. The vision model had so many mistakes today borderline isn’t really worth using in my opinion. This Gemini model though? It’s really genuinely fantastic.

1

u/DHamlinMusic Bilateral Optic Neuropathy 14d ago

They rolled a scaled down version into the lookout app, and it seems like that's eventually going to be more full featured.

1

u/Disastrous_Tap_3155 14d ago

So glad y'all already tried it. I was considering the subscription since the demos seemed helpful, but thanks google for doing this for free.

1

u/janneroblind 14d ago

No, what is the Name of it?

1

u/mi1ky_tea 13d ago

Okay may be a dumb question but how do I get gemeni on my galsxy s22+ ? So far it seems that I still have the regular ol:' google assistant.

-9

u/razzretina ROP / RLF 14d ago

I don’t care how much I can’t find something, I’m not giving Google of all things access to my camera. The company famous for telling us to make chlorine gas, eat glue, and fail to spell strawberry is not likely to be able to describe anything in a worthwhile way.

God. I can’t wait for this planet killing crap to finish failing already. It’s not worth wasting our drinking water on.

4

u/Vic_3300 14d ago

Umm, I understand that most of us have some very reasonable reservations about AI, but it is hard to argue that it has been beneficial to the blind community in so many ways. I've used it to describe pictures, convert messy pdfs to well formatted texts and describe maps and geographical features. While it is not accurate 100% of the time, based on fact checking and verifications from my friends and family, its pretty damn right on over whelmingly.

1

u/Disastrous_Tap_3155 14d ago

Would love to hear the name of the tool you use for messy PDFs. I have to resort to google docs as a college student and it's not a good conversion process.

1

u/Vic_3300 14d ago

Uploading them to Chat GPt and asking it to give me all the text seem to work fine most of the time.

1

u/DHamlinMusic Bilateral Optic Neuropathy 14d ago

Gemini app can do this, up to 10 files at once.

2

u/Effective_Meet_1299 13d ago

I mean, I get it but really? Reddit aren't exactly angels you know. Also, AI has changed so much for the blind community. Look at apps like PiccyBot or even Seeing AI, how do you think that works?

2

u/ChipsAhoiMcCoy 14d ago edited 14d ago

It has been pretty phenomenal for me so far. Did you have a bad experience with accuracy?

Also FYI, I’m pretty sure the glue comment was proven to be false. From what I understand, someone just used inspect element to change what the results said and the screenshot spread like wildfire.

If not that particular comment, then there must’ve been another copycat incident where people just used inspect element to change the result. I don’t rely on these AI search engines when I need accuracy in my workflow, but if it’s a question then I’m passively curious about, it doesn’t really hurt anything to ask. Also, with the vision capabilities this particular system has, It’s incredibly useful. If you don’t want Google to screw with your information, then quite frankly, just don’t stream anything sensitive to it. You can choose to share a window rather than your entire screen, so if you’re just using it for describing a video game for example, just use window share. It’s not like anything private is going to be happening in your game that someone else around the world hasn’t done 1 million times over.

I should also note that this new model is the highest scoring image model on the planet right now, and this is just the flash model. There hasn’t really been a single instance I’ve had where it’s been horrendously and accurate at all. It’s pretty mind blowing.

1

u/tymme legally blind, cyclops (Rb) 14d ago

The company famous for telling us to make chlorine gas, eat glue, and fail to spell strawberry

Posting on Reddit, which has said the same and far worse, totally fine.

But I get it. Just go back to Apple Maps telling you to turn left off the middle of a bridge, if their thousand-plus dollar hardware didn't fold in your pocket or completely melt from bad chips before you can use it.

-1

u/Vic_3300 14d ago

I have a feeling that person is one of those capitalism hating, terminally online tech nihilist type lol. Again, I think most of us has some understandable concerns about the wider implication of AI, but that person just seemed, sad.

1

u/Fuzzy-Identifier 13d ago

You’re cutting off your nose to spite your face