for my research project, I wrote a small python code that scrapes random public screenshots from prnt.sc and uses the Moondream2 vision model (from Hugging Face) to generate descriptions of the images.
It’s fascinating (and a bit concerning) to see what kind of information can be interpreted from completely random screenshots. Let me know your thoughts!
Hey everyone,
for my research project, I wrote a small python code that scrapes random public screenshots from prnt.sc and uses the Moondream2 vision model (from Hugging Face) to generate descriptions of the images.
Demo here:
https://youtu.be/xEsHRepfzks
It’s fascinating (and a bit concerning) to see what kind of information can be interpreted from completely random screenshots. Let me know your thoughts!
Link to repo: https://github.com/sensahin/VisionShot