• by coldblues on 7/12/2024, 5:39:36 AM

  • by huac on 7/12/2024, 4:11:27 AM

    The samples were released a while back: https://google-research.github.io/seanet/stream_vc/

  • by judiisis on 7/12/2024, 5:41:19 AM

    What is the current best Foss(or otherwise) implementation for voice changer/anonymiser?

  • by udev4096 on 7/12/2024, 4:40:10 AM

  • by manishsharan on 7/12/2024, 1:20:57 PM

    Are there any use cases that is driving this ? Is there a huge burning need for technology ?

    Are kidnappers and con-men a huge under-served market that Google is hoping to serve ? Deep Fake videos not convincing enough to serve the need of fraudsters ?

    I am totally against regulating AI but shit like this gives fodder to the other side.

  • by gnat on 7/12/2024, 4:03:19 AM

    From the poster:

    In this work, we propose a light-weight (~20M param.) causal voice conversion solution that can run in real-time with low latency on a commercially available mobile device. The key design elements are: (1) using a causal encoder to learn soft speech units; (2) injecting whitened f0 to improve pitch stability without leaking source speaker info.

    In our later V2 version, we found that f0 rescaling followed by a NSF-style harmonic-plus-noise conditioning (as is done in RVC) results in better quality.

  • by froglus on 7/12/2024, 3:45:44 PM

    is it like discord or just voice chat, because i like to have things twice!!

  • by neilk on 7/12/2024, 1:10:38 PM

    What are the anticipated use cases?

    I know of one: transgender people often would like to alter the timbre of their voice and spend a lot of time training their voice. At least for online scenarios, this can just do it.

    But other than that AI voice altering research seems like it benefits mostly scammers? I’m just wondering what they tell themselves they’re doing. I didn’t see this in the paper.