Can you tell which sentence was spoken with Google AI? New audio samples are tough to distinguish from humans

The text-to-speech system, Tacotron 2, is made up of two deep neural networks. The first translates text into a spectrogram, which is a visual method that illustrates audio frequencies over time with spectral vectors. That spectrogram is then fed into WaveNet, a system from Alphabet’s AI research lab DeepMind, which reads the spectrogram and generates matching audio elements.

Quartz has six one-sentence audio clips, with three generated by AI and three spoken by a human hired by Google. Visit the site here to play the audio clips and determine whether you can decipher the human from the AI.

Although answers are not provided, author Dave Gershgorn notes “if you reveal the ‘page source’ and look at the filenames of each on the Google research website, one is labeled ‘gen,’ ostensibly to mark the generated sample.”

The system is currently trained to mimic only one female voice. Google would need to retrain the system to speak like a male or different female, Quartz notes.

At the Becker's 11th Annual IT + Revenue Cycle Conference: The Future of AI & Digital Health, taking place September 14–17 in Chicago, healthcare executives and digital leaders from across the country will come together to explore how AI, interoperability, cybersecurity, and revenue cycle innovation are transforming care delivery, strengthening financial performance, and driving the next era of digital health. Apply for complimentary registration now.

Next Up in Health IT

The hidden cost of lost clinical time and how leading health systems are responding

Next Up in Health IT

Vanderbilt Health, Siemens Healthineers ink $87M partnership

Start With Strategy: How Rural Hospitals Should Approach IT Trends

Health IT vendor breach exposes 442,000 patients' data

The hidden cost of lost clinical time and how leading health systems are responding

Next Up in Health IT

Join the 500,000+ healthcare executives who start their day with Becker’s