AI Voices

Recording the voice

Follow these guidelines to maximise the quality of the AI voice:

Record at least 1 minute of audio
Keep the audio consistent
Replicate your performance
Find a good balance for the volume

How to prepare your voice training samples

When creating your custom voice you will need one or several training samples (Recordings). These should be recorded using the same equipment (microphone etc) and on the same set that the rest of the video is.

Must haves:

Needs to have the same audio grading/mixing that the final video will have.
Needs to have emotional range that reflects the rest of your video, can’t be overly monotone.

Nice to haves:

If the intention is to use AI-voice for name reading, the training data should aim to include a small set of actual name readings, e.g 5-10 greetings. This helps increase consistency.

<aside>

Training sample requirements

Accepted file-types: wav, mp3
Minimum length (per sample): 10 seconds
Maximum length (per sample): 3 minutes </aside>

Assuring high quality voice generation

The quality of the generated audio is mainly dependent on two factors:

The quality of the original training data
The creative itself (e.g how the generated audio is implemented in the script).

The more weaknesses there are in these two areas, the less life-like the end result will be.