How to Adjust Speech Style?

Users can leverage ViiTor AI's Text-to-Speech (TTS) functionality to generate natural and lifelike voices, with the ability to customize voice styles to meet the demands of various scenarios. This guide provides a comprehensive overview of how to adjust voice styles using ViiTor AI, enabling you to create more expressive and engaging voice content.

1. Log in and Access the Text-to-Speech Interface

  1. After logging in, navigate to the dashboard.
  2. Click on the 'Text-to-Speech' feature option in the navigation bar.

2. Configure Voice Settings

  1. Select Voice Tone
    • From the preset voice list, choose your preferred voice tone (either from the voice library or your own cloned voice). Voice tags can help you filter voices that support emotional configuration.
    • If you wish to use your own voice, please refer to the instructions on Voice Cloning.
  2. Adjust Voice Parameters Adjust the following parameters based on your needs:
    1. Translation Direction:
      • Function: Automatically translates the input text into the specified language.
      • Adjustment Method: Select the target language from the dropdown menu, or keep it as 'No Translation' if not needed.
    2. Emotion:
      • Function: Adjust the tone and emotion of the synthesized voice to better align with the emotional expression of the text, such as happiness, sadness, anger, surprise, etc. Note: For best results, choose an emotion mode that matches the emotion of the text.
      • Adjustment Method: Select the appropriate emotion type based on the content. For example, happiness is suitable for advertisements, sadness for stories, and anger for dramatic scenes.
      • Recommended Values:
        • Advertisements: Happiness
        • Stories: Sadness or Neutral
        • Drama: Anger or Surprise
    3. Speed:
      • Function: Controls the playback speed of the voice.
      • Adjustment Method:
        • Fast: Suitable for time-sensitive content (e.g., advertisements, short videos).
        • Slow: Suitable for content requiring emphasis or emotional expression (e.g., poetry, stories).
      • Recommended Values:
        • Fast: 1.2x-1.5x
        • Slow: 0.8x-1.0x
    4. Volume:
      • Function: Adjusts the volume of the generated voice, making it louder or softer.