TTS Speech rework by Shivansps · Pull Request #7357 · scp-fs2open/fs2open.github.com

Shivansps · 2026-04-07T01:12:23Z

The objectives of this pr are:

Add TTS Speech options to ingame options settings. Incliding voice selection, voice rate, volume settings and select places were TTS is used
Add TTS Speech support to Linux OS by using speech-dispatcher/libspeechd-dev, this is done used dlopen and a small implementation of the lib types. In this way there is not additional dependency for compiling or in runtime, if speech-dispatcher is not installed on host OS, the speech system just fails to init.
Separate the speech system cpps into diferent files for each platform, copying the way it was done for mac, this is clearer and will make it easier to add other platforms, like android in the future.
Sanitize text was moved to an earlier stage, to fsspeech.cpp, to avoid having to repeat this bit of code in every implementation.

Note:
I may have broken mac tts with these changes and i have no way to test it.

Sessile-Nomad · 2026-04-07T07:48:59Z

Seconded. Also, options to increase/decrease the speed of TTS narration would be great if possible, since the default speed is slower than most people speak 🙂

…github.com into speech-rework

Shivansps · 2026-04-08T23:19:56Z

Seconded. Also, options to increase/decrease the speed of TTS narration would be great if possible, since the default speed is slower than most people speak 🙂

you got it

notimaginative

I tested it on Mac and confirmed everything was still working. Apart from the issues noted obviously. I'd like to check and test the Linux changes as well and will do that when I have a bit more time.

However, one desperately needed addition is the ability to actually test the voice changes before saving them.

notimaginative · 2026-04-14T14:55:12Z

+static SCP_vector<SCP_string> cached_voices;
+static bool voices_cached = false;


This really doesn't need to happen. It's a complete waste of memory. The voices should be cached of course, but that should happen when and where it's needed, not globally at startup.

I'm only marking that issue here, but it applies to win and linux versions as well.

this is needed for imgui, otherwise it would call the getvoices() at every frame.
There is any any to detect when we are out of the settings systems to clear memory? Ideally you would call it once when you enter to cache it and empty on exit.

ingame_options_close() can be used for the F3 settings screen, but I'm not sure how that interacts with SCPUI. Simple in theory, but what I didn't realize until now is that the options builder prints the current values to the log, which means it has to load all of this stuff at startup without the in-game options screen ever being active. Which also means that we can't just initialize the voices in in game_options_init().

We might be able to cache the name of the currently active voice and use that whenever it requests the id of the active voice. The options could be changed so that only ttsvoice_enumerator() gets the whole list of voices and builds a <int, SCP_string> pair vector and then display doesn't have to look it up. I believe the display resolution option does it like that so take a look in 2d.cpp for an example. You'd still have to get the list of voices every frame, but only once a frame, and only when on the settings screen, so you could probably get away with not caching it.

Hopefully something like that would work. If not then I think it's going to require changes beyond the scope of this PR. If that's the case then we'll just have to accept it as-is and have a look at improving OptionBuilder in the future to make dealing with this stuff easier.

notimaginative · 2026-04-14T15:55:43Z

+    // 180 wpm = normal
+    float rate = 180.0f * (rate_percent / 100.0f);


The base speaking rate varies by the voice in use so you can't use a general default value and have it sound right. I recommend using something similar to the attached diff which gets the default voice rate when a voice is set and uses that as the value for these calculations.

speech-rework-rate.patch

This applies to the Mac voice stuff only, but the Linux and Windows versions may need something similar, along with the value cap for safety.

MjnMixael · 2026-04-14T16:56:30Z

Regarding testing; The options UI framework doesn't really have a nice way to set that up as a separate control.

I'd recommend carving out a special case for this option's selector to play a test whenever the value changes.

notimaginative · 2026-04-14T18:47:38Z

+
+    SPDConnection* connection = spd;
+    if ( !Speech_init ) {
+    	connection = p_spd_open("fso_voice_list", "client", nullptr, SPD_MODE_SINGLE);


This needs to use the same client and connection names as the p_spd_open() call in speech_init().

notimaginative · 2026-04-14T18:53:43Z

 	endif()
 elseif(APPLE)
 	# it should just work
+elseif(UNIX)


UNIX includes any Unix-like OS, such as BSD, macOS, Cygwin, and others. If Speech Dispatcher or any system calls/headers used by the code are not generally available outside of Linux then please narrow down the check here.

If the code is intended to be used or available on platforms other than just Linux then I'd recommend renaming speech_linux.cpp to support that notion.

notimaginative · 2026-04-14T18:55:54Z

-IF(WIN32 OR APPLE)
-	OPTION(FSO_USE_SPEECH "Use text-to-speach libraries" ON)
-ENDIF(WIN32 OR APPLE)
+OPTION(FSO_USE_SPEECH "Use text-to-speach libraries" ON)


If speech isn't available outside of Windows, Mac, and Linux then this should still be behind a guard, or at least default to OFF if it's not one of those three platforms.

notimaginative · 2026-04-14T18:57:07Z

+elseif (UNIX)
+	add_file_folder("Sound"
+		${file_root_sound}
+		sound/speech_linux.cpp


Same issue using UNIX here as mentioned in the FindSpeech.cmake comments applies here as well.

notimaginative · 2026-04-14T19:02:20Z

+	auto rate = static_cast<signed int>(rate_percent - 100.0f);
+	if (rate < -100)
+		rate = -100;
+	if (rate > 100)


Should be else if (...), or preferably just use CAP(rate, -100, 100);.

notimaginative · 2026-04-14T19:14:45Z

+    SPDVoice** voices = p_spd_list_synthesis_voices(connection);
+
+    for (int i = 0; voices[i] != nullptr; i++) {
+    	SCP_string lang = voices[i]->language;


Why? There's no need to allocate memory for this. Just use strncmp() on voices[i]->language directly.

notimaginative · 2026-04-14T19:43:57Z

+    	}
+	}
+
+    SPDVoice** voices = p_spd_list_synthesis_voices(connection);


voices can be null so anything referencing it must be inside of an if (voices) {...} guard.

notimaginative · 2026-04-14T19:47:59Z

+    	SCP_string lang = voices[i]->language;
+    	// There are too many we cant add them all
+    	// Only add English voices
+    	if(lang.find("en") == 0) {


It would probably be best if this filtered based on the games current language index (in this case: "en", "de", "fr", "pl") rather than limiting it exclusively to English.

i was thinking about that, but it would not be correct as de version would only load DE voices that would not work with any mission on knossos that arent in german. Like, all of them.
This may need to trought into it then, getting getting the list of voices langs installed in system to filter and add some sort of selector based on that is more complicated. And it would need to be done for all OS.

Hmm, I guess I should have tested that. I'm sort of going off of how it works on Mac, since it will translate the text into the TTS language you've set (based on voice). And I don't know if the voice synthesis with Speech Dispatcher works the same way. If not then obviously the point is moot and how you're currently doing it is probably the safest option.

Ideally you would also want a language combo box to filter the voices. It is probably posible to populate it using the getvoices code. Ill look at this after fixing the other issues.

notimaginative · 2026-04-15T14:01:18Z

Regarding testing; The options UI framework doesn't really have a nice way to set that up as a separate control.

I'd recommend carving out a special case for this option's selector to play a test whenever the value changes.

I played around with this a bit, but it doesn't look like there's a way to do it. It's easy enough to add a test command to the change listener but that's only called when changes are saved. There doesn't appear to be any current mechanism to deal with control changes when they happen.

I assume we'll have to push the test feature down the road, extend options manager to have such functionality, then circle back and add it for TTS.

…ction

Done by notimaginative

Shivansps · 2026-04-19T20:12:06Z

I did the requested changes. taylor when you have the time please take a look. two things to add:

I did remove the voice cache from mac and windows, but i had to re-add for Linux, just the brute number of voices in the included by default espeak-ng makes it impossible to even open the settings. Its not the number on the combo box thats the problem, at least not alone, its the time it takes to process them. More than 100 loops on that "for" and you will notice the delay.
Regarding loading only english voices, adding another selector to do a filter is not viable rn for the same reason we cant do a voice test after changing a setting, there is no way to detect that the filter is updated so it updates the voice list.
This also might be kind of a overkill, because you dont normally deal with this absurd number of voices in a way they all cant be listed.
So what i did is to count them, and if there are less than 600 load them all and if there are more, only load english ones.

Also, normally you are not going to want to use the espeak-ng voices, they are terrible, so Linux users should be able to remove the espeak-ng backend and install a neural one like PiperTTS to speech-dispatcher. I havent tested that myselft but the point of using speech-dispatcher is that it should work with any TTS backend.

Shivansps added 11 commits April 5, 2026 16:40

add imgui speech options

9ad34dc

adapt existing windows sapi speech implementation

839d6b6

adapt existing mac speech integration

618cf58

add speech linux stubs

5a94f87

add speech support in linux

1efe01b

Add array checks

ecacd4f

Use dlopen for speech-dispatcher

ae8e56b

corrrect lib name

191061d

missing includes and static cast

fc5a017

do not change mac file type

4d71c38

fix clang tidy warnings 1

0c3534c

wookieejedi added enhancement A new feature or upgrade of an existing feature to add additional functionality. refactor A cleanup/restructure of a feature for speed, simplicity, and/or maintainability labels Apr 7, 2026

Shivansps added 5 commits April 8, 2026 19:27

set tts rate

be08a77

set localization ids

5e564ad

Merge branch 'master' into speech-rework

550ea1e

fix clang tidy warnings 2

205eaef

Merge branch 'speech-rework' of https://github.com/Shivansps/fs2open.…

5b9d842

…github.com into speech-rework

correct symbol name

5d47980

notimaginative requested changes Apr 14, 2026

View reviewed changes

Shivansps added 5 commits April 19, 2026 14:56

Remove voice cache and fix win enumerate_voices overriding voice sele…

0c27de6

…ction

fix mac rate

127e55a

Done by notimaginative

requested changes

6338623

re-add voice cache for linux

191400f

Open connection for linux get flags

8470aa9

fix missing }

4c41528

		static SCP_vector<SCP_string> cached_voices;
		static bool voices_cached = false;

		// 180 wpm = normal
		float rate = 180.0f * (rate_percent / 100.0f);

Conversation

Shivansps commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sessile-Nomad commented Apr 7, 2026

Uh oh!

Shivansps commented Apr 8, 2026

Uh oh!

notimaginative left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MjnMixael commented Apr 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

notimaginative commented Apr 15, 2026

Uh oh!

Shivansps commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Shivansps commented Apr 7, 2026 •

edited

Loading

Shivansps commented Apr 19, 2026 •

edited

Loading