How To Add OpenAI Text-To-Speech To Your Bubble App (Complete Guide)
5th of March, 2024
Are you looking to add Text-to-Speech to your Bubble app? This step-by-step guide shares everything you need to know about building an integration with OpenAI.
When it comes to this guide, you won’t need to have any previous coding experience. We’ll provide clear instructions and insights throughout the process.
The steps to integrating OpenAI’s Text-to-Speech model with Bubble include:
1. Connecting to the OpenAI API
2. Reviewing the OpenAI API documentation
3. Creating an API call from Bubble to OpenAI
4. Designing the UI of your Bubble app
1. Connecting to the OpenAI API
The first thing we’ll need to do is create a connection between our Bubble app and our OpenAI account. This connection will allow us to send text from Bubble, then have OpenAI turn this into a spoken audio file.
When creating a connection between two platforms, you’ll need to use what’s known as an API. If you’re brand new to Bubble, the concept of APIs can seem overwhelming, but we assure you, you don’t need any coding experience to be able to use these.
The beautiful thing about working with APIs in this day and age is that if you run into any issues along the way, you can always troubleshoot errors using ChatGPT itself. It is an incredibly helpful resource.
But for our tutorial today, we’re going to jump straight into a brand new Bubble editor. Once here, we’ll open the plugins tab, then select the plugins library. The plugin we’ll need to install is the free, API Connector plugin.
After installing the plugin, we can create our first API. For this API, we’ll name this after the service we’re going to connect with. So this, of course, will be ‘OpenAI’. And then this is where all of the real work is going to start.
In order to set up and configure our API correctly, we need to reference some documentation written by OpenAI, which tells us exactly how to do that. You can find a link to that documentation here.
When you click on this, it’s going to take us to their text-to-speech model, and this includes all of the steps that we need to follow.
Now, what we’re interested in finding right now is the quick start instructions. So this little bit of code here contains absolutely everything we need to follow.
When reading this code, we’d always recommend making sure it’s set as the ‘Curl’ option. The curl format can easily be interpreted by Bubble. Today, however, we’re going to manually connect this with Bubble just so we can explain a few core concepts along the way.
2. Reviewing the OpenAI API documentation
Before setting up any API calls, we first need to ensure we generate an API key for OpenAI.
An API key allows you to securely connect multiple services together. You can think of it like a key that opens a door between each platform.
In order to source your OpenAI API key, you’ll first need to ensure you’ve created an OpenAI account. Then from here, you can open the ‘API keys’ tab, and you’ll see an option to generate a new API key.
Once you’ve created a new secret key, make a copy of this, then re-open your Bubble editor.
Under the API you’ve created, you’ll see a dropdown that provides multiple ways to authorize your API. From this dropdown select the ‘private key in header’ option. You’ll then see a field that allows you to past in your OpenAI API key.
Before pasting in your API key, it’s important to first add the word ‘Bearer’ at the beginning of the field. This will structure your API key to ensure you’re the rightful bearer of this code.
If you were to review the OpenAI documentation, you’ll also notice that under the field called ‘Authorization’, it prompts you to type in the word ‘Bearer’.
Within the documentation page, you might also notice that there’s an additional ‘header’ field we need to configure. This header is known as the ‘Content-type’.
The purpose of this value is to determine what type of content will be stored within our API call. You see, OpenAI has a whole library of different API services. There’s things like text generation, image generation, audio generation, and many more. Each of these unique services stores and sends data in a unique way – which is why it’s important to label what type of content it contains.
For the sake of this guide today, we’re not going to add this ‘Content-type’ header to our overall API, but instead, we’ll add this individually to each API call.
3. Creating an API call from Bubble to OpenAI
Within our overall API, what we’re going to do is create our first call. A call, as the name suggests, is a bit of information that you’re going to send through a message. It’s kind of like the action you’re going to perform within the API. Now, we’re going to expand this here, and the first thing we’d like to do is give this API call a name. We’re going to call it “Text-to-Speech” because that’s exactly what we’re going to do.
Then, from here, we’re just going to need to make a few minor tweaks to some of these settings in the dropdown menus. When it comes to APIs, you’ll notice that there are two options you can reference: there’s something known as “data” or there’s an “action”.
If you’re connecting to a third-party service, you’re most likely going to want to receive data or send data to it. So, if you use an API call as data, that means you’re receiving data. You might use that for an application where you’re displaying things like stock or share prices or the price of crypto. In this case, you’re constantly just pulling information from a service to display inside your own app.
On the other hand, with the action option, what that means is that you want to perform an action to this API. So, you want to send something to that API service and you want it to do something, which is exactly what we wanted to do today. We want to send text from our Bubble app, then have OpenAI receive this, then turn it into an audio file.
So for our API, we’ll need to select the action option for the data type. Then, from here, what we need to add the URL of this specific API endpoint. If we revert back to our documentation, what you’ll see at the top of this curl code is a URL link. This URL highlights where this API service lives on the internet.
Within this API call, what we essentially want to do is write a letter to this API service and say, “here’s a bunch of text. we’ll post it to you. It’s going to go to your home address, which is this URL. When you receive this letter, we want you to read this text, then turn it into an audio file.”
So what we need to do is copy that URL and paste it into the available field in Bubble. We also need to update the option provided to be “post”. As we mentioned, we want to use this address to post information.
Now, we can move down to the header field of our API call. Do you remember in the OpenAI documentation that there was a header value called ‘Content-type’? This is exactly where we’ll add this field.
If we revert back to the OpenAI documentation page, we can see that within the ‘Content-type’, the value should be listed as ‘application/JSON’. This just determines that we’ll structure the content of our API call in the JSON format.
Star by copying the ‘application/JSON’ text, then paste this into the content field in Bubble. When adding this field, you’ll see a checkbox named ‘private’ beside this. Leave this checked by default as we don’t want to expose this option within a workflow.
For the final part of our API call, we now need to configure the data parameters. Parameters determine what information you’ll be sending through each API call. This is the data that OpenAI will receive.
If you’re new to working with APIs, this is the part that can seem the most overwhelming. What you’ll find, however, is that the whole process is relatively straightforward.
If we review the OpenAI documentation again, we can see that the JSON code highlights all of the parameters we need to add. In fact, we can simply copy this snippet of code, then paste this into the parameters field of our API call.
When reviewing the JSON code, you’ll see there’s three parameters for this API call. These include:
- The OpenAI model we’ll be using
- The text we’ll send through. This is the text will be turned into audio
- The voice. This is the style of the AI voice you wish to use for the audio file
When it comes to these parameters, this is probably the most confusing part of working with APIs. Because these parameters are structure in actual code, it can often seem overwhelming. But look, please just try and avoid the squiggly brackets and all of the quotation marks. We assure you, it’s nothing too complex. In fact, if we were to run this API right now, it would work. So, it would send through this information saying this is the exact model we want to use.
Before we can run this model, however, we’ll need to make a few minor customisations. These customisations will allow us to make the parameter fields dynamic.
Right now, the parameters are static values. This means that each time the model runs, it will send through the exact values you see here. Now, this will become a problem when our users want to customise the audio file they can generate. You see, each user will want to send through their own unique text, as well as select a specific AI voice. This is why it’s important to make some of these parameters dynamic – so we can modify the data sent to OpenAI in every API call.
In order to make these parameters dynamic, you’ll just need to add triangle brackets ‘<>’ inside your JSON field. In between yoru triangle brackets, you’ll need to give a name to each parameter.
For our API call today, we’ll create two dynamic parameters. There’s the field, as well as the field. After adding these parameters, Bubble will display an input field for each option. Once this is visible, take the time to add an example value for each of these fields. You’ll need to do this in order to initialise your API call.
Although the values you add inside your API call look static, we’ll soon show you how these can be dynamically changed once we build a workflow.
At this point, we’re now ready to initialise our API call. This will just provide us with a confirmation of our connection. If the API call is successful, Bubble will show a popup that displays all of the values OpenAI returns after it receives your API call.
As you can see, in our popup, there’s an audio file. If we select this, it will open a new browser tab and allow you to play the content. In my example today, this will speak the words “this is test text”.
And just like that, you’ve successfully completed all of the steps to setting up your custom API call.
One thing to note is that you may receive an error when trying to initial your API call. This error will often state that you don’t have enough tokens or credits within your OpenAI account. If you see this popup, you’ll need to add a payment method to your account, then pre-pay for credit.
4. Designing the UI of your Bubble app
After configuring your API, this is where the fun begins. We can now build our own application that makes use of this new feature.
If we open our Bubble editor, we’ll head to the design tab. In our example today, you’ll see we’ve already built a demo application. For the sake of our guide, we’ll review how this page is set up, then also take a moment to review the database structure.
On this page, things are pretty basic. we don’t have too much happening. We’ve got some text elements, a multi-line input field where a user can type in their text, and a drop-down menu that allows someone to select from an AI voice. We also have a button that is going to trigger a workflow.
But if we scroll down, we also have a repeating group on the page. If we open the property edit of this repeating group, it’s displaying a data type known as ‘AI audio’. For the data source, it’s simplifying performing a search in my database for all of the ‘AI audio’ entries.
If we quickly digress and open our data tab, we can review how this data type has been configured.
Inside of this data type, there’s two data fields:
- Audio file – set as a file type
- Input text – set as a text type
Of course, your database might look a little bit different. For instance, if you’re building a blogging platform and you want to be able to turn someone’s blog into an audiobook, all you need to do is add these two fields under your main data type.
If we then go back to the design tab, we’ll review the elements inside our main repeating group. As this repeating group displays a list of all the AI audio files in our app, it contains a text element and an audio player.
The text element allows us to display the input text that a user has added. The audio player, on the other hand, allows us to play the audio that OpenAI generates.
5. Building an audio player in Bubble
When it comes to adding an audio player, there’s multiple ways you can achieve this in Bubble. You could, of course, install a plugin. Within our tutorial today, however, we’re going to use a simple snippet of HTML to create our own custom player.
Even if you don’t have any experience working with HTML or CSS, this process is incredibly easy. Simply follow this link to W3 Schools. This link provides you with a snippet of HTML that automatically formats as an audio player within any browser.
Make a copy of this HTML, then open your Bubble editor. Inside your editor, add a HTML element into your repeating group. Inside of this element, paste in the code you copied from W3 Schools. This should look like:
<audio controls>
<source src=”horse.mp3″ type=”audio/mpeg”>
Your browser does not support the audio element.
</audio>
In order to make this HTML audio player functional, you just need to swap out the reference to the file source.
Highlight the code that refers to “horse.mp3”, then remove this. You’ll need to replace this with a dynamic value of the audio file you wish to play from your own Bubble database. To do this, simply insert dynamic data, then reference the ‘current cell’s AI Audio, the audio file field’.
Like any element in Bubble, you can also customise the dimensions of the HTML audio player, including any margin around it.
If you were to now run a preview of your app, you’ll see this element will automatically format like an audio player.
6. Building the workflows
One of the last key things we’ll need to build is the workflows that power our API call. Thankfully, this process is relatively straightforward.
On our page, select the button element we referenced earlier. In our example app, this button displays the words ‘generate audio’.
Now, we can choose to trigger a workflow when this button is clicked.
When it comes to this workflow, there are only a few steps we need to build. With the first step, we’ll reference the API call we’d previously created. To access this, open the plugins menu, then choose the ‘Text-to-Speech’ API.
Now, you’ll see a series of input fields where we can update the dynamic text for our model. If you remember, the two fields we can change are the input text, as well as the AI voice.
We’ll need to match the input text to the multiline input field on our page. When it comes to the AI voice, we’ll also match this to the dropdown menu that displays a list of all the available options. These options include:
- Alloy
- Echo
- Fable
- Onyx
- Nova
- Shimmer
After configuring this step, we’ll then need to store the data that OpenAI returns once the model has successful ran. To do this, add an additional step in your workflow that ‘creates a new thing’. The type of thing you’ll want to create is an ‘AI audio’.
When creating this new entry, simply match the relevant data fields with the returned values from OpenAI. In this case, it means we’ll take the newly generated audio file, then save this under the ‘audio file’ field.
Finally, we’ll add one last step to our workflow that resets the values of the input fields on our page.
And that is absolutely everything we need to build within our workflow tab. If we were to now run a preview of the app, you can see the entire end product in a functional state.
At this point in time, you now know how to create your very own Text-to-speech model using both OpenAI and Bubble. Today, we’ve only just highlighted one use case, but the possibilities are truly endless.