Demos
Audio to Text using OpenAI's Whisper
11min
integrating openai models with procesio is something anyone can do due to the simplicity of using the call api action if you don't believe us, we're going to prove how easy it is with this brand new use case at the bottom of this article you'll be able to find import file containing this ready to use, use case following the import, credentials will need to be configured scenario let's say you work for a company that specializes in providing accessibility services for people with hearing impairments you want to generate transcripts from audio files such as podcasts or radio theater so that anyone who suffers from hearing loss or similar impairments can enjoy this type of content we're going to use the whisper model from openai to generate transcripts from audio files first, we retrieve the audio file from our google drive and then we generate the transcript the audio file that will be used in our example is an mp3 recording from marian https //archbee doc uploads s3 amazonaws com/pd o7nzlwdisbwglsha9q/hixlbaz8ny4gexwfdqaga thanks from marian mp3 retrieve the file from google drive we'll start by creating the credential to access google drive notice that the file is publicly accessible so no authentication is required to download it we define some variables to be used in our flow fileurl ➜ string string containing the audio file location to create the file url for your own files, you can follow this tutorial endpoint ➜ string string representing the google drive endpoint to download from downloadstatuscode ➜ integer integer representing the status code of the download request audiofile ➜ file file representing the downloaded audio file then we build the actual flow we extract the endpoint by removing the base url from fileurl using string replace we download the file using call api with our google drive credential for call api we will have to use the following headers content type ➜ application/force download content disposition ➜ attachment generate the transcript we create the credential for openai make sure to use your own api key instead of $openai api key when configuring the credential we define some variables to be used in our flow fileurl ➜ string string containing the audio file location audiofile ➜ file file representing the downloaded audio file queryresponse ➜ json json holding the model's response querystatuscode ➜ integer integer representing the model response status code transcript ➜ string string containing the transcript for the audio file then we build the actual flow we download the audio file by using call subprocess with our first process we query the whisper model using call api with a form data body we extract the transcript from the model's response using json mapper for call api we will use form data file \[ file file ] ➜ insert audiofile variable model \[ text text ] ➜ whisper 1 response format \[ text text ] ➜ json make sure to select the right type ( file or text ) when using call api with form data transcript hello, hello, smalllings this is marian from procesio, the technology that uncomplicates your automation life i want to say you rock! and thank you for trusting and being with us for two days already don't forget, procesio is a proven technology with use cases at enterprise level so, if you have use cases that you want to discuss or just need help, join our discord community and we will be more than happy to help happy automation with procesio! action pool call api string replace call subprocess json mapper import file use below procesio file, for importing this use case dirctly to one of your workspaces (feel free to create a new workspace dedicted for this example) https //archbee doc uploads s3 amazonaws com/pd o7nzlwdisbwglsha9q/hbno21k8jut055kb0kcce audiotranscripts procesio