Scalable audio REST API to convert, trim, concatenate, optimize, and compress audio files.
Firstly, your audio file must be uploaded or accessible to Bytescale:
Use the Bytescale Dashboard to upload a file manually.
Use the Upload Widget, Bytescale SDKs or Bytescale API to upload a file programmatically.
Use our external storage options to process external audio.
Build an audio processing URL:
Get the raw URL for your file:
https://upcdn.io/W142hJk/raw/example.mp3
Replace "raw" with "audio":
https://upcdn.io/W142hJk/audio/example.mp3
Add querystring parameters to control the output:
https://upcdn.io/W142hJk/audio/example.mp3?br=96
Play your audio by navigating to the URL from step 2.
By default, your audio will be encoded to AAC.
The default HTTP response will be an HTML webpage with an embedded audio player. This is for debug purposes only: developers are expected to override this behavior by specifying an f option when embedding audio into their webpages and apps.
To embed audio in a webpage using Video.js:
<!DOCTYPE html><html><head> <link href="https://unpkg.com/video.js@7/dist/video-js.min.css" rel="stylesheet"> <script src="https://unpkg.com/video.js@7/dist/video.min.js"></script> <style type="text/css"> .audio-container { height: 316px; max-width: 600px; } </style></head><body> <div class="audio-container"> <video-js class="vjs-fill vjs-big-play-centered" controls preload="auto"> <p class="vjs-no-js">To play this audio please enable JavaScript.</p> </video-js> </div> <script> var vid = document.querySelector('video-js'); var player = videojs(vid, {responsive: true}); player.on('loadedmetadata', function() { // Begin playing from the start of the audio. (Required for 'f=hls-aac-rt'.) player.currentTime(player.seekable().start(0)); }); player.src({ src: 'https://upcdn.io/W142hJk/audio/example.mp3!f=hls-aac-rt&br=80&br=256', type: 'application/x-mpegURL' }); </script></body></html>
The f=hls-aac-rt output format is designed to reduce the wait time for your listeners when the given audio has not been transcoded before. Like the other output formats, this audio format incurs an initial delay while transcoding starts. However, unlike the other formats, once transcoding begins the audio will be streamed to listeners during transcoding. As with the other formats, once transcoded, the resulting audio will be cached and will not need to be transcoded again.
To create an MP3 file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=mp3 to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the MP3 file:
https://upcdn.io/W142hJk/audio/example.mp3?f=mp3
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.mp3", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=mp3&a=/audio.mp3" } }}
To create an AAC file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=aac to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the AAC file:
https://upcdn.io/W142hJk/audio/example.mp3?f=aac
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.aac", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=aac&a=/audio.aac" } }}
To create a WAV file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=wav-riff to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the WAV file:
https://upcdn.io/W142hJk/audio/example.mp3?f=wav-riff
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.wav", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=wav-riff&a=/audio.wav" } }}
To create an HTTP Live Streaming (HLS) file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=hls-aac to the URL.
Add parameters from the Audio Transcoding API or Audio Compression API
You can create adaptive bitrate (ABR) audio by specifying multiple groups of bitrate and/or sample rate parameters. The end-user's audio player will automatically switch to the most appropriate variant during playback. By default, a single 96 kbps variant is produced.
You can specify up to 10 variants. Each variant's parameters must be adjacent on the querystring. For example: br=80&sr=24&br=256&sr=48 specifies 2 variants, whereas br=80&br=256&sr=24&sr=48 specifies 3 variants (which would most likely be a mistake). You can add next=true between groups of parameters to forcefully split them into separate variants.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the HTTP Live Streaming (HLS) file:
https://upcdn.io/W142hJk/audio/example.mp3?f=hls-aac&br=80&br=256
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.m3u8", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=hls-aac&br=80&br=256&a=/audio.m3u8" } }}
Real-time transcoding allows you to return HLS manifests (.m3u8 files) while they're being transcoded, rather than having to wait for the full transcode job to complete.
To create HTTP Live Streaming (HLS) audio with real-time transcoding:
Complete the steps from creating HLS audio.
Replace f=hls-aac with f=hls-aac-rt.
The result will be an M3U8 file that's dynamically updated as new segments finish transcoding:
https://upcdn.io/W142hJk/audio/example.mp4?f=hls-aac-rt
#EXTM3U#EXT-X-VERSION:3#EXT-X-INDEPENDENT-SEGMENTS#EXT-X-STREAM-INF:BANDWIDTH=2038521,AVERAGE-BANDWIDTH=2038521,CODECS="mp4a.40.2"example.mp3!f=hls-aac-rt&a=/0f/manifest.m3u8
The Audio Metadata API allows you to extract the audio file's duration, codec, and more.
To extract an audio file's duration using JavaScript:
<!DOCTYPE html><html><body> <p>Please wait, loading audio metadata...</p> <script> async function getAudioDuration() { const response = await fetch("https://upcdn.io/W142hJk/audio/example.mp4?f=meta"); const jsonData = await response.json(); const audioTrack = (jsonData.tracks ?? []).find(x => x.type === "Audio"); if (audioTrack === undefined) { alert("Cannot find audio metadata.") } else { alert(`Duration (seconds): ${audioTrack.duration}`) } } getAudioDuration().then(() => {}, e => alert(`Error: ${e}`)) </script></body></html>
The Audio Processing API can transcode audio from video and audio files:
The Audio Processing API can transcode audio from the following audio inputs:
File Extension(s) | Audio Container | Audio Codecs |
---|---|---|
.wma, .asf | Advanced Systems Format (ASF) | WMA, WMA2, WMA Pro |
.fla, .flac | FLAC | FLAC |
.mp3 | MPEG-1 Layer 3 | MP3 |
.ts, .m2ts | MPEG-2 TS | MP2, PCM |
.aac, .mp4, .m4a | MPEG-4 | AAC |
.mka | Matroska Audio Container | Opus, FLAC |
.oga | OGA | Opus, Vorbis, FLAC |
.wav | Waveform Audio File | PCM |
The Audio Processing API can transcode audio from the following video inputs:
File Extension(s) | Video Container | Video Codecs |
---|---|---|
.gif | No Container | GIF 87a, GIF 89a |
.m2v, .mpeg, .mpg | No Container | AVC (H.264), DV/DVCPRO, HEVC (H.265), MPEG-1, MPEG-2 |
.3g2 | 3G2 | AVC (H.264), H.263, MPEG-4 part 2 |
.3gp | 3GP | AVC (H.264), H.263, MPEG-4 part 2 |
.wmv | Advanced Systems Format (ASF) | VC-1 |
.flv | Adobe Flash | AVC (H.264), Flash 9 File, H.263 |
.avi | Audio Video Interleave (AVI) | Uncompressed, Canopus HQ, DivX/Xvid, DV/DVCPRO, MJPEG |
.mxf | Interoperable Master Format (IMF) | Apple ProRes, JPEG 2000 (J2K) |
.mxf | Material Exchange Format (MXF) | Uncompressed, AVC (H.264), AVC Intra 50/100, Apple ProRes (4444, 4444 XQ, 422, 422 HQ, LT, Proxy), DV/DVCPRO, DV25, DV50, DVCPro HD, JPEG 2000 (J2K), MPEG-2, Panasonic P2, SonyXDCam, SonyXDCam MPEG-4 Proxy, VC-3 |
.mkv | Matroska | AVC (H.264), MPEG-2, MPEG-4 part 2, PCM, VC-1 |
.mpg, .mpeg, .m2p, .ps | MPEG Program Streams (MPEG-PS) | MPEG-2 |
.m2t, .ts, .tsv | MPEG Transport Streams (MPEG-TS) | AVC (H.264), HEVC (H.265), MPEG-2, VC-1 |
.dat, .m1v, .mpeg, .mpg, .mpv | MPEG-1 System Streams | MPEG-1, MPEG-2 |
.mp4, .mpeg4 | MPEG-4 | Uncompressed, DivX/Xvid, H.261, H.262, H.263, AVC (H.264), AVC Intra 50/100, HEVC (H.265), JPEG 2000, MPEG-2, MPEG-4 part 2, VC-1 |
.mov, .qt | QuickTime | Uncompressed, Apple ProRes (4444, 4444 XQ, 422, 422 HQ, LT, Proxy), DV/DVCPRO, DivX/Xvid, H.261, H.262, H.263, AVC (H.264), AVC Intra 50/100, HEVC (H.265), JPEG 2000 (J2K), MJPEG, MPEG-2, MPEG-4 part 2, QuickTime Animation (RLE) |
.webm | WebM | VP8, VP9 |
Some codec profiles are not supported by Bytescale. It is worth noting that AVC (H.264) High 4:4:4 Predictive is currently not supported. We aim to provide a full list of supported profiles in the near future.
Use the Audio Metadata API to extract the duration, codec, and other information from an audio file.
Instructions:
Replace raw with audio in your audio URL.
Append ?f=meta to the URL.
The result will be a JSON payload describing the audio's tracks (see below).
Example audio metadata JSON response:
{ "tracks": [ { "bitRate": 159980, "bitRateMode": "VBR", "channels": 2, "codec": "AAC", "codecId": "mp4a-40-2", "frameCount": 35875, "frameRate": 46.875, "samplingRate": 48000, "title": "Stereo", "type": "Audio" } ]}
Use the Audio Transcoding API to transcode your audio to a specific format.
Use the f parameter to change the output format of the audio:
Format | Transcoding | Compression | Browser Support |
---|---|---|---|
f=mp3 | medium | good | all |
f=aac recommended | medium | excellent | all |
f=wav-riff | medium | none | none |
f=wav-rf64 | medium | none | none |
f=hls-aac | medium | excellent | requires SDK |
f=hls-aac-rt | fast | excellent | requires SDK |
Transcodes the audio to MP3 (.mp3).
Response: JSON for an asynchronous transcode job. The JSON will contain the URL to the MP3 file on job completion.
Transcodes the audio to AAC (.aac).
Response: JSON for an asynchronous transcode job. The JSON will contain the URL to the AAC file on job completion.
Transcodes the audio to Waveform (.wav) using the RIFF wave format.
Response: JSON for an asynchronous transcode job. The JSON will contain the URL to the WAV file on job completion.
Transcodes the audio to Waveform (.wav) using the RF64 wave format (to support output audio larger than 4GB).
Response: JSON for an asynchronous transcode job. The JSON will contain the URL to the WAV file on job completion.
Transcodes the audio to HLS AAC (.m3u8).
Response: JSON for an asynchronous transcode job. The JSON will contain the URL to the M3U8 file on job completion.
Browser support: all browsers (requires an audio player SDK with HLS support, like Video.js)
Transcodes the audio to HLS AAC (.m3u8) and returns the audio while it's being transcoded.
This output format is designed to reduce the wait time for your listeners when the given audio has not been transcoded before. Like the other output formats, this audio format incurs an initial delay while transcoding starts. However, unlike the other formats, once transcoding begins the audio will be streamed to listeners during transcoding. As with the other formats, once transcoded, the resulting audio will be cached and will not need to be transcoded again.
Caveat: This format introduces challenges for some audio players and audio SDKs due to the use of a live M3U8 playlist during transcoding. As such, we generally recommend using one of the asynchronous formats (which don't end with -rt) for a simpler implementation.
Response: M3U8
Browser support: all browsers (requires an audio player SDK with HLS support, like Video.js)
Returns a webpage with an embedded audio player that's configured to play the requested audio in AAC.
Useful for sharing links to audio files and for previewing/debugging audio transformation parameters.
Response: HTML
This is the default value.
Returns metadata for the audio file (duration, codec, etc.).
See the Audio Metadata API docs for more information.
Response: JSON (audio metadata)
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=auto) will be returned to the user while it's being transcoded only if the transcode rate is faster than the playback rate.
Only supported by f=hls-aac-rt and f=html-aac.
This is the default value.
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=false) will never be returned to the user while it's being transcoded.
Use this option as a performance optimization (instead of using rt=auto) when you know the variant will always transcode at a slower rate than its playback rate:
•When rt=auto is used, the initial HTTP request for the M3U master manifest will block until the first few segments of each rt=auto and rt=true variants have been transcoded, before returning the initial M3U playlist.
•In general, you want to exclude slow-transcoding HLS variants to reduce this latency.
If none of the HLS variants have rt=true or rt=auto then the fastest variant to transcode will be returned during transcoding.
Only supported by f=hls-aac-rt and f=html-aac.
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=auto) will always be returned to the user while it's being transcoded.
Only supported by f=hls-aac-rt and f=html-aac.
Use the Audio Compression API to control the file size of your audio.
Sets the output audio bitrate (kbps).
Supported values for f=aac, f=hls-aac, f=hls-aac-rt and f=html-aac:
•16
•20
•24
•28
•32
•40
•48
•56
•64
•80
•96
•112
•128
•160
•192
•224
•256
•288
•320
•384
•448
•512
•576
Supported values for f=mp3:
•16
•24
•32
•40
•48
•56
•64
•72
•80
•88
•96
•104
•112
•120
•128
•136
•144
•152
•160
•168
•176
•184
•192
•200
•208
•216
•224
•232
•240
•248
•256
•264
•272
•280
•288
•296
Not applicable to f=wav (Waveform audio files do not have a bitrate).
Default: 96
Sets the output audio sample rate (kHz).
Supported values for f=aac, f=hls-aac, f=hls-aac-rt and f=html-aac:
•8
•12
•16
•22.05
•24
•32
•44.1
•48
•88.2
•96
Supported values for f=mp3:
•22.05
•32
•44.1
•48
Supported values for f=wav:
•8
•16
•22.05
•24
•32
•44.1
•48
•88.2
•96
•192
Note: the sample rate will be automatically adjusted if the provided value is unsupported by the requested bitrate for the requested audio format (for example, AAC only supports sample rates between 32kHz - 48kHz when a bitrate of 96kbps is used).
Default: 48
Use the Audio Trimming API to remove parts of the audio from the start and/or end.
Sets the start position of audio, and removes all audio before that point.
If s exceeds the length of the audio, then an error will be returned.
Supports numbers between 0 - 86399 with up to two decimal places. To provide frame accuracy for audio inputs, decimals will be interpreted as frame numbers, not milliseconds.
Sets the end position of audio, and removes all audio after that point.
If te exceeds the length of the audio, then no error will be returned, and the parameter effectively does nothing.
Supports numbers between 0 - 86399 with up to two decimal places. To provide frame accuracy for audio inputs, decimals will be interpreted as frame numbers, not milliseconds.
Applies the trim specified by ts and/or te after the rp parameter is applied.
Applies the trim specified by ts and/or te before the rp parameter is applied.
This is the default value.
Use the Audio Concatenation API to append additional audio files to the primary audio file's timeline.
Appends the audio from another media file (video or audio file) to the output.
You can specify this parameter multiple times to append multiple media files.
If you specify append multiple times, then the media files will be concatenated in the order of the querystring parameters, with the primary input audio (specified on the URL's file path) playing first.
To use: specify the "file path" attribute of another media file as the query parameter's value.
Number of times to play the audio file.
If this parameter appears after an append parameter, then it will repeat the appended audio file only.
If this parameter appears before any append parameters, then it will repeat the primary audio file only.
Default: 1
The Audio Processing API is available on all Bytescale Plans.
Your processing quota (see pricing) is consumed by the output audio file's duration multiplied by a "processing multiplier": the codec of your output audio file determines the "processing multiplier" that will be used.
Audio files can be played an unlimited number of times.
Your processing quota will only be deducted once per URL: for the very first request to the URL.
There is a minimum billable duration of 10 seconds per audio file.
Audio billing example:
A 60-second audio file encoded to AAC would consume 45 seconds (60 × 0.75) from your monthly processing quota.
If the audio file is initially played in January 2025, and is then played 100k times for the following 2 years, then you would be billed 45 seconds in January 2025 and 0 seconds in all the following months. (This assumes you never clear your permanent cache).
Codec | Processing Multiplier |
---|---|
AAC | 0.75 |
MP3 | 0.75 |
WAV | 1.15 |
When using f=hls-aac, f=hls-aac-rt or f=html-aac (which uses f=hls-aac-rt internally) your processing quota will be consumed per HLS variant.
When using f=hls-aac-rt each real-time variant (rt=true or rt=auto) will have an additional 10 seconds added to its billable duration.
The default behavior for HLS outputs is to produce one HLS AAC variant.
You can change this behavior using the querystring parameters documented on this page.
HLS pricing example:
Given an input audio file of 60 seconds and the querystring ?f=hls-aac-rt&br=64&br=128&br=256&rt=false, you would be billed:
3×60 seconds for 3× HLS variants (br=64&br=128&br=256).
2×10 seconds for 2× HLS variants using real-time transcoding.
The first two variants on the querystring (br=64&br=128) do not specify rt parameters, so will default to rt=auto.
Per the pricing above, real-time variants incur an additional 10 seconds of billable duration.
200 seconds total billed duration: 3×60 + 2×10
This website uses cookies. By continuing you are consenting to the use of cookies per our Cookie Policy. Our legal policies were last updated August 16 2024.
This website requires a modern web browser -- the latest versions of these browsers are supported: