Subscribe to our newsletter and stay informed

Check out our list of top companies

Check out our carefully compiled lists of the most relevant and impactful companies within their fields.

Check out our list of top unicorns

Read and learn about the biggest companies that various countries have produced, how they made it, and what the future looks like for them.
November 7, 2023

OpenAI Unveils New APIs: DALL-E 3, Text-to-Speech, and More

OpenAI has been launching a series of groundbreaking APIs that are set to revolutionize the tech landscape

DALL-E 3, OpenAI’s text-to-image model, is now accessible via an API, following its initial integration with ChatGPT and Bing Chat. This latest API iteration retains built-in moderation features, designed to safeguard against misuse. Offering various format and quality options, it covers resolutions from 1024×1024 to 1792×1024, with pricing starting at $0.04 per generated image. However, it's worth noting that compared to its predecessor, the DALL-E 3 API has some limitations. Unlike DALL-E 2, it can't be used to modify existing images or create variations, and generation requests sent to DALL-E 3 are subject to automatic rewriting "for safety reasons" and to add more detail, potentially impacting precision based on the prompt.

OpenAI has also introduced an Audio API, providing access to six preset voices: Alloy, Echo, Fable, Onyx, Nova, and Shimer, along with two generative AI model variants. This API, available today, comes with pricing starting at $0.015 per 1,000 input characters. OpenAI's CEO, Sam Altman, emphasized the naturalness of the audio generated, noting its potential to enhance app interactions, accessibility, language learning, and voice assistance. However, it's important to note that OpenAI doesn't offer emotional affect control over the generated audio, though certain factors in the text, such as capitalization and grammar, may influence the voices' tone.

Furthermore, OpenAI announced the latest version of its open-source automatic speech recognition model, Whisper large-v3. This update boasts enhanced performance across various languages and is accessible via GitHub under a permissive license.

These API releases reflect OpenAI's commitment to expanding the accessibility and utility of advanced AI technologies. As the technology landscape continues to evolve, OpenAI is at the forefront of empowering developers and creators with innovative tools.

Neil Hodgson Coyle
Neil Hodgson-Coyle
Editorial chief at TechNews180
Back to top

Related articles

chevron-down linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram