How To Use the OpenAI API

What is OpenAI?

OpenAI is a research company that focuses on artificial intelligence (AI) and machine learning. Originally founded in 2015, OpenAI’s goal was to “advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.”

They pivoted in 2019 from a non-profit to a structure they call “capped” for-profit. The for-profit company, named OpenAI LP, was created to secure additional funding of $1 billion from Microsoft. The for-profit is still controlled by the non-profit called OpenAI Inc.

What OpenAI APIs are available?

GPT-3

The GPT-3 language models are adept at a range of text-related generation and transform use cases including copywriting, summarization, parsing unstructured text, classification, and translation.

Pricing has recently dropped as their engineers find ways to run their model more efficiently and pass those savings on to us! The model is pay-as-you-go and ranges from $0.0004 to $0.0200 per 1k tokens (~750 words) based on the quality of the model you select.

New accounts receive $18 in free credit that can be used during the first 3 months, which is great for experimenting. I’ve been using it for over a month regularly and have yet to reach $1 in usage fees!

Update — Nov. 28th, 2022
OpenAI has just released their latest GPT-3 model: text-davinci-003, that can do any task the other models can do, often with higher quality, longer output and better instruction-following. It also supports inserting completions within text, and is now trained up to Jun 2021 (The previous text-davinci-002 model was trained up to Oct 2019)

DALL·E

In January 2021, OpenAI introduced DALL·E, their state-of-the-art text-to-image model. In November 2022 they finally made it available as a public beta.

Pricing ranges from $0.016/image for 256×256 resolution to $0.020/image for 1024×1024 images. These can also take advantage of the $18 credit for new accounts mentioned.

Codex

Released in August 2021, Codex is a descendent of GPT-3 that translates natural language to code in over a dozen programming languages including JavaScript, TypeScript, Go, Perl, PHP, Ruby, Swift, SQL, and even Shell.

Some example usages include turning comments into code (magic!), contextual next line and full function completion, applicable API and library discovery, and rewriting code to improve performance.

API access is still in private beta, but you can join the waitlist. It also powers GitHub Copilot, so you can try it there today. You can also watch a pretty impressive demo of it in use.

Who is already building on OpenAI’s API?

OpenAI’s release of APIs to access these machine learning models has resulted in a wave of new startups and service offerings. The AI Writing Assistants category alone has over 100+ companies, with more launching every month.

Here are a few highlights of companies and products using these APIs in some form:

Notion has announced that they are integrating AI text generation features into their note-taking and productivity software. This functionality is still in private alpha and will bring the power of artificial intelligence directly into your Notion workspace.

GitHub Copilot is built on Codex and uses the context in your editor to automatically generate whole lines of code or even entire functions.

Microsoft is bringing DALL·E to a new graphic design app called Designer, allowing users to create professional quality social media posts, invitations, digital postcards, graphics, and more. They are also integrating it into Bing and Microsoft Edge with Image Creator enabling users to create new images if web results don’t return what they’re looking for.

Duolingo uses GPT-3 to provide French grammar correction functionality. An internal study they conducted showed that the use of this feature leads to improved second-language writing skills.

Viable uses language models, including GPT-3, to analyze customer feedback and generate summaries and insights, helping businesses better and more quickly understand what customers are telling them.

Keeper Tax helps freelancers find tax-deductible expenses automatically by using GPT-3 to parse and interpret their bank statements into usable transaction information.

How can you use the OpenAI APIs?

Getting started with OpenAI is straightforward:

Step 1 — Create a new OpenAI account

Go to https://beta.openai.com/signup and create you account.

Step 2 — Generate an API key

Visit https://beta.openai.com/account/api-keys and click “Create new secret key”. Make sure and save this newly generated key somewhere safe, since you won’t be able to see the full key again once you close the modal.

Reminder: your API key is a secret, so don’t share it with anyone, use it in client-side code, include it in blog posts, or check it into any public GitHub repositories!

Step 3 — Make a test call!

For simplicity I’ll use curl in these examples, but you can obviously make these calls using Postman, or in code using any of the language-specific libraries offered by OpenAI or the community: https://beta.openai.com/docs/libraries

Make this simple test request to get a specific model’s details, using your API key:

curl https://api.openai.com/v1/models/text-davinci-003 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY"

And if everything is working, you should get back a 200 response with this JSON results like this:

{
  "id": "text-davinci-003",
  "object": "model",
  "created": 1669599635,
  "owned_by": "openai-internal",
  "permission": [
    {
      "id": "modelperm-TULcdyRYjFvZGfo1snLsisCV",
      "object": "model_permission",
      "created": 1669678785,
      "allow_create_engine": false,
      "allow_sampling": true,
      "allow_logprobs": true,
      "allow_search_indices": false,
      "allow_view": true,
      "allow_fine_tuning": false,
      "organization": "*",
      "group": null,
      "is_blocking": false
    }
  ],
  "root": "text-davinci-003",
  "parent": null
}

GPT-3

To prompt the model to respond with “Hello World”, use the completions endpoint like this:

curl https://api.openai.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "text-davinci-003",
    "prompt": "Say Hello World",
    "temperature": 0,
    "n": 1,
    "max_tokens": 5
}'

You’ll get a response with the output test in the choices array, in this case as the only element:

{
    "id": "cmpl-6DeO...",
    "object": "text_completion",
    "created": 1669700523,
    "model": "text-davinci-003",
    "choices": [
        {
            "text": "\n\nHello World!",
            "index": 0,
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 3,
        "completion_tokens": 5,
        "total_tokens": 8
    }
}

A couple of parameters of note:

model – There are several models to pick from, some being less expensive per token than others. You can read more about them and their strengths and weaknesses here: https://beta.openai.com/docs/models/gpt-3

prompt – This is the important bit, and where your creativity is required to guide the language model toward the output you expect. Here are some great examples to get you started in learning what is possible and how to achieve your desired results: https://beta.openai.com/examples

temperature – In AI writing assistant products you’ll often see this called “creativity”. On a scale of 0 to 1, 0 being defined and tracking the prompt closely and 1 being “take lots of risks”. I used 0 in my prompt because I wanted it to return exactly what I asked for. Using higher values I found that I’d be less like to get “Hello World” back. For most uses, I’d recommend a 0.7 as a good starting point. You can read more about sampling temperature to learn about the concept.

n – This controls how many distinct completions to return from the prompt. Thus the choices key for the results array in the JSON response. Note: this number will act as a multiplier for token use, so keep an eye on this to manage your budget.

max_tokens – This tells the model how much content (tokens) to generate. There is an upper limit for each model, most being 2048 with newer models being 4096. I usually keep it around 256 or 512 for most things, but for long-form content, you’ll need to bump it up.

BTW, instead of making direct API calls, you can also use the GPT-3 Playground which has a simple web interface for working with the API, using your existing account. It also enables some other neat features like colorized highlighting of token probabilities.

DALL·E

The primary endpoint for creating images is, not surprisingly, images/generations.

curl https://api.openai.com/v1/images/generations \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
    "prompt": "An illustration of a computer programmer writing code for a tutorial blog post",
    "n": 1,
    "size": "512x512"
  }'

The response will contain and array of generated image URLs which you can then open to view them. In this case I’ve set the n parameter to 1, which asks for just a single generation.

{
    "created": 1668720514,
    "data": [
        {
            "url": "https://oaidalleapiprodscus.blob.core.../img-odP0XBN8rHaARDz7TdJMrXnJ.png?..."
        }
    ]
}

My first prompt asked for “a pencil sketch…”, and then I asked for “an illustration…”. I wouldn’t say these are exactly selfies, but not too far off!

You might notice a couple of the weaknesses of this current generation of text-to-image AI models, which also affects Midjourney and Stable Diffusion: trouble reproducing hands and text.

A couple of parameters of note:

prompt – The text description of the image you’d like to generate. Again, your creativity is required here to guide model toward the output you want. For some excellent example prompts and the resulting images, do checkout PromptHero.

n – Like the GPT-3 API, this controls how many variations to return from the prompt. Again, this number will act as a multiplier for image generations, so be conscience of this to manage your budget.

size – Dimensions of the generated images. Currently the options are limited to 256x256512x512, or 1024x1024.

Codex

I’m still on the private beta waitlist, so I haven’t been able to use this myself directly. I’ll refer you to the documentation for now.

What are some benefits of using OpenAI APIs?

While there are plenty of software tools and services you can subscribe to that essentially wrap the OpenAI APIs and add domain-specific concepts and value on top, sometimes you might want to directly use the APIs yourself for experimentation, bulk processing, custom workflow or integrating into your product (new or existing)… or simply to take advantage of the incredible cost savings!

And if working with the raw APIs directly isn’t something that works for your needs, interests, or skill level… you can always use one of those higher-level tools that have already integrated it for you:

GPT-3

There’s a new and growing list of GPT-3 AI writing tools maintained by Last Writer: AI Writing Assistant Directory

DALL·E 2

Join the waitlist for Microsoft Designer

Try the Bing Image Creator Preview (if it’s available in your region yet)

OpenAI also has its own DALL·E preview app that you can use as a simple UI to their model and uses your OpenAI account.

Codex

Being in a waitlist-gated private beta, your best option currently is to use GitHub Copilot.

Published by

protodave

Maker, breaker and fixer of things.

Leave a Reply

Your email address will not be published. Required fields are marked *