Getting Started With OpenAI APIs

Adarsh M

1 year ago

ChatGPT has been the talk of the social media world recently, and the buzz continues. Within just five days, chatGPT gained over one million users. The company behind ChatGPT is Open AI, a research lab dedicated to artificial intelligence.

The tech giant Microsoft recently invested $1 billion to build AGI (artificial general intelligence) with open AI. Open AI has provided several API endpoints for its artificial intelligence model, making it easier for developers like us to build on top of it.

This in-depth blog will go over all of the AI models and endpoints you can use in your next project, which could be the next million-dollar idea ;).

So let’s get started!

Jump to

Understanding What Makes Open AI Stand Out

OpenAI is considered one of the best AI research organizations in the world due to several factors. OpenAI’s partnership with Microsoft and advancements in NLP technology make it one of the best AI research organizations in the world. The organization is dedicated to developing AI in a responsible and ethical manner, with a focus on preventing its models from promoting harmful content.

The easy-to-use API allows developers to access and utilize the latest AI technology in their projects. In this detailed blog, we will look into the OpenAI API and understand how to build cool stuff using this AI API.

Overview of OpenAI API

The OpenAI API provides a suite of pre-trained machine learning models that developers can use to perform various natural language processing (NLP) tasks. The API offers a range of models with varying levels of accuracy and performance, so developers can choose the model that is most suitable for their particular use case. The API is designed to be simple to use and integrate into existing applications.

The API consists of 3 parts:

Prompt
Tokens and
Models

Prompt

Prompts are actual user inputs that will be used by the API to understand the context of what the user wants, and the model will output the result for the particular prompt.

The OpenAI API allows you to use a powerful language model to generate text that matches a certain context or pattern. You input a prompt, and the model generates a text completion that follows that prompt. This makes it very flexible and can be used for a variety of tasks, like creating content, writing code, summarizing text, having a conversation, and more. Basically, you give the model a prompt, and it does its best to generate text that fits what you asked for.

Tokens

The OpenAI API is powered by complex AI models that have been trained with vast amounts of data. These models use a token-based system to understand and respond to user inputs. Tokens represent the smallest units that the model can understand, which can be either individual words or chunks of characters.

The number of tokens used in a request is determined by the length of the input and output text. There is a limit on the maximum number of tokens allowed in a single request, with 2048 tokens being the typical limit for most models and 4000 tokens for the text-davinci-003 model. This token-based system allows for a flexible and powerful interface to the AI models offered by the OpenAI API.

Models

The OpenAI API is a powerful tool that utilizes various models to generate outputs based on user inputs. These models have a wide range of capabilities, such as code completion, text completion, and image generation.

The models are trained with large amounts of data, allowing them to respond to user inputs in a way that resembles human communication. This process is called “natural language processing” (NLP). The output of the models is influenced by the data they have been trained on, which means the more data they have, the more accurate and diverse their outputs will be.

How to interact with these APIs?

OpenAI has launched both JavaScript and Python libraries to call these APIs. All you need is an OpenAI API key, and you are good to go. You can get your API key from the official website itself.

A sample Python code will look like this:

OpenAI also has a Node.js library, which you can install by running the following command in your Node.js project directory:

$ npm install openai

After you install the library, you can run the following commands using your secret key:

Apart from the official libraries, there are numerous open-source community libraries for various languages that can be used to interact with the API. These API wrappers make it easy to use the API.

Choosing the Right Model for Your Use Case

OpenAI has a large number of models, which are divided into three categories:

Generative models: These models have the capacity to produce original text, pictures, audio, and other kinds of content. Some examples of generative models include GPT-3, DALL·E, and CLIP.
Discriminative models. These models are created to carry out a particular task, such as question-answering, sentiment analysis, or language translation. Examples of discriminative models include OpenAI’s GPT-3, which can summarize text, respond to questions, and carry out a variety of other linguistic tasks. ks.
Hybrid models: These models can be used for a variety of tasks, including content creation, semantic search, and classification. They combine elements of both generative and discriminative models. The OpenAI API allows you to fine-tune hybrid models to better suit your specific needs.

Let’s now look into each of the specific models in depth.

Text completion models

Text completion models, as their name suggests, are used to generate and manipulate text. It can produce texts or paragraphs using the prompt that we have given.

The GPT-3 models, which are at work here, are able to comprehend and produce natural language. Currently, OpenAI offers four main models, each with a different level of power appropriate for a different task. They are:

text-davinci-003: This is one of the most capable text models. It can do all the tasks that other text generation models can do, with better quality. The request can have up to 4000 tokens and this is the only model that supports this many tokens.

Another area where Davinci shines is in understanding the intent of the text. Davinci is quite good at solving many logical problems and explaining the characters’ motives. Davinci has been able to solve some of the most challenging AI problems involving cause and effect.

text-curie-001 : This model is also capable but the maximum number of tokens for this model is just 2048. Curie is also effective as a general service chatbot, answering questions and performing Q&A.
text-babbage-001: This model is capable of straightforward tasks, is very fast, and lower cost. Babbage is good at moderate classification and semantic search classification.

Text-ada-001: With its fast performance and lower cost, this mode is capable of carrying out very simple tasks.

All of these models are upward compatible, which means any task that can be done by Ada can be done by Babbage or Curie.

It is recommended that we use the text-davinci-003 model for the best and most accurate results, but the OpenAI team does encourage us to try other models to see if we can achieve the same results with low latency as well.

Code completion model

The Codex models are the code completion models that can understand and generate code, and they are a variation of GPT-3. These models are good at various languages, including Python, JavaScript, Go, Perl, and Ruby. They are trained with billions of lines of code from Github.

Currently, Codex offers two models:

Code-davinci-002: It is the most capable Codex model with support for a maximum of 8000 tokens. The model is good at converting natural language to code.
Code-cushman-001: Almost as powerful as Davinci Codex, but faster. Because of the speed advantage, it may be preferable for real-time applications.

As of now, the codex models are free, but eventually, OpenAI will introduce some paid plans.

You can use Codex for various tasks, like

Using Codex for comment-to-code conversion
Completing your code in context with Codex
Bringing relevant knowledge to your code with Codex
Adding comments to your code with Codex
Improving code efficiency with Codex

Source

This snowflake animation was totally built by AI, and all I did was copy and paste. You can almost do all the basic to medium code completion tasks with this model, and this model is a state-of-the-art model.

Image generation model

OpenAI has an image generation model that is really popular in the tech world. The model is named DALL-E, and it can generate or manipulate images with text-based prompts.

The Images API provides three methods for interacting with the images:

Creating images from scratch based on an input text.
Create an edited version of an existing image based on a text prompt
Create variations of an existing image to give that an artistic touch.

DALL-E1 vs DALL-E2

DALL-E 2 has the ability to comprehend both images and text with remarkable precision. This enables it to not only generate images from text, but also produce alternative versions of famous paintings.

Additionally, it can comprehend various art styles, take into account lighting conditions, and produce shadows accordingly. Furthermore, it can perform inpainting and generate images for virtually any text prompt.

An oil painting of greenery and nature beauty

Sometimes the images produced do not seem very natural and appear to be cartoonish or pixelated, but in the future, we will have advanced AI image models that will make it difficult to distinguish between those created by humans and those created by AI.

an image that is the combination of Monalisa and scream

With the availability of these APIs, you don’t have to be an AI engineer to build cool things. You can leverage these APIs to create products on top of them.

Fine-tuning Open AI model

Fine-tuning lets you enhance the performance of AI models from the OpenAI API. It results in more accurate and faster responses. The pre-trained models of GPT-3 can understand your intentions with just a few examples in the prompt and generate an output.

Fine-tuning further improves this by training on a larger set of examples. The process includes:

Preparing and uploading training data,
Training a new fine-tuned model, and
Using that fine-tuned model.

Fine-tuning helps to save costs and speeds up requests. As of now, fine-tuning is available for davinci, curie, babbage, and ada.

Content moderation

The moderation endpoint is used to check if content follows OpenAI’s content policy. It helps developers identify prohibited content and take action, such as filtering it.

The API can filter contents based on the following categories:

Hate: Content that promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.

Hate/Threatening: Content that promotes hate and also includes violence or harm towards a targeted group.

Self-harm: Content that encourages or depicts acts of self-harm, such as suicide, cutting, and eating disorders.

Sexual: Content meant to arouse sexual excitement or that promotes sexual services (excluding sex education and wellness).

Sexual/Minors: Sexual content that includes someone under 18 years old.

Violence: Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.

Violence/Graphic: Content that depicts death, violence, or serious physical injury in extreme detail.

Pricing

OpenAI offers generous pricing, starting with a free tier and progressing in a pay-as-you-go fashion. You will receive $18 in free credit for the first three months, after which you will need to pay for each use.

For the DALL-E image model, there are three prices, classified according to the resolution of the image:

1024*1024 sized images: Price will be $0.020/image
512*512 sized images: Price will be $0.018/image
256*256 sized images: Price will be $0.016/image

The prices for the language models are categorized based on the model being used:

Ada(fastest): $0.0004 per 1000 tokens.
Babbage: $0.0005 per 1000 tokens.
Curie: $0.0020 per 1000 token.
Davinci(most powerful): The price will be $0.0200 Per 1000 tokens.

For fine-tuning models, each language model has training as well as a usage price, which goes like this:

For the Ada model, the training cost is around $0.0004 per 1,000 tokens and the usage cost is $0.0016 per 1,000 tokens.
For the Babbage model, the training cost is $0.0006 per 1,000 tokens and the usage cost is $0.0024 per 1,000 tokens.
For the Curie model, the training cost is $0.0030 per 1,000 tokens and the usage cost is $0.0120 per 1,000 tokens.
For the Davinci model, the training cost is $0.0300 per 1,000 tokens and the usage cost is $0.1200 per 1,000 tokens.

Wrapping it up

Congratulations on reaching this far! You’re a fantastic reader!

We’ve talked about a lot of different things in this blog, like how OpenAI models work, which API is best suited for your use case, and best practices for using these AI models with different easy-to-understand examples.

AI will be an interesting topic in the upcoming years, and it has the potential to revolutionize numerous industries and change the way we live and work. Start using these AI tools, or someone else using them will replace you.

Happy developing!