Skip to main content
Version: 0.0.0

Frequently Asked Questions

Pricing and Billing

What Is A Token?

A token is the basic unit of text or code that an LLM AI uses to process and generate language. The number of tokens per word varies, but is approximately 1.3 tokens per word. You can see the number of tokens that a given query consumed in the results the API provides you.

How Do I Get Charged If I Use Less than a Million Tokens?​

Fractions of a million tokens are billed accordingly (e.g., for 100k tokens @ $1/m-tokens, you would be charged $0.10).

How Can I Track my Usage?

You may monitor your real-time token usage on the Usage page under your account. Note that data may be delayed by up to 5 minutes.

Where Can I Find My Invoices?

You can find your invoices on the Billing page under your account.

Other FAQs

What is the context length for the base models ?

See the following pages for text generation, fine-tuning, and embedding models.

Are The Models The Unmodified Open Source Models?

Unless otherwise indicated, the models provided are the unmodified open source models.

How Can I Delete My Account?

Please reach out to us at We will be happy to help you close your account if you no longer wish to use it.

How Long Do You Hold Onto My Queries? My Results?

Anyscale may securely retain your queries or results for up to 30 days to help prevent security or technical problems

Do Users Share GPUs?

Endpoints may utilize shared resources. If you require dedicated GPUs, reach out to .

Do Users Share Computing Nodes?

Endpoints may utilize shared resources. If you require dedicated computing resources, reach out to

Which Cloud Service Provider Do You Use To Host The Endpoints? Can I Choose Where my Endpoints are Hosted?

Please check for the latest available Cloud Service Providers, which may be updated from time to time. If you require the ability to choose which Cloud Service Provider is being used to host the endpoints, you should use Private Endpoints; please reach out to to get started.

What Rights Do You Claim In My Queries? My Results? Does Anyscale use my Queries or Results to Train or Fine-Tune its Own Models?

Anyscale treats your Queries and Results as “Customer Data” under our terms of service (located at Anyscale doesn't use your Queries or Results to train or fine-tune a model – unless you use the fine-tuning service, in which case Anyscale tunes the model only for you. If you are interested in fine-tuning a model, reach out to

For Fine-Tuning, what data do you store? Can I delete this data?

If you use the fine-tuning feature, Anyscale stores both the fine-tuning data as well as the fine-tuned model (including parameters) within the Anyscale account to provide the fine-tuning feature. You may delete the fine tuning data with the API. See the Endpoints Docs for more information. Don't rely on this storage as backup for the fine-tuning data. Keep a backup of any fine-tuning data you wish to keep. If you wish to delete the fine-tuned model, reach out to