Skip to main content
Version: 0.0.0


Anyscale Endpoints offers the best open-source large language models (LLMs) as fully managed API endpoints. This allows you to focus on building applications powered by LLMs without the need to worry about the underlying infrastructure.

  • Ease of use: Our platform provides simple APIs to query and, soon, fine-tune LLMs.
  • Fully managed: With features such as auto-scaling and pay-as-you-go, we keep the models up and running so you don't have to.

Get started

  1. Register an account. (If you are viewing this from Anyscale Endpoints, you can skip this step.)
  2. Generate an API key.
  3. Run your first query:

Please go to Query a model page for more model query details.

import openai

query = "Write a program to load data from S3 with Ray and train using PyTorch."

client = openai.OpenAI(
base_url = "",
api_key = "esecret_ANYSCALE_API_KEY"
# Note: not all arguments are currently supported and will be ignored by the backend.
chat_completion =
messages=[{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": query}],
for message in chat_completion:
print(message.choices[0].delta.content, end="", flush=True)