Version: Latest

Chat: `google/gemma-7b-it`

info

See the Hugging Face model page for more model details.

About this model

Model name to use in API calls:

google/gemma-7b-it

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They're text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

Model Developers: Google

Variations: this model corresponds to the 7B instruct version of the Gemma model. There are also variations of the model that are the 2B base model, 7B base model, and 2B instruct model.

Input Models: input text only.

Output Models: generate text only.

Context Length: 8192

License: a custom commercial license is available at: https://ai.google.dev/gemma/terms

Get started

Register an account. (If you are viewing this from Anyscale Endpoints, you can skip this step.)
Generate an API key.
Run your first query:

See Query a model for more model query details.

Python SDK streaming
Python SDK
Node Streaming
Node
cURL

import openai

query = "Write a program to load data from S3 with Ray and train using PyTorch."

client = openai.OpenAI(
    base_url = "https://api.endpoints.anyscale.com/v1",
    api_key = "esecret_ANYSCALE_API_KEY"
)
# Note: not all arguments are currently supported and will be ignored by the backend.
chat_completion = client.chat.completions.create(
    model="google/gemma-7b-it",
    messages=[{"role": "system", "content": "You are a helpful assistant."},
              {"role": "user", "content": query}],
    temperature=0.1,
    stream=True
)
for message in chat_completion:
    print(message.choices[0].delta.content, end="", flush=True)

import openai

query = "Write a program to load data from S3 with Ray and train using PyTorch."

client = openai.OpenAI(
    base_url = "https://api.endpoints.anyscale.com/v1",
    api_key = "esecret_ANYSCALE_API_KEY"
)
# Note: not all arguments are currently supported and will be ignored by the backend.
chat_completion = client.chat.completions.create(
    model="google/gemma-7b-it",
    messages=[{"role": "system", "content": "You are a helpful assistant."},
              {"role": "user", "content": query}],
    temperature=0.1
)
print(chat_completion.choices[0].message.content)

import OpenAI from "openai";
const anyscale = new OpenAI({
  baseURL: "https://api.endpoints.anyscale.com/v1",
  apiKey: "esecret_ANYSCALE_API_KEY",
});

async function chat_complete(prompt) {
  const completion = await anyscale.chat.completions.create({
    model: "google/gemma-7b-it",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: prompt },
    ],
    temperature: 0.1,
    stream: true,
  });
  for await (const chunk of completion) {
    process.stdout.write(chunk.choices[0]?.delta?.content || "");
  }
}

const query =
  "Write a program to load data from S3 with Ray and train using PyTorch.";
chat_complete(query);

import OpenAI from "openai";
const anyscale = new OpenAI({
  baseURL: "https://api.endpoints.anyscale.com/v1",
  apiKey: "esecret_ANYSCALE_API_KEY",
});

async function chat_complete(prompt) {
  const completion = await anyscale.chat.completions.create({
    model: "google/gemma-7b-it",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: prompt },
    ],
    temperature: 0.1,
  });
  process.stdout.write(completion.choices[0]?.message?.content);
}

const query =
  "Write a program to load data from S3 with Ray and train using PyTorch.";
chat_complete(query);

export ANYSCALE_BASE_URL="https://api.endpoints.anyscale.com/v1"
export ANYSCALE_API_KEY="YOUR_ANYSCALE_ENDPOINT_API_KEY"

curl "$ANYSCALE_BASE_URL/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ANYSCALE_API_KEY" \
  -d '{
    "model": "google/gemma-7b-it",
    "messages": [{"role": "system", "content": "You are a helpful assistant."},
                 {"role": "user", "content": "Write a program to load data from S3 with Ray and train using PyTorch."}],
    "temperature": 0.7
  }'

Chat: google/gemma-7b-it

About this model​

Get started​

Chat: `google/gemma-7b-it`

About this model

Get started