Skip to main content

Bring Your Own Model

Predibase allows you to bring your own models and adapters from your local machine or external repositories like HuggingFace.

Upload a custom adapter from local

Use the Predibase SDK to upload your adapter to Predibase:

pb.adapters.upload("/path/to/adapter", repo="my_repo", base_model="llama3-1-8b")

Note that the base_model here should ideally map to one of Predibase's officially supported models for the best experience. If your adapter's base model is not on this list, you can provide the HuggingFace model ID instead.

The adapter path on your local machine is expected to follow the PEFT format containing the following files:

/path/to/adapter
/adapter_config.json
/adapter_model.safetensors

To run inference on your uploaded adapter:

For either base model deployment method, instructions for running inference are the same. You'll need the following:

  • deployment name (ex. for a fine-tuned mistral-7b model, the deployment name is "mistral-7b" from our shared models or the name of your private serverless deployment)
  • adapter repo and version in Predibase (ex. "my_repo/1" for the example above)
lorax_client = pb.deployments.client("llama-3-1-8b")

print(lorax_client.generate("...",
adapter_id="my_repo/1",
max_new_tokens=256
).generated_text)

Prompt an adapter from HuggingFace

You can prompt a custom fine-tuned adapter from Huggingface (e.g. tldr_headline_gen) as below:

lorax_client = pb.deployments.client("mistral-7b-instruct")

print(lorax_client.generate("The following passage is content from a news report. Please summarize this passage in one sentence or less. Passage: Jeffrey Berns, CEO of Blockchains LLC, wants the Nevada government to allow companies like his to form local governments on land they own, granting them power over everything from schools to law enforcement. Berns envisions a city based on digital currencies and blockchain storage. His company is proposing to build a 15,000 home town 12 miles east of Reno. Nevada Lawmakers have responded with intrigue and skepticism. The proposed legislation has yet to be formally filed or discussed in public hearings. Summary: ",
adapter_id="predibase/tldr_headline_gen",
adapter_source="hub",
max_new_tokens=256
).generated_text)

Private adapters

To run inference on your private adapter, you'll additionally need the following:

lorax_client = pb.deployments.client("mistral-7b-instruct")

print(lorax_client.generate("The following passage is content from a news report. Please summarize this passage in one sentence or less. Passage: Jeffrey Berns, CEO of Blockchains LLC, wants the Nevada government to allow companies like his to form local governments on land they own, granting them power over everything from schools to law enforcement. Berns envisions a city based on digital currencies and blockchain storage. His company is proposing to build a 15,000 home town 12 miles east of Reno. Nevada Lawmakers have responded with intrigue and skepticism. The proposed legislation has yet to be formally filed or discussed in public hearings. Summary: ",
adapter_id="predibase/tldr_headline_gen",
adapter_source="hub",
api_token="<HUGGINGFACE API TOKEN>"
max_new_tokens=256
).generated_text)

Deploy a custom model from HuggingFace

To serve a custom base model from HuggingFace, you will need to use a private serverless deployment. Verify that it is supported as a "Best-effort LLM" and then deploy the model.

  1. Deploy a custom base model as a private serverless deployment.
  2. Prompt as normal.

Note that private serverless deployments are billed by $/gpu-hour. (See pricing)

If you would like to have a private serverless deployment of a custom model we don't yet support, we'll get it working for you -- reach out to support@predibase.com.