Host LLama 2 for free using Cloudflare AI

In this guide, we’ll explore how you can deploy and host LLaMA 2, a powerful language model, for free using Cloudflare Workers.

LLMs (Large Language Models) and AI technologies are rapidly advancing, and with Cloudflare’s generous pricing model, you’re perfectly positioned to start developing your own AI applications.

Follow these steps to set up your application:

1. Create a Cloudflare Account
Start by signing up or logging into your Cloudflare account.

2. Navigate to Workers & Pages
Within your dashboard, find the section for “Workers & Pages” to begin setting up your new application.

3. Create Your Application
Click on the “Create Application” button to initiate the setup process.

4. Create a Worker
Choose “LLM App” from the templates under the Workers tab. This choice serves as the foundation of your application, enabling JavaScript execution on Cloudflare’s servers. Selecting the LLM App template provides a head start with the necessary packages for running such an application.

5. Deploy Your WorkerAfter creating your worker, click on “Deploy”. Don’t worry; you can update your worker’s code later as needed.
Hurrah! Our worker is now live, make a request like below to see the magic happen. Replace the url with your worker url.

curl --location 'https://worker-white-tooth-29e9.2000-aman-sinha.workers.dev/' \--header 'Content-Type: application/json' \--data '{"prompt": "say a joke"}'

6. Edit Your Worker’s Code
Now, it’s time to customize your worker. Click on “Edit code” to start coding. Initially, your worker is up and running.Edit index.js to contain the below content

import { Ai } from './vendor/@cloudflare/ai.js';

export default {
    async fetch(request, env) {
        const body = await request.json();
        const ai = new Ai(env.AI);
        const response = await ai.run("@hf/thebloke/llama-2-13b-chat-awq", body);
        return new Response(JSON.stringify(response));
    },
};

The Final project looks like this

7. Click “Save and Deploy”
Our LLM is now up and running, and accepting requests and prompts from us.Let’s try by making the below request to get a joke. Change URL with your worker URL for it to work.

curl --location 'https://worker-white-tooth-29e9.2000-aman-sinha.workers.dev/' \--header 'Content-Type: application/json' \--data '{"prompt": "say a joke"}'

Well Done. Our app is live and we can make use of Llama2 now. You can checkout various models available on Cloudflare AI here:
https://developers.cloudflare.com/workers-ai/models/text-generation/

Try out various models, see what fits best for your use case, and move forward.

———————————————

I’m deeply involved with AI and LLMs. Follow me on Medium for more insights.Feel free to say hi or connect via Twitter and LinkedIn.


References

https://developers.cloudflare.com/workers-ai/
https://blog.cloudflare.com/writing-poems-using-llama-2-on-workers-ai
https://developers.cloudflare.com/workers-ai/models/text-generation/


Made by

Aman Kumar

©2024 amankumar.ai

Made by

Aman Kumar

©2024 amankumar.ai

Made by

Aman Kumar

©2024 amankumar.ai