Scientists Clone ChatGPT Spending Just R$3,000

Researchers at Stanford University, in the United States, managed to clone ChatGPT at a cost of just US$ 600 (just over R$ 3 thousand in the current conversion). Called Alpaca GPT, the technology proves that software like those created by OpenAI can be simpler to replicate than you might think.

The scholars belong to the Center for Research in Foundation Models at Stanford. As an academic work, they decided to create their own language model based on Meta’s technology — LLaMA 7B — and integrated with OpenAI’s GPT API.

Since they didn’t have to build the algorithm from scratch or invest in powerful machines to develop the technology, the full cost was just paid to the companies for licensing purposes. It was US$ 500 for OpenAI and only US$ 100 for Meta, which allowed the use for the creation of the current project.

The amount is paltry compared to the millions, perhaps billions, invested by the two Big Techs in creating their AI tools. Alpaca GPT exhibits very similar behavior to GPT-3.5, which was used to power the initial version of ChatGPT.

Alpaca GPT surprised by its capacity

The Stanford researchers said they were very surprised when they compared the Alpaca with other models on the market. In some cases, they said the technology was even superior, with more direct and accurate results than ChatGPT itself.

Even with the advancement, Stanford’s AI still suffers from several shortcomings common to language models, such as “hallucination, toxicity, and stereotypes.” These are problems that GPT-4 and its successors seek to combat, after all, nobody wants to talk to an intelligence that starts to “freak out” out of nowhere, giving harsh, incoherent or completely out of context answers.

To arrive at the result, the team asked GPT to take 175 pairs of human-written instructions and start generating more in the same style and format, always presenting 20 at a time. This process was automated by one of the OpenAI APIs, which, in a short time, generated more than 52,000 conversations.

These interactions became a considerable sample used in training the LLaMA model. The result was the optimization of Meta’s database with the processing capacity of the OpenAI solution, creating a third product (Alpaca) for less than R$ 3,100 and more capacity than the two products.

AIs on the rise and a busy market

The last few weeks have been busy in the field of artificial intelligence. After OpenAI released its API to the market, Meta had the LLaMA leaked on the Web and Microsoft decided to end the queue for using Bing Chat.

Soon afterwards, the GPT-4 and the Ernie Bot, from the Chinese Baidu, were launched, which were disappointing due to their simplicity. Last week, the Midjourney image-to-text generator also got an optimized version to create even better photos and art.

The fact is that while companies like Microsoft, OpenAI and Meta spend millions of dollars to develop AI models, scientists manage to create much cheaper solutions. Could “technology piracy” be a threat to the development of generative AI? This can be a good discussion for the future.