reader comments
63 with
On Tuesday, Meta announced Llama 2, a new source-available family of AI language models notable for its commercial license, which means the models can be integrated into commercial products, unlike its predecessor. They range in size from 7 to 70 billion parameters and reportedly “outperform open source chat models on most benchmarks we tested,” according to Meta.
“This is going to change the landscape of the LLM market,” tweeted Chief AI Scientist Yann LeCun. “Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face, and other providers.”
According to Meta, its Llama 2 “pretrained” models (the bare-bones models) are trained on 2 trillion tokens and have a context window of 4,096 tokens (fragments of words). The context window determines the length of the content the model can process at once. Meta also says that the Llama 2 fine-tuned models, developed for chat applications similar to ChatGPT, have been trained on “over 1 million human annotations.”
While it can’t match OpenAI’s GPT-4 in performance, Llama 2 apparently fares well for a source-available model. According to Jim Fan, senior AI scientist at Nvidia, “70B is close to GPT-3.5 on reasoning tasks, but there is a significant gap on coding benchmarks. It’s on par or better than PaLM-540B on most benchmarks, but still far behind GPT-4 and PaLM-2-L.” More details on Llama 2’s performance, benchmarks, and construction can be found in a research paper released by Meta on Tuesday.
In February, Meta released the precursor of Llama 2, LLaMA, as source-available with a non-commercial license. Officially only available to academics with certain credentials, someone soon leaked LLaMA’s weights (files containing the parameter values of the trained neural networks) to torrent sites, and they spread widely in the AI community. Soon, fine-tuned variations of LLaMA, such as Alpaca, sprang up, providing the seed of a fast-growing underground LLM development scene.
potential licensees with “greater than 700 million monthly active users in the preceding calendar month” must request special permission from Meta to use it, potentially precluding its free use by giants the size of Amazon or Google.
The power and peril of the open approach
While open AI models with weights available have proven popular with hobbyists and people seeking uncensored chatbots, they have also proven controversial. Meta is notable for standing alone among the tech giants in supporting major openly-licensed and weights-available foundation models, while those in the closed-source corner include OpenAI, Microsoft, and Google.
Critics say that open source AI models carry potential risks, such as misuse in synthetic biology or in generating spam or disinformation. It’s easy to imagine Llama 2 filling some of these roles, although such uses violate Meta’s terms of service. Currently, if someone performs restricted acts with OpenAI’s ChatGPT API, access can be revoked. But with the open approach, once the weights are released, there is no taking them back.
However, proponents of an open approach to AI often argue that openly-available AI models encourage transparency (in terms of the training data used to make them), foster economic competition (not limiting the technology to giant companies), encourage free speech (no censorship), and democratize access to AI (without paywall restrictions).
Perhaps getting ahead of potential criticism for its release, Meta also published a short “Statement of Support for Meta’s Open Approach to Today’s AI” that reads, “We support an open innovation approach to AI. Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology.”
pointed out on Twitter. Lack of training data transparency is still a sticking point for some LLM critics because the training data that teaches these LLMs what they “know” often comes from an unauthorized scrape of the Internet with little regard for privacy or commercial impact. Meta says it “made an effort to remove data from certain sites known to contain a high volume of personal information about private individuals” in the Llama 2 research paper, but it did not list what those sites are.
Currently, anyone can request access to download Llama 2 by filling out a form on Meta’s website.
[Update (July 19, 2023): Some industry observers dispute Meta’s characterization of Llama 2 as “open source” software, pointing out that its license does not fully comply with the Open Source Initiative’s definition of the term. These critics highlight that Meta’s license places usage restrictions on Llama 2, excluding licensees with over 700 million active daily users (mentioned above) and restricting the use of its outputs to improve other LLMs.
In a tweet responding to Yann LeCun’s announcement of Llama 2, the OSI clarified, “The [Llama 2] license only authorizes some commercial uses. The term Open Source has a clear, well-understood meaning that does not allow for restrictions on commercial use.” They also highlighted Section 2 of the Llama 2 license, titled “Additional Commercial Terms.”
In light of these clarifications, we have updated this article to use terms such as “source-available,” “openly licensed,” and “weights available” to more accurately describe Llama 2.]