Why bigger is not always better in AI 

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

In AI research, everyone seems to think that bigger is better. The idea is that more data, more computing power, and more parameters will lead to models that are more powerful. This thinking started with a landmark paper from 2017, in which Google researchers introduced the transformer architecture underpinning today’s language model boom and helped embed the “scale is all you need” mindset into the AI community. Today, big tech companies seem to be competing over scale above everything else.

“It’s like, how big is your model, bro?” says Sasha Luccioni, the AI and climate lead at the AI startup Hugging Face. Tech companies just add billions more parameters, which means an average person couldn’t download the models and tinker with them, even if they were open-source (which they mostly aren’t). The AI models of today are just “way too big,” she says.  

With scale come a slew of problems, such as invasive data-gathering practices and child sexual abuse material in data sets, as Luccioni and coauthors detail in a new paper. To top it off, bigger models also have a far bigger carbon footprint, because they require more energy to run. 

Another problem that scale brings is the extreme concentration of power, says Luccioni. Scaling up costs tons of money, and only elite researchers working in Big Tech have the resources to build and operate models like that. 

“There’s this bottleneck that’s created by a very small number of rich and powerful companies who use AI as part of their core product,” she says. 

It doesn’t have to be like this. I just published a story on a new multimodal large language model that is small but mighty. Researchers at the Allen Institute for Artificial Intelligence (Ai2) built an open-source family of models called Molmo, which achieve impressive performance with a fraction of the resources used to build state-of-the-art models. 

The organization claims that its biggest Molmo model, which has 72 billion parameters, outperforms OpenAI’s GPT-4o, which is estimated to have over a trillion parameters, in tests that measure things like understanding images, charts, and documents.  

Meanwhile, Ai2 says a smaller Molmo model, with 7 billion parameters, comes close to OpenAI’s state-of-the-art model in performance, an achievement it ascribes to vastly more efficient data collection and training methods. Read more about it from me here. Molmo shows we don’t need massive data sets and massive models that take tons of money and energy to train. 

Breaking out of the “scale is all you need” mindset was one of the biggest challenges for the researchers who built Molmo, says Ani Kembhavi, a senior director of research at Ai2. 

“When we started this project, we were like, we have to think completely out of the box, because there has to be a better way to train models,” he says. The team wanted to prove that open models can be as powerful as closed, proprietary ones, and that required them to build models that were accessible and didn’t cost millions of dollars to train. 

Molmo shows that “less is more, small is big, open [is as good as] closed,” Kembhavi says. 

There’s another good case for scaling down. Bigger models tend to be able to do a wider range of things than end users actually need, says Luccioni. 

“Most of the time, you don’t need a model that does everything. You need a model that does a specific task that you want it to do. And for that, bigger models are not necessarily better,” she says.

Instead, we need to change the ways we measure AI performance to focus on things that actually matter, says Luccioni. For example, in a cancer detection algorithm, instead of using a model that can do all sorts of things and is trained on the internet, perhaps we should be prioritizing factors such as accuracy, privacy, or whether the model is trained on data that you can trust, she says. 

But that would require a higher level of transparency than is currently the norm in AI. Researchers don’t really know how or why their models do what they do, and don’t even really have a grasp of what goes into their data sets. Scaling is a popular technique because researchers have found that throwing more stuff at models seems to make them perform better. The research community and companies need to shift the incentives so that tech companies will be required to be more mindful and transparent about what goes into their models, and help us do more with less. 

“You don’t need to assume [AI models] are a magic box and going to solve all your issues,” she says. 


Now read the rest of The Algorithm

Deeper Learning

An AI script editor could help decide what films get made in Hollywood

Every day across Hollywood, scores of people read through scripts on behalf of studios, trying to find the diamonds in the rough among the many thousands sent in every year. Each script runs up to 150 pages, and it can take half a day to read one and write up a summary. With only about 50 of these scripts selling in a given year, readers are trained to be ruthless. 

Lights, camera, AI: Now the tech company Cinelytic, which works with major studios like Warner Bros. and Sony Pictures, aims to offer script feedback with generative AI. It launched a new tool called Callaia that analyzes scripts. Using AI, it takes Callaia less than a minute to write its own “coverage,” which includes a synopsis, a list of comparable films, grades for areas like dialogue and originality, and actor recommendations. Read more from James O’Donnell here.

Bits and Bytes

California’s governor has vetoed the state’s sweeping AI legislation
Governor Gavin Newsom vetoed SB 1047, a bill that required pre-deployment safety testing of large AI systems, and gave the state’s attorney general the right to sue AI companies for serious harm. He said he thought the bill focused too much on the largest models without considering broader harms and risks. Critics of AI’s rapid growth have expressed dismay at the decision. (The New York Times) 

Sorry, AI won’t “fix” climate change
OpenAI’s CEO Sam Altman claims AI will deliver an “Intelligence Age,” unleashing “unimaginable” prosperity and “astounding triumphs” like “fixing the climate.” But tech breakthroughs alone can’t solve global warming. In fact, as it stands, AI is making the problem much worse. (MIT Technology Review) 

How turning OpenAI into a real business is tearing it apart
In yet another organizational shakeup, the startup lost its CTO Mira Murati and other senior leaders. OpenAI is riddled with chaos that stems from its CEO’s push to transform it from a nonprofit research lab into a for-profit organization. Insiders say this shift has “corrupted” the company’s culture. (The Wall Street Journal)

Why Microsoft made a deal to help restart Three Mile Island
A once-shuttered nuclear plant could soon be used to power Microsoft’s massive investment in AI development. (MIT Technology Review) 

OpenAI released its advanced voice mode to more people. Here’s how to get it.
The company says the updated version responds to your emotions and tone of voice, and allows you to interrupt it midsentence. (MIT Technology Review) 

The FTC is cracking down on AI scams
The agency launched “Operation AI Comply” and says it will investigate AI-infused frauds and other types of deception, such as chatbots giving “legal advice,” AI tools that let people create fake online reviews, and false claims of huge earnings from AI-powered business opportunities.
(The FTC) 

Want AI that flags hateful content? Build it.
A new competition promises $10,000 in prizes to anyone who can track hateful images online. (MIT Technology Review)