Image generated by DALL-E.
Note: The bulk of this article was written in February 2023. Since then, several open-source replicas of GPT3 have been released, most notably by FAIR and Stanford. While these have improved accessibility to these models, issues mentioned here are still at play: first, these are replicas that are several versions behind the most recent, state-of-the-art models, and second, resource constraints continue to limit who gets to use these models in practice. In the long run, open-source efforts cannot outcompete closed ones, and will always be playing catch-up – the playing field is inherently unequal.1 Instead, the question is fundamentally about rebalancing political power – and as a community, we should think about pushing back against this trend towards closed science in a more organized way.2
Being a graduate student in NLP in 2023 is both an exciting and yet frustrating experience. On one hand, we are faced with a massive amount of hype, attention, and financial investment entering the field – new papers are being written every day at an unprecedented rate (one only has to look at the exploding rate of submissions to *ACL conferences in the past few years). On the other hand, a small number of companies (in particular OpenAI) are seizing this opportunity to quickly monopolize this field as their personal pot of gold. They do so by, essentially, closing off access to fundamental research problems. In this new paradigm of research, companies release demos and press releases to draw in credit, clout, and additional funding. They take from the community, building off the backs of decades of open-source research, without offering any of their own information, ideas, or resources in return. Building off of this inequity, these corporations work to concentrate an increasing amount of wealth, resources, and investments in their own hands.
This has immediate consequences for the research community: increasingly, there are insurmountable barriers to entry to certain types of research. Many PhD students have started to ask: what is even possible for us to do? As companies work to purposefully set up these barriers, this has left the rest of the research community having to form their research agenda entirely depending on who and what these corporations elect to grant to them. The rate of innovation is artificially suppressed.
Beyond this, however, there are vast and reaching ethical consequences: Which communities are benefited and harmed by these models, and who gets to decide how important each of these communities are? Whose labor gets used and credited when building these models? Whose voice is privileged in the building of these models, and who gets to dictate the priorities of the field?
The first section focuses on how the privatization of ideas harms the ability of the field to innovate. The second section examines some of the other (ethical) consequences of this trend. The final section presents potential alternatives for us going forward. This is an open problem, of course, and I welcome additional ideas or insights into how we may address them in the future.
I. The Stagnation of Innovation
There are two main barriers to accessing large language model (LLM) research: the first being the lack of resources and economic means to train these models, and the second being the lack of access to these models in the first place.
The former is not new in AI research, and certainly not in scientific research as a whole. Though arguably any means of distributing funding privileges certain ideas over others, there are ways to increase accessibility even at this stage, e.g. through sharing and pooling of resources.
We’ll mainly focus on the second barrier, which has emerged in recent years as corporations move increasingly to safeguard and profit off of their scientific ideas. OpenAI, and recently Google AI with their largest language models, have all significantly restricted, or even entirely closed off, access to their LLMs. For example, when GPT3 was released, OpenAI (citing safety concerns) closed off access to almost everything except a (paid) API where users can have limited access to only the top-K predictions. This was a purposeful rejection of the (at the time) predominant practice of open-sourcing code, model weights, data, etc., and has proven to be a lucrative business decision, helping consolidate OpenAI as the leading brand in LLM research.
On the other hand, this has essentially closed off the exploration of fundamental language model research to a small circle of researchers within the direct vicinity of OpenAI. By privileging access to model weights, representations, controls, and data, the type of innovation (on LLMs) that can be produced outside of OpenAI is limited. This is reflected in the type of research that has emerged and been popularized recently, lots of which can either be described as various forms of (or supplements to) prompting, or various ways of using LLM generations for different tasks. In fact, these are the only types of research that can be conducted with access to the OpenAI API. Research that requires access to model logits and weights, such as interpretability studies, model editing, and controlled generation studies cannot be done except on much smaller-scale language models, which can display very different behavior compared to large language models. Thus, a few powerful entities get to close off entire research avenues for the larger community by deciding what forms of research can be conducted and determining who does or doesn’t get to conduct such research.
Despite great training, knowledge, and creativity, the rest of the community simply cannot contribute to certain fundamental problems due to these barriers. In fact, a great deal of attention and work gets diverted into simply recreating versions of these systems: GPT3 vs. OPT, ChatGPT vs. BARD. This is duplicate work which could be saved entirely if such systems were accessible and open to all.
To further drive this point home, imagine if Google had decided to operate by this same policy and had never released publically-accessible versions of the Transformer and BERT, instead requiring everyone pay for API access to these models. Researchers would have to submit a formatted data file, pay Google to fine tune the model on it, and then feed in every single prediction one at a time.3 What innovation would we have missed out on in the past few years? Would researchers at OpenAI even have been able to conceptualize, much less build, GPT2, GPT3, and ChatGPT? Aside from the inherent unfairness of profiting off of years of open science that has brought the field to this state, closing off access to information may also curtail the overall rate of scientific advancement in the future. And although OpenAI justifies these restrictions based on the “safety concerns” of releasing their models, it seems too much of a coincidence that the type of access they decide to provide is one that will maximize attention and usage while minimizing scientific competition.4
Final note: though I’ve spent most of this section pointing the finger at OpenAI, they are just one player in an environment of broader systemic incentives. The approach OpenAI has taken seems like an all-reward, no-cost approach that, at some point, was ripe to be abused by someone. The problem is, of course, the profit model by which AI research operates, which is itself a reflection of the broader incentives under capitalism. Principles of open science become, at some point (specifically when there is sufficient profit to be earned), in conflict with the competitive requirements of capitalism. Because of these inherent contradictions, we are at an inflection point where research ideas are increasingly siloed, privatized, and safeguarded by individual corporations. These ideas are made into products to be consumed and marketed to the public, but their recipes are kept secret.
II. Why does broader access to innovation opportunities matter?
One counterargument that could be posed: maybe we don’t need any more new architectures, algorithms, and ideas, and none of this discussion really matters. We’ve already seen that by simply scaling up existing models, we’re able to accomplish new and extraordinary feats. And indeed, this has been the predominant mode of progress in the field in the past few years.
However, as many others have asserted, there are still fundamental problems to solve within AI: the lack of systematicity, interpretability, and factuality, the inability to understand meaning and intent, the inability to deal with real-world environments and low-resource languages, etc. Arguably, some of these may be able to be addressed with scaling to a certain degree. However, it is unclear whether scaling is an absolute solution, and certain hard problems (e.g. of understanding/possessing communicative intent) seem to require fundamentally orthogonal innovations.
I’m also not going to sit here and say that scaling hasn’t been incredibly useful. This is not intended to be an indictment of scaling in and of itself, but rather the incentives driving modern scaling research. Why have we been able to scale so much in the first place? A lot of this can be attributed to the concentration of wealth and resources in the hands of tech companies. Scaling and capitalist growth are undeniably intertwined; in many ways, scaling is just a straightforward application of capitalist growth to NLP, built upon the premise that simple, unfettered, and relatively thoughtless accumulation of resources, data, and labor will somehow solve AI. And as corporations accumulate more wealth, resources, and access to low-wage labor, techniques which can take the greatest advantage of this (e.g. large-scale training of connectionist networks) are naturally rewarded.
It should come as no surprise then that many of the critiques of LLM scaling also mirror critiques of capitalism: It’s bad for the environment (scaling research continues to proliferate while ), requires invisible/low-wage labor, replicates societal stereotypes and harms, and is unable to serve already underprivileged communities (e.g. groups that speak low-resource or signed languages). Furthermore, a few privileged corporations who have access to the most resources can monopolize this technology.
Rather than suggest we stop research in scaling, I believe we should try and curb the worst elements of capitalism (and its excesses) that have weaved its way into the field, and manifested itself largely in scaling research. So far, it’s the profit incentive that has driven when corporations decide to think about these harms and risks. This has shown itself most evidently in the ousting of Google AI ethics researchers in 2020, when a team of these researchers decided to publish a paper criticizing LLMs and scaling research. Compare this to OpenAI’s safety policy, or corporate legal teams which flag and review each paper before they are released. These policies and legal teams similarly have the potential to, and have certainly in practice, greatly inhibited scientific progress, yet only one group (the ethics researchers) have been systematically sidelined and accused of inhibiting progress. It just so happens that the latter group is constructed to protect the interest of these corporations. Risks and ethics are important to consider only to the degree that they bring in profits and evade lawsuits; when it’s no longer profitable to consider them, these things are quickly pushed to the wayside.
Finally, it’s worth examining the ethical consequences themselves of the current developments in LLMs. I’ll give only a cursory treatment, as many before me have studied this in great depth. In practice, these LLMs and scaling approaches have worked to serve predominantly the privileged English-speaking population, and other populations well-represented on the internet. Scaling at all costs heavily exploits labor and resources – building ChatGPT entailed using underpaid Kenyan workers to annotate large amounts of toxic data. LLM training also comes at a great environmental cost through energy consumption, and though people have pushed for corporations to recognize the climate consequences of their training methods, scaling research continues to proliferate, while very few corporations conduct serious research (at the same rate and scale) on new, climate-efficient innovations.
It’s also worth thinking about the ethical consequences of the monopolization of these technologies by a few corporations: Whose voices are privileged by the system? Who gets a say in how LLMs are used, built, and deployed? Which communities and issues matter to the model builders and which communities and issues don’t? The increased concentration of resources, wealth, and power means corporations themselves currently have unilateral decision-making power about who gets to use these systems and in what capacity. This is not to say that some sort of filtration is not necessary: clearly, there are malicious actors in the world who wish to use these systems to enact harm on the world. But are these corporations truly filtering bad actors? Or are they promoting even more harm by closing off access to communities who might benefit from these technologies (and whose labor went into building them)? Policing who has access should not be under the jurisdiction of a single corporation, and furthermore, a blanket closed access policy is far more punitive and distrustful than necessary — the end result is simply, once again, privileging an elite group of paid and institutionally-accredited researchers, which in turn means that these systems get further tailored to their needs.
III. Possible answers going forward
I think that scaling research can absolutely be conducted in an ethical, thoughtful, and deliberate way. The issue is when it’s privatized and driven by profit. In addition to open science issues suppressing innovation, the profit motive means that corporations are now blindly clamoring on top of each other to train the biggest baddest model without regard for who is served or harmed, and without regard for whether or not this is even the optimal solution to begin with. And because of the lack of open access and resources for those outside of a select few big tech companies, others researchers must accept this at face value. Regardless of how smart and innovative you may be, there is no choice but to be at the whims of whomever has the most resources to train the biggest model.
I believe NLP research works best when driven by a community working collectively for their own needs. In the recent Turkish-Syrian earthquake, a large group of engineers and researchers came together to build technologies to extract locations of survivors and what resources they require. Māori people have led the effort to build technologies for their language, te reo, a low-resource language which has received less attention from mainstream LLM research. Of course, arguments can be made that these efforts came together at a smaller scale and dedicated to very specific issues, and that fundamental NLP advances cannot be made in the same way. However, I believe this is simply because we don’t have the correct societal incentives; people want to work on fundamental problems, but there are insurmountable barriers preventing researchers and communities from doing so.
There are some things we can also do in the immediate short term to rebalance the incentive structure. For example, the trend of closing off LLM access can be discouraged by the community. Currently, the cost is near-zero for closed science, while journalists and sometimes researchers themselves participate in hyping up, advertising, and using models that are entirely, essentially, closed to the community (consequently generating more private data for these corporations to use to improve these models). LLM success has been partly built off the backs of this unprecedented hype, and people seem largely resigned to (or even openly embracing) the lack of open access. As a community, I believe we should think more critically about what sorts of trends we tout, and whether or not they’re healthy for the research ecosystem as a whole.
In the longer term, we can imagine a future where scaling NLP can operate as a publicly-funded, open-source, international effort, where all contributors of data and labor benefit from the system and are credited. We can and should aim to make LLM development a more democratic effort, with representation from all sectors of society. Ethical guidelines and tradeoffs should also be democratically decided by a representative group of the population, not unilaterally by a single corporation. Imagine the power of a global community of NLP researchers working openly and collectively to advance better technologies for all humans.
Finally, I believe achieving this future is inherently a political question – a question of how we can reshift power (specifically, decision-making power about who gets access to technologies) to be in the hands of the democratic majority rather than a small number of corporations. There is clearly a lot of discontent (and existential dread) from NLP researchers about the recent trend towards closed science. But other than lamenting on Twitter and in private discussions, very few concrete actions have been taken. One solution, to simply build open-source replicas, is potentially useful in the short term but does not address the root of the issue – it seems inefficient to be constantly stuck in a wasteful cycle of trying to reproduce what corporations have already built, and certainly less efficient than simply open-sourcing original source code.5 One may also be tempted to simply put forth a sound rhetorical argument for open-sourcing, though it seems difficult to simply persuade a company against pursuing profit in a capitalist society. I am personally a major proponent of organizing and collective action as a means of building power.6 The organizing must be directed not just against OpenAI, or even the single issue of closed science, but against the root cause of these problems – the capitalist incentives that currently dictate the priorities of the field, and of society in general. Historically, achieving social change has taken mass movements. Perhaps it’s time to think of an organized way to push back against these forces, and I welcome ideas for doing so.
This blog post was inspired by countless conversations with my labmates in the LINGO lab, and fellow MIT GSU organizers. The following people in particular were of instrumental help in the writing of this blog post: Daniel Shen, Alexis Ross, Tianxing He, and Jacob Andreas.
An analogous scenario: Suppose two people are playing a card game but one player can look at their opponent’s cards, while the other cannot. ↩
In particular, we should organize to rebalance power and ownership of these models into the hands of a democratic majority. The idea of putting a moratorium on LLM research, endorsed by a small number of powerful individuals who wish to continue to maintain the status quo of who gets to access, build, and deploy these technologies, will only continue to perpetuate the problems mentioned here. This is in addition to the many other problems with the moratorium letter, articulated in this response. ↩
Perhaps unrealistically costly, but let’s just suppose this is plausible. ↩
This is, of course, not to say that LLMs don’t have safety concerns (in fact, almost all technologies do!) The question is whether OpenAI’s stated policy actually aligns with what is optimal for community safety, or whether it is simply a veil for being able to close off and monopolize scientific ideas. After all, it seems hardly safe for a single entity to unilaterally decide what safety means for an entire community. ↩
Though reproducibility studies are important, these open-source replicas aren’t built with a primary goal of studying replicability. Furthermore, releasing open-source code and training details dramatically aids reproducibility. ↩
One manifestation of such organizing: labor unions, which can (incrementally) shift political power against large corporations. However, these are by no means the only way to organize, and it’s unclear whether they are best suited for this scenario. ↩