How to Make Generative AI Greener

By Ajay Kumar and Tom Davenport

July 2023

Illustration by Alex William

Summary.   Generative AI is impressive, but the hidden environmental costs and impact of these models are often overlooked. Companies can take eight steps to make these systems greener: Use existing large generative models, don’t generate your own; fine-tune train existing models; use energy-conserving computational methods; use a large...more

While observers have marveled at the abilities of new generative AI tools such as ChatGPT, BERT, LaMDA, GPT-3, DALL-E-2, MidJourney, and Stable Diffusion, the hidden environmental costs and impact of these models are often overlooked. The development and use of these systems have been hugely energy intensive and maintaining their physical infrastructure entails power consumption. Right now, these tools are just beginning to gain mainstream traction, but it’s reasonable to think that these costs are poised to grow — and dramatically so — in the near future.

The data center industry, which refers to a physical facility designed to store and manage information and communications technology systems, is responsible for 2–3% of global greenhouse gas (GHG) emissions. The volume of data across the world doubles in size every two years. The data center servers that store this ever-expanding sea of information require huge amounts of energy and water (directly for cooling, and indirectly for generating non-renewable electricity) to operate computer servers, equipment, and cooling systems. These systems account for around 7% of Denmark’s and 2.8% of the United States’ electricity use.

Almost all of the best-known generative AI models are generated by “hyperscale” (very large) cloud providers with thousands of servers that produce major carbon footprints; in particular, these models run on graphics processing unit (GPU) chips. These require 10–15 times the energy a traditional CPU needs because a GPU uses more transistors in the arithmetic logic units. Currently, the three main hyperscale cloud providers are Amazon AWS, Google Cloud, and Microsoft Azure.

If we try to understand the environmental impact of ChatGPT through the lens of carbon footprint, we should understand the carbon footprint lifecycle of machine learning (ML) models first. That’s the key to beginning to make generative AI greener through lower energy consumption.

What Determines the Carbon Footprint of Generative AI Models?

All large generative models are not alike in terms of their energy use and carbon emissions. When determining the carbon footprint of an ML model, there are three distinct values to consider:

  • the carbon footprint from training the model

  • the carbon footprint from running inference (inferring or predicting outcomes using new input data, such as a prompt) with the ML model once it has been deployed, and

  • the carbon footprint required to produce all of the needed computing hardware and cloud data center capabilities.

Models with more parameters and training data generally consume more energy and generate more carbon. GPT-3, the “parent” model of ChatGPT, is at or near the top of the generative models in size. It has 175 billion model parameters and was trained on over 500 billion words of text. According to one research article, the recent class of generative AI models requires a ten to a hundred-fold increase in computing power to train models over the previous generation, depending on which model is involved. Thus overall demand is doubling about every six months.

Training models are the most energy-intensive components of generative AI. Researchers have argued that training a “single large language deep learning model” such as OpenAI’s GPT-4 or Google’s PaLM is estimated to use around 300 tons of CO2 — for comparison, the average person is responsible for creating around 5 tons of CO2 a year, though the average North American generates several times that amount. Other researchers calculated that training a medium-sized generative AI model using a technique called “neural architecture search” used electricity and energy consumption equivalent to 626,000 tons of CO2 emissions — or the same as CO2 emissions as driving five average American cars through their lifetimes. Training a single BERT model (a large language model developed by Google) from scratch would require the same energy and carbon footprint as a commercial trans-Atlantic flight.

Inference, or using the models to get responses to user prompts, uses less energy each session, but eventually involves many more sessions. Sometimes these models are only trained once, and then deployed to the cloud and used by millions of users for inference. In that case, deploying large deep-learning models to the cloud for inference purposes also consumes a lot of energy. Analysts report that NVIDIA estimates that 80–90% of the energy cost of neural networks lies in ongoing inference processing after a model has been trained.

In addition to initial training and inference usage of energy by large generative models, users and resellers of these models are increasingly employing fine-tuning or prompt-based training. When combined with the original generative model trained on large volumes of data, fine-tuning allows prompts and answers that are tailored to an organization’s specific content. Some research suggests that fine-tuning training consumes considerably less energy and computing power than initial training. However, if many organizations adopt fine-tune approaches and do it often, the overall energy consumption could be quite high.

Although it is difficult to calculate the cost of manufacturing the computers needed to run all this AI software, there is reason to believe that it is very high. One 2011 study estimated that 70% of the energy used by a typical laptop computer is incurred during its manufacture, and that desktop computers are even higher. It is likely that the complex and powerful GPU chips and servers used to run AI models are much higher than laptops and desktops.

How to Make AI Greener

Given all that, there is a movement to make AI modelling, deployment, and usage more environmentally sustainable. Its goal is to replace power-hungry approaches with more suitable and environmentally-conscious replacements. Change is needed from both vendors and users to make AI algorithms green so that their utility can be widely deployed without harm to the environment. Generative models in particular, given their high energy consumption, need to become greener before they become more pervasive. We know of several different ways in which AI and generative AI can move in this direction, which we describe below.

Use existing large generative models, don’t generate your own. There are already many providers of large language and image models, and there will be more. Creating and training them requires enormous amounts of energy. There is little need for companies other than large vendors or cloud providers to generate their own large models from scratch. They already have access to the needed training data and massive volumes of computing capability in the cloud, so they don’t need to acquire it.

Fine-tune train existing models. If a company wants a generative model trained on its own content, it shouldn’t start from scratch to train a model but rather refine an existing model. Fine-tuning and prompt training on specific content domains consume much less energy than training new large models from scratch. It can also provide more value to many businesses than generically-trained models. This should be the primary focus for companies wishing to adopt generative models for their own content.

Use energy-conserving computational methods. Another approach to reducing generative AI energy consumption is to use less computationally expensive approaches such as TinyML to process the data. The TinyML framework allows users to run ML models on small, low-powered edge devices like microcontrollers with low bandwidth requirements (no need to send the data to the server for processing). While general CPUs consume an average of 70 watts of power and GPUs consume 400 watts of power, a tiny microcontroller consumes just a few hundred microwatts — a thousand times less power — to process the data locally without sending it to data servers.

Use a large model only when it offers significant value. It is important for data scientists and developers to know where the model provides value. If the usage of a 3x more power-hungry system increases the accuracy of a model by just 1–3% then it is not worth the extra energy consumption. More broadly, machine learning and artificial intelligence are not always required to solve a problem. Developers need to first do research and analysis of multiple alternative solutions and select an approach according to the findings. The Montreal AI Ethics Institute, for example, is actively working on this problem.

Be discerning about when you use generative AI. Machine learning and NLP tools are revolutionary for medical-related health problems and prediction. They are great for predicting natural hazards such as tsunamis, earthquakes, etc. These are useful applications, but tools just for generating blog posts or creating amusing stories may not be the best use for these computation-heavy tools. They may be depleting the earth’s health more than they are helping its people. If a company is employing generative AI for content creation, it should try to ensure that the models are used only when necessary or to reduce other computing costs, which should also reduce their overall computing budgets.

Evaluate the energy sources of your cloud provider or data center. AI (and software in general) carbon intensity can be minimized by deploying models in regions that are able to use environmentally friendly power resources and are carbon friendly. This practice has shown a 75% reduction in operational emissions. For example, a model trained and operating in the U.S. may use energy from fossil fuels, but the same model can be run in Quebec where the primary energy source is hydroelectric. Google has recently started to build a $735 million clean energy data center in Quebec and plans to shift to 24/7 carbon-free energy by 2030. It also offers a “Carbon Sense Suite” to help companies reduce energy consumption in their cloud workloads. Users of cloud providers can monitor the companies’ announcements about when and how they have deployed carbon-neutral or zero-carbon energy sources.

Re-use models and resources. Just like other materials, tech can be reused. Open-source models can be used rather than training new ones. Recycling can lower the impact of carbon-producing AI practices. Raw materials can be extracted to make newer generations of the latest laptops, processors, hard drives, and much more.

Include AI activity in your carbon monitoring. Carbon monitoring practices need to be adopted by all research labs, AI vendors, and AI-using firms to know what is their carbon footprint. They also need to publicize their footprint numbers in order for their customers to make intelligent decisions about doing AI-related business with them. The calculation of GHG emissions is dependent on the data sets of the data suppliers and processing firms such as research labs and AI-based service providers such as OpenAI. From the inception of the ideas to the infrastructure that will be utilized to gain research results, all need to be following green AI approaches. There are several packages and online tools available like CodeCarbon, Green algorithms, and ML CO2 Impact, which can be included in your code at runtime to estimate your emissions and we should encourage the developer community to consider these performance metrics to establish benchmarks and to evaluate ML models.

Of course, there are many considerations involved with the use of generative AI models by organizations and individuals: ethical, legal, and even philosophical and psychological. Ecological concerns, however, are worthy of being added to the mix. We can debate the long-term future implications of these technologies for humanity, but such considerations will be moot if we don’t have a habitable planet to debate them on.

Previous
Previous

Pressure in Emerging Markets

Next
Next

China’s gold market in June: wholesale demand stable and gold reserves up