Sustainable AI: a contradiction in terms? – AlgorithmWatch


Does Artificial Intelligence (AI) help tackle the climate crisis or is it an environmental sin worse than flying? We explain how much energy AI systems really consume, why we need better measurements, how AI can become more sustainable, and what all this has to do with the EU’s AI Act.
If you would like to learn more about the topic, please get in touch with:
There is a wide range of potential applications for AI systems: They are supposed to make resource consumption more efficient, solve complex social problems such as the energy and mobility transition, create a more sustainable energy system, and facilitate research into new materials. AI is even seen as an essential tool for tackling the climate crisis. However, such optimism ignores the fact that the use of AI also causes a considerable amount of CO2 emissions, which are a major cause of the climate crisis.
In general, ecological sustainability aims to preserve nature in order to leave a planet worth living on for future generations. In most cases, AI systems are the opposite of ecologically sustainable. Artificial Intelligence systems often rely on exploiting social and ecological resources. Nevertheless, they are currently often given the benefit of the doubt: Technology would sort everything out in the end, surely. In fact, AI has great social potential, but its use also entails dangers and harmful consequences.
There is hardly any information available on AI systems’ energy consumption and the emissions they cause as a result. This makes it difficult to develop political solutions to reduce emissions. It is well known that data centers as well as the production and operation of all hardware contribute heavily to global carbon dioxide emissions. They form the infrastructure necessary for the operation of AI systems. The emissions from the use of AI systems must be added to those caused by their infrastructure.
In technical jargon, this is called the “inference” phase. Each utilization of an AI system during the inference stage usually consumes relatively little energy. However, inference can take place extremely frequently. In late 2022, Facebook AI researchers concluded in a scientific paper that Facebook data centers performed trillions of inference operations each day. Between the beginning of 2018 and mid-2019, the number of servers devoted specifically to inference at Facebook’s data centers increased by 2.5 times, according to the study. At a company like Facebook, this volume of inference comes from things like recommendations and ranking algorithms, for example – algorithms that are used each time Facebook’s nearly 3 billion users worldwide access the platform and view content in their newsfeed.
Other typical applications that contribute to high inference rates on online platforms include image classification, object recognition in images, and translation and speech recognition services based on large language models. Scientists have concluded that the emissions produced in the inference phase of AI models are likely to be significantly higher than those produced during the development and training phases. This assumption is supported by internal figures from Facebook, which confirm that for in-house systems, resource consumption during the inference phase can be, depending on the application, far higher than during development and training.
Consider, for example, the training of the BLOOM model. Energy consumption during the training phase corresponds to the emission of around 24.7 tons of CO₂ equivalents. But if we factor in hardware production and operational energy, the emissions value doubles. Training alone is therefore not sufficient as a reference variable when calculating the emissions produced by AI systems. Measurements and methodically stringent calculations must cover their entire life cycles to sensitize companies, developers, and researchers to initiate targeted political regulations.
BLOOM contains 175 billion parameters. Parameters are values that a Machine Learning model learns during the training process, and they form the basis for the outcomes the model then produces. The number of parameters also determines the number of computing operations that must be performed and, thus, the amount of energy consumed. An AI language model comprising 540 billion parameters, such as PaLM, which Google published in 2022, is likely to exceed BLOOM’s energy consumption. A single training run of PaLM at a Google data center in Oklahoma, which obtains 89 percent of its energy requirements from carbon-free energy sources, resulted in 271,43 tons CO2 emissions tons of CO2 emissions. That is roughly the equivalent to the emissions produced by a fully occupied commercial jet during 1.5 flights across the United States. Mind you, such processes take place thousands of times, every day.
One can only assume that such emission values for a system as large as PaLM represents a relatively significant improvement, as the associated data center is geared towards sustainability and reduces emissions. But the question remains as to why a far more efficient hardware innovation and new training methods were only deployed to make models even larger, rather than to improve the energy efficiency of smaller, yet still quite substantial models. That isn’t just irresponsible from the perspective of resource conservation. Such vast models also make it more difficult to detect and remove discriminatory, misogynistic and racist content from the data used in training.
Presumably, all major online platforms rely on the support of AI for content moderation. One reason is that deploying AI systems is cheaper than using exclusively human moderation. Moreover, the job is psychologically challenging: Moderators are constantly exposed to disturbing content circulating on the internet. But AI moderation systems also rely on human decision-making. Moderators provide training data to the systems, and are thus a mandatory prerequisite for the systems to be developed in the first place.
Nevertheless, moderators are exposed to extremely poor working conditions. Given the psychologically stressful nature of the work, large platforms have an extra obligation to take care of their moderators. But there are instead constant reports of subcontractors of Facebook, TikTok or OpenAI not paying their moderators and clickworkers enough, not offering them adequate psychological support, and exerting extreme pressure on them through constant monitoring and by way of threats aimed at preventing their unionization. These working conditions are socially unsustainable as they do not address basic needs such as financial security and mental health. People are exploited for AI systems.
Exploitation for AI technology is not limited to moderation. Certain hardware is required to train and use algorithms. This function is performed by data centers that, once again, require minerals that are contained in batteries and microprocessors. The conditions under which humans have to work in in order to acquire these minerals are horrific. They are commonly referred to as “blood minerals.” And the electronic waste that the servers that the servers eventually turn into are dumped in Asian countries, where people have to suffer the environmental consequences.
AI technology consumes computing power and therefore energy. It is computationally very intensive to find patterns in data sets during training and to check whether the predictions based on these patterns are accurate during the inference of the AI systems. When the corresponding servers are in operation, electrical energy is converted into heat. To prevent the servers from overheating, they need to be cooled. For this purpose, water has to do the trick.
Training a large language model like GPT-3 and LaMDA can easily evaporate millions of liters of fresh water for cooling the power plants and AI servers. This is all the more concerning as water becomes increasingly scarce due to rapid population growth and/or outdated water infrastructures, especially in drought-prone areas. The exponential growth in demand is resulting in an ever-increasing water footprint. For example, Google’s direct water consumption increased by 20 percent between 2021 and 2022, and even doubled in certain drought-hit areas. Microsoft saw a 34-percent increase in its direct water consumption over the same period. ChatGPT needs about 500ml water for a simple conversation of 20-50 questions and answers. Since the chatbot has more than 100 million active users, each of whom engages in multiple conversations, ChatGPT’s water consumption is staggering. And it’s not only the application’s operational mode: Training GPT-3 in Microsoft’s state-of-the-art U.S. data centers would directly consume 700,000 liters of clean freshwater (enough for producing 370 BMW cars or 320 Tesla electric vehicles).
The harmful consequences AI products have on the environment initially went unaddressed politically. When the European Commission published its draft AI Act in April 2021, it did not include any mandatory environmental requirements for manufacturers and/or users of AI models. Critics from the research community and civil society argued that companies developing AI systems should be required to provide reliable data on their environmental impact. This requirement is essential to being able to assess such an impact’s extent and take appropriate measures to mitigate harmful consequences. The collective fight against the climate crisis alone demands this. The European Parliament took up the demand in its own draft and included it in the final trilogue negotiations with the EU Commission and the EU Council.
The draft AI Act that resulted from the trilogue negotiations in December 2023 included a comprehensive environmental protection policy. The environment is explicitly referred to as one of the legal interests to be protected. The European Commission is now to assign the European standardization bodies to draw up mandatory reporting and documentation procedures to improve AI systems’ resource performance. These procedures are intended to help reduce high-risk AI systems’ energy and other resource consumption during their life cycle and to promote the energy-efficiency of general-purpose AI models (GPAI).
Two years after the regulation’s coming into force, the Commission must submit a report on whether the new documentation standards have led to GPAI models being more energy-efficient. In its report, it must evaluate already implemented measures and assess what further measures will be necessary. It must then file such a report every four years.
According to the final draft, providers of GPAI models that are trained with large amounts of data and consume a lot of energy must document their energy consumption precisely. The Commission had completely neglected this aspect in its first draft. This is why research organizations have repeatedly called for AI models’ energy consumption to be made transparent. The Commission must now develop a suitable methodology for assessing the amount of energy consumed. GPAI models that pose a systemic risk must meet more stringent requirements. For example, they must develop internal risk management measures and testing procedures. These measures and procedures must be approved by a dedicated authority to ensure that providers comply with the requirements.
Protecting the EU population from AI systems’ harmful environmental impact is an important legislative step forward. However, one has to wonder why the transparency requirements on energy and resource consumption only apply to high-risk systems. All AI systems have an impact on the environment. For this reason, all systems should be required to disclose the ecological damage they might cause.
Critics often claim that the obligation to measure the environmental impacts of AI systems is too complicated and places too great a burden on small and medium-sized enterprises in particular – and ultimately hinders innovation. But easy-to-use measurement methods already exist for monitoring energy consumption, CO₂-equivalent emissions, water consumption, the use of minerals for hardware, and the generation of electronic waste. Tools for assessing the sustainability of AI systems are already available. Companies just have to make use of them. 
The AI startup Hugging Face has set out to make AI models more sustainable by tackling problems like emissions, bias, and discrimination and supporting open-source approaches in the ML community. Open source helps with the recycling of models. Instead of training transformer models once, you can reuse them. All the pretrained models on Hugging Face can be fine-tuned for specific use cases. That is more environmentally friendly than creating a model from scratch. Several years ago, the main approach was to accumulate as much data as possible to train a model, which would then not be shared. Now, data-intensive models are shared after training. People can reuse and retune them according to their particular use cases.
Hugging Face has set up a database that can be used to search specifically for low-emission models. The emissions numbers indicated there are from training as it is often impossible to determine the emissions that will result from the use of an AI systems’ interference. A lot of companies are interested in how much CO2 will be emitted during the interference, but it depends on a number of factors, including the hardware used and where the computing is being done. Without knowing those factors, it is impossible to provide information on the emissions. Still, a lot of people would find such information extremely useful.
If companies started using tools to measure their ML models’ emissions and disclosing that information, AI models could be assessed based on facts and figures. Tools like Code Carbon calculate a model’s carbon footprint in real-time. It’s a program that runs in parallel to any code and will estimate the carbon emissions at the end. Hugging Face runs a website allowing you to enter information like training hours and the type of hardware used. It then provides an estimate of the system’s carbon footprint. It is less precise than Code Carbon, but works as a rough estimate.
Data-minimalist approaches are one way of reducing energy consumption in the training and application phase of AI models. Minimalism refers to the amount of data processed with AI. With data minimalism, the data sets used for training and application are kept small. AI applications are supposed to work as efficient and effective as possible. So, the less data needed for the same procedure, the better the result. There is a widespread tendency to throw in all the available data, particularly in an industrial environment, where there is a lot of data, sometimes way more than needed. If you use it, an incremental improvement of half a percent might be achieved.
In order to make an algorithm really efficient, harmful data needs to be removed. Data is harmful if it distorts results.
For example, if an AI is supposed to determine an average purchase interest for a given product range, data that only reflects purchasing behavior during marketing campaigns can distort the result. People are not necessarily interested in the item itself, but only in the discount. We therefore don’t look at their natural purchasing behavior, but provoked a reaction to the marketing campaign and thus cannot draw any useful conclusions about the AI’s behavior. Such data should ideally not be used for the training of AI. And the less data is processed, the less energy is verifiably consumed.
Hardware development is making rapid progress when it comes to computing efficiency. If you compare a GPU from this year to one built two or three years ago, there’s a significant difference. It’s 10 times faster. But the efficiency leap leads to people doing more computing, a phenomenon called rebound effect. If the models’ size and the amount of computation needed were kept at a constant level, it would be a step towards sustainability. But both are growing fast. The “the bigger the better” mentality in AI modeling is getting out of hand.
Before the AI Act, there was little discussion about whether the CO2 emissions caused by AI systems should be measured. This was partially due to the fact that there were hardly any suitable instruments available in the past. Without reliable data on emissions, political decision-makers were unable to exert the necessary pressure on the industry. Now, however, such tools are available. Finally, political guidelines can oblige companies to use them to determine their products’ emission values.
Read more on our policy & advocacy work on ADM and sustainability.

source