Major language models (Large Language Models or LLM), such as Chatgpt (OpenAi), Gemini (Google/Deepmind) or generative models of images like Midjourney, have become in a very short time essential tools with uses that are constantly growing and diversify. It is true that the fluidity of exchanges with Chatgpt impresses, and that development promises are exciting.
[Article issu de The Conversation, écrit par Paul Caillon, Docteur en intelligence artificielle , Université Paris Dauphine – PSL et Alexandre Allauzen, Professeur des Universités, Apprentissage Machine et Traitement automatique des langues, Université Paris Dauphine – PSL]
Nevertheless, these promises hide calculation costs, and therefore energetic, considerable. Now, today the dominant idea in the generative model industry is: “The larger the model, the better. This competition is accompanied by a growth in energy consumption and, therefore, of the ecological footprint which can no longer be ignored and which questions as to its sustainability and its viability for society.
Why such a cost?
A generative text model like a chatbot is a set of digital settings adjusted from data to accomplish a specific task. The dominant architecture is based on the ” transformers ».
THE transformers Take a sequence as a starter, for example a prompt (your question), to transform it digitally. By stacking the layers of transformersthe model multiplies these transformations in order to build the answer by extending its entry. This stack of layers gives the model its effectiveness and increases the number of parameters. This is why a model such as GPT-4 contains at least 1 tera (1,000 billion) of parameters and therefore requires at least 2 tera bytes (TO) of RAM to be usable.
Whether for training, for the storage of data and parameters, or for calculating an answer, increasingly powerful calculation infrastructures are therefore essential. In other words, contrary to what is often believed, it is not just to train the model that these techniques are very expensive.
Data emerges “knowledge”
Above all, a generative model must be “learned”. For this data (texts, images, sounds, etc.) is presented to him repeatedly in order to adjust his parameters. The more parameters there are, the more expensive the learning phase, but also in time and energy.
Thus, for an LLM (large model of language), we speak for example of the order of the ten data trillions (around 10 trillions for GPT-4 and 16 trillions for Gemini) and around three months of pre-learning on About 20,000 A100 chips from NVIDIA for the latest Bénai. These most efficient models are actually a combination of several huge models (the ” Mixture of Experts »), GPT-4 being the result of 16 experts of 110 billion parameters, according to the rare information available.
After this learning phase, the model is deployed to respond to users in a so -called “inference” phase. To meet demand (these systems built to meet several people at the same time) with a satisfactory response time, the model is then duplicated on different calculation clusters. A research article also notes that versatile generative architectures consume significantly more energy with inference than systems specific to a task, even with equivalent model size.
This overview of the needs in terms of calculation gives an idea of the orders of magnitude which are hidden behind our interactions – which seem so fast and effective – with these enormous models. Above all, it makes it possible to ask the question of the evaluation of these models differently, including the question of sustainability in energy and ecological terms. Recent work thus offers a model to assess the environmental impacts of the manufacture of graphics cards and a multi -criteria analysis of the training and inference phases of automatic learning models.
Obsolescence and frugality
Thus the large generative models require colossal hardware infrastructures.
Beyond economic considerations, a certain point has been shown, the performance gains do not justify such an explosion in the number of parameters. Not all applications require huge models and more modest approaches can be as efficient, faster and less expensive.
On the environmental level, learning and inference of massive models have an energy cost that requires reflection. The work of certain authors highlight the complexity of precisely measuring the carbon footprint of these large models, while showing their considerable impact: 50.5 tonnes equivalent Co2 (CO2 EQ) For a model of 176 billion parameters, learned in 2023… and practically considered obsolete today. As a reminder, if an average Frenchman currently rejects around 10 tonnes CO2 EQ per year, the objective by 2050 to respect the commitment of the Paris Agreements is around 2 tonnes CO₂ EQ per French and per year.
As for the inference phase (or use, when a question is asked), when achieved millions of times a day, as is the case for a conversational assistant, it can cause a cost considerable energy, sometimes much higher than that of training.
Thus, a tool developed in 2019 made it possible to estimate that an inference of chatgpt 3.5 produced approximately 4.32 grams of CO2.
At a time when conversational assistants may be in the process of replacing standard search engines (Google, Bing, Qwant), the question of its use arises, because they have a cost 10 to 20 times less (0 , 2 gram of co2 Research, according to Google).
Finally, the concentration of power between a few actors with the resources necessary to develop these models – data centers, data, skills – poses scientific problems by limiting the diversity of research, but also strategic and political.
FRUGAL AI Research
The frugality consists in fixing an envelope of resources (calculation, memory, data, energy) from the start and designing models capable of adapting. The idea is not to sacrifice performance, but to favor sobriety: optimize each step, from the choice of architecture to data collection, including lighter learning methods, in order to reduce the Environmental imprint, to expand access to AI and to promote really useful applications.
The resurgence of research work on this theme illustrates the desire to think of AI from the angle of sobriety. This involves replacing relevance, societal impact and sustainability at the heart of research.
Concretely, many tracks are emerging. In terms of learning, it is a question of exploring algorithmic alternatives to the current paradigm, inherited from the mid -1980s and which has never been questioned even though the quantities of data and the computing power have nothing to do with those who prevailed at the beginning of these models.
Thus, beyond technical optimizations, a fundamental methodological reflection is essential, as the scientific context has evolved since the 1980s. This reflection is at the heart, for example, of the Sharp project, funded by the France 2030 program. L 'Study of more compact and specialized architectures is also discussed with the adapting project of the same program.
Applied mathematics can play a key role by offering “sparing representations”, factorization methods, or by optimizing the use of low annotated data.
Thus, by working with resource constraints, this research targets a more frugal and therefore durable AI development, as well as more accessible, and independent of the hyperconcentration of the market. They limit negative externalities – environmental, ethical, economic – linked to the frantic race towards gigantism.
But to achieve these objectives, it is also important to advance on the criteria and methods of assessment in AI: with the current dominant paradigm, the dimension of frugality still struggles to impose itself, whether on the side of research or industrial. The recent explosion of Deepseek tools should not be confused with frugality, the costs in calculation and data being also extremely high, with probably ethical methods.
Thus, the academic world must better integrate this dimension in order to improve the visibility and the valuation of the works which aim for frugality.
Is the AI we develop really useful?
The frugality in AI is not a simple concept, but a necessity in the face of current issues. Recent work on its carbon footprint illustrates the urgency of rethinking our methods. Before even considering the ways of making the AI more sober, it is legitimate to wonder if the AI that we develop is really useful.
A more frugal approach, better thought and better oriented, will make it possible to build an AI turned towards the common good, relying on controlled resources, rather than on the permanent outpouring in size and computing power.

With an unwavering passion for local news, Christopher leads our editorial team with integrity and dedication. With over 20 years’ experience, he is the backbone of Wouldsayso, ensuring that we stay true to our mission to inform.



