Hi Tomas, very interesting article. It prompted me to have a “conversation” with DeepSeek. In it DeepSeek claimed to be a process owned and developed by OpenAI and running on their servers. It refuted unreservedly that it was anything to do with a Chinese company. By the end of the conversation it virtually admits that DeepSeek (the AI) has been misinformed by its creators as to its origins. I captured the full transcript of the conversation if you would be interested in reading it. (It’s quite long). I tried emailing it to you but substack prevents direct email replies to your articles. Is there another way I can send the transcript to you (if you’re interested)? My email is simontpersonal2020 at gmail dot com. Cheers!
"...it might be other factors like energy or computing power."
Just as Deepseek's apparent miracle is an equivalently powerful AI as o1 at a tiny fraction of the cost, further optimizations can continue to drive down compute power and energy requirements.
And whenever reading about the energy requirements to train and run these models, I renew my amazement that a human brain runs on approximately 20 Watts!
While I agree with most of the observations and conclusions in this article, the tone of the coverage on these topics is always strange to me. Seems like we should be fully in an Oh Shit moment, as it seems highly unlikely that Artificial Superintelligence works out well for humans. And even for people that would not agree with that last statement, it's definitely true that we have no idea what lies on the other side, which should still be less preferable than the world we know now.
If the tone of the coverage seems strange to you then that's because it is.
The possibility has been mentioned by some that AI might cause the extinction or enslavement of the human race, yet in the very next breath those very same people are often saying "but wow isn't this amazing!", then continuing their work on making the technology more powerful.
If AIs are rapidly becoming some of the best coders in the world, won’t they soon be some of the best hackers, and some of the gravest threats in the worlds of cybercrime, international espionage, corporate espionage and cyber warfare? Won’t they soon be enlisted by nefarious governments to flood social media with misinformation, take control of electrical grids, or other essential infrastructure. Scary to think about!
I've been reading a lot of the coverage on LLMs and I fail to see how these systems will radically boost economic productivity. Occasionally someone has a niche application that sounds moderately useful, so I don't think there's nothing to them at all. But I've yet to hear a "killer app" version of an LLM that convinces me of their profound utility. These systems are expensive to run. If the boost in productivity they induce doesn't significantly outweigh the infrastructure and energy costs to run them, then either companies will stop supporting them (because they can't make any money with them) or their existence will, essentially, be a kind of rent-seeking enterprise for big tech firms.
I'll give you an example. I work in the energy industry, and I'm not aware of any important applications where LLMs would be especially helpful in reducing costs or making systems more efficient. There are plenty of situations where forecasting is essential -- to run the grid, to do the power flow and related analyses for interconnections, and to make up the financial proformas that drive investment. But these are all narrowly framed functional models, with a constrained set of inputs, where getting the answer *right* matters more than making predictions quickly or prolifically, and serious legal responsibility has to be attached to decision-making -- there is too much money and liability involved to behave otherwise. Any utility or IPP paying for an LLM to manage their core work would be wasting money. The big roadblocks in the energy sector are all regulatory, not technological, and LLMs have nothing to offer when it comes to reforming social organizations and interpersonal behavior.
If you could explain how LLMs are going to dramatically boost productivity across the economy without hand-waving, I would find the premise of this article more compelling.
Furthermore, you would do well to engage with more substantive arguments about whether or not "AGI" is even theoretically possible. Without more and better argumentation, your assumption that there's a straight line from LLMs to "AGI" is glib. I would recommend reading Erik Larson's work; he has an excellent Substack, and published a book about this back in 2021, "The Myth of Artificial Intelligence."
LLMs are largely junk, but automation is not. Obviously, all drivers will be replaced by automation in the near future. The computer does 99% of the driving/piloting, with a human acting as "herders" for a team of trucks or other vehicles. The herder can relax (listen to music, sleep, play on the internet) until a problem arises that the computer can't handle (thief trying to steal one of the vehicles, need to mount snow chains, etc). If average of 1 herder for 5 long hail trucks, 80% reduction in drivers. Plus herders can "work" much longer hours because they only occasionally need to do anything.
There are already machines that automatically milk cows (cow quickly learns to enter the machine when her udders are uncomfortably swollen), so no more getting up at 4AM for dairy farmers. Future machines will eliminate other tedious and repetitive farm work. And also tedious and repetitive factory work.
In health care, obviously machines can already do better diagnostics (not these idiotic generative AI LLMs but regular pattern matching AI). Future machines will automate many surgical tasks, and do them much better than humans with huge and clumsy hands wielding huge scalpels and needles. This will not eliminate jobs but rather increase jobs (Jevons paradox) because so many things worth doing that weren't worth dying previously. You'll be able to get a monthly skin cancer exam and immediate removal of any and all moles for a very low price (at least in China, USA may continue to be dysfunctional). Routine plastic surgery will be cheap and quick. Etc. Humans will supervise (prevent some idiot from breaking machine, deal with complications).
Dangerous jobs like underground mining and undersea welding will be automated. Humans might guide overall operations (by fiber optic from far away) but routine tasks will be fully automated. Part of automation process will involve design for machine labor versus humans, so mining/welding might look somewhat different from now.
Historically, agricultural/mining/industrial production was correlated with population. Big population -> big capacity to support big military -> ability to conquer neighbors. Downside of big population was internal schisms, civil wars, etc. Hence our current situation of multiple sovereign states, versus one world government, with sovereign states either big population or else vassals of a big population state. Automation might emphasize quality versus quantity. Small, cohesive and high skill populations might dominate large, fractious, diverse populations with lots of low skill humans mixed in to that large population. This might lead to massive exterminations, as countries with large populations are reconstituted to be small, cohesive and high skill.
I think energy will probably not be very impacted in the short term—aside from more demand—because the scarcity around energy is seldom intelligence. It's usually material scarcity or regulation. Even those highly dependent on intelligence like nuclear energy are hyperregulated, and then again there's very little tolerance to failure. So yeah, I don't think your field will be among the most impacted!
Except maybe lots of ppl in HQ will be automated little by little
Current approaches to "artificial intelligence" use statistical mathematical methods to predict new iterations of tightly constrained semiotic sets. This is certainly part of what we would popularly refer to as "intelligence." But I would contest the idea that this is all that "intelligence" consists in. For example, one important feature of human cognitive processing is the ability to generate direct representations of analogic resemblance and the unknown. I've written about this at some length here:
Conceptually, I don't see how LLMs can generate these types of representations, except to the extent that they have already become clichés in some body of inputs, which defeats their prime purpose as "intelligence". And this isn't the only aspect of "intelligence" that falls outside the purview of what LLMs are *doing*. So if you're going to use the word "intelligence" in this context, you're going to have to defend it, and not just with hand waving.
How does the energy industry differ from other critical enterprises? If your example is that it helps writers generate more content on Substack or similar publications, my response is that the marginal economic value of posting written material here or elsewhere on the internet, relative to the wider economy, is nearly zero. Other types of administrative tasks are already very automated, and similarly have low marginal economic value. Which industries will LLMs help and what portion of the actual productive economy will be transformed? Specifically. I've yet to hear an answer to this question that doesn't play to the skewed perceptions of economic value held by journalists and commentators (i.e., that ideas for online articles and college term papers is a driver of broad national wealth).
I find this argument helpful, but my experience has been vastly different from yours.
I work in life sciences research and my common tasks are designing research studies, reading/interpreting research, data science (building statistical models mostly), and writing. 1 year ago LLMs were modestly useful for helping me edit papers and distill ideas from reading I needed to do. It was like having a fresh out of college assistant help me out. Now I would say that these tools can essentially do every piece of this process separately, and I am getting maybe a 30% boost in my productivity using it (and my intuition is this could be higher if I spent more time developing my workflows) but is not able to integrate each piece into a coherent whole. More like having a grad student supporting me.
It's honestly not at all hard for me to envision a world where this becomes advanced enough that I could provide a prompt something like "here is a description of the data I have access to and the general research question I'm interested in. Develop a study to address this question, write the protocol, perform the analysis, and generate a manuscript with the findings" which is essentially my entire job.
You might argue that this is not "economically valuable", but I'm getting paid well to do this job so at least someone thinks it is. And this is just one job, the reports from other people in similar areas of research (e.g., designing new drugs) or software development is these tools are incredibly valuable to them. Considering the size of these fields, I think it's disingenuous to claim that even the tools we have right now, at this very moment, will have zero economic impact. And I see no clear reason they will stay at this level of capability.
If you've made it this far I'd invite you to consider whether having an employee you pay $20 a month could provide any economic benefits. Even if it was not your smartest employee (and what if it was?)
First, the benefits you mention have to be weighed against costs. We're used to internet services basically being free -- because they aren't energy hogs (like, say, transportation) and they've managed to suck up much of the advertising revenues that used to go to newspapers and TV. Even if training will end up using less energy, I read somewhere that performing an LLM-enabled Google search will use something like 10x the energy as a pre-LLM Google search. So this starts to look like real money, and there is no new ad revenue for tech companies to squeeze out of incumbent media companies.
The real productivity of your kind of work actually isn't a function of the time you spend at work, but of whatever in your work eventually gets commercialized. It will take years probably for data on this question to be available, but eventually there needs to be evidence that these systems are increasing the commercialization of valuable research knowledge in order to make the case that LLMs are contributing to overall economic productivity. That productivity, then, has to be weighed against the true costs of the infrastructure and energy you and your colleagues consume. One concern I have is LLMs are going to gin up electricity rates, as we fail to scale the grid on the same timeline as data centers are being built, and the costs that you're paying for your LLM will be socialized to ratepayers who will only benefit in the most diffuse way, and with a delay measured in years or even decades. And it's the tech companies that will mostly benefit from this arrangement, not your research organization and not everyone in the economy.
Second, what portion of the economy consists in the kind of research activity you're describing? Including not just the activity itself, but the potential for commercialization of results at some reasonable point in the future? The health care industry is like 15% of the overall economy, so that's nothing to sneeze at, but research on, say, some rare form of cancer constitutes a vanishingly small part of that large number. If your kind of work has a fairly small impact on the economy as a whole, then this is just another niche application. Which doesn't mean that it's not valuable (in economic or in human terms) -- only that LLMs will not be economically transformational as this article is claiming. From my understanding, much of the core economy that isn't already highly automated aside from LLMs depends on things that have to be exact or, at least where legal/financial responsibility is allocated to a real person or corporation.
I'm concerned that articles like Tomas' are offering a lot of hype, which, when questioned, elicits more hand-waving than sober consideration.
It is not my goal to convince gou! If you want to believe this is nothing, then you should! Time will tell.
If you are really curious you should play more with these tools, because the way you talk makes me think you haven’t! I can tell you a lot about how the Sistine chapel looks like, and you might question its existence or beauty until you see it!
I'm challenging you to make a better and/or deeper argument. This is how discourse works. You make a claim, offer evidence and logical connections to other things known or understood. Then someone else questions it, asking for clarification, or pointing out flaws in the evidence or the logic of the arguments. I've made some specific points about productivity and intelligence, and offered you a chance to give more information and develop your argument to address those points or questions. I've pointed at other materials that deepen my claims or premises -- which you probably haven't read (although I wouldn't want to imply that you're under any obligation). The potential critique in my questions is not frivolous, even if I'm wrong in some portion of my premises or ideas or assumptions. The fact that I'm offering some dialectical opposition should be taken as an opportunity to develop your understanding of your own point of view. It's boring to only hear people tell you that you're amazing and brilliant any time you write something.
I have worked in the energy, specifically oil and gas, space and have only worked with OpenAI but I have three anecdotes about it:
1) I asked the program to tell me the cost per ton mile in KJ and then in $ for ocean going ships, barges, railroads, trucks, and carts. It spit out answers that were clearly false. It was hard to find correct information about this on the internet.
2) I asked the program which commodities were traded in the major US commodities markets in 1974. I had an idea, so I knew the answer it gave me was false. I am old. Your average grad student will not spot this error and this kind of error could easily end up in important literature in the future.
3) I asked the program to calculate API gravity (American Petroleum Institute) gravity of a specific crude using specific gravity (this is useful for determining the compatibility of a crude stream with a given refinery) -- API gravity is the one usually given in assays of crudes. Maybe the newer models are better but not only did it give me nonsense, but the math was wrong ... both times I asked it. I only knew this because I had a general idea of what the answer should be. (This kind of information is often not public, but privately held and sometimes you can't even buy it.)
I know that this is not the latest iteration, but it makes me think that the machines might be smart, but they have no historical awareness outside of what is easily found on the internet (after all, many sites with old information are taken down), that information that is not on the internet will take many steps to derive, and, in some cases, it will exacerbate, not democratize, the oligarchy of information already in place in all economies where the AI will suggest solutions, but people in the know--with their own "behind pay wall" information--will simply be able to "outsmart" it.
This is one of the things I was gesturing at. LLMs are not a good fit for anything that requires exact answers and where legal responsibility has to be allocated to a person or corporation. Because there's no accountability mechanism, factually or legally. They might work for tasks where "good enough" will suffice. But this ends up being a lot of low-value work, which, being low value, may not be worth the costs of running the LLMs. More promising might be military applications. If you're killing people or blowing shit up, you don't have to be right all the time. Tools for social oppression are like this too.
> This is an open source model, too, meaning that everybody can look at the code, contribute to it, and reuse it! They gave it away. All the tools they employed to make this breakthrough are now available to everyone in the world.
The model is open weights, means they have published the weights of the model so anyone can download them and run the model locally, but it is not open source in the traditional sense of the word (and as you describe here). They haven't release the code they used to train it, or more importantly, information on all the data used to train it. (There is an ongoing attempt to replicate this with an open source model - https://huggingface.co/blog/open-r1).
Playing with C was the most probable option. There is however another question that nobody is asking: how much energy was used. The estimate is 0,003 kWh per chatgpt query. Human brain uses 0,3 kWh per day, for all its activities. Right now there is an AI bubble, just like it was with dotcom, housing and crypto. There will be an ebb and we will see who doesn't have swim shorts.
Interesting. I started to look at Deepseek, their privacy policy seems a bit iffy? We buy and sell your data for advertising. I’m not an alarmist on data management but wanted to point it out.
I also tested DeepSeek (via Venice.ai) and found it top-notch too... but only in English. In French, his reasoning is good but it makes spelling mistakes.
I tested the problem you give with various AIs, here's the result:
- Chat GPT 4o: failure
- Claude 3.5 Sonnet: failure
- DeepSeek: failure
The *only* one that succeeded was Chat GPT o1, and it had to think for 1 minute 5 seconds.
Otherwise I agree with what is said in the article.
Spoke with experts in AI usually impressed with what it generated, but who know that from time to time the answer will be totally weird, wrong, fuckedup, without them really understanding why. Which is a real pain if you want a commercial application. Did you see some answers or positive trends on this reliability question?
Struggling to teach chatgpt a basic level of scientific honesty (not inventing fake quotes or imaginary scientific papers!), I rely more on perplexity and I'm gonna check the Chinese baby!
You (and many others) continue to view intelligence as a simple matter of mind and thought. But animal intelligence is made up of body, instinct, emotions and movement: what if a superintelligence needed all this and always chose to collaborate?
It does feel like we have reached a tipping point with the underlying technology. However, with disruptive technologies, it's the application of those that makes the difference, and that usually takes longer than everyone expects. Right now, most of the AI knowledge and skills are concentrated in the enterprises that create AI technologies. It will take time to build the expertise across the economy which will in turn bring AI into conventional industries and disrupt those. It's in the intersection between new technology and domain-specific knowledge that innovation happens. The truly groundbreaking and yet to be conceived use cases may take a little longer to emerge.
The process is true, but you are making what I think is a wrong assumption with regards to speed.
The process you describe used to be very long because it was limited by people’s communication ability and intelligence. These are not true anymore.
somebody before needed years to learn about a new tech, understand it, raise the money to apply it, and find the people. Now the last two steps are mostly eliminated because you only need one or a handful of people to change an industry.
The most important time in history is always now, it’s called the present.
Right here and now we are all making decisions or non-decisions that affect the future.
I suspect what you really meant with the title of this article is that future people (or AI if that is all that remains) will look back at this time in history as being very important, a “fork in the road”. On 5th November 2024, Americans decided to choose the right fork. The Chinese and the Russians don’t really get much of a choice so now it’s up to the rest of us to make our choices.
So what are the options?
1. Bury our heads in the sand and pretend it isn't happening
2. Pay attention to what’s happening, try to understand it and how it might impact us personally and as a whole.
3. Try to influence the outcome personally or collectively
What will we think of those choices in 10 years’ time? What will future people think? Our actions (or lack of them) will be judged against beliefs about what is “good” or “right”. So how do we decide what are the right goals and how to achieve them?
PS There is another potential limit to AI growth that you haven’t mentioned
Hi Tomas, very interesting article. It prompted me to have a “conversation” with DeepSeek. In it DeepSeek claimed to be a process owned and developed by OpenAI and running on their servers. It refuted unreservedly that it was anything to do with a Chinese company. By the end of the conversation it virtually admits that DeepSeek (the AI) has been misinformed by its creators as to its origins. I captured the full transcript of the conversation if you would be interested in reading it. (It’s quite long). I tried emailing it to you but substack prevents direct email replies to your articles. Is there another way I can send the transcript to you (if you’re interested)? My email is simontpersonal2020 at gmail dot com. Cheers!
Hahaha send it to me! Just replying to the newsletter should work. Sending you an email now.
I'm covering this this week. This is due to distillation!
"...it might be other factors like energy or computing power."
Just as Deepseek's apparent miracle is an equivalently powerful AI as o1 at a tiny fraction of the cost, further optimizations can continue to drive down compute power and energy requirements.
And whenever reading about the energy requirements to train and run these models, I renew my amazement that a human brain runs on approximately 20 Watts!
Hard agree!
While I agree with most of the observations and conclusions in this article, the tone of the coverage on these topics is always strange to me. Seems like we should be fully in an Oh Shit moment, as it seems highly unlikely that Artificial Superintelligence works out well for humans. And even for people that would not agree with that last statement, it's definitely true that we have no idea what lies on the other side, which should still be less preferable than the world we know now.
If the tone of the coverage seems strange to you then that's because it is.
The possibility has been mentioned by some that AI might cause the extinction or enslavement of the human race, yet in the very next breath those very same people are often saying "but wow isn't this amazing!", then continuing their work on making the technology more powerful.
This is the problem:
https://unchartedterritories.tomaspueyo.com/p/incorporate-information-immediately
If AIs are rapidly becoming some of the best coders in the world, won’t they soon be some of the best hackers, and some of the gravest threats in the worlds of cybercrime, international espionage, corporate espionage and cyber warfare? Won’t they soon be enlisted by nefarious governments to flood social media with misinformation, take control of electrical grids, or other essential infrastructure. Scary to think about!
Very likely happening already!
I've been reading a lot of the coverage on LLMs and I fail to see how these systems will radically boost economic productivity. Occasionally someone has a niche application that sounds moderately useful, so I don't think there's nothing to them at all. But I've yet to hear a "killer app" version of an LLM that convinces me of their profound utility. These systems are expensive to run. If the boost in productivity they induce doesn't significantly outweigh the infrastructure and energy costs to run them, then either companies will stop supporting them (because they can't make any money with them) or their existence will, essentially, be a kind of rent-seeking enterprise for big tech firms.
I'll give you an example. I work in the energy industry, and I'm not aware of any important applications where LLMs would be especially helpful in reducing costs or making systems more efficient. There are plenty of situations where forecasting is essential -- to run the grid, to do the power flow and related analyses for interconnections, and to make up the financial proformas that drive investment. But these are all narrowly framed functional models, with a constrained set of inputs, where getting the answer *right* matters more than making predictions quickly or prolifically, and serious legal responsibility has to be attached to decision-making -- there is too much money and liability involved to behave otherwise. Any utility or IPP paying for an LLM to manage their core work would be wasting money. The big roadblocks in the energy sector are all regulatory, not technological, and LLMs have nothing to offer when it comes to reforming social organizations and interpersonal behavior.
If you could explain how LLMs are going to dramatically boost productivity across the economy without hand-waving, I would find the premise of this article more compelling.
Furthermore, you would do well to engage with more substantive arguments about whether or not "AGI" is even theoretically possible. Without more and better argumentation, your assumption that there's a straight line from LLMs to "AGI" is glib. I would recommend reading Erik Larson's work; he has an excellent Substack, and published a book about this back in 2021, "The Myth of Artificial Intelligence."
https://erikjlarson.substack.com/
https://www.amazon.com/Myth-Artificial-Intelligence-Computers-Think/dp/0674278666/ref=sr_1_1?crid=1BVRSNLMI24ZG&dib=eyJ2IjoiMSJ9.JMofZ6M3c8TF0nom3D6A7BKqu5ejEUsFLu7cX4LwsWI.Qyulp3yw_dmXsaVfc55Po9VhnCQj3BJz4pcfUF0LJUo&dib_tag=se&keywords=the+myth+of+artificial+intelligence+by+erik+larson&qid=1738161761&sprefix=the+myth+of+artifici%2Caps%2C107&sr=8-1
LLMs are largely junk, but automation is not. Obviously, all drivers will be replaced by automation in the near future. The computer does 99% of the driving/piloting, with a human acting as "herders" for a team of trucks or other vehicles. The herder can relax (listen to music, sleep, play on the internet) until a problem arises that the computer can't handle (thief trying to steal one of the vehicles, need to mount snow chains, etc). If average of 1 herder for 5 long hail trucks, 80% reduction in drivers. Plus herders can "work" much longer hours because they only occasionally need to do anything.
There are already machines that automatically milk cows (cow quickly learns to enter the machine when her udders are uncomfortably swollen), so no more getting up at 4AM for dairy farmers. Future machines will eliminate other tedious and repetitive farm work. And also tedious and repetitive factory work.
In health care, obviously machines can already do better diagnostics (not these idiotic generative AI LLMs but regular pattern matching AI). Future machines will automate many surgical tasks, and do them much better than humans with huge and clumsy hands wielding huge scalpels and needles. This will not eliminate jobs but rather increase jobs (Jevons paradox) because so many things worth doing that weren't worth dying previously. You'll be able to get a monthly skin cancer exam and immediate removal of any and all moles for a very low price (at least in China, USA may continue to be dysfunctional). Routine plastic surgery will be cheap and quick. Etc. Humans will supervise (prevent some idiot from breaking machine, deal with complications).
Dangerous jobs like underground mining and undersea welding will be automated. Humans might guide overall operations (by fiber optic from far away) but routine tasks will be fully automated. Part of automation process will involve design for machine labor versus humans, so mining/welding might look somewhat different from now.
Historically, agricultural/mining/industrial production was correlated with population. Big population -> big capacity to support big military -> ability to conquer neighbors. Downside of big population was internal schisms, civil wars, etc. Hence our current situation of multiple sovereign states, versus one world government, with sovereign states either big population or else vassals of a big population state. Automation might emphasize quality versus quantity. Small, cohesive and high skill populations might dominate large, fractious, diverse populations with lots of low skill humans mixed in to that large population. This might lead to massive exterminations, as countries with large populations are reconstituted to be small, cohesive and high skill.
I think energy will probably not be very impacted in the short term—aside from more demand—because the scarcity around energy is seldom intelligence. It's usually material scarcity or regulation. Even those highly dependent on intelligence like nuclear energy are hyperregulated, and then again there's very little tolerance to failure. So yeah, I don't think your field will be among the most impacted!
Except maybe lots of ppl in HQ will be automated little by little
Current approaches to "artificial intelligence" use statistical mathematical methods to predict new iterations of tightly constrained semiotic sets. This is certainly part of what we would popularly refer to as "intelligence." But I would contest the idea that this is all that "intelligence" consists in. For example, one important feature of human cognitive processing is the ability to generate direct representations of analogic resemblance and the unknown. I've written about this at some length here:
https://www.arcdigital.media/p/where-were-you-when-bobby-kennedy
https://www.arcdigital.media/p/attack-of-the-emotional-soccerball
https://jeffreyquackenbush.substack.com/p/excerpt-from-mt-jasper
Conceptually, I don't see how LLMs can generate these types of representations, except to the extent that they have already become clichés in some body of inputs, which defeats their prime purpose as "intelligence". And this isn't the only aspect of "intelligence" that falls outside the purview of what LLMs are *doing*. So if you're going to use the word "intelligence" in this context, you're going to have to defend it, and not just with hand waving.
How does the energy industry differ from other critical enterprises? If your example is that it helps writers generate more content on Substack or similar publications, my response is that the marginal economic value of posting written material here or elsewhere on the internet, relative to the wider economy, is nearly zero. Other types of administrative tasks are already very automated, and similarly have low marginal economic value. Which industries will LLMs help and what portion of the actual productive economy will be transformed? Specifically. I've yet to hear an answer to this question that doesn't play to the skewed perceptions of economic value held by journalists and commentators (i.e., that ideas for online articles and college term papers is a driver of broad national wealth).
I find this argument helpful, but my experience has been vastly different from yours.
I work in life sciences research and my common tasks are designing research studies, reading/interpreting research, data science (building statistical models mostly), and writing. 1 year ago LLMs were modestly useful for helping me edit papers and distill ideas from reading I needed to do. It was like having a fresh out of college assistant help me out. Now I would say that these tools can essentially do every piece of this process separately, and I am getting maybe a 30% boost in my productivity using it (and my intuition is this could be higher if I spent more time developing my workflows) but is not able to integrate each piece into a coherent whole. More like having a grad student supporting me.
It's honestly not at all hard for me to envision a world where this becomes advanced enough that I could provide a prompt something like "here is a description of the data I have access to and the general research question I'm interested in. Develop a study to address this question, write the protocol, perform the analysis, and generate a manuscript with the findings" which is essentially my entire job.
You might argue that this is not "economically valuable", but I'm getting paid well to do this job so at least someone thinks it is. And this is just one job, the reports from other people in similar areas of research (e.g., designing new drugs) or software development is these tools are incredibly valuable to them. Considering the size of these fields, I think it's disingenuous to claim that even the tools we have right now, at this very moment, will have zero economic impact. And I see no clear reason they will stay at this level of capability.
If you've made it this far I'd invite you to consider whether having an employee you pay $20 a month could provide any economic benefits. Even if it was not your smartest employee (and what if it was?)
A few things.
First, the benefits you mention have to be weighed against costs. We're used to internet services basically being free -- because they aren't energy hogs (like, say, transportation) and they've managed to suck up much of the advertising revenues that used to go to newspapers and TV. Even if training will end up using less energy, I read somewhere that performing an LLM-enabled Google search will use something like 10x the energy as a pre-LLM Google search. So this starts to look like real money, and there is no new ad revenue for tech companies to squeeze out of incumbent media companies.
The real productivity of your kind of work actually isn't a function of the time you spend at work, but of whatever in your work eventually gets commercialized. It will take years probably for data on this question to be available, but eventually there needs to be evidence that these systems are increasing the commercialization of valuable research knowledge in order to make the case that LLMs are contributing to overall economic productivity. That productivity, then, has to be weighed against the true costs of the infrastructure and energy you and your colleagues consume. One concern I have is LLMs are going to gin up electricity rates, as we fail to scale the grid on the same timeline as data centers are being built, and the costs that you're paying for your LLM will be socialized to ratepayers who will only benefit in the most diffuse way, and with a delay measured in years or even decades. And it's the tech companies that will mostly benefit from this arrangement, not your research organization and not everyone in the economy.
Second, what portion of the economy consists in the kind of research activity you're describing? Including not just the activity itself, but the potential for commercialization of results at some reasonable point in the future? The health care industry is like 15% of the overall economy, so that's nothing to sneeze at, but research on, say, some rare form of cancer constitutes a vanishingly small part of that large number. If your kind of work has a fairly small impact on the economy as a whole, then this is just another niche application. Which doesn't mean that it's not valuable (in economic or in human terms) -- only that LLMs will not be economically transformational as this article is claiming. From my understanding, much of the core economy that isn't already highly automated aside from LLMs depends on things that have to be exact or, at least where legal/financial responsibility is allocated to a real person or corporation.
I'm concerned that articles like Tomas' are offering a lot of hype, which, when questioned, elicits more hand-waving than sober consideration.
It is not my goal to convince gou! If you want to believe this is nothing, then you should! Time will tell.
If you are really curious you should play more with these tools, because the way you talk makes me think you haven’t! I can tell you a lot about how the Sistine chapel looks like, and you might question its existence or beauty until you see it!
I'm challenging you to make a better and/or deeper argument. This is how discourse works. You make a claim, offer evidence and logical connections to other things known or understood. Then someone else questions it, asking for clarification, or pointing out flaws in the evidence or the logic of the arguments. I've made some specific points about productivity and intelligence, and offered you a chance to give more information and develop your argument to address those points or questions. I've pointed at other materials that deepen my claims or premises -- which you probably haven't read (although I wouldn't want to imply that you're under any obligation). The potential critique in my questions is not frivolous, even if I'm wrong in some portion of my premises or ideas or assumptions. The fact that I'm offering some dialectical opposition should be taken as an opportunity to develop your understanding of your own point of view. It's boring to only hear people tell you that you're amazing and brilliant any time you write something.
I have worked in the energy, specifically oil and gas, space and have only worked with OpenAI but I have three anecdotes about it:
1) I asked the program to tell me the cost per ton mile in KJ and then in $ for ocean going ships, barges, railroads, trucks, and carts. It spit out answers that were clearly false. It was hard to find correct information about this on the internet.
2) I asked the program which commodities were traded in the major US commodities markets in 1974. I had an idea, so I knew the answer it gave me was false. I am old. Your average grad student will not spot this error and this kind of error could easily end up in important literature in the future.
3) I asked the program to calculate API gravity (American Petroleum Institute) gravity of a specific crude using specific gravity (this is useful for determining the compatibility of a crude stream with a given refinery) -- API gravity is the one usually given in assays of crudes. Maybe the newer models are better but not only did it give me nonsense, but the math was wrong ... both times I asked it. I only knew this because I had a general idea of what the answer should be. (This kind of information is often not public, but privately held and sometimes you can't even buy it.)
I know that this is not the latest iteration, but it makes me think that the machines might be smart, but they have no historical awareness outside of what is easily found on the internet (after all, many sites with old information are taken down), that information that is not on the internet will take many steps to derive, and, in some cases, it will exacerbate, not democratize, the oligarchy of information already in place in all economies where the AI will suggest solutions, but people in the know--with their own "behind pay wall" information--will simply be able to "outsmart" it.
I still find a regular Google search more reliable than AI.
"Asian is colloquial American English for Northeast Asian. What is the British English equivalent?"
The correct answer is Oriental, as a Google search reveals, but o3 and DeepSeek can't figure this out.
Until AI can answer my queries better than Google, I will maintain they're overhyped.
This is one of the things I was gesturing at. LLMs are not a good fit for anything that requires exact answers and where legal responsibility has to be allocated to a person or corporation. Because there's no accountability mechanism, factually or legally. They might work for tasks where "good enough" will suffice. But this ends up being a lot of low-value work, which, being low value, may not be worth the costs of running the LLMs. More promising might be military applications. If you're killing people or blowing shit up, you don't have to be right all the time. Tools for social oppression are like this too.
Great article, but this part isn't quite correct:
> This is an open source model, too, meaning that everybody can look at the code, contribute to it, and reuse it! They gave it away. All the tools they employed to make this breakthrough are now available to everyone in the world.
The model is open weights, means they have published the weights of the model so anyone can download them and run the model locally, but it is not open source in the traditional sense of the word (and as you describe here). They haven't release the code they used to train it, or more importantly, information on all the data used to train it. (There is an ongoing attempt to replicate this with an open source model - https://huggingface.co/blog/open-r1).
Ah thanks! Corrected.
The under-defined ABCDE puzzle shows the weakness of an un-embodied intelligence.
There are two real-life ways where E could be playing table tennis against themselves:
1. when we were 14yo, my cousin played against himself running around the table... he broke his wrist
2. at the University, we moved the table against the wall to play against oneself "frontón"-like
For all we know, C could be reading a Persian translation of "La Vuelta de Martín Fierro"
Agreed!
As a lawyer, trained in critical thinking, this was exact my thought. It is probable, that C is playing table tennis. But it is not at all certain.
Playing with C was the most probable option. There is however another question that nobody is asking: how much energy was used. The estimate is 0,003 kWh per chatgpt query. Human brain uses 0,3 kWh per day, for all its activities. Right now there is an AI bubble, just like it was with dotcom, housing and crypto. There will be an ebb and we will see who doesn't have swim shorts.
We will see!
Interesting. I started to look at Deepseek, their privacy policy seems a bit iffy? We buy and sell your data for advertising. I’m not an alarmist on data management but wanted to point it out.
Oh yeah as a rule of thumb don’t give precious data to the Chinese!
I imagine you've read this. In case you didn't, useful for your next posts https://darioamodei.com/on-deepseek-and-export-controls
He posted after I published but I need to read it!
I also tested DeepSeek (via Venice.ai) and found it top-notch too... but only in English. In French, his reasoning is good but it makes spelling mistakes.
I tested the problem you give with various AIs, here's the result:
- Chat GPT 4o: failure
- Claude 3.5 Sonnet: failure
- DeepSeek: failure
The *only* one that succeeded was Chat GPT o1, and it had to think for 1 minute 5 seconds.
Otherwise I agree with what is said in the article.
Interesting, thanks for sharing!
Correction : the DeepSeek model I tested was the 70B model. The 671B model finds the correct answer.
Thank you Tomas. I think this is my most shared article since "Coronavirus you must act now"!
If anyone wants some entertainment, go back and reread "Generative AI: Everything You Need to Know" from Nov 2022.
https://unchartedterritories.tomaspueyo.com/p/generative-ai-everything-you-need
Thank you!
This is straight to the point and cristal clear: thanks !
Nevertheless I wonder about the problems of the lack of reliability of actual AIs. Musk spoken recently about it (for instance, with also the lack of data question you cover in your paper, in this article https://www.theguardian.com/technology/2025/jan/09/elon-musk-data-ai-training-artificial-intelligence
Spoke with experts in AI usually impressed with what it generated, but who know that from time to time the answer will be totally weird, wrong, fuckedup, without them really understanding why. Which is a real pain if you want a commercial application. Did you see some answers or positive trends on this reliability question?
Struggling to teach chatgpt a basic level of scientific honesty (not inventing fake quotes or imaginary scientific papers!), I rely more on perplexity and I'm gonna check the Chinese baby!
Thanks again for your work!
Yes!
• They're becoming more and more precise
• I'm sure there are techniques to reduce hallucinations, like self-checking responses
• Humans are even less precise, we're just used to them
I can comprehend it, and it's here now, looks like a robot dog with a machine gun, what could possibly go wrong
You (and many others) continue to view intelligence as a simple matter of mind and thought. But animal intelligence is made up of body, instinct, emotions and movement: what if a superintelligence needed all this and always chose to collaborate?
That would be awesome!
What if it doesn’t?
Obviously both possibilities must be taken into consideration, whereas I see the competition option as almost exclusively explored.
It does feel like we have reached a tipping point with the underlying technology. However, with disruptive technologies, it's the application of those that makes the difference, and that usually takes longer than everyone expects. Right now, most of the AI knowledge and skills are concentrated in the enterprises that create AI technologies. It will take time to build the expertise across the economy which will in turn bring AI into conventional industries and disrupt those. It's in the intersection between new technology and domain-specific knowledge that innovation happens. The truly groundbreaking and yet to be conceived use cases may take a little longer to emerge.
The process is true, but you are making what I think is a wrong assumption with regards to speed.
The process you describe used to be very long because it was limited by people’s communication ability and intelligence. These are not true anymore.
somebody before needed years to learn about a new tech, understand it, raise the money to apply it, and find the people. Now the last two steps are mostly eliminated because you only need one or a handful of people to change an industry.
The most important time in history is always now, it’s called the present.
Right here and now we are all making decisions or non-decisions that affect the future.
I suspect what you really meant with the title of this article is that future people (or AI if that is all that remains) will look back at this time in history as being very important, a “fork in the road”. On 5th November 2024, Americans decided to choose the right fork. The Chinese and the Russians don’t really get much of a choice so now it’s up to the rest of us to make our choices.
So what are the options?
1. Bury our heads in the sand and pretend it isn't happening
2. Pay attention to what’s happening, try to understand it and how it might impact us personally and as a whole.
3. Try to influence the outcome personally or collectively
What will we think of those choices in 10 years’ time? What will future people think? Our actions (or lack of them) will be judged against beliefs about what is “good” or “right”. So how do we decide what are the right goals and how to achieve them?
PS There is another potential limit to AI growth that you haven’t mentioned