I am focusing on the upsides here. But like any new technology, there are also serious downsides. The biggest of all being potentially the singularity.
Honestly, this was as timely and impactful as what put you on the map - Flatten The Curve. No one has put together the full story of state of Gen AI, right at the moment it begins the steep R0.
WRT cinema and dubbing, the missing piece is to AI-edit the voices so that they (a) sound like the original actors and (b) feel like they happened during the actual action, instead of read off from the script in front of a microphone in a random sound studio.
Agreed. But do you think that’s impossible? It sounds quite possible to me to get to the 80-90% level of fidelity, which sounds like enough to me, since today dubbing is just spoken over the original version, which is terrible.
I see AI as a useful tool for humans, but still relatively primitive. What Roman soldier would plow a field in his military regalia? He would set that aside, and put on his grubby field working dress. AI can't figure that out. A human will have to program that into AI. I've recently read two articles about self-driving taxis in San Francisco. Both authors reported that after getting most of the way to their destinations, the AI got confused over something, pulled over and ended the trip. At least the humans lived to report it!
I’m the one who directed the Roman’s to be in military regalia to be fair. If I used the field working dress, they would have looked like normal farmers.
And you’re right on self-driving! It’s a much harder one than people though, like brick-laying.
Other fields are much easier, and even driving will be at some point automated.
Funny. Because I'm behind on some Uncharted Territories and never delete them from my inbox, I just read this article the day before reading "The Most Important Time in History Is Now".
One application I see with promise is that I volunteer with the local fire district Emerg. Med Tech's. (I have an Emerg Med. Responder license which is an assistant to EMT's) We have training sessions where we show up and the patient exhibits certain symptoms. We keep gathering data to narrow down what may be the medical cause of the symptoms. The hospital is 25 miles away on a narrow road. Lives may be saved if we can diagnose what is happening quickly. I can envision a portable AI medical assistant that can observe what we're doing and make relevant suggestions. Also some of our patients don't speak English or a language we know so that's another obvious help.
I was 15 when I saw first the Color TV, I was 23 when I knew about Satellite TV, I was 35 when I wrote my first email, and at 50, I bought my iPhone 4s. Now I'm 60, trying to fit into my seat belts.
The Internet changed Stock Image Industry within which I was involved. AI will kill (Transform) it.
… and then there's the downsides. Besides the obvious: Who, for one, decides which images are or are not appropriate? OpenAI requires that we "Do not attempt to create, upload, or share images that are not G-rated or cause harm". This sentence happens to include half of modern art, as well as a heap of other topics that may or may not be related to arousing prurient interest. It's quite telling that not being even PG-rated appears to be more important than, say, not generating a glorifying image of a suicide.
Excellent article and I'm only half-way through! Will be utilizing a lot from this, so thank you. However, I still think we have hard work to do on AI alignment.
Mind-blowing Tomas! I'd already seen some of this of course, particularly the image-related stuff. Short-term and rather mundanely the transcription software will help me a lot. Even as recently as a couple years ago I found human-captioning services worse than me just correcting AI-generating captions/transcriptions, and Adobe premiere has improved even more recently, but still needs a lot of proofreading/correcting. I film/edit a lot of medical industry videos, so if recognising technical language is getting more accurate that will save a huge amount of time. If Whisper or Descript are better... off to experiment...
Before writing the article, I was scared of getting into this because it seemed daunting. But plenty of tools are user-ready, and many others are engineer-ready. As you say, time to explore!
Descript you totally can. I couldn’t find a way to try whisper. Could you?
I haven't tried yet, I saved the Harry Ramsay link you included to go through with Whisper when I have more time - need to use Google Colab apparently and maybe it will finally make me dabble in coding... This article of yours itself I will have to go through a few more times! I opened several tabs with all the links and realised it was exponential in time required to read... :D
Great framing around diffusion curves and where we are on the adoption S-curve. One practical angle I’d add is evaluation: teams need lightweight, task-specific evals (clarity, factuality, brand voice) instead of generic benchmarks. For visual work, I keep a small gallery of “golden outputs” and regenerate against them; Createimg.ai (https://createimg.ai) helps me test variations across models to see which stays closest to spec. Would love a follow-up on measuring drift as models update.
After the camera was developed it was use to verify reality as individual testimony could be biased
Now the word can become “flesh” and imagination “real”. Walking shadows in AI
I am focusing on the upsides here. But like any new technology, there are also serious downsides. The biggest of all being potentially the singularity.
Honestly, this was as timely and impactful as what put you on the map - Flatten The Curve. No one has put together the full story of state of Gen AI, right at the moment it begins the steep R0.
Well done!
Thank you.
In the grand scheme of things, COVID was more urgent, but this is more important!
WRT cinema and dubbing, the missing piece is to AI-edit the voices so that they (a) sound like the original actors and (b) feel like they happened during the actual action, instead of read off from the script in front of a microphone in a random sound studio.
Agreed. But do you think that’s impossible? It sounds quite possible to me to get to the 80-90% level of fidelity, which sounds like enough to me, since today dubbing is just spoken over the original version, which is terrible.
I see AI as a useful tool for humans, but still relatively primitive. What Roman soldier would plow a field in his military regalia? He would set that aside, and put on his grubby field working dress. AI can't figure that out. A human will have to program that into AI. I've recently read two articles about self-driving taxis in San Francisco. Both authors reported that after getting most of the way to their destinations, the AI got confused over something, pulled over and ended the trip. At least the humans lived to report it!
I’m the one who directed the Roman’s to be in military regalia to be fair. If I used the field working dress, they would have looked like normal farmers.
And you’re right on self-driving! It’s a much harder one than people though, like brick-laying.
Other fields are much easier, and even driving will be at some point automated.
Funny. Because I'm behind on some Uncharted Territories and never delete them from my inbox, I just read this article the day before reading "The Most Important Time in History Is Now".
https://unchartedterritories.tomaspueyo.com/p/the-most-important-time-in-history-agi-asi
It's absolutely insane to see the progress in just 2 years and 2 months. Strap yourself in for what happens next!
Maybe I should use this for contrast!
One application I see with promise is that I volunteer with the local fire district Emerg. Med Tech's. (I have an Emerg Med. Responder license which is an assistant to EMT's) We have training sessions where we show up and the patient exhibits certain symptoms. We keep gathering data to narrow down what may be the medical cause of the symptoms. The hospital is 25 miles away on a narrow road. Lives may be saved if we can diagnose what is happening quickly. I can envision a portable AI medical assistant that can observe what we're doing and make relevant suggestions. Also some of our patients don't speak English or a language we know so that's another obvious help.
That AI could also be on the phone with the patient while you arrive, so that by the time you're there you're 70% done
I was 15 when I saw first the Color TV, I was 23 when I knew about Satellite TV, I was 35 when I wrote my first email, and at 50, I bought my iPhone 4s. Now I'm 60, trying to fit into my seat belts.
The Internet changed Stock Image Industry within which I was involved. AI will kill (Transform) it.
Thanks for this excellent recap.
… and then there's the downsides. Besides the obvious: Who, for one, decides which images are or are not appropriate? OpenAI requires that we "Do not attempt to create, upload, or share images that are not G-rated or cause harm". This sentence happens to include half of modern art, as well as a heap of other topics that may or may not be related to arousing prurient interest. It's quite telling that not being even PG-rated appears to be more important than, say, not generating a glorifying image of a suicide.
This is possible if there are monopolies in the models. If there’s competition, the market will decide what’s acceptable.
Excellent article and I'm only half-way through! Will be utilizing a lot from this, so thank you. However, I still think we have hard work to do on AI alignment.
Indeed. Luckily none of these things will get us close to an AGI, but this moment is coming and we’re certainly not prepared.
Mind-blowing Tomas! I'd already seen some of this of course, particularly the image-related stuff. Short-term and rather mundanely the transcription software will help me a lot. Even as recently as a couple years ago I found human-captioning services worse than me just correcting AI-generating captions/transcriptions, and Adobe premiere has improved even more recently, but still needs a lot of proofreading/correcting. I film/edit a lot of medical industry videos, so if recognising technical language is getting more accurate that will save a huge amount of time. If Whisper or Descript are better... off to experiment...
Before writing the article, I was scared of getting into this because it seemed daunting. But plenty of tools are user-ready, and many others are engineer-ready. As you say, time to explore!
Descript you totally can. I couldn’t find a way to try whisper. Could you?
I haven't tried yet, I saved the Harry Ramsay link you included to go through with Whisper when I have more time - need to use Google Colab apparently and maybe it will finally make me dabble in coding... This article of yours itself I will have to go through a few more times! I opened several tabs with all the links and realised it was exponential in time required to read... :D
It took me months to gather the links and about 40h to write the 2 articles. I’m not surprised!
I may need to also re-read Asimov and Philip K. Dick...
Great framing around diffusion curves and where we are on the adoption S-curve. One practical angle I’d add is evaluation: teams need lightweight, task-specific evals (clarity, factuality, brand voice) instead of generic benchmarks. For visual work, I keep a small gallery of “golden outputs” and regenerate against them; Createimg.ai (https://createimg.ai) helps me test variations across models to see which stays closest to spec. Would love a follow-up on measuring drift as models update.
We made a market map that has some additional companies to the Sequioa one. The gen ai shift could produce even more value than the shift to cloud. https://base10.vc/post/generative-ai-mission-critical/
I think you’d be interested in this market map of gen ai companies we produced. The gen ai shift could produce even more value than the shift to cloud. https://base10.vc/post/generative-ai-mission-critical/
I’m not sure what you mean