Pluralistic: "Humans in the loop" must detect the hardest-to-spot errors, at superhuman speed (23 Apr 2024)

doctorow · 23 April 2024 07:11

Originally published at: Pluralistic: “Humans in the loop” must detect the hardest-to-spot errors, at superhuman speed (23 Apr 2024) – Pluralistic: Daily links from Cory Doctorow

Today's links

"Humans in the loop" must detect the hardest-to-spot errors, at superhuman speed: The particular torments of reverse-centaurs are drastically under-theorized.
Hey look at this: Delights to delectate.
This day in history: 2009, 2014, 2019, 2023
Upcoming appearances: Where to find me.
Recent appearances: Where I've been.
Latest books: You keep readin' em, I'll keep writin' 'em.
Upcoming books: Like I said, I'll keep writin' 'em.
Colophon: All the rest.

"Humans in the loop" must detect the hardest-to-spot errors, at superhuman speed (permalink)

If AI has a future (a big if), it will have to be economically viable. An industry can't spend 1,700% more on Nvidia chips than it earns indefinitely – not even with Nvidia being a principle investor in its largest customers:

https://news.ycombinator.com/item?id=39883571

A company that pays $0.36-$1/query for electricity and (scarce, fresh) water can't indefinitely give those queries away by the millions to people who are expected to revise those queries dozens of times before eliciting the perfect botshit rendition of "instructions for removing a grilled cheese sandwich from a VCR in the style of the King James Bible":

https://www.semianalysis.com/p/the-inference-cost-of-search-disruption

Eventually, the industry will have to uncover some mix of applications that will cover its operating costs, if only to keep the lights on in the face of investor disillusionment (this isn't optional – investor disillusionment is an inevitable part of every bubble).

Now, there are lots of low-stakes applications for AI that can run just fine on the current AI technology, despite its many – and seemingly inescapable errors ("hallucinations"). People who use AI to generate illustrations of their D&D characters engaged in epic adventures from their previous gaming session don't care about the odd extra finger. If the chatbot powering a tourist's automatic text-to-translation-to-speech phone tool gets a few words wrong, it's still much better than the alternative of speaking slowly and loudly in your own language while making emphatic hand-gestures.

There are lots of these applications, and many of the people who benefit from them would doubtless pay something for them. The problem – from an AI company's perspective – is that these aren't just low-stakes, they're also low-value. Their users would pay something for them, but not very much.

For AI to keep its servers on through the coming trough of disillusionment, it will have to locate high-value applications, too. Economically speaking, the function of low-value applications is to soak up excess capacity and produce value at the margins after the high-value applications pay the bills. Low-value applications are a side-dish, like the coach seats on an airplane whose total operating expenses are paid by the business class passengers up front. Without the principle income from high-value applications, the servers shut down, and the low-value applications disappear:

https://locusmag.com/2023/12/commentary-cory-doctorow-what-kind-of-bubble-is-ai/

Now, there are lots of high-value applications the AI industry has identified for its products. Broadly speaking, these high-value applications share the same problem: they are all high-stakes, which means they are very sensitive to errors. Mistakes made by apps that produce code, drive cars, or identify cancerous masses on chest X-rays are extremely consequential.

Some businesses may be insensitive to those consequences. Air Canada replaced its human customer service staff with chatbots that just lied to passengers, stealing hundreds of dollars from them in the process. But the process for getting your money back after you are defrauded by Air Canada's chatbot is so onerous that only one passenger has bothered to go through it, spending ten weeks exhausting all of Air Canada's internal review mechanisms before fighting his case for weeks more at the regulator:

https://bc.ctvnews.ca/air-canada-s-chatbot-gave-a-b-c-man-the-wrong-information-now-the-airline-has-to-pay-for-the-mistake-1.6769454

There's never just one ant. If this guy was defrauded by an AC chatbot, so were hundreds or thousands of other fliers. Air Canada doesn't have to pay them back. Air Canada is tacitly asserting that, as the country's flagship carrier and near-monopolist, it is too big to fail and too big to jail, which means it's too big to care.

Air Canada shows that for some business customers, AI doesn't need to be able to do a worker's job in order to be a smart purchase: a chatbot can replace a worker, fail to their worker's job, and still save the company money on balance.

I can't predict whether the world's sociopathic monopolists are numerous and powerful enough to keep the lights on for AI companies through leases for automation systems that let them commit consequence-free free fraud by replacing workers with chatbots that serve as moral crumple-zones for furious customers:

https://www.sciencedirect.com/science/article/abs/pii/S0747563219304029

But even stipulating that this is sufficient, it's intrinsically unstable. Anything that can't go on forever eventually stops, and the mass replacement of humans with high-speed fraud software seems likely to stoke the already blazing furnace of modern antitrust:

https://www.eff.org/de/deeplinks/2021/08/party-its-1979-og-antitrust-back-baby

Of course, the AI companies have their own answer to this conundrum. A high-stakes/high-value customer can still fire workers and replace them with AI – they just need to hire fewer, cheaper workers to supervise the AI and monitor it for "hallucinations." This is called the "human in the loop" solution.

The human in the loop story has some glaring holes. From a worker's perspective, serving as the human in the loop in a scheme that cuts wage bills through AI is a nightmare – the worst possible kind of automation.

Let's pause for a little detour through automation theory here. Automation can augment a worker. We can call this a "centaur" – the worker offloads a repetitive task, or one that requires a high degree of vigilance, or (worst of all) both. They're a human head on a robot body (hence "centaur"). Think of the sensor/vision system in your car that beeps if you activate your turn-signal while a car is in your blind spot. You're in charge, but you're getting a second opinion from the robot.

Likewise, consider an AI tool that double-checks a radiologist's diagnosis of your chest X-ray and suggests a second look when its assessment doesn't match the radiologist's. Again, the human is in charge, but the robot is serving as a backstop and helpmeet, using its inexhaustible robotic vigilance to augment human skill.

That's centaurs. They're the good automation. Then there's the bad automation: the reverse-centaur, when the human is used to augment the robot.

Amazon warehouse pickers stand in one place while robotic shelving units trundle up to them at speed; then, the haptic bracelets shackled around their wrists buzz at them, directing them pick up specific items and move them to a basket, while a third automation system penalizes them for taking toilet breaks or even just walking around and shaking out their limbs to avoid a repetitive strain injury. This is a robotic head using a human body – and destroying it in the process.

An AI-assisted radiologist processes fewer chest X-rays every day, costing their employer more, on top of the cost of the AI. That's not what AI companies are selling. They're offering hospitals the power to create reverse centaurs: radiologist-assisted AIs. That's what "human in the loop" means.

This is a problem for workers, but it's also a problem for their bosses (assuming those bosses actually care about correcting AI hallucinations, rather than providing a figleaf that lets them commit fraud or kill people and shift the blame the an unpunishable AI).

Humans are good at a lot of things, but they're not good at eternal, perfect vigilance. Writing code is hard, but performing code-review (where you check someone else's code for errors) is much harder – and it gets even harder if the code you're reviewing is usually fine, because this requires that you maintain your vigilance for something that only occurs at rare and unpredictable intervals:

https://twitter.com/qntm/status/1773779967521780169

But for a coding shop to make the cost of an AI pencil out, the human in the loop needs to be able to process a lot of AI-generated code. Replacing a human with an AI doesn't produce any savings if you need to hire two more humans to take turns doing close reads of the AI's code.

This is the fatal flaw in robo-taxi schemes. The "human in the loop" who is supposed to keep the murderbot from smashing into other cars, steering into oncoming traffic, or running down pedestrians isn't a driver, they're a driving instructor. This is a much harder job than being a driver, even when the student driver you're monitoring is a human, making human mistakes at human speed. It's even harder when the student driver is a robot, making errors at computer speed:

https://pluralistic.net/2024/04/01/human-in-the-loop/#monkey-in-the-middle

This is why the doomed robo-taxi company Cruise had to deploy 1.5 skilled, high-paid human monitors to oversee each of its murderbots, while traditional taxis operate at a fraction of the cost with a single, precaratized, low-paid human driver:

https://pluralistic.net/2024/01/11/robots-stole-my-jerb/#computer-says-no

The vigilance problem is pretty fatal for the human-in-the-loop gambit, but there's another problem that is, if anything, even more fatal: the kinds of errors that AIs make.

Foundationally, AI is applied statistics. An AI company trains its AI by feeding it a lot of data about the real world. The program processes this data, looking for statistical correlations in that data, and makes a model of the world based on those correlations. A chatbot is a next-word-guessing program, and an AI "art" generator is a next-pixel-guessing program. They're drawing on billions of documents to find the most statistically likely way of finishing a sentence or a line of pixels in a bitmap:

https://dl.acm.org/doi/10.1145/3442188.3445922

This means that AI doesn't just make errors – it makes subtle errors, the kinds of errors that are the hardest for a human in the loop to spot, because they are the most statistically probable ways of being wrong. Sure, we notice the gross errors in AI output, like confidently claiming that a living human is dead:

https://www.tomsguide.com/opinion/according-to-chatgpt-im-dead

But the most common errors that AIs make are the ones we don't notice, because they're perfectly camouflaged as the truth. Think of the recurring AI programming error that inserts a call to a nonexistent library called "huggingface-cli," which is what the library would be called if developers reliably followed naming conventions. But due to a human inconsistency, the real library has a slightly different name. The fact that AIs repeatedly inserted references to the nonexistent library opened up a vulnerability – a security researcher created a (inert) malicious library with that name and tricked numerous companies into compiling it into their code because their human reviewers missed the chatbot's (statistically indistinguishable from the the truth) lie:

https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/

For a driving instructor or a code reviewer overseeing a human subject, the majority of errors are comparatively easy to spot, because they're the kinds of errors that lead to inconsistent library naming – places where a human behaved erratically or irregularly. But when reality is irregular or erratic, the AI will make errors by presuming that things are statistically normal.

These are the hardest kinds of errors to spot. They couldn't be harder for a human to detect if they were specifically designed to go undetected. The human in the loop isn't just being asked to spot mistakes – they're being actively deceived. The AI isn't merely wrong, it's constructing a subtle "what's wrong with this picture"-style puzzle. Not just one such puzzle, either: millions of them, at speed, which must be solved by the human in the loop, who must remain perfectly vigilant for things that are, by definition, almost totally unnoticeable.

This is a special new torment for reverse centaurs – and a significant problem for AI companies hoping to accumulate and keep enough high-value, high-stakes customers on their books to weather the coming trough of disillusionment.

This is pretty grim, but it gets grimmer. AI companies have argued that they have a third line of business, a way to make money for their customers beyond automation's gifts to their payrolls: they claim that they can perform difficult scientific tasks at superhuman speed, producing billion-dollar insights (new materials, new drugs, new proteins) at unimaginable speed.

However, these claims – credulously amplified by the non-technical press – keep on shattering when they are tested by experts who understand the esoteric domains in which AI is said to have an unbeatable advantage. For example, Google claimed that its Deepmind AI had discovered "millions of new materials," "equivalent to nearly 800 years’ worth of knowledge," constituting "an order-of-magnitude expansion in stable materials known to humanity":

https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/

It was a hoax. When independent material scientists reviewed representative samples of these "new materials," they concluded that "no new materials have been discovered" and that not one of these materials was "credible, useful and novel":

https://www.404media.co/google-says-it-discovered-millions-of-new-materials-with-ai-human-researchers/

As Brian Merchant writes, AI claims are eerily similar to "smoke and mirrors" – the dazzling reality-distortion field thrown up by 17th century magic lantern technology, which millions of people ascribed wild capabilities to, thanks to the outlandish claims of the technology's promoters:

https://www.bloodinthemachine.com/p/ai-really-is-smoke-and-mirrors

The fact that we have a four-hundred-year-old name for this phenomenon, and yet we're still falling prey to it is frankly a little depressing. And, unlucky for us, it turns out that AI therapybots can't help us with this – rather, they're apt to literally convince us to kill ourselves:

https://www.vice.com/en/article/pkadgm/man-dies-by-suicide-after-talking-with-ai-chatbot-widow-says

Hey look at this (permalink)

Tokyo flood tunnels https://www.reddit.com/r/Damnthatsinteresting/comments/1ca336g/tokyo_flood_tunnels/
Ghost is federating over ActivityPub https://activitypub.ghost.org (h/t Waxy)
The Coddling of the American Parent https://www.thedailybeast.com/the-coddling-of-the-american-parent

A Wayback Machine banner.

This day in history (permalink)

#15yrsago EU Parliament passes copyright term extension, rejects proposal to give the addition funds to artists https://www.openrightsgroup.org/blog/parliament-buckles-copyright-extension-goes-through-to-council-of-ministers/

#10yrsago How science fiction influences thinking about the future https://www.smithsonianmag.com/arts-culture/how-americas-leading-science-fiction-authors-are-shaping-your-future-180951169/?no-ist

#10yrsago Obama official responsible for copyright chapters of TPP & ACTA gets a job at MPAA; his replacement is another copyright lobbyist https://www.vox.com/2014/4/22/5636466/hollywood-just-hired-another-white-house-trade-official

#10yrsago Having leisure time is now a marker for poverty, not riches https://www.economist.com/finance-and-economics/2014/04/22/nice-work-if-you-can-get-out

#10yrsago Eternal vigilance app for social networks: treating privacy vulnerabilities like other security risks https://freedom-to-tinker.com/2014/04/21/eternal-vigilance-is-a-solvable-technology-problem-a-proposal-for-streamlined-privacy-alerts/

#10yrsago How the Russian surveillance state works https://web.archive.org/web/20140206154124/http://www.worldpolicy.org/journal/fall2013/Russia-surveillance

#5yrsago Political candidate’s kids use his election flyers to fool his laptop’s facial recognition lock https://twitter.com/mattcarthy/status/1120641557886058496

#5yrsago Fool me twice: New York State commutes Charter’s death sentence after Charter promises to stop breaking its promises https://arstechnica.com/tech-policy/2019/04/charter-avoids-getting-kicked-out-of-new-york-agrees-to-new-merger-conditions/

#5yrsago Greta Thunberg attributes her ability to focus on climate change to her Asperger’s https://www.youtube.com/watch?v=hKMX8WRw3fc

#5yrsago A Sanders candidacy would make 2020 a referendum on the future, not a referendum on Trump https://www.theguardian.com/us-news/2019/apr/22/bernie-sanders-democrats-trump-2020

#5yrsago EU to create 350m person biometric database for borders, migration and law enforcement https://www.zdnet.com/article/eu-votes-to-create-gigantic-biometrics-database/

#1yrsago A Collective Bargain https://pluralistic.net/2023/04/23/a-collective-bargain/

Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.

The Bezzle at Book Passage Corte Madera (Marin County), April 27
https://www.bookpassage.com/event/cory-doctorow-bezzle-martin-hench-novel-corte-madera-store
Canadian Centre for Policy Alternatives (Winnipeg), May 2
https://www.eventbrite.ca/e/cory-doctorow-tickets-798820071337
Wordfest (Calgary), May 3
https://wordfest.com/2024/event/wordfest-presents-cory-doctorow-2/
Massy Arts (Vancouver), May 4
https://www.eventbrite.ca/e/solo-reading-cory-doctorow-the-bezzle-tickets-876989167207
Tartu Prima Vista Literary Festival, May 5-11
https://tartu2024.ee/en/kirjandusfestival/
Tim O’Reilly and Cory Doctorow on “Enshittification” and the Future of AI, May 14
https://www.oreilly.com/live-events/tim-oreilly-and-cory-doctorow-on-enshittification-and-the-future-of-ai/0642572001651/
"Finding the Money" screening (LA), May 15
https://www.laemmle.com/film/finding-money?date=2024-05-15
Media Ecology Association keynote (Amherst, NY), Jun 6-9
https://media-ecology.org/convention
American Association of Law Libraries keynote (Chicago), Jul 21
https://www.aallnet.org/conference/agenda/keynote-speaker/

A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)

Come sfuggire al “Merdocene” e costruire un Internet migliore (Torino Biennale Tecnologia)
https://www.youtube.com/watch?v=Z5NC2EZCYBg
Show Me The Money Club (The Rideshare Guy)
https://www.youtube.com/watch?v=ZCETi3XqSds
NUOVO BARETTO UTOPIA
https://videoteca.kenobit.it/w/azRmQBCenVwjSRz9WCp8JS

A grid of my books with Will Stahle covers..

Latest books (permalink)

The Bezzle: a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). Signed, personalized copies at Dark Delicacies (https://www.darkdel.com/store/p3062/Available_Feb_20th%3A_The_Bezzle_HB.html#/).
"The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). Signed, personalized copies at Dark Delicacies (https://www.darkdel.com/store/p3007/Pre-Order_Signed_Copies%3A_The_Lost_Cause_HB.html#/)
"The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245).
"Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. Signed copies at Dark Delicacies (US): and Forbidden Planet (UK): https://forbiddenplanet.com/385004-red-team-blues-signed-edition-hardcover/.
"Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com
"Attack Surface": The third Little Brother novel, a standalone technothriller for adults. The Washington Post called it "a political cyberthriller, vigorous, bold and savvy about the limits of revolution and resistance." Order signed, personalized copies from Dark Delicacies https://www.darkdel.com/store/p1840/Available_Now%3A_Attack_Surface.html
"How to Destroy Surveillance Capitalism": an anti-monopoly pamphlet analyzing the true harms of surveillance capitalism and proposing a solution. https://onezero.medium.com/how-to-destroy-surveillance-capitalism-8135e6744d59?sk=f6cd10e54e20a07d4c6d0f3ac011af6b) (signed copies: https://www.darkdel.com/store/p2024/Available_Now%3A__How_to_Destroy_Surveillance_Capitalism.html)
"Little Brother/Homeland": A reissue omnibus edition with a new introduction by Edward Snowden: https://us.macmillan.com/books/9781250774583; personalized/signed copies here: https://www.darkdel.com/store/p1750/July%3A__Little_Brother_%26_Homeland.html
"Poesy the Monster Slayer" a picture book about monsters, bedtime, gender, and kicking ass. Order here: https://us.macmillan.com/books/9781626723627. Get a personalized, signed copy here: https://www.darkdel.com/store/p2682/Corey_Doctorow%3A_Poesy_the_Monster_Slayer_HB.html#/.

A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025
Unauthorized Bread: a graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025

Colophon (permalink)

Today's top sources:

Currently writing:

A Little Brother short story about DIY insulin PLANNING
Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS JAN 2025
Vigilant, Little Brother short story about remote invigilation. FORTHCOMING ON TOR.COM
Spill, a Little Brother short story about pipeline protests. FORTHCOMING ON TOR.COM

Latest podcast: Capitalists Hate Capitalism https://craphound.com/news/2024/04/14/capitalists-hate-capitalism/

This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.

How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

Keerthi · 23 April 2024 10:36

Thank you for your thoughtful article! Always enjoyed reading your articles. I agree with most of the points here. Yet, What could be the status of the AI non profit research labs, (who are less incentivized towards commercializing) after the AI bubble bursts?

As you have said, this whole generative AI hype trying to automate creativity and rolling out immature technologies which need to be still in labs, seems ingenuine. I liked reading the articles by authors like you, who take a neutral stand ignoring both the AI hype utopia and AI doomerism. But it is still unclear to me, what are the problems that are worth trying to solve using AI in the future which proves really needed for the society. It would be really helpful if you write an article providing a bigger picture of what AI could really be used for, in the society. I would like to read such thoughts from the authors who are unbiased towards this tech. This could give an idea to the young readers who are interested in this field to align their efforts accordingly.

For instance, I think, a visual explainer about the surroundings through audio for blind, using AI for automating scientific workflows and fixing some flaws in peer review etc., might be worthwhile research based applications to pursue regardless of the bubble.

And I might be wrong, but current top AI researchers seem to be trying to anthropomorphize AI. I couldn’t understand why. Is this some sort of existential crisis, do they think that we could understand our “the why behind existence and intelligence” if we create a similar one to us? Or is this some effort for an intellectual accomplishment.

I always thought that technology in any form is created to solve a problem or aid people in the society. But it feels like, current AI and generative AI tech is getting developed without any concrete aim and thrown at every problem there is.

Karellen · 23 April 2024 11:17

That’s not quite making sense there at the end.

At first, I thought it was saying it was about shifting blame to an unpunishable AI. I had the idea that the reverse centaur actually works as a target for shifting the blame from the unpunishable AI, but still away from the boss who put the AI in place. i.e. “Hey, we put a low-paid human in the loop to catch the mistakes, and they’re the one who failed. We’ve fired them as punishment to appease disgruntled customers/regulators, and will hire someone else to take the blame and get fired next time.”

But then I realised maybe that was what you meant instead?

waterfoodearthcosmos · 4 May 2024 19:18

Did not see that story in the link below the paragraph. Also a search only shows one reditt post of that. Most search results have instructions for removing a peanut butter sandwich from a VCR in the style of the King James Bible.

Where did you get that from?

system · 8 May 2024 07:12

This topic was automatically closed after 15 days. New replies are no longer allowed.