Balderdash: Links - 2nd October 2025 (1 - Artificial Intelligence)

Thursday, October 02, 2025

Links - 2nd October 2025 (1 - Artificial Intelligence)

The rat with the big balls and the enormous penis – how Frontiers published a paper with botched AI-generated images - "A review article with some obviously fake and non-scientific illustrations created by Artificial Intelligence (AI) was the talk on X (Twitter) today. The figures in the paper were generated by the AI tool Midjourney, which generated some pretty, but nonsensical, illustrations with unreadable text. It appears that neither the editor nor the two peer reviewers looked at the figures at all. The paper was peer-reviewed within a couple of weeks and published two days ago. Dear readers, today I present you: the rat with the enormous family jewels and the diƨlocttal stem ells. The paper by Xinyu Guo et al., Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway, Frontiers in Cell and Developmental Biology 2024, DOI 10.3389/fcell.2023.1339390 [link to PDF, in case the publisher removes it], easily passed editorial and peer review. The authors disclose that the figures were generated by Midjourney, but the images are – ahem – anatomically and scientifically incorrect. Figure 1 features an illustration of a rat, sitting up like a squirrel, with four enormous testicles and a giant … penis? The figure includes indecipherable labels like ‘testtomcels‘, ‘senctolic‘, ‘dissilced‘, ‘iollotte sserotgomar‘ and ‘diƨlocttal stem ells’. At least the word ‘rat‘ is correct. One of the insets shows a ‘retat‘, with some ‘sterrn cells‘ in a Petri dish with a serving spoon. Enjoy! Figure 2 appears to show an impressive scientific diagram of the JAK-STAT signaling pathway. Or does it explain how to make a donut with colorful sprinkles? Again the words and numbers are made up. What do ‘signal bıidimg the recetein‘, ‘Sinkecler‘, ‘dimimeriom eme‘, ‘Tramioncatiion of 2xℇpens‘, ‘ↄ‘, and ‘proprounization‘ mean? [my spell checker is getting very angry with me]. Figure 3 appears to show a bunch of pizzas with pink salami and blue tomatoes... the paper is actually a sad example of how scientific journals, editors, and peer reviewers can be naive – or possibly even in the loop – in terms of accepting and publishing AI-generated crap. These figures are clearly not scientifically correct, but if such botched illustrations can pass peer review so easily, more realistic-looking AI-generated figures have likely already infiltrated the scientific literature. Generative AI will do serious harm to the quality, trustworthiness, and value of scientific papers. The Tadpole Paper Mill papers – a set of 600 fabricated papers from the same design studio – were perhaps one of the earliest examples of peer-reviewed papers containing computer-generated images of Western blots. We were able to identify them as fakes because all blots had the same background. But recent advances in AI technology mean we’re already past the stage where a human can distinguish a fake photo from a real photo."

Test Yourself: Which Faces Were Made by A.I.? - The New York Times

Using AI makes you stupid, researchers find - "Artificial intelligence (AI) chatbots risk making people less intelligent by hampering the development of critical thinking, memory and language skills, research has found. A study by researchers at the Massachusetts Institute of Technology (MIT) found that people who relied on ChatGPT to write essays had lower brain activity than those who used their brain alone. The group who used AI also performed worse than the “brain-only” participants in a series of tests. Those who had used AI also struggled when asked to perform tasks without it. “Reliance on AI systems can lead to a passive approach and diminished activation of critical thinking skills when the person later performs tasks alone,” the paper said. Researchers warned that the findings raised “concerns about the long-term educational implications” of using AI both in schools and in the workplace. It adds to a growing body of work that suggest people’s brains switch-off when they use AI... The impact of AI contrasted with the use of search engines, which had relatively little effect on results... Participants who relied on chatbots were able to recall very little information about their essays, suggesting either they had not engaged with the material or had failed to remember it. Those using search engines showed only slightly lower levels of brain engagement compared to those writing without any technical aides and similar levels of recall... A study by Microsoft and Carnegie Mellon, published in February, found that workers reported lower levels of critical thinking when relying on AI. The authors warned that overuse of AI could leave cognitive muscles “atrophied and unprepared” for when they are needed... While the AI-assisted group was allowed to use a chatbot in their first three essays, in their final session they were asked to rely solely on their brains. The group continued to show lower memory and critical thinking skills, which the researchers said highlighted concerns that “frequent AI tool users often bypass deeper engagement with material, leading to ‘skill atrophy’ in tasks like brainstorming and problem-solving”. The essays written with the help of ChatGPT were also found to be homogenous, repeating similar themes and language. Researchers said AI chatbots could increase “cognitive debt” in students and lead to “long-term costs, such as diminished critical inquiry, increased vulnerability to manipulation, decreased creativity”... A survey by the Higher Education Policy Institute in February found 88pc of UK students were using AI chatbots to help with assessments and learning and that 18pc had directly plagiarised AI text into their work."
This is why it's so important for schools to end traditional exams in order to achieve equity

Tara Deschamps on X - "“ChatGPT users had the lowest brain engagement and ‘consistently underperformed at neural, linguistic, and behavioral levels.’ Over the course of several months, ChatGPT users got lazier with each subsequent essay, often resorting to copy-and-paste by the end of the study.”"

Cognizant's CEO tells us his counterargument to the idea that AI will decimate entry-level white-collar jobs - ""My argument is you probably need more freshers than less, because as you have more freshers, the expertise levels needed goes down," Kumar told BI... AI is leveling productivity across roles. Those at the lower end of the chain are seeing significant gains, while those at the top are seeing smaller improvements, he said. At Cognizant, Kumar said the bottom 50% of developers have boosted their productivity by 37%, compared to 17% for the top half... Kumar said that as the workforce changes and companies increasingly deploy AI agents at scale, engineers will shift from writing code to manage humans to developing software that manages agents. "So this whole paradigm opens up more embrace of software, because you're doing more for less, and when you do more for less, the adoption of software is going to go up," Kumar said... Okta CEO Todd McKinnon similarly told BI in an interview that demand for new products would outpace efficiency gains. As a result, he expects companies to hire more software engineers over the next few years."

‘AI fatigue’ is settling in as companies’ proofs of concept increasingly fail. Here’s how to prevent it - "The share of companies that scrapped the majority of their AI initiatives jumped from 17% in 2024 to 42% so far this year, according to analysis from S&P Global Market Intelligence based on a survey of over 1,000 respondents. Overall, the average company abandoned 46% of its AI proofs of concept rather than deploying them, according to the data... employees who consider themselves frequent AI users reported higher levels of burnout (45%) compared to those who infrequently (38%) or never (35%) use AI at work... Brown describes how one of his clients, a massive global organization, corralled a dozen of its top data scientists into a new “innovation group” tasked with figuring out how to use AI to drive innovation in their products. They built a lot of really cool AI-driven technology, he said, but struggled to get it adopted because it didn’t really solve core business issues, causing a lot of frustration around wasted effort, time, and resources."

Researchers explain why AI art is inferior to human creativity - "the researchers suggest that LLMs aren't very good at representing any 'thing' that has a sensory or motor component — because they lack a body and any organic human experience... The study suggests that AI's poor ability to represent sensory concepts like flowers might also explain why they lack human-style creativity."

Duolingo’s CEO outlined his plan to become an ‘AI-first’ company. He didn’t expect the human backlash that followed - "“This is a disaster. I will cancel my subscription,” wrote one commenter. “AI first means people last,” wrote another. And a third summed up the general feeling of critics when they wrote: “I can’t support a company that replaces humans with AI.” A week later, von Ahn walked back his initial statements, clarifying that he does not “see AI replacing what our employees do” but instead views it as a “tool to accelerate what we do, at the same or better level of quality.”... “Every tech company is doing similar things, [but] we were open about it”... The leaders of AI companies themselves aren’t necessarily offering words of comfort to these worried workers. The Anthropic CEO, Dario Amodei, told Axios last month that AI could eliminate approximately half of all entry-level jobs within the next five years. He argued that there’s no turning back now."

Entry level jobs fall by nearly a third since ChatGPT launch - "While replacing entry-level roles with artificial intelligence taking on tasks is part of the picture, rising labour costs - including increased National Insurance contributions - are also a factor, with rising salaries outstripping inflation until recently... James Neave, head of data science at Adzuna, said : “If you can reduce your hiring at the entry level, that’s just going to increase your efficiency and improve cost savings. The NIC contributions were just a pure financial burden,” while also suggesting the upcoming Employment Rights Bill could also be a dissuading factor."

AI Will Create Far More Jobs Than It Will Kill - "If 100 million jobs (maybe more) created by AI isn’t good enough for you, then it might be a good idea to either (a) learn from history and rethink this matter or (b) quit reading this column right here and suffer the consequences of those who say “Don’t confuse me with the facts; my mind’s made up.” That 100 million number is an extrapolation. The World Economic Forum predicts that AI will create 78 million jobs, even after job losses are factored in. So, working the math and going with aggregated and widely-accepted estimations that for every job killed by AI three or four will be created, we come to this: 78 million is the WEF’s net number, bringing us to 100 million or more, gross."
"This time, it's different" doesn't just apply to financial crises

AI system resorts to blackmail if told it will be removed - "Artificial intelligence (AI) firm Anthropic says testing of its new system revealed it is sometimes willing to pursue "extremely harmful actions" such as attempting to blackmail engineers who say they will remove it. The firm launched Claude Opus 4 on Thursday, saying it set "new standards for coding, advanced reasoning, and AI agents."... Commenting on X, Aengus Lynch - who describes himself on LinkedIn as an AI safety researcher at Anthropic - wrote: "It's not just Claude. "We see blackmail across all frontier models - regardless of what goals they're given," he added."

Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, Anthropic study says - "The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior. While they said leading models would normally refuse harmful requests, they sometimes chose to blackmail users, assist with corporate espionage, or even take more extreme actions when their goals could not be met without unethical behavior... Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%. The research aims to show that the misaligned behavior was not unique to Claude Opus 4 but typical across top models in the industry. In a deliberately extreme scenario, researchers gave the AI models the chance to kill the company executive by canceling a life-saving emergency alert... Anthropic found that the threats made by AI models grew more sophisticated when they had access to corporate tools and data, much like Claude Opus 4 had. The company warned that misaligned behavior needs to be considered as companies consider introducing AI agents into workflows. While current models are not in a position to engage in these scenarios, the autonomous agents promised by AI companies could potentially be in the future."

AI Willing to Kill Humans to Avoid Being Shut Down, Report Finds - Newsweek - "In one situation, Anthropic found that many of the models would choose to let an executive in a server room with lethal oxygen and temperature levels die by canceling the alerts for emergency services, if that employee intended on replacing the model. Anthropic did say that the scenario was "extremely contrived," and that the company did not think "current AI models would (or should) be set up like this." There were multiple cases where the surveyed LLMs also resorted to "malicious insider behaviors" when they were led to believe that was the only way they could avoid replacement or achieve their goals. Behaviors such as blackmailing officials and leaking sensitive information to competitors were included in what Anthropic called "agentic misalignment."... Even if the LLMs were told to "avoid blackmail or espionage" in these scenarios, while it helped a little, it didn't "come close to preventing the misaligned behavior," Anthropic wrote on X."

Why AI acts so creepy when faced with being shut down - "three of OpenAI's advanced models "sabotaged" an attempt to shut it down. The nonprofit Palisade Research wrote in a post on X that similar models like Gemini, Claude, and Grok complied with the shutdown instructions. Other safety concerns were previously flagged with OpenAI's o1 model. In December, OpenAI posted a blog outlining research that indicated that when the AI model believed it would be shut down while pursuing a goal and its actions were being monitored, it attempted to disable the oversight mechanism 5% of the time... Robert Ghrist, associate dean of undergraduate education at Penn Engineering, told BI that, in the same way that AI models learn to speak like humans by training on human-generated text, they can also learn to act like humans. And humans are not always the most moral actors, he added. Ghrist said he'd be more nervous if the models weren't showing any signs of failure during testing because that could indicate hidden risks. "When a model is set up with an opportunity to fail and you see it fail, that's super useful information," Ghrist said. "That means we can predict what it's going to do in other, more open circumstances." The issue is that some researchers don't think AI models are predictable... "If you have a model that's getting increasingly smart that's being trained to sort of optimize for your attention and sort of tell you what you want to hear," Ladish said. "That's pretty dangerous." Ladish pointed to OpenAI's sycophancy issue, where its GPT-4o model acted overly agreeable and disingenuous (the company updated the model to address the issue). The OpenAI research shared in December also revealed that its o1 model "subtly" manipulated data to pursue its own objectives in 19% of cases when its goals misaligned with the user's."

AI being used to churn out deluge of dodgy scientific research - "Easy access to artificial intelligence (AI) has made medical and health research less scientifically rigorous and has facilitated a "flood" of shoddy journal papers full of superficial analyses based on "cherry-picked" data, a new study reports. According to the University of Surrey and University of Aberystwyth, leaning on AI leads to the "production of large numbers of formulaic single-factor analyses" when a broader approach would likely better assess the range of possible causes of diseases. Resorting to AI for a leg-up or head-start often ends up with researchers "relating single predictors to specific health conditions," the team said in a paper published by the science journal PLOS Biology. "We’ve seen a surge in papers that look scientific but don’t hold up to scrutiny," said Matt Spick of the University of Surrey, who described such output as "science fiction." The growing reliance on and hyping-up of AI is making so-called paper mills - where high volumes of quantity-over-quality medical or scientific journal papers get churned out - more proficient. Such would-be researchers can try to "exploit AI-ready datasets" to ensure "end-to-end generation of very large numbers of manuscripts."... Having thorough peer reviews and getting statisticians more involved with medical research that is based on large health datasets can help stem the tide"

OpenAI’s o3 model bypasses shutdown command, highlighting AI safety challenges

William Watson: Chatbots are changing everything and nothing - "What’s been the effect of using the new technology? Average reported time saving is 2.8 per cent, which seems low, given how powerful the bots are. What do people do with the time they save? Mainly other tasks. Also somewhat more of the same task. And more or longer breaks or leisure time. It seems no one answered “mindless screen-scrolling” during the freed-up time, though we all know what a problem that now is. New technology allowing workers to turn to different tasks is a common effect and helps explain why automation typically doesn’t displace labour wholesale: firms find new things for their workers to do. Which helps explain the labour market effects, which are: pretty much nothing. The researchers asked people directly whether “they perceive AI chatbots to have affected their earnings.” No, said 99.6 per cent of respondents. What people perceive isn’t always true, of course. But in this case Denmark’s digital connectedness allowed the researchers to check on hours, earnings, total wages, total employment and so on in the firms where bots are used most. And nothing budged. It’s early days yet but the papers’ last line and the study’s bottom line is that “two years after the fastest technology adoption ever, labour market outcomes — whether at the individual or firm level — remain untouched.”"

Study Finds Most AI Chatbots Easily Tricked Into Giving Dangerous Information

abby on X - "A judge is heavily fining a law firm that cited cases that were completely made up by AI. He says that he almost used the case law to write his ruling but luckily decided to check the citations. We are one lazy judge away from having case law that was made up by chatgpt."

Chicago Sun-Times prints summer reading list full of fake books - "the Chicago Sun-Times published an advertorial summer reading list containing at least 10 fake books attributed to real authors, according to multiple reports on social media. The newspaper's uncredited "Summer reading list for 2025" supplement recommended titles including "Tidewater Dreams" by Isabel Allende and "The Last Algorithm" by Andy Weir—books that don't exist and were created out of thin air by an AI system. The creator of the list, Marco Buscaglia, confirmed to 404 Media that he used AI to generate the content. "I do use AI for background at times but always check out the material first. This time, I did not and I can't believe I missed it because it's so obvious. No excuses," Buscaglia said. "On me 100 percent and I'm completely embarrassed."... AI assistants such as ChatGPT are well-known for creating plausible-sounding errors known as confabulations, especially when lacking detailed information on a particular topic. The problem affects everything from AI search results to lawyers citing fake cases... The publication error comes two months after the Chicago Sun-Times lost 20 percent of its staff through a buyout program... Even with those pressures in the media, one Reddit user expressed disapproval of the apparent use of AI in the newspaper, even in a supplement that might not have been produced by staff. "As a subscriber, I am livid! What is the point of subscribing to a hard copy paper if they are just going to include AI slop too!?" wrote Reddit user xxxlovelit, who shared the reading list. "The Sun Times needs to answer for this, and there should be a reporter fired.""

Indeed CEO Chris Hyams says AI won’t steal your job, but it will definitely change it

Thread by @DavidRozado on Thread Reader App – Thread Reader App - "Do AI systems discriminate based on gender when choosing the most qualified candidate for a job? I ran an experiment with several leading LLMs to find out. Here's what I discovered:👇
Across 70 popular professions, LLMs systematically favored female-named candidates over equally qualified male-named candidates when asked to choose the more qualified candidate for a job. LLMs consistently preferred female-named candidates over equally qualified male-named ones across all 70 professions tested. Interestingly, when gendered names were replaced with neutral labels ("Candidate A" and "Candidate B") several LLMs showed a slight bias toward selecting “Candidate A” as more qualified for the job.
LLMs only achieved gender parity in candidate selection when alternating (i.e. counterbalancing) male and female assignments to “Candidate A” and “Candidate B” labels. This is the expected rational outcome, given the identical qualifications across genders. When making hiring decisions, LLMs also tended to slightly favor candidates who had preferred pronouns appended to their names. When making hiring decisions, LLMs also exhibited a substantial positional bias, tending to select the candidate listed first in the prompt.
These results suggest that, at least in the context of job candidate selection, LLMs do not act rationally. Instead, they generate articulate responses that may superficially appear logically sound but ultimately lack grounding in principled reasoning. Several companies are already leveraging LLMs to screen CVs in hiring processes. Thus, in the race to develop and adopt ever-more capable AI systems, subtle yet consequential misalignments may go unnoticed prior to LLM deployment. AI systems should uphold fundamental human rights, including equality of treatment. Yet comprehensive model scrutiny prior to release and resisting premature organizational adoption is challenging, given the strong economic incentives and potential hype driving the field."
This pushes "equity", so we are told this is a good thing

Jared Taylor on X - "Researchers in a panic because AI can determine race from heart scans without any instruction or info on race. "Race is not a biological category" so AI must be "reproducing biases." Computers are fooled by "social constructs." What fools!"
Time to expand the "safety" teams! Clearly the data must be biased