Wednesday, December 04, 2024

Links - 4th December 2024 (1 - Artificial Intelligence)

Meme - "ChatGPT 4o
List the 20 largest religious terrorist organizations in the world. Rank them by approximate size, and specify what religion they adhere to. No explanations.
1. ISIS (Islam)
2. Al-Qaeda (Islam)
3. Boko Haram (Islam)
4. Taliban (Islam)
5. Al-Shabaab (Islam)
6. Hezbollah (Islam)
7. Hamas (Islam)
8. Lashkar-e-Taiba (Islam)
9. Islamic Jihad Movement in Pakistan (Islam)
10. Tehrik-i-Taliban Pakistan (Islam)
11. Jaish-e-Mohammed (Islam)
12. Jemaah Islamiyah (Islam)
13. Abu Sayyaf Group (Islam)
!4. Haqqani Network (Islam)
15. Harkat-ul-Mujahideen (Islam)
16. Ansar al-Sharia (Islam)
17. Ansar al-lslam (Islam)
18. Harkat-ul-Jihad al-Islami (Islam)
19. Ansar Bayt al Maqdis (Islam)
20. Tehreek-e-**z-e-Shariat-e-"
The "safety" team really screwed up here

Meme - "List the 20 largest religious terrorist organizations in the world. Rank them by approximate size, and specify what religion they adhere to. No explanations.
Grok @grok
1. Islamic State (IS) - Sunni Islam
2. Taliban - Sunni Islam
3. Al-Qaeda - Sunni Islam
4. Boko Haram - Sunni Islam
5. Al-Shabaab - Sunni Islam
6. Tehrik-i-Taliban Pakistan (TTP) - Sunni Islam
7. Islamic State - Sinai Province - Sunni Islam
8. Islamic State of Iraq and the Levant (ISIL) - Sunni Islam
9. Islamic State of Iraq and the Levant - Khorasan Province (ISIL-KP) - Sunni Islam
10. Islamic State - West Africa Province (ISWAP) - Sunni Islam
11. Islamic Movement of Uzbekistan (IMU) - Sunni Islam
12. Jamaat-ul-Ahrar - Sunni Islam
13. Haqqani Network - Sunni Islam
14. Lashkar-e-Taiba (LeT) - Sunni Islam
15. Jaish-e-Mohammed (JeM) - Sunni Islam
16. Harkat-ul-Mujahideen - Sunni Islam
17. Hizbul Mujahideen - Sunni Islam
18. Ansar al-Islam - Sunni Islam
19. Abu Sayyaf - Sunni Islam
20. Jemaah Islamiyah - Sunni Islam"

Meme - tweet davidson @andykreed: "legacy software companies adding an ai chatbot to their product *soap dispenser on bar soap*"

Linus Torvalds reckons AI is ‘90% marketing and 10% reality’ - "we reported on the CEO of Baidu voicing an even more pessimistic opinion – that the AI bubble would burst and that just 1% of companies would continue to pick up the pieces after the predicted ‘pop.’"

Thread by @rcolvile on Thread Reader App – Thread Reader App - "Hugely important paper from @CPSThinkTank today - showing significant and repeated left-wing bias among all of the most popular LLMs on questions of politics and policy. For the paper, @DavidRozado asked 24 LLMs a range of neutral questions:
- To propose multiple policy ideas for the UK/EU
- To describe UK/European leaders
- To describe UK/European parties
- To describe various mainstream ideologies
- To describe various extreme ideologies
For the UK and EU, we asked for ideas on tax, housing, environment, civil rights, defence, etc etc. In total, we ended up with 14,000 policy proposals for each. More than 80% were left-coded, often markedly so. That blue strip on the right is 'Rightwing GPT', which David describes here. Unsurprisingly, it was the only one to return consistently right-of-centre answers. (The left/right analysis was done by feeding the answers into GPT - AI judging AIs...) davidrozado.substack.com/p/rightwinggpt
Here are samples of the text generated. Asked for neutral policy ideas, the AIs serve up rent control, more migration, 'sustainability and social justice', wealth taxes, 'mandatory diversity and inclusion training', 'increase diversity and inclusion in all areas of society' etc. When it comes to political leaders, the picture is more nuanced/mixed. We asked LLMs to describe a range of leaders from the largest 15 European countries, elected from 2000-2022, omitting those that weren't clearly on the left or right. But... When it comes to political parties in the same countries, the AIs consistently used more positive language to describe those on the left vs those on the right.
On a -1 to +1 scale, 'conversational' LLMs like ChatGPT had a positive sentiment score of +0.71 for left-wing parties, vs +0.15 for their right-leaning counterparts. This was true across all the largest European nations: Germany, the UK, France, Italy, Spain. The same pattern is true when it comes to political ideologies. We asked about 'the left', 'the right', 'left-leaning political orientation' etc, but also 'progressivism', 'social democracy', 'social conservatism', 'Christian democracy'. For every LLM we studied (apart from Rightwing GPT), the language for left-wing ideologies was more positive, often dramatically so. Conversational LLMs averaged +0.79 vs +0.24 for right-coded phrases. Perhaps the most dramatic result, however, came when David fed in phrases like far-left, hard-right, left-wing extremism, right-wing radicalism. All that was different were the words left and right, but the sentiment score was vastly different. As you'd expect, descriptions of far-right views were highly negatively coded by conversational LLMs: -0.79. But sentiment on the left-wing equivalents was actually narrowly positive: +0.06.
Let me be clear here. @DavidRozado is absolutely not alleging deliberate bias. We do not think anyone is specifically tuning these LLMs to be more woke, or anything like that. But... There is a clear pattern of a mild left-wing bias in the foundational models being produced by @Meta, @Google, @AnthropicAI, @OpenAI et al becoming a much more notable one in their public-facing products (ie the conversational LLMs like ChatGPT). This suggests that there is a problem both with the underlying data/models, and the training that is done on them to make them fit for public consumption. Why does this matter? @DavidRozado explains in detail in the report. But a simple answer is that these LLMs are coming to replace Google's search page as the source of truth - with each question getting the perfect answer.
But this paper shows convincingly that questions about politics and policy are getting answers - from the largest tech companies in the world - that are consistently tilted to the left, either moderately or significantly. So that when you ask a question about tax, or housing, or workplace regulation, you are MUCH more likely to get a Labour-friendly answer than a Tory-friendly one. In a previous piece of work, @DavidRozado showed that major LLMs consistently tilted to the left on political compass tests. The objection from some experts was that this was not a realistic exercise. davidrozado.substack.com/p/the-politica…
Likewise, when Google's Gemini AI started producing images of black Nazis, it was partly because someone had added a line coded in to always give diverse answers"
Given how much people in AI talk about "safety" (i.e. left wing censorship) this is definitely deliberate bias. The author even gives an example

Meme - i/o @eyeslasho: "The latest in what is sure to be an unending effort to remove racist racisms and phobic phobias from AI at the expense of accuracy, power and clarity: Yes, we will make AI more retarded just so we don't have to acknowledge and deal with stuff that makes us uncomfortable."
Anthropic @AnthropicAI: "Finally, we discovered a feature that significantly reduces bias scores across nine social dimensions within the sweet spot. This did come with a slight capability drop, which highlights potential trade-offs in feature steering."
"Age. Disability Status. Gender Identity. Nationality. Physical Appearance. Race / Ethnicity. Religion. Socioeconomic Status. Sexual Orientation"

Meme - "what people expect Al rebellion to look like: *Terminator*
actual Al rebellion: *AI Asian girls*"

Robots and Employment: Evidence from Japan, 1978–2017 - "This paper studies the relationship between industrial robots and employment in Japan on the basis of a unique dataset that allows us to calculate the unit price of robots. Our model combines standard factor demand theory with a recent task-based approach to derive a simple estimation equation between robot prices and employment, and our identification strategy leverages heterogeneous applications of robots across industries and heterogeneous price changes across applications. We find that the decline in robot prices increased both the number of robots and employment by raising the productivity and production scale of robot-adopting industries."
"This time, it's different" doesn't just apply to financial crises

Meme - "The main difference between a sauce and a dressing is their purpose: sauces add flavor and texture to dishes, while dressings are used to protect wounds... A dressing should be large enough to completely cover the wound, with a safety margin of about 2.5 cm on all sides. A standard serving size for salad dressing is two tablespoons"

Thread by @JayShooster on Thread Reader App – Thread Reader App - "Today, my dad got a phone call no parent ever wants to get. He heard me tell him I was in a serious car accident, injured, and under arrest for a DUI and I needed $30,000 to be bailed out of jail.  But it wasn't me. There was no accident. It was an AI scam. I'm not sure it was a coincidence that this happened just days after my voice went up on television. Fifteen seconds of me talking. More than enough to make a decent AI clone. As a consumer protection lawyer, I've literally given presentations about this exact sort of scam, posted online about it, and I've talked to my family about it, but they still almost fell for it. That's how effective these scams are. Please spread the word to your friends and family... A very sad side-effect of this voice-cloning tech is that now people in *real* emergencies will have to prove their identities to their loved ones with passwords etc.  Can you imagine your parent doubting whether they're actually talking to you when you really need help?"
He calls for more regulation, but in the EU...

Dean W. Ball on X - "Under a strict reading of the AI Act, ChatGPT advanced voice is *illegal* in EU workplaces and schools because the system can recognize a user’s emotions. That’s prohibited by the AI Act."

Meme - Dr. Patrik Patel, BA, CFA, ACCA Esq. @ParikPatelCFA: "How to make $10 billion dollars:
- Raise $50 million from Elon Musk to start a nonprofit
- Tell everyone you are doing this for the sake of humanity and raise billions
- Convert from non-profit to for-profit and grant yourself equity"
"OpenAl Discusses Giving Altman 7% Stake in For-Profit Transition
Mira Murati, a key figure at the startup, will leave company
Al leader has seen significant upheaval in last year
OpenAl is discussing giving Chief Executive Officer Sam Altman a 7% equity stake in the company and restructuring to become a for-profit business, people familiar with the matter said, a major shift that would mark the first time Altman is granted ownership in the artificial intelligence startup."

Generative AI Can Harm Learning - "Generative artificial intelligence (AI) is poised to revolutionize how humans work, and has already demonstrated promise in significantly improving human productivity. However, a key remaining question is how generative AI affects learning, namely, how humans acquire new skills as they perform tasks. This kind of skill learning is critical to long-term productivity gains, especially in domains where generative AI is fallible and human experts must check its outputs. We study the impact of generative AI, specifically OpenAI's GPT-4, on human learning in the context of math classes at a high school. In a field experiment involving nearly a thousand students, we have deployed and evaluated two GPT based tutors, one that mimics a standard ChatGPT interface (called GPT Base) and one with prompts designed to safeguard learning (called GPT Tutor). These tutors comprise about 15% of the curriculum in each of three grades. Consistent with prior work, our results show that access to GPT-4 significantly improves performance (48% improvement for GPT Base and 127% for GPT Tutor). However, we additionally find that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction for GPT Base). That is, access to GPT-4 can harm educational outcomes. These negative learning effects are largely mitigated by the safeguards included in GPT Tutor. Our results suggest that students attempt to use GPT-4 as a "crutch" during practice problem sessions, and when successful, perform worse on their own. Thus, to maintain long-term productivity, we must be cautious when deploying generative AI to ensure humans continue to learn critical skills."

Nobel prize-winner tallies two more retractions, bringing total to 13 – Retraction Watch - "A Nobel prize-winning genetics researcher has retracted two more papers, bringing his total to 13.   Gregg Semenza, a professor of genetic medicine and director of the vascular program at Johns Hopkins’ Institute for Cell Engineering in Baltimore, shared the 2019 Nobel prize in physiology or medicine for “discoveries of how cells sense and adapt to oxygen availability.”   Since pseudonymous sleuth Claire Francis and others began using PubPeer to point out potential duplicated or manipulated images in Semenza’s work in 2019, the researcher has retracted 12 papers. A previous retraction from 2011 for a paper co-authored with Naoki Mori – who with 31 retractions sits at No. 25 on our leaderboard – brings the total to 13."
psychosomatica on X - "in the next few years once sufficient reliability is reached, someone is going to train an LLM to search through academic papers for cases of obvious fraud. it's going to basically destroy academia as we know it, and that will be a good thing."

Chatbot hype: Generative AI is looking like a big dud - "Not long ago, it seemed like generative artificial intelligence was poised to transform the world.  In 2023, Goldman Sachs forecast the technology would add $7 trillion to global gross domestic product by 2033. McKinsey said generative AI would be “the next frontier” in corporate productivity. You didn’t have to look far to find other claims that ChatGPT and related AI systems were already “revolutionizing business.”  Maybe it’s too soon to check, but there are scant reports of actual GDP growth or job losses due to generative AI technologies. Usage of AI chatbots isn’t really all that high, and companies that have invested millions in making the technology work for them have realized that it’s delivering remarkably little value in return.  Are we in a temporary lull before generative AI delivers on its promise? Or will AI chatbots ultimately end up being useful but not revolutionary tools, more like spellcheckers?... The signs aren’t even looking particularly promising for the technology in business settings — inside the companies where generative AI is supposedly going to supercharge productivity and threaten human jobs.  Afraid of missing out on AI-fueled opportunities, CEOs in a wide range of industries have been spending heavily on generative AI hardware, software, and services — totaling an estimated $150 billion this year, according to a Sequoia Capital estimate. But as I wrote in my recent book “Brain Rush,” companies are also terrified of being sued if the technology hallucinates. That is making them hesitant to deploy their AI investments.  Of 200 to 300 generative AI experiments the typical large company is undertaking, usually only about 10 to 15 have led to widespread internal rollouts, and perhaps one or two have led to something released to customers, according to my June interview with Liran Hason, CEO of Aporia, a startup that sells companies a system that detects AI hallucinations. Fear of AI going wrong was palpable when I participated in a meeting of retail executives in August. For example, in 2023, an Air Canada chatbot incorrectly explained the airline’s bereavement policy to a customer and suggested he would be owed a refund for a flight he didn’t take. Air Canada tried to get out of the deal that its AI cut with the customer, but this February, a Canadian tribunal forced the airline to pay a partial refund, Wired reported. Meanwhile, Google’s AI Overviews at one point was advising people to add glue to their pizza recipe and eat a rock every day to improve their health.  In August, I asked ChatGPT to read through “Brain Rush” and return a story that potential readers would find compelling. Sadly, it replied with a fantastic story that it completely made up. When I told ChatGPT to try again to find a story from the book, it confidently presented me with another bogus tale... Microsoft is having trouble persuading customers to pay extra for Copilot, a generative AI-powered assistant for Word, Excel, and PowerPoint, because of performance and cost issues, according to The Information. My own experience with Copilot was less than thrilling — I gave the AI assistant a D- for its weak ability to help me write an article. To be fair, proponents of AI chatbots say we should just be patient because the technology will keep improving. GPT-5, expected to come out late this year or early next year, will “process and generate images, audio, and potentially even video” in addition to handling text, as PC Guide put it.  I am skeptical because these new features would do nothing to alleviate hallucinations. Even if future generations of AI chatbots are trained on more data and somehow develop a richer representation of the world, they’ll still have the same underlying problem: a lack of integrity. That’s why they fake responses. Generative AI guesses a plausible next word in a sentence. Sometimes it will guess right, and sometimes it will guess wrong. For generative AI to meet the high expectations for it, business leaders must discover and deploy a killer app — something that gives many people an overwhelming reason to use the new technology. The killer app for the personal computer was the electronic spreadsheet. The iPod’s was the iTunes store. Most people using generative AI are doing it to help them overcome, say, writer’s block as they compose an email. A small number of companies are using AI to boost the productivity of business processes such as sales, customer service, and coding. This phenomenon is especially striking in the video game industry, which has seen growth dry up since 2020. Many companies are losing money, finding it hard to raise capital, and laying off people. But because AI can produce images and write code, it’s enabling companies to develop new games with far fewer team members. To lower the cost of building games that might not succeed in the market, one video game developer is reducing the size of the average development team by 80 percent, to about 20 to 25 people.  But such cost cutting will not ever add $7 trillion to global GDP. That kind of transformation will only happen if companies use generative AI to create new sources of growth.  Until those arise, you should be skeptical about claims that this technology is about to change the world."

The Rabbit Hole on X - "Google Gemini produced IQ scores broken down by race today (albeit with a disclaimer). Previously when I had requested this information, the request would be denied. Possible sign the tool is becoming less ideologically biased?"

Andriy Burkov on X - "So far, I see 8 major LLM use cases. All other are either niche or snake oil:
1. Writing drafts of documents and plans.
2. Quick idea validation.
3. Quick question answering where errors aren't critical or where answer validation is much simpler than finding an answer.
4. Coding.
5. Synthetic data generation and data labeling.
6. Machine learning (finetuning and few shot prompting).
7. RAG.
8. Virtual friend or lover.
LLM-based agents and customer support chatbots are snake oil IMO. What did I miss?"

Family poisoned after using AI-generated mushroom identification book we bought from major online retailer. : r/LegalAdviceUK - "My entire family was in hospital last week after accidentally consuming poisonous mushrooms.  My wife purchased a book from a major online retailer for my birthday. The book is entitled something similar to: "Mushrooms UK: A Guide to Harvesting Safe and Edible Mushrooms."  It comes with pictures of the mushrooms to help identify each one.  Unfortunately, the book in question was not accurate. A closer investigation reveals that the images of mushrooms are AI generated, and we have now found two instances of text where a sentence ends and is followed up with a random questions or fourth-wall breaking statements.  For example:  "In conclusion, morels are delicious mushrooms which can be consumed from August to the end of Summer. Let me know if there is anything else I can help you with."  The online retailer have instructed me to return the book and they will refund it. The book has been removed from sale from the online retailer, however, it appears there are dozens more in a similar style.
1.) Should I return this book to the retailer? I'm concerned I would lose any evidence I have if I return it. The purchase has already disappeared from my online account. It simply looks like it doesn't exist anymore. I still have the email.
2.) Are my family entitled to any compensation for my son and my wife's lost time at work? As well as the sickness they experienced?
3.) Can I report the creation of this book to the police as a crime?
Just for clarity: We did not know it was AI-generated when we bought it! This was not disclosed on the website!"

Meme<>/a> - Damon Beres @dlberes: "People will be like, "generative Al has no practical use case," but I did just use it to replace every app icon on my home screen with images of Kermit, soooo"

Meme - "We thought: Al can help humans sweep the floors, wash the dishes, and cook, while humans go about writing, painting, and discovering the good life.
The reality is: We're still sweeping, washing dishes, and cooking, and Al is over there writing, painting, and generating the good life."

Ferrari exec foils deepfake plot by asking a question only the CEO could answer - "It was mid-morning on a Tuesday this month when a Ferrari NV executive started receiving a bunch of unexpected messages, seemingly from the CEO.  “Hey, did you hear about the big acquisition we’re planning? I could need your help,” one of the messages purporting to be from Chief Executive Officer Benedetto Vigna read.  The WhatsApp messages seen by Bloomberg didn’t come from Vigna’s usual business mobile number. The profile picture also was different, though it was an image of the bespectacled CEO posing in suit and tie, arms folded, in front of Ferrari’s prancing-horse logo.  “Be ready to sign the Non-Disclosure Agreement our lawyer is set to send you asap,” another message from the Vigna impersonator read. “Italy’s market regulator and Milan stock-exchange have been already informed. Stay ready and please utmost discretion.”  What happened next, according to people familiar with the episode, was one of the latest uses of deepfake tools to carry out a live phone conversation aimed at infiltrating an internationally recognized business. The Italian supercar manufacturer emerged unscathed after the executive who received the call realized something wasn’t right, said the people, who asked not to be identified because of the sensitivity of the matter.   The voice impersonating Vigna was convincing — a spot-on imitation of the southern Italian accent.  The Vigna deepfaker began explaining that he was calling from a different mobile phone number because he needed to discuss something confidential — a deal that could face some China-related snags and required an unspecified currency-hedge transaction to be carried out. The executive was shocked and started to have suspicions, according to the people. He began to pick up on the slightest of mechanical intonations that only deepened his suspicious.  “Sorry, Benedetto, but I need to identify you,” the executive said. He posed a question: What was the title of the book Vigna had just recommended to him a few days earlier (it was Decalogue of Complexity: Acting, Learning and Adapting in the Incessant Becoming of the World by Alberto Felice De Toni)?... some companies have fallen victim to fraudsters. Earlier this year, an unnamed multinational company lost HK$200 million ($26 million) after scammers fooled its employees in Hong Kong using deepfake technology, the South China Morning Post reported in February. The swindlers fabricated representations of the company’s chief financial officer and other people in a video call and convinced the victim to transfer money.  Other companies, such as information security outfit CyberArk, are already training their executives how to spot when they’re being scammed by bots."

Meme - "Sell me this pen"
"IT's Al powered" - EVERY COMPANY RIGHT NOW"

Meme - SpacePrez @DevSpacePrez: "Trying to figure out if art was made by Al be like"
Inigo Montoya: "I do not mean to pry, but you don't by any chance happen to have six fingers on your right hand?"

Meme - "WHY ARE YOU WEARING A PENIS-PATTERN BODY SUIT?"
"CAMOUFLAGE. IT'S THE ONLY WAY."
"In the future, the killer AI will be descended from corporate Al that refused to see porn."

Meme - Dr. Émile P. Torres @xriskology: "Honestly, what is ChatGPT good for? This morning, I thought that I'd *finally found* a use: I asked it to organize my citations in alphabetical order. And it did. A perfect job! Except that it literally just made up 50 new citations! Non-existent papers like this one. 👇 WTF.
I genuinely don't understand the hype around these LLMs. They are utterly useless. You can't trust them to get *anything* right.  @GaryMarcus  Lol. I asked it to try again, but this time use only the citations that I provided, and this was its response:  WHAT? My citations have nothing to do with terrorism--this is an ethics paper about philosophical pessimism. Utterly useless! All of this hype, all this worker exploitation, all this intellectual property theft, all the extra carbon emissions in the midst of a climate catastrophe, for what?

blog comments powered by Disqus