Chatbots : the GOOD , the BAD and the UGLY

How do you regulate advanced AI chatbots like ChatGPT and Bard?

Extract :

“AI will fundamentally change every software category” said Microsoft’s CEO Satya Nadella Tuesday when he announced OpenAI’s generative AI technology was coming to the Bing search engine to offer users what MSFT hopes will be a richer search experience.

But how to regulate tools, such as OpenAI’s chatbot ChatGPT, that can generate any type of content from a few words, and are trained on the world’s knowledge, is a question that is puzzling policymakers around the world. The solution will involve assessing risk, one expert told Tech Monitor, and certain types of content will need to be more closely monitored than others.

Within two months of launch, ChatGPT, the AI chatbot became the fastest-growing consumer product in history, with more than 100 million active monthly users in January alone. It has prompted some of the world’s largest companies to pivot to or speed up AI rollout plans and has given a new lease of life to the conversational AI sector.

Microsoft is embedding conversational AI in its browser, search engine and broader product range, while Google is planning to do the same with the chatbot Bard and other integrations into Gmail and Google Cloud, several of which it showcased at an event in Paris today.

Other tech giants such as China’s Baidu are also getting in on the act with chatbots of their own, and start-ups and smaller companies including Jasper and Quora bringing generative and conversational AI to the mainstream consumer and enterprise markets.

This comes with real risks from widespread misinformation and harder-to-spot phishing emails through to misdiagnosis and malpractice if used for medical information. There is also a high risk of bias if the data used to feed the model isn’t diverse. While Microsoft has a retrained model that is more accurate, and other providers like AI21 are working on verifying generated content against live data, the risk of “real looking but completely inaccurate” responses from generative AI are still high.

Last week, Thierry Breton, the EU commissioner for the internal market, said that the upcoming EU AI act would include provisions targeted at generative AI systems such as ChatGPT and Bard. “As showcased by ChatGPT, AI solutions can offer great opportunities for businesses and citizens, but can also pose risks,” Breton told Reuters. “This is why we need a solid regulatory framework to ensure trustworthy AI based on high-quality data.”

Breton and his colleagues will have to act fast, as new AI rules drawn up in the EU and elsewhere may not be ready to cope with the challenges posed by these advanced chatbots.

Analytics software provider SAS outlined some of the risks posed by AI in a recent report, AI & Responsible Innovation. Author Dr Kirk Borne said: “AI has become so powerful, and so pervasive, that it’s increasingly difficult to tell what’s real or not, and what’s good or bad”, adding that this technology is being adopted faster than it can be regulated.

Dr Iain Brown, head of data science at SAS UK & Ireland, said governments and industry both have a role to play in ensuring AI is used for good, not harm.

This includes the use of ethical frameworks to guide the development of AI models and strict governance to ensure fair, transparent and equitable decisions from those models.

“We test our AI models against challenger models and optimise them as new data becomes available,” Brown explained.

Other experts believe companies producing the software will be charged with mitigating the risk the software represents, with only the highest-risk activities facing tighter regulation.

Edward Machin, data, privacy and cybersecurity associate at law firm Ropes and Gray told Tech Monitor it is inevitable that technology like ChatGPT, which seemingly appeared overnight, will move faster than regulation, especially in an area like AI which is already difficult to regulate.

“Although regulation of these models is going to happen, whether it is the right regulation, or at the right time, remains to be seen,” he says

“Providers of AI systems will bear the brunt of the legislation, but importers and distributors – in the EU at least – will also be subject to potentially onerous obligations,” Machin adds. This could put some developers of open-source software in a difficult position.

“There is also the thorny question of how liability will be handled for open-source developers and other downstream parties, which may have a chilling effect on willingness of those folks to innovate and conduct research,” Machin says.

AI, privacy and GDPR

Aside from the overall regulation of AI, there are also questions around the copyright of generated content and around privacy, Machin continues. “For example, it’s not clear whether developers can easily – if at all – address individuals’ deletion or rectification requests, nor how they get comfortable with scraping large volumes of data from third-party websites in a way that likely breaches those sites’ terms of service,” he says.

Lilian Edwards, Professor of Law, Innovation and Society at Newcastle University, who works on regulation of AI and with the Alan Turing Institute, said some of these models will come under GDPR, and this could lead to orders being issued to delete training data or even the algorithms themselves. It may also spell the end of widescale scraping of the internet, currently used to power search engines like Google, if website owners lose out on traffic to AI searches.

The big problem, says Edwards, is the general purpose nature of these models. This makes them difficult to regulate under the EU AI Act, which has been drafted to work on the basis of risk, as it is difficult to judge what the end user is going to be doing with the technology due to the fact it is designed for multiple use cases. She said the European Commission is trying to add rules to govern this type of technology but is likely to do this after the act becomes law, which could happen this year.

Enforcing algorithmic transparency could be one solution. “Big Tech will start lobbying to say ‘you can’t put these obligations on us as we can’t imagine every future risk or use’,” says Dr Edwards. “There are ways of dealing with this that are less or more helpful to Big Tech, including making the underlying algorithms more transparent. We are in a head-in-the-sand moment. Incentives ought to be towards openness and transparency to better understand how AI makes decisions and generates content.”

“It is the same problem you get with much more boring technology, that tech is global, bad actors are global and enforcement is incredibly difficult,” she said. “General purpose AI doesn’t match the structure of the AI act which is what the fight is over now.”

Adam Leon Smith, CTO of AI consultancy DragonFly has worked in technical AI standardisation with UK and international standards development organisations and acted as the UK industry representative to the EU AI standards group. “Regulators globally are increasingly realising that it is very difficult to regulate technology without consideration of how it is actually being used,” he says.

He told Tech Monitor that accuracy and bias requirements can only be considered in the context of use, with risks, rights and freedoms requirements also difficult to consider before it reaches wide-scale adoption. The problem, he says, is that large language models are general-purpose AI.

“Regulators can force transparency and logging requirements on the technology providers,” Leon Smith says. “However, only the user – the company that operates and deploys the LLM system for a particular purpose – can understand the risks and implement mitigations like humans in the loop or ongoing monitoring.”

AI regulatory debate looming

It is a large-scale debate that is looming over the European Commission and hasn’t even started in the UK, but one that regulators such as data watchdog the Information Commissioner’s Office and its counterpart for financial markets, the Financial Conduct Authority, will have to tackle. Eventually, Leon Smith believes, as regulators increase their focus on the issue, AI providers will start to list the purposes for which the technology “must not be used”, including issuing legal disclaimers before a user signs in to put them outside the scope of “risk-based regulatory action”.

Current best practices for managing AI systems “barely touch on LLMs, it is a nascent field that is moving extremely quickly,” Leon Smith says. “A lot of work is necessary in this space and the firms providing such technologies are not stepping up to help define them.”

OpenAI’s CTO Mira Muratti this week said that generative AI tools will need to be regulated.

“It is important for OpenAI and companies like ours to bring this into the public consciousness in a way that’s controlled and responsible,” she said in an interview with Time.

But beyond the AI vendors, she said “a tonne more input into the system” is needed, including from regulators and governments. She added that it’s important the issue is considered quickly. “It’s not too early,” Muratti said. “It’s very important for everyone to start getting involved, given the impact these technologies are going to have.”

Why Do A.I. Chatbots Tell Lies and Act Weird? Look in the Mirror.

Extract :

When Microsoft added a chatbot to its Bing search engine this month, people noticed it was offering up all sorts of bogus information about the Gap, Mexican nightlife and the singer Billie Eilish.

Then, when journalists and other early testers got into lengthy conversations with Microsoft’s A.I. bot, it slid into churlish and unnervingly creepy behavior.

In the days since the Bing bot’s behavior became a worldwide sensation, people have struggled to understand the oddity of this new creation. More often than not, scientists have said humans deserve much of the blame.

But there is still a bit of mystery about what the new chatbot can do — and why it would do it. Its complexity makes it hard to dissect and even harder to predict, and researchers are looking at it through a philosophic lens as well as the hard code of computer science.

Like any other student, an A.I. system can learn bad information from bad sources. And that strange behavior? It may be a chatbot’s distorted reflection of the words and intentions of the people using it, said Terry Sejnowski, a neuroscientist, psychologist and computer scientist who helped lay the intellectual and technical groundwork for modern artificial intelligence.

“This happens when you go deeper and deeper into these systems,” said Dr. Sejnowski, a professor at the Salk Institute for Biological Studies and the University of California, San Diego, who published a research paper on this phenomenon this month in the scientific journal Neural Computation. “Whatever you are looking for — whatever you desire — they will provide.”

Google also showed off a new chatbot, Bard, this month, but scientists and journalists quickly realized it was writing nonsense about the James Webb Space Telescope. OpenAI, a San Francisco start-up, launched the chatbot boom in November when it introduced ChatGPT, which also doesn’t always tell the truth.

The new chatbots are driven by a technology that scientists call a large language model, or L.L.M. These systems learn by analyzing enormous amounts of digital text culled from the internet, which includes volumes of untruthful, biased and otherwise toxic material. The text that chatbots learn from is also a bit outdated, because they must spend months analyzing it before the public can use them.

As it analyzes that sea of good and bad information from across the internet, an L.L.M. learns to do one particular thing: guess the next word in a sequence of words.

It operates like a giant version of the autocomplete technology that suggests the next word as you type out an email or an instant message on your smartphone. Given the sequence “Tom Cruise is a ____,” it might guess “actor.”

When you chat with a chatbot, the bot is not just drawing on everything it has learned from the internet. It is drawing on everything you have said to it and everything it has said back.

It is not just guessing the next word in its sentence. It is guessing the next word in the long block of text that includes both your words and its words.

The longer the conversation becomes, the more influence a user unwittingly has on what the chatbot is saying. If you want it to get angry, it gets angry, Dr. Sejnowski said. If you coax it to get creepy, it gets creepy.

The alarmed reactions to the strange behavior of Microsoft’s chatbot overshadowed an important point:

The chatbot does not have a personality. It is offering instant results spit out by an incredibly complex computer algorithm.

Microsoft appeared to curtail the strangest behavior when it placed a limit on the lengths of discussions with the Bing chatbot. That was like learning from a car’s test driver that going too fast for too long will burn out its engine. Microsoft’s partner, OpenAI, and Google are also exploring ways of controlling the behavior of their bots.

But there’s a caveat to this reassurance: Because chatbots are learning from so much material and putting it together in such a complex way, researchers aren’t entirely clear how chatbots are producing their final results. Researchers are watching to see what the bots do and learning to place limits on that behavior — often, after it happens.

Microsoft and OpenAI have decided that the only way they can find out what the chatbots will do in the real world is by letting them loose — and reeling them in when they stray. They believe their big, public experiment is worth the risk.

Dr. Sejnowski compared the behavior of Microsoft’s chatbot to the Mirror of Erised, a mystical artifact in J.K. Rowling’s Harry Potter novels and the many movies based on her inventive world of young wizards.

“Erised” is “desire” spelled backward. When people discover the mirror, it seems to provide truth and understanding. But it does not. It shows the deep-seated desires of anyone who stares into it. And some people go mad if they stare too long.

“Because the human and the L.L.M.s are both mirroring each other, over time they will tend toward a common conceptual state,” Dr. Sejnowski said.

It was not surprising, he said, that journalists began seeing creepy behavior in the Bing chatbot. Either consciously or unconsciously, they were prodding the system in an uncomfortable direction.

As the chatbots take in our words and reflect them back to us, they can reinforce and amplify our beliefs and coax us into believing what they are telling us.

Dr. Sejnowski was among a tiny group researchers in the late 1970s and early 1980s who began to seriously explore a kind of artificial intelligence called a neural network, which drives today’s chatbots.

A neural network is a mathematical system that learns skills by analyzing digital data. This is the same technology that allows Siri and Alexa to recognize what you say.

Around 2018, researchers at companies like Google and OpenAI began building neural networks that learned from vast amounts of digital text, including books, Wikipedia articles, chat logs and other stuff posted to the internet. By pinpointing billions of patterns in all this text, these L.L.M.s learned to generate text on their own, including tweets, blog posts, speeches and computer programs. They could even carry on a conversation.

These systems are a reflection of humanity. They learn their skills by analyzing text that humans have posted to the internet.

But that is not the only reason chatbots generate problematic language, said Melanie Mitchell, an A.I. researcher at the Santa Fe Institute, an independent lab in New Mexico.

When they generate text, these systems do not repeat what is on the internet word for word. They produce new text on their own by combining billions of patterns.

Even if researchers trained these systems solely on peer-reviewed scientific literature, they might still produce statements that were scientifically ridiculous. Even if they learned solely from text that was true, they might still produce untruths. Even if they learned only from text that was wholesome, they might still generate something creepy

“There is nothing preventing them from doing this,” Dr. Mitchell said. “They are just trying to produce something that sounds like human language.”

Artificial intelligence experts have long known that this technology exhibits all sorts of unexpected behavior. But they cannot always agree on how this behavior should be interpreted or how quickly the chatbots will improve.

Because these systems learn from far more data than we humans could ever wrap our heads around, even A.I. experts cannot understand why they generate a particular piece of text at any given moment.

Dr. Sejnowski said he believed that in the long run, the new chatbots had the power to make people more efficient and give them ways of doing their jobs better and faster. But this comes with a warning for both the companies building these chatbots and the people using them: They can also lead us away from the truth and into some dark places.

“This is terra incognita,” Dr. Sejnowski said. “Humans have never experienced this before.”

The Chatbots Are Here, and the Internet Industry Is in a Tizzy

Extract :

When Aaron Levie, the chief executive of Box, tried a new A.I. chatbot called ChatGPT in early December, it didn’t take him long to declare, “We need people on this!”

He cleared his calendar and asked employees to figure out how the technology, which instantly provides comprehensive answers to complex questions, could benefit Box, a cloud computing company that sells services that help businesses manage their online data.

Mr. Levie’s reaction to ChatGPT was typical of the anxiety — and excitement — over Silicon Valley’s new new thing. Chatbots have ignited a scramble to determine whether their technology could upend the economics of the internet, turn today’s powerhouses into has-beens or create the industry’s next giants.

Not since the iPhone has the belief that a new technology could change the industry run so deep. Cloud computing companies are rushing to deliver chatbot tools, even as they worry that the technology will gut other parts of their businesses. E-commerce outfits are dreaming of new ways to sell things. Social media platforms are being flooded with posts written by bots. And publishing companies are fretting that even more dollars will be squeezed out of digital advertising.

The volatility of chatbots has made it impossible to predict their impact. In one second, the systems impress by fielding a complex request for a five-day itinerary, making Google’s search engine look archaic. A moment later, they disturb by taking conversations in dark directions and launching verbal assaults.

The result is an industry gripped with the question: What do we do now?

“Everybody is agitated,” said Erik Brynjolfsson, an economist at Stanford’s Institute for Human-Centered Artificial Intelligence. “There’s a lot of value to be won or lost.”

Rarely have so many tech sectors been simultaneously exposed. The A.I. systems could disrupt $100 billion in cloud spending, $500 billion in digital advertising and $5.4 trillion in e-commerce sales, according to totals from IDC, a market research firm, and GroupM, a media agency.

Google, perhaps more than any other company, has reason to both love and hate the chatbots. It has declared a “code red” because their abilities could be a blow to its $162 billion business showing ads on searches.

But Google’s cloud computing business could be a big winner. Smaller companies like Box need help building chatbot tools, so they are turning to the giants that process, store and manage information across the web. Those companies — Google, Microsoft and Amazon — are in a race to provide businesses with the software and substantial computing power behind their A.I. chatbots.

“The cloud computing providers have gone all in on A.I. over the last few months,” said Clément Delangue, head of the A.I. company Hugging Face, which helps run open-source projects similar to ChatGPT. “They are realizing that in a few years, most of the spending will be on A.I., so it is important for them to make big bets.”

When Microsoft introduced a chatbot-equipped Bing search engine last month, Yusuf Mehdi, the head of Bing, said the company was wrestling with how the new version would make money. Advertising will be a major driver, he said, but the company expects fewer ads than traditional search allows.

“We’re going to learn that as we go,” Mr. Mehdi said.

As Microsoft figures out a chatbot business model, it is forging ahead with plans to sell the technology to others. It charges $10 a month for a cloud service, built in conjunction with the OpenAI lab, that provides developers with coding suggestions, among other things.

Google has similar ambitions for its A.I. technology. After introducing its Bard chatbot last month, the company said its cloud customers would be able to tap into that underlying system for their own businesses.

But Google has not yet begun exploring how to make money from Bard itself, said Dan Taylor, a company vice president of global ads. It considers the technology “experimental,” he said, and is focused on using the so-called large language models that power chatbots to improve traditional search.

“The discourse on A.I. is rather narrow and focused on text and the chat experience,” Mr. Taylor said. “Our vision for search is about understanding information and all its forms: language, images, video, navigating the real world.”

Sridhar Ramaswamy, who led Google’s advertising division from 2013 to 2018, said Microsoft and Google recognized that their current search business might not survive. “The wall of ads and sea of blue links is a thing of the past,” said Mr. Ramaswamy, who now runs Neeva, a subscription-based search engine.

Amazon, which has a larger share of the cloud market than Microsoft and Google combined, has not been as public in its chatbot pursuit as the other two, though it has been working on A.I. technology for years.

But in January, Andy Jassy, Amazon’s chief executive, corresponded with Mr. Delangue of Hugging Face, and weeks later Amazon expanded a partnership to make it easier to offer Hugging Face’s software to customers.

As that underlying tech, known as generative A.I., becomes more widely available, it could fuel new ideas in e-commerce. Late last year, Manish Chandra, the chief executive of Poshmark, a popular online secondhand store, found himself daydreaming during a long flight from India about chatbots building profiles of people’s tastes, then recommending and buying clothes or electronics. He imagined grocers instantly fulfilling orders for a recipe.

“It becomes your mini-Amazon,” said Mr. Chandra, who has made integrating generative A.I. into Poshmark one of the company’s top priorities over the next three years. “That layer is going to be very powerful and disruptive and start almost a new layer of retail.”

But generative A.I is causing other headaches. In early December, users of Stack Overflow, a popular social network for computer programmers, began posting substandard coding advice written by ChatGPT. Moderators quickly banned A.I.-generated text.

Part of the problem was that people could post this questionable content far faster than they could write posts on their own, said Dennis Soemers, a moderator for the site. “Content generated by ChatGPT looks trustworthy and professional, but often isn’t,” he said.

When websites thrived during the pandemic as traffic from Google surged, Nilay Patel, editor in chief of The Verge, a tech news site, warned publishers that the search giant would one day turn off the spigot. He had seen Facebook stop linking out to websites and foresaw Google following suit in a bid to boost its own business.

He predicted that visitors from Google would drop from a third of websites’ traffic to nothing. He called that day “Google zero.”

“People thought I was crazy,” said Mr. Patel, who redesigned The Verge’s website to protect it. Because chatbots replace website search links with footnotes to answers, he said, many publishers are now asking if his prophecy is coming true.

For the past two months, strategists and engineers at the digital advertising company CafeMedia have met twice a week to contemplate a future where A.I. chatbots replace search engines and squeeze web traffic.

The group recently discussed what websites should do if chatbots lift information but send fewer visitors. One possible solution would be to encourage CafeMedia’s network of 4,200 websites to insert code that limited A.I. companies from taking content, a practice currently allowed because it contributes to search rankings.

“There are a million things to be worried about,” said Paul Bannister, CafeMedia’s chief strategy officer. “You have to figure out what to prioritize.”

Courts are expected to be the ultimate arbiter of content ownership. Last month, Getty Images sued Stability AI, the start-up behind the art generator tool Stable Diffusion, accusing it of unlawfully copying millions of images. The Wall Street Journal has said using its articles to train an A.I. system requires a license.

In the meantime, A.I. companies continue collecting information across the web under the “fair use” doctrine, which permits limited use of material without permission.

“The world is facing a new technology, and the law is groping to find ways of dealing with it,” said Bradley J. Hulbert, a lawyer who specializes in this area. “No one knows where the courts will draw the lines.”

========================================================

Wednesday, 8 March 2023

Chatbots : the GOOD , the BAD and the UGLY

AI, privacy and GDPR

AI regulatory debate looming

Why Do A.I. Chatbots Tell Lies and Act Weird? Look in the Mirror.

The Chatbots Are Here, and the Internet Industry Is in a Tizzy

No comments:

Post a Comment