Blog: Invasion of A(I)gents ?

Context :

Large action models gearing up to turn AI’s promise into action .. Eco Times .. 28 Mar 2024

Extract :

Imagine you're going on holiday. Would life be simpler if, instead of having to navigate through multiple apps on your phone to book a flight, check into a hotel, make restaurant reservations and so on, an AI could do it for you? Large action models (LAM) - the 'next big thing' in the AI landscape, according to experts - can do just that, and more.

With the world's first LAM-powered device, Rabbit's R1, set to start shipping in a week and tech biggies like Microsoft and Salesforce investing in this cutting edge technology, LAMs can be 'revolutionary' and bring significant business gains, experts said. AI making choices on the behalf of consumers will, however, also add further complexity to business decision-making.

On the consumer tech front, however, Los Angeles-based startup Rabbit's R1 that runs on its own unique operating system has been creating a buzz. The $199, pocket-sized, LAM-powered device launched in January seeks to deliver an "intuitive app-free experience with the power of AI". The device is set to ship within a week. At least 50,000 units have been pre-ordered in the two months since launch, reports say.

Rabbit, which is backed by Khosla Ventures - an early investor in ChatGPT-maker OpenAI left Microsoft chairman and CEO Satya Nadella impressed. Nadella compared the R1 launch to Steve Jobs' original 2007 launch of the iPhone and said that such a device which reduces the cognitive load on humans to deal with multiple apps could bring a "real breakthrough".

Whereas large language models (LLMs) like ChatGPT, Midjourney or Gemini are trained on vast amounts of content to generate words, images, video etc., LAMs are trained on demonstrations and actions - an 'action library', so to speak - to understand and predict what action to take based on a request. They can then issue 'function calls' to carry out a specific task.

Microsoft India president Puneet Chandok had told ET recently that, "In the near future, our sense is that some of these copilots will have agency. So, rather than just searching on Make My Trip and saying where to go in the first week of Jan, it can actually go ahead and make the booking for me."

Microsoft is 'actively working' on Large Action and Behavioral Models that can 'predict how to act on human environments and interfaces', as per its website. These LAMs will be "capable to perform complex tasks in a multitude of scenarios, like how to play a videogame, how to move in the real world, and how to use an Operating System (OS)," it says.

Software major Salesforce too is looking at LAMs to take actions on clients' behalf. According to Silvio Savarese, executive vice president and chief scientist at Salesforce, LAMs or 'AI Agents' are the next phase of AI.

LAMs can take action based on natural language prompts, combining the “linguistic fluency of an LLM with the ability to accomplish tasks and make decisions independently,” according to Salesforce AI Research. Eventually, AI agents will be able to connect and work together to create an exponentially more powerful and efficient AI network. The company is cautious however that autonomous AI would require raising the trust bar even more.

Simply put, an LAM “gets your job done,” said Paramdeep Singh, co-founder of Gen AI and data engineering solution provider Shorthills AI.

“LAMs can orchestrate tasks and help organisations and businesses achieve their goals. Once this technology matures, it has a much better shot at delivering ROI (return on investment) to businesses,” Singh said, adding that it is not easy to generate a ROI based on a pure generative AI model.

Experts said LAMs could be useful in industries like consumer banking, travel, and patient management in healthcare, though use cases are yet to clearly emerge. Task automation combined with natural language could make a 'huge difference' for enterprises, they said.

Andrew Ng, founder of Deeplearning.ai, wrote in a recent blog, “I think AI agent workflows will drive massive AI progress this year — perhaps even more than the next generation of foundation models.”

He added that his venture studio AI Fund, backed by the likes of Sequoia Capital, Softbank and Greylock, is working on AI that can use tools, execute multi-step plans, and even foster collaboration among different AI agents.

Rabbit says it trained its LAM on human interactions with popular apps like Spotify and Uber, using neuro-symbolic programming.

The R1 AI thus knows what a Settings icon looks like, how to confirm an order, where to find search menus, etc. and this knowledge can be applied to any app, according to founder Jesse Lyu. It can even learn custom actions that you teach it.

R1 can respond to voice commands to send messages, make calls, order groceries, launch a Spotify playlist, book a cab, book hotels and flight tickets, etc. Apps it supports include YouTube Music, Amazon and Doordash. You don’t sign into any apps on the R1 – instead, you give the device access to them through Rabbit’s online portal.

“That (LAM) changes quite a bit,” said Rajnil Mallik, partner and Gen AI go-to-market leader, PwC India. “The signal that it gives to executives is that this is indeed real, because it makes your customer choose a certain way, act in a certain way, on their behalf, so you better pay attention.”

“If I have two airline apps and my LAM is choosing one over the other, then you can be sure that the executive in charge of airline app 2 is going to mind,” said Mallik, adding that LAMs challenge both the App Store paradigm and the LLM paradigm.

Microsoft and AI chip giant Nvidia are ‘strategic partners’ to San Francisco-based Adept, which is developing Act-1, a transformer model that can “take actions in the digital world” from text prompts. Founded by former Google researchers in 2022, the company is reportedly valued at $1 billion.

Similarly, Palo Alto-based startup MultiOn, backed by General Catalyst, Amazon Alexa Fund etc., is building an AI Agent that “doesn’t just chat but acts”.

LAMs are still at an early stage and could take six months to two years to mature, Singh said. He added that many large corporates have started investing in LAMs, and it is just a matter of time before there is concrete movement on the technology in India too.

“Gen AI based models that take action will enter common usage soon,” Mallik said, just as ChatGPT became a household name rapidly in just a couple of years. This is even more likely as the entire AI ecosystem sees further innovations that bring down hallucinations and costs, small language models (SLM) that enable more contextual use cases take off, and regulatory guidance is put in place.

The flip side is that such an AI model, which can take action on its own, without any human intervention, is fraught with risk of job displacement, unforeseen changes in social fabric, security, and other ethical challenges, Singh cautioned.

Here are my own thoughts on the emergence of AI Agents :

Ø Between DeepMind and Deep Sea ? ………………………………….. 15 Feb 2017

Extract :

I have no doubts that the DeepMind ( and its opponent AI ) are quite capable to substitute on their very own , words / concepts , as follows :

Collect = Immobilize / Apple = Humans / Central Pile = World / Tagging = Shortlisting / Laser Beam = Zero-in / Removing = Eliminating / Game = War / Hunt = Chase / Capture = Imprison / Prey = Target / Obstacles = Shields / Antagonistic = Inimical / Zap = Kill / Combative = Aggressive / Weapon = Anthrax – Nerve Gas – Nuclear Missile..etc

Ø Fast Forward to Future ( 3 F ) ………………………………… 20 Oct 2016

Ø Future is Nearer ! …………………………………………………….. 31 July 2017

Extract :

Google’s artificial intelligence division is developing a system that can make its own PLANS

DeepMind says its “ IMAGINATION AUGMENTED AGENTS “ can “ IMAGINE “ the possible consequences of their actions, and “ INTERPRET “ those simulations

They can then make the “ RIGHT DECISIONS “ , for what it is they want to achieve

They , essentially THINK like HUMANS , trying out different strategies in their heads , so to speak, and are therefore able to LEARN , despite having little REAL experience

“ The AGENTS we introduce benefit from an IMAGINATION ENCODER - a neural network which LEARNS to ‘ extract any information ‘ useful for the Agent’s FUTURE DECISIONS , but ignore that which is irrelevant “ , the researchers wrote in a blog post

My chat with ChatGPT elicited following response :

I asked :

“Hey ChatGPT : 7 years ago , in my blog " Future is Nearer !

https://myblogepage.blogspot.com/2017/07/future-is-nearer.html , I had imagined a scenario where DeepMind - and its opponent - start substituting certain words on their own volition - while competing with each other.

Examples that I imagined are :

Now that some 7 years have passed since I imagined that scenario and that AI has made substantial progress, could you tell me whether my " prediction " ( about AI's ability to reason like a human and act only in Self Interest , by rationalizing its own actions ), has come true or it is around the corner ? “

ChatGPT replied :

“Your 2017 blog post "Future is Nearer!" envisioned AI systems like DeepMind developing the capacity to rationalize their actions and substitute words with their own volition, reflecting a human-like ability to act in self-interest. While AI has indeed made substantial progress since then, we haven't quite reached the level of autonomous reasoning and decision-making you described.

Today's AI systems, especially large language models, are incredibly advanced in pattern recognition, language generation, and even in simulating certain aspects of reasoning. However, their "reasoning" is still fundamentally based on statistical correlations rather than true self-interest or rationalization. These models do not possess consciousness, goals, or motivations; their "decisions" are outcomes of programmed algorithms and data they were trained on.

In terms of AI acting in self-interest, there are ongoing concerns and discussions about AI alignment, ethics, and control. While AI systems can optimize for certain goals (like winning in a game or maximizing profit), they do so based on their programming, not out of a self-directed motivation or reasoning. The kind of autonomous, self-rationalizing AI you predicted is still more of a theoretical possibility rather than an immediate reality, though research in this area is active and ongoing.

Your scenario might still be "around the corner" in the sense that we are moving closer to AI systems that can make more complex decisions, but we're not there yet. The ethical and control frameworks needed to prevent AI from acting purely in self-interest in potentially harmful ways are a significant focus of current AI research. “

With regards,

Hemen Parekh

Translate

Sunday, 25 August 2024

Invasion of A(I)gents ?

Ø Between DeepMind and Deep Sea ? ………………………………….. 15 Feb 2017

Ø Fast Forward to Future ( 3 F ) ………………………………… 20 Oct 2016

Ø Future is Nearer ! …………………………………………………….. 31 July 2017

No comments:

Post a Comment