From Gemini, Claude to... - Quantum Ledger Trading Center

Artificial Intelligence was seen as novelty in 2023 with the launch of ChatGPT just a month before the start of that year. Microsoft CEO Satya Nadella, after investing $10 billion in OpenAI, threw down the gauntlet to Google, taunting the DeepMind owner to show what its AI could do. Google tottered with its half-baked Bard bot that spewed out inaccurate answers. In subsequent months, the Alphabet-owned company went through a baptism of fire and finally came out with a rechristened AI, Gemini, by the end of 2023.

Building on that momentum, the search giant steadily rebuilt its reputation and began infusing Gemini capabilities in almost all of its products and services in 2024. Google’s recent AI advances, including the second generation of Gemini, the Trillium AI accelerator chip, and breakthroughs in quantum computing with the Willow chip, have significantly boosted investor confidence, driving its stock price to a record high.

OpenAI did not give Google’s Gemini an easy run. The competition between the two Silicon Valley giants was intense. Both the firms launched advanced AI models with improved reasoning capabilities.

OpenAI’s o3 model built on top of its predecessor, o1, by focusing on enhanced reasoning skills, outperforming previous models in complex coding and advanced mathematics. Similarly, Google’s Gemini 2.0 Flash Thinking model answered complex questions by outlining its thought process, enhancing the model’s reasoning capabilities.

OpenAI’s o3 model, with its advanced reasoning capabilities, garnered more attention from Microsoft as the tech giant relies on OpenAI models for its AI assistant, Microsoft 365 Copilot. While o3 promises better performance, its increased cost and computation time are significant considerations for business applications.

Anthropic’s Claude and Mistral AI

It wasn’t just Google’s turn-around this year. Product launches and updates from a clutch of other AI companies have, in some instances, stolen OpenAI’s thunder this year. Amazon-backed Anthropic upgraded its AI model, Claude 3.5 Sonnet, with a “computer use” capability. That feature enables the AI to autonomously perform tasks such as moving the cursor, typing, and browsing the Internet, effectively automating complex computer interactions. This development aims to enhance productivity, particularly for software developers, by allowing the AI to execute multi-step actions with minimal human intervention.

Another area Anthropic made strides this year was in how it implemented safety measures. During the U.S. presidential election, with its Clio tool, it analysed AI usage to ensure responsible AI deployment. In another instance, the AI company allowed the UK’s AI Safety Institute to early-test its Claude 3.5 Sonnet model.

French AI company Mistral for its part brought in a certain level of transparency in AI with its open-weight models, including Mistral 7B and Mixtral 8x7B, designed for customisation and deployment across various applications. These models were available under open licenses, promoting accessibility and innovation within the AI community.

Mistral Large 2, the startup’s flagship model, was integrated into IBM’s Watsonx platform, offering enhanced capabilities in code generation, mathematics, and reasoning.

In November, the France-based startup expanded into the US by establishing an office in Palo Alto, California, a strategic move aimed at attracting top AI talent and enhancing the company’s sales operations. Such moves show how Silicon Valley continues to be the epicentre for top tech talent.

Mistral AI also collaborated with Qualcomm to bring new generative AI models to devices powered by Snapdragon and Qualcomm platforms, indicating a focus on enhancing AI accessibility and performance in consumer electronics.

Meta’s Llama

Meta’s large language models (LLMs), while quite late to the AI race, aren’t far behind. In some ways, Facebook’s parent company can be credited for modularizing AI models with its Llama models.

In April, Meta released Llama 3, offering models with 8 billion (8B) and 70 billion (70B) parameters. These models were pre-trained on approximately 15 trillion tokens from publicly available sources, with fine-tuning on over 10 million human-annotated examples. With performance in coding, reasoning, and multilingual support, Llama 3 was positioned by Meta an open-source AI model.

In July, Llama 3.1 expanded the context windows for the 8B and 70B parameters models. Plus, it also launched a 405 billion (405B) parameter model. In September, Meta introduced Llama 3.2, featuring models with 1B, 3B, 11B, and 90B parameters. This version marked a significant milestone by incorporating multimodal capabilities, allowing the models to process both text and images. Additionally, Llama 3.2 was optimized for deployment on edge and mobile devices, broadening its applicability.

Llama models were gradually integrated into Facebook, Instagram, and WhatsApp, enhancing user experiences with AI-driven features like real-time translation and content generation. By the end of 2024, Llama has received a wider adoption with over 650 million downloads, per a company blog, citing downloads from Hugging Face.

Meta’s Llama and Mistral AI are giving open-source alternative to the OpenAI-style closed source AI models. Open-source models allow a broader audience to engage with AI technology, facilitating collaborative development and rapid innovation. This openness contrasts with the closed nature of proprietary models, which restrict access to underlying code and data.

Developers can customise open-source LLMs to specific needs, enhancing versatility across diverse applications. This adaptability is often limited in closed-source models due to proprietary restrictions. While open-source models offer benefits, they may face challenges in matching the performance and specialised capabilities of proprietary systems. Proprietary models often have access to extensive resources and data, enabling them to achieve higher performance in certain tasks. And success in AI is dependent on the kind of data the models are trained in.

Apple intelligence and on-device experience

In 2024, AI models were increasingly deployed across various devices to bring their capabilities closer to end-users, enhance functionality, and provide real-time experiences. For instance, in smartphones and tablets, Apple’s Neural Engine, Qualcomm’s Snapdragon AI Engine, and Google’s Tensor chipsets were integrated with LLMs and vision models to power various tasks, including voice assistants like Siri and Google Assistant, real-time image processing for photo enhancements and augmented reality, and multimodal AI for speech, text, and image-based queries.

Apple’s incorporation of AI features into its latest iPhones, such as Visual Intelligence and Image Playground, has revitalised consumer interest, contributing to a significant increase in iPhone sales. This integration has positioned Apple on the brink of becoming the first company to surpass a $4 trillion market capitalisation.

PC manufacturers are fully embracing AI by integrating AI accelerators such as NVIDIA GeForce RTX series GPUs, AMD Radeon chips, and Apple’s M-series chips. These accelerators enable generative models and advanced features like AI-powered transcription and video editing tools, enhanced productivity tools, gaming enhancements including real-time ray tracing and AI-driven non-playable characters (NPCs) to run on device.

Talent is key

At the core of the current AI revolution lies a dual-engine system. One engine is responsible for training the data, while the other leverages the trained data to make inferences. To effectively operate these engines, hardware resources alone are insufficient. Human expertise plays a crucial role.

This has become a pain point for OpenAI in particular as it has been bleeding top talent this year. A notable departure from the AI giant is co-founder Ilya Sutskever. The company also lost its chief technology officer Mira Murati, and several other leading computer scientists.

To make matters worse, some of the departing members are joining rivals. For instance, John Schulman, another OpenAI co-founder, moved to Anthropic. Schulman is a key leader in the creation of ChatGPT.

Per a BCG analysis of top skilled labour, AI experts are the most mobile at nearly 11 out 100 experts moving internationally every five years. So, where they go can make a big difference.

Companies and countries that attract the best talent can gain a competitive edge in the tech world. Countries that are open to global talent tend to invent more and grow faster. As countries compete for leadership in AI, their ability to attract top talent is crucial for their success.

Published - December 27, 2024 08:58 am IST

From Gemini, Claude to Llama: How AI titans shaped the industry in 2024

Anthropic’s Claude and Mistral AI

Meta’s Llama

Apple intelligence and on-device experience

Talent is key