AI Weekly
Posts
AI Weekly: 10/02/23

AI Weekly: 10/02/23

Amazon shakes up LLMOps and commits $4B to Anthropic, Sam Altman and Jony Ive meet to create an AI-first consumer device, and Anthropic, and OpenAI integrates web browsing into ChatGPT

October 02, 2023

Good morning and welcome to this week’s edition of AI Weekly!

This week, Amazon made some major moves with their $4B commitment to Claude-creator Anthropic, and their launch of five different tools to aid in the LLM development process.

In OpenAI news, Sam Altman met with renowned iPhone designer Jony Ive over a the creation of a new consumer hardware device tailored for the evolving AI era. SoftBank's CEO, Masayoshi Son, has also been involved in the discussions, advocating for its subsidiary, chip designer Arm, to have a central role in this project.

OpenAI has also enabled web browsing capabilities for ChatGPT Plus, providing users with an enormous new set of exciting use cases that come with the ability to leverage the internet’s sea of real-time data.

Keep reading to learn about Mistral AI’s new 70B-parameter LLM, Adobe’s web launch of Photoshop, and one startup that’s building a unified platform for trip discovery, planning, booking, and sharing with friends. Happy AIing!

- ZG

Here are the most important stories of the week:

TEXT

OpenAI enhanced ChatGPT's capabilities, enabling web browsing for updated information through a Microsoft Bing integration, enriching its responses with current, authoritative data. Link.

The browsing functionality is accessible to ChatGPT Plus and Enterprise users, selectable from a drop-down menu in the GPT-4 selector within the application.
Initially launched in March, the browsing feature was temporarily disabled due to users bypassing news paywalls, but has been reinstated with exclusions following "robots.txt" guidelines.
Microsoft's Bing Chat, powered by a newer OpenAI model, had earlier incorporated web browsing, offering a similar functionality.
The updated browsing feature in ChatGPT allows users to explore the web without leaving the ChatGPT interface, maintaining a seamless user experience.
The enhancement follows recent upgrades like image scanning, audio conversation capabilities, and the integration with the new image generation model, DALL-E 3, demonstrating OpenAI's continuous efforts to augment ChatGPT's functionality.

Amazon plans to invest $4 billion into Anthropic, with an initial commitment of $1.25 billion. Link.

The hefty investment in Anthropic signals Amazon’s major stride into the AI arena, aligning with tech giants who are already advancing in this sector.
Anthropic's AI models excel in safety by autonomously revising their responses, a feature that sets them apart in a competitive field focused on reliable AI.
Through this investment, Amazon secures a stake in Anthropic, paving the way for integrating pioneering AI technology into its diverse product range.
Despite the new alliance with Amazon, Anthropic sustains its relationship with Google, ensuring its innovative AI technology remains accessible on Google Cloud.
This strategic move by Amazon reflects the intensifying race among industry leaders to leverage and invest in AI, a testament to the technology's transformative promise.

Mistral AI, the French startup, unveiled its free large language model, Mistral 7B, aiming to promote open generative AI community support. Link.

Users can download Mistral 7B through various channels, including a 13.4-gigabyte torrent, with a GitHub repository and Discord channel for collaborative development and troubleshooting.
The model is distributed under the Apache 2.0 license, permitting unrestricted use provided attribution is maintained.
Mistral 7B, while smaller than certain models, offers comparable capabilities but at a lower computational cost.
The release marks a milestone after three months of rigorous development by a team with experience from Meta and Google DeepMind.
While the model is free, more extensive access and certain features are part of Mistral's commercial offering.

Meta unveils Llama 2 Long, an advanced AI model, improving response generation for long user prompts. Link.

Llama 2 Long outperforms competitors like OpenAI’s GPT-3.5 Turbo and Claude 2 in handling extended character contexts.
Meta enhanced Llama 2 by adding 400 billion more tokens from longer text data sources.
Key modification in Rotary Positional Embedding (RoPE) encoding improves model's handling of longer sequences.
Reinforcement Learning from Human Feedback (RLHF) and synthetic data generated by Llama 2 utilized to improve performance.
Community shows excitement on Llama 2 Long's results, validating Meta's open-source approach towards generative AI.

Google updated its AI chatbot Bard, integrating it with popular products like Gmail, Docs, Drive, aiming to outshine ChatGPT by leveraging its extensive user base across various applications. Link.

Bard Extensions theoretically allow live personalized data retrieval from Google services, enhancing AI assistance beyond static information.
Despite promising features, Bard disappoints in practice with poor integration, inaccurate responses, and a lack of creativity compared to OpenAI's GPT-4.
Bard's underlying model, PaLM 2, with around 340 billion parameters, falls short against GPT-4's extensive training on 1.8 trillion parameters, affecting its response relevance and quality.
Initial tests reveal Bard's inability to efficiently handle tasks like document summarization, flight deal searches, or creative requests, failing to meet advertised capabilities.
Bard's sole redeeming feature is a built-in verification via Google Search to cross-check answers, highlighting its responses' unreliability and necessitating a significant improvement to keep Google competitive in AI-driven assistance.

IMAGE/VIDEO

Meta unveils 28 AI characters modeled after celebrities to engage users across its platforms, hinting at a resolution to the ongoing conflict in Hollywood regarding AI’s role in entertainment and actor compensation. Link.

These AI personas, built on the Llama 2 language model, interact with users in text chats offering expertise and entertainment aligned with their real-world counterparts’ personas.
The characters, derived from filmed animations using generative techniques, provide a unique user experience while retaining the “unique personality and tone” of each celebrity.
Current interaction is text-based with plans for audio integration in the future, aiming to surpass previous attempts like Amazon’s celebrity Alexa voices.
The business model behind celebrity compensation for this initiative remains undisclosed, but parallels to existing creator payment models on Instagram and other Meta platforms are noted.
The launch emphasizes the rising capability of AI in replicating human-like interactions, paving the way for enhanced user engagement and potential new revenue streams in the digital entertainment sphere.

Getty Images, in collaboration with Nvidia, launches Generative AI by Getty Images, a tool enabling users to create images using Getty's licensed photo library and Nvidia's Edify model, ensuring legal protection for commercial usage of generated images. Link.

The tool performed impressively in generating realistic human figures in testing, showcasing its capacity to be trained on actual photos rather than just illustrated art.
The generated images will not be included in Getty's standard content libraries; however, creators will be compensated if their AI-generated images are utilized to train current and future models, with revenue-sharing mechanisms in place.
The tool imposes restrictions on the type of images users can create, disallowing the generation of images involving real individuals or mimicking real-life events to prevent potential manipulation or misrepresentation.
Users will have perpetual, worldwide, and unlimited rights to their created images, although the technical copyright status of AI-generated images remains a complex issue.
Getty’s move into the AI image generation domain helps secure its vast image library against unauthorized use by other AI developers, amidst ongoing copyright concerns within the creative community.

Adobe officially launched the web version of Photoshop for all users on paid plans, after nearly two years in beta testing. Link.

The web version includes Firefly-powered AI tools like generative fill and generative expand aimed at enhancing image editing capabilities.
Tools on the web version are organized based on workflows such as image reproduction or object selection, and tool names are fully displayed for beginner ease.
Collaboration is facilitated as users can share file links with others, regardless of their subscription status.
The web version retains most of the desktop version's tools, excluding some like the patch tool, pen tool, smart object support, and polygonal lasso, though Adobe is working to integrate these missing tools.
Currently, Adobe does not plan to offer a free or freemium version of Photoshop on the web, restricting access to users with paid plans.

Tubi is testing "Rabbit AI," utilizing OpenAI's GPT-4 for enhanced content discovery. Link.

The tool aims to provide highly personalized recommendations, enhancing user experience by navigating Tubi's vast content library and allowing users to inquire about specific content genres.
Features include bookmarking recommendations and saving search history termed "Rabbit Holes" for easy revisit.
Initially available on Tubi's iOS app for beta testing, with plans for broader availability.
OpenAI subscribers will have access to the Rabbit AI plug-in for ChatGPT, indicating a shared utility beyond Tubi.
This initiative reflects a growing trend among media platforms, like YouTube and Amazon, investing in AI for improved user engagement.

SPEECH/AUDIO

Spotify's CEO, Daniel Ek, will not entirely ban AI-generated content but opposes its use to impersonate real artists without consent. Link.

Ek has identified three AI use categories in music: acceptable auto-tuning, unacceptable artist impersonation, and a controversial middle ground of AI-inspired music.
Ek foresees a long-term debate on AI's role in music, particularly concerning artist impersonation and AI-inspired music.
In the past, Spotify has experienced attempts to game their system by users uploading impersonated or falsely attributed music.
Spotify prohibits the use of its platform’s content to train AI models, reflecting concerns over AI's impact on music creation.
The upcoming UK-hosted safety summit in November reflects broader global scrutiny on AI's implications for jobs, copyright, and various sectors.

CODE/DEVTOOLS

AWS unveiled five generative AI innovations to facilitate the development of generative AI applications, aimed at boosting employee productivity and business transformation. Link.

Amazon Bedrock, now generally available, simplifies the access and utilization of foundational models, aiding businesses in experimenting with and customizing top foundational models securely with their proprietary data.
Amazon Titan Embeddings, a large language model, has been launched to ease the implementation of Retrieval-Augmented Generation (RAG) for organizations, supporting an array of languages and a longer context length for better understanding of text.
Upcoming Meta’s Llama 2 model, via Amazon Bedrock, introduces enhanced training and a longer context length for improved dialogue use cases, facilitating a diversity of model access for optimal generative AI utilization.
New feature in Amazon CodeWhisperer enables secure customization using private code bases, significantly boosting developer productivity by providing tailored code suggestions.
Generative BI authoring capabilities in Amazon QuickSight have been introduced to streamline the creation and customization of visuals for business analysts using natural-language commands, saving time and focusing on higher-value tasks.

Zapier introduced Canvas, a new tool designed to aid users in planning and diagramming business processes with the assistance of AI. Link.

The company also announced the general availability of Tables, an automation-centric database service, to all users.
Zapier, transitioning over the years from a basic tool for connecting web services to a platform allowing complex integrations and workflow automations, is addressing emerging challenges with Canvas and Tables as part of the solution.
Canvas serves as a visual diagramming tool enabling users to map out processes end-to-end, with the flexibility to edit components connected to Zapier within Canvas, aiming for an all-encompassing editing ability in the future.
An AI component within Canvas allows users to input the problem they are facing, with the service generating a process solution, enhancing problem-solving with automated process suggestions.
In conjunction with launching Canvas and Tables, Zapier is also rolling out several smaller feature updates including a new interactive editor, more administrative controls, and additional integrations, broadening its service offering and usability.

POLICY/LAW/ETHICS

The Writers Guild of America (WGA) ended a nearly five-month strike by reaching an agreement with Hollywood studios, allowing writers to resume work under new contract terms. Link.

A significant issue during the strike was the potential use of AI tools like ChatGPT by studios, possibly impacting writers' earnings.
The new contract stipulates that AI cannot be utilized for writing or rewriting scripts, preventing loss of writing credits to AI.
Although writers have the choice to use AI tools individually, studios cannot enforce the use of certain AI tools during a project.
The contract also reserves WGA's right to challenge the use of writers' material for AI training.
This agreement potentially sets a standard for defining AI usage limits in creative fields, amidst ongoing discussions on AI's role in such industries.

OTHER

OpenAI is in advanced discussions with Jony Ive, former Apple designer, and SoftBank for a $1 billion project aimed at creating a groundbreaking AI hardware akin to the "iPhone of artificial intelligence." Link.

OpenAI's CEO, Sam Altman, is reportedly engaging with SoftBank's Masayoshi Son regarding the funding of this venture, with Jony Ive's company LoveFrom spearheading the development of OpenAI's first consumer device.
The collaboration seeks to enhance AI-user interaction, drawing inspiration from the transformative user experience of iPhone's touchscreen.
The negotiations involve a significant investment from SoftBank, potentially creating a new company harnessing the talent and technology from all three firms.
SoftBank is also advocating for its subsidiary, chip designer Arm, to have a central role in this project.
Although the discussions are serious, a formal agreement has not been reached, indicating that a product release may take several years.

Blend, a UK-based startup, leverages AI to provide personalized online shopping recommendations for users, aiding them in finding products aligned with their style, size, and budget amidst the overwhelming online fashion market. Link.

Differing from traditional personalization based on historic purchase data, Blend's AI evolves with user interaction, trends, and style changes over time, ensuring a more tailored shopping experience.
Blend aims to create a seamless user-centric platform, amassing a waiting list of 2,000 users and partnerships with over 250 retailers, including luxury retailer Net-a-Porter.
Targeting digitally savvy 18 to 34-year-olds, Blend's go-to-market strategy is to start in the UK, with intentions to penetrate the US market.
Blend’s business model is threefold: engaging shoppers with personalized recommendations, collaborating with influencers for content and brand promotion, and partnering with brands to offer a diversified, style-specific advertising platform, all while planning to monetize through commission on sales, and potentially, a subscription service for premium features.
Through user-generated content, like images and reviews, Blend also aims to address the common online shopping challenge of finding the right fit, aiming to reduce return rates for retailers and improve overall customer satisfaction.

Vancouver-based startup, Pilot, aims to enhance the global travel experience by creating a unified platform using AI for trip discovery, planning, booking, and sharing with friends. Link.

The brainchild of serial entrepreneur Connor Wilson, Pilot quickly amassed over 20,000 users with no proactive marketing, hinting at a significant market need for socially integrated travel planning solutions.
Unlike typical social networks or AI travel agencies, Pilot emphasizes on facilitating connections and collaborations among existing friends, family, and partners, ensuring users have full control over their shared travel plans.
The platform's AI-driven feature, Quickstart, produces personalized trip itineraries, allowing user modifications, accommodation, and flight bookings, subsequently enabling users to share their experiences on its blog.
Operating on an affiliate model, Pilot monetizes by earning commissions from vendor bookings made through the platform, although its current focus is on community growth and platform development rather than immediate revenue.
With a web app available globally and a mobile app on the way, alongside about $650,000 in angel funding, Pilot is seeking to raise $4 million to enhance the platform's social functionality and expand to Latin American and Asian markets.

DeepMind's large language model (LLM) Chinchilla 70B achieved significant lossless compression, reducing ImageNet image patches to 43.4% of original size, outperforming PNG algorithm which compressed to 58.5%. Link.

For audio, Chinchilla compressed LibriSpeech data to 16.4% of raw size, better than FLAC compression at 30.3%.
Lossless compression, unlike lossy techniques like JPEG, retains all data during compression, emphasizing Chinchilla 70B's efficiency in different data types despite being primarily trained on text.
Effective data compression is viewed by some as a form of general intelligence, as it involves identifying patterns and making sense of complexity, akin to understanding the world.
The study explored generating new data with compression algorithms like gzip and Chinchilla; Chinchilla performed better in generative tasks due to its design for language processing.
While not peer-reviewed, the DeepMind paper suggests large language models could serve in novel applications, with the relationship between compression and intelligence continuing to be a subject of research and debate.