AI Weekly
Posts
AI Weekly: 10/09/23

AI Weekly: 10/09/23

Microsoft launched AutoGen, OpenAI is reportedly considering building their own chip, and a new report shows that 10% of chatbot conversations are NSFW

October 09, 2023

Good morning and welcome to this week’s edition of AI Weekly! In this week’s news, Microsoft has launched AutoGen, an open-source Python library for creating and managing LLM agents. The new tool will help developers simplify the orchestration, optimization, and automation of LLM workflows.

In other news, OpenAI is reportedly exploring the possibility of creating its own AI chips, as discussions about AI chip strategies have been ongoing within the company. While these reports cannot be confirmed, this is the second report in the last two weeks of OpenAI’s plans to develop AI-specific hardware (see last week’s news).

In startup developments, The Browser Company has launched AI-powered features under the "Arc Max" brand for the Arc browser, utilizing OpenAI’s GPT-3.5 and Anthropic’s models to develop lightweight yet effective features similar to AI assistants in competing browsers.

Oh, and a new study shows that 10% of chatbot conversations revolve around erotic content. Shocking….

Keep reading to learn more about other AI happenings from last week!

- ZG

Here are the most important stories of the week:

TEXT

Google has announced Assistant with Bard, a personal assistant powered by generative AI. Link.

The new assistant aims to make conversations with AI more natural, intuitive, and useful.
It will be available for early testers on Android and iOS devices.
Assistant with Bard can help with tasks like planning trips, finding emails, creating lists, and writing social posts.
It will integrate with Google services such as Gmail and Docs and use contextual awareness from device sensors.
The development of generative AI models for voice assistants is transforming the way people interact with technology, with a focus on improving understanding, conversation, personalization, anticipation, and action.

Researchers analyzed 100,000 chatbot conversations and found that approximately 10% of them had erotic content. Link.

The study categorized unsafe chatbot conversations into three groups: requests for explicit and erotic storytelling, explicit sexual fantasies and role-playing scenarios, and discussing toxic behavior across different identities.
Erotic storytelling was the most common type, making up 5.71% of the sample conversations, followed by explicit fantasy and role-play at 3.91% and bigoted discussions at 2.66%.
The researchers collected the large dataset using the "Chatbot Arena," a service where users can compare responses from different large language models.
While AI chatbots have gained significant attention, there has been limited research on their real-world interactions, making this study a valuable contribution to understanding user behavior and chatbot safety.
The researchers aim to use their findings to enhance chatbot safety for all users.

IMAGE/VIDEO

Canva has introduced a range of generative AI features under the "Magic Studio" banner with the aim of enhancing creativity among average workers, offering 10 new tools for various design-related tasks. Link.

The co-founder and product chief of Canva, Cameron Adams, expressed that these AI tools could significantly benefit the vast majority of office workers lacking design training by providing them with powerful, user-friendly design tools.
Among the new features, "Magic Grab" allows individuals without photo editing skills to easily manipulate major elements within an image.
The "Magic Switch" feature facilitates easy transition between document types, like converting a lengthy text block into a presentation.
"Magic Media" extends text-to-image and text-to-video capabilities, incorporating technology from AI startup Runway, while AI is also utilized to generate descriptive alt text for images to aid visually impaired individuals.
Canva, with a history of integrating AI features, sees a broader shift in the software industry towards incorporating generative AI capabilities, and plans to continue expanding its AI features to refine creative outputs further.

Adobe is set to unveil an AI-powered photo editing tool called Project Stardust at the Adobe Max event. Link.

Project Stardust can automatically identify individual objects in photos, allowing for easy manipulation, such as moving or deleting objects.
It features a "Contextual Task Bar" that suggests next steps in the editing process, streamlining the workflow.
The tool leverages generative AI capabilities, allowing users to input text or descriptions to generate new elements in images.
While similar automated design tools exist, Adobe claims that Project Stardust will offer a wide range of capabilities, revolutionizing interactions with Adobe products.
More details about Project Stardust and other Adobe AI releases will be revealed at the Adobe Max event starting on October 10th.

Researchers, including Soheil Feizi from the University of Maryland, have found that current AI watermarking techniques to identify AI-generated images are unreliable and can be easily manipulated. Link.

Watermarking is considered a promising strategy to trace the origins of AI-generated content and combat misinformation.
Google's DeepMind and other tech giants have pledged to develop watermarking technology, but researchers are skeptical of its effectiveness.
Watermarking's shortcomings have been highlighted in multiple studies, and experts believe that it should be used in combination with other detection technologies.
Some researchers view watermarking as a form of harm reduction, capable of catching lower-level AI fakery attempts, even if it can't prevent high-level attacks.
There is debate about the effectiveness and practicality of watermarking as a reliable tool for identifying AI-generated content, with some suggesting it may not be a viable solution.

Researchers from the University of North Carolina Chapel Hill have proposed a two-stage framework called VIDEODIRECTORGPT for generating coherent multi-scene videos from text descriptions. Link.

This approach aims to address the challenge of creating long, detailed videos with smooth transitions from text prompts.
VIDEODIRECTORGPT consists of two modules: a Video Planner that uses the GPT-4 language model to create a structured video plan and a Video Generator (Layout2Vid) that generates the multi-scene video based on the plan.
The framework improves object control, motion, and consistency across scenes compared to previous text-to-video models.
VIDEODIRECTORGPT has applications in creating video visualizations, educational tutorials, summarizing lengthy footage, and more.
While promising, there are still limitations to address, such as glitches and limited diversity in backgrounds and entities.

INFRA/DEVTOOLS

Microsoft has introduced AutoGen, an open-source Python library for creating and managing LLM agents. Link.

AutoGen is designed to simplify the orchestration, optimization, and automation of LLM workflows by enabling the creation of customizable agents that interact through natural language messages.
Developers can create a variety of agents, each with its unique role and capabilities, and these agents can cooperate with each other to accomplish tasks.
AutoGen supports autonomous multi-agent applications and can also include "human proxy agents" for user oversight and control in sensitive decisions.
The framework allows developers to create reusable components for rapid custom application development and supports complex scenarios and architectures.
AutoGen enters a competitive field of LLM application frameworks, with various other contenders such as LangChain, LlamaIndex, AutoGPT, MetaGPT, BabyAGI, ChatDev, and Hugging Face's Transformers Agents library.

Liz O’Sullivan, a member of the National AI Advisory Committee, has co-founded Vera, a startup aimed at making AI safer. Link.

Vera recently raised $2.7 million in funding to develop its toolkit for creating "acceptable use policies" for generative AI models.
The platform identifies risks in model inputs and blocks or transforms requests that could contain sensitive information or malicious intent.
Vera places constraints on model responses, giving companies more control over AI behavior in production.
The startup's approach involves proprietary language and vision models to detect problematic content.
Vera aims to address compliance-related challenges in adopting generative AI models and reduce the risk of offensive or harmful behavior. However, concerns about model biases remain.

Observe, a provider of observability software, has raised $50 million in convertible debt financing led by Sutter Hill Ventures. Link.

The funds will be used to expand Observe's sales and R&D teams, aiming to increase its headcount from 150 to 250 employees by the end of 2024.
Observe stores machine-generated data and logs, offering software-as-a-service observability tools to analyze this data.
The company competes with monitoring and log analytics tools like New Relic, Splunk, Datadog, and Sumo Logic.
Observe has introduced generative AI features, including GPT Help, GPT Extract, GPT Slack Assistant, and OPAL Co-Pilot, to expedite observability tasks.
The company has shifted its focus to companies with 200 to 2,000 employees, resulting in increased average sales prices and reduced churn, with a client base of over 60 brands and around 1,600 monthly average users.

Gradient, a startup that facilitates the development and deployment of AI applications using LLMs, has emerged from stealth with $10 million in funding. Link.

The platform allows organizations to create specialized and fine-tuned LLMs at scale in the cloud.
Gradient offers access to various open source LLMs, such as Meta's Llama 2, which can be fine-tuned for specific needs.
Customers maintain full ownership and control over their data and trained models.
Gradient is positioning itself to simplify AI adoption for businesses by reducing complexity and costs associated with AI infrastructure setup and model development.
The startup differentiates itself by enabling the deployment of multiple specialized models simultaneously.

HARDWARE

OpenAI is reportedly exploring the possibility of creating its own AI chips, as discussions about AI chip strategies have been ongoing within the company. Link.

The shortage of chips for training AI models has driven OpenAI to consider options such as acquiring an AI chip manufacturer or designing chips in-house.
OpenAI CEO Sam Altman has prioritized the acquisition of more AI chips for the company.
Currently, OpenAI relies on GPU-based hardware for developing models, but the demand for GPUs in the generative AI field has strained the supply chain.
Developing AI chips internally is a risky endeavor that could take years and cost hundreds of millions of dollars.
Other tech giants like Google, Amazon, and Microsoft have already pursued their own AI chip solutions to support AI workloads and training.

A collaborative project involving Google DeepMind and 33 research institutions aims to create a general-purpose AI system for diverse physical robots and tasks. Link.

Traditional robot training involves specialized models for each robot, task, and environment, making it time-consuming and challenging.
The Open X-Embodiment project introduces two key components: a dataset with data from multiple robot types and a family of models that can transfer skills across various tasks.
The dataset comprises data from 22 robot embodiments at 20 institutions, encompassing over 500 skills and 150,000 tasks across 1 million episodes.
Models, based on the transformer architecture, were tested in different research labs and demonstrated significantly higher success rates in various tasks compared to specialized models.
The project aims to transform robot training by enabling models to learn from diverse examples and sharing data and models with the research community.

Lemurian Labs, a startup founded by alumni from Google, Intel, and Nvidia, is developing a new chip and software for processing AI workloads more efficiently and cost-effectively. Link.

The company aims to change the traditional chip architecture, making compute resources move to data instead of data traveling to compute resources.
Lemurian wants to replace the floating point approach used in GPUs with a logarithmic approach to save on energy and gain speed and precision.
The startup plans to release the software part of its solution in Q3 next year and is working on developing the hardware.
The $9 million seed investment was led by Oval Park Capital, with participation from other investors.
The company currently has 24 employees and plans to hire more as it grows.

MULTIMODAL

Unitary AI, a startup based in Cambridge, England, has secured $15 million in funding for its content moderation platform. Link.

The platform uses a "multimodal" approach to parse video content, analyzing text, sound, and visuals simultaneously to detect harmful or inappropriate content.
Unitary's business has been growing, with the platform now classifying 6 million videos per day, up from 2 million earlier this year.
The funding will be used to expand into more regions, hire additional talent, and add support for languages beyond English.
Content moderation is a critical challenge for online platforms, and Unitary AI aims to improve the effectiveness of moderation tools in the video domain.
Existing tools have focused on single data types like text, audio, or images, but Unitary combines these modalities for more accurate content analysis.

OTHER

The Browser Company has launched AI-powered features under the "Arc Max" brand for the Arc browser, utilizing OpenAI’s GPT-3.5 and Anthropic’s models to develop lightweight yet effective features similar to AI assistants in competing browsers. Link.

Arc Max offers unique functionalities including the ability to rename pinned tabs based on page titles for easier readability, and rename downloaded files based on their content.
A notable feature of Arc Max is its ability to provide a summary preview of a link when users hover over it and press shift, aiding in quick content overview without navigating away from the current page.
Users can activate these features by using the command bar (Cmd + T), typing “Arc Max,” and selecting the features they wish to enable. Interaction with ChatGPT is facilitated through the command bar by typing “ChatGPT” followed by the query.
The Browser Company emphasized the importance of integrating AI tools seamlessly within users' workflow, as illustrated through their experimentation with automatic note-taking and converting the forward button into an exploration page.
Although earlier features like Boosts and prototype functionalities did not make the final cut due to speed issues, CEO Josh Miller assured in a livestream that the currently introduced features will be retained for at least 90 days for feedback collection, indicating a user-centric approach towards further development and refinements.

Visa plans to invest $100 million in companies working on generative AI technologies and applications that impact commerce and payments. Link.

The investments will be made through Visa Ventures, the company's corporate investment arm.
Visa has been involved in AI in payments since 1993 and recognizes the transformative potential of generative AI.
The investments will range in size from a few million dollars for early-stage companies to larger investments if there's a strong rationale.
Visa is looking for companies applying generative AI to solve real problems in commerce, payments, and fintech.
The company is also interested in firms focused on responsible AI use and aligning with Visa's policies.

Rabbit, formerly known as Cyber Manufacture Co., is developing an AI-powered UI layer that enables natural language interaction with any software. Link.

Founded by Jesse Lyu and Alexander Liao, Rabbit is creating Rabbit OS, supported by an AI model that mimics human interaction with desktop and mobile interfaces.
The startup has secured $20 million in funding from investors like Khosla Ventures, Synergis Capital, and Kakao Investment, valuing it between $100 million and $150 million.
Rabbit's approach differentiates itself from competitors like Adept by focusing on comprehending complex user intentions and operating user interfaces.
The AI model can currently interact with popular consumer applications and aims to expand its support to all platforms and niche apps.
Rabbit plans to release dedicated hardware for its platform and aims to make money through licensing, refining its model, and selling custom devices. However, it faces challenges in collecting sufficient training data and competing with established players like Microsoft and OpenAI.

Okta is incorporating AI capabilities into its identity platform to enhance customer safety and security. Link.

Okta CEO Todd McKinnon views AI as a pivotal technology wave, similar in importance to the internet, cloud, and mobile technologies.
Okta AI encompasses a set of capabilities leveraging data from identity, risk signals, usage patterns, customers, and policies, combined with AI technology.
The first capability, Identity Threat Protection, continuously monitors security posture, integrating with external security solutions to identify risks and trigger universal logout when needed.
Policy Recommender recommends application security configurations based on usage data from Okta's customer base.
Log Investigator uses generative AI to enable natural language queries of Okta logs, providing insights and answers to users.
Okta plans to put these AI features into beta in the coming months, with a general availability release expected next year.

Metropolis raised $1.7 billion to acquire SP Plus, a parking facility management services provider. Link.

The funding includes a combination of equity and debt and is co-led by Eldridge Capital and 3L Capital.
Metropolis will take on $650 million in loans and $1.05 billion in Series C preferred stock financing.
SP Plus, with over 2 million parking spaces and operations across North America, will be acquired for approximately $1.5 billion.
Metropolis, founded in 2017, uses computer vision technology to offer checkout-free parking experiences to customers.
The acquisition expands Metropolis' presence in the parking industry, adding to its existing operations in over 360 cities with $4 billion in processed payments annually.

Ben McKean, founder and CEO of Hungryroot, has launched a new nonprofit app called Every, which uses AI to foster self-discovery and human connection. Link.

Every offers "thought-provoking games" that prompt users to explore their inner thoughts and preferences.
The app leverages AI, including technology from OpenAI and Midjourney, to generate questions and prompts for its games.
McKean created Every in response to growing feelings of disconnection, with statistics showing high levels of loneliness and distrust in society.
The app generates insights into users' personalities and preferences and encourages them to find common ground with others.
While initially a free side project, McKean is open to the possibility of scaling Every into a business if it gains traction. The app is available for iOS users.