- AI Weekly
- Posts
- AI Weekly: 05/15/23
AI Weekly: 05/15/23
Google releases many new AI products, Anthropic upgrades Claude, and more regulation comes out of Europe
Good morning and welcome to this week’s edition of AI Weekly! The big headline from this past week’s news revolves around the slew of new AI products Google announced during its annual I/O developer conference, including a GitHub Copilot competitor, a text-to-music model, and a new application that supercharges note-taking.
Anthropic has also announced that it has expanded Claude’s context window to 100k tokens, meaning users can now input prompts over 10x the size of the previous context window.
In regulation news, MEPs in the European Parliament have voted on amendments to the draft AI legislation, including requirements for foundational models that underpin generative AI technologies like ChatGPT. Read about all that and more AI news below!
- ZG
Here are the most important stories of the week:
TEXT
Claude's context window has been expanded from 9K to 100K tokens, allowing businesses to submit large volumes of text for analysis. Link.
Claude can read and analyze 100K tokens in less than a minute, while it would take an average person over 5 hours.
The expanded context window enables Claude to retrieve information from documents and answer complex questions that require synthesis of knowledge.
Partners are excited about the possibilities of larger context windows for their businesses, such as digesting financial statements, analyzing legislation, and summarizing dense documents.
With 100K context windows, Claude can also identify risks, themes, and arguments in legal documents and provide answers to technical questions.
The expanded capability allows for rapid prototyping by integrating entire codebases and building upon or modifying them intelligently.
Google announces Project Tailwind, an AI-powered notebook tool aimed at organizing and summarizing freeform notes. Link.
Users select files from Google Drive, and Tailwind creates a private AI model specialized in that information, along with a personalized interface for navigating the notes and documents.
Tailwind showcases PaLM 2, the successor to Google's PaLM language model, which powers many of the tool's features through a newly-launched API.
The tool can generate study guides, suggest questions, create reading comprehension quizzes, and answer natural language questions about the notes with source citations.
The target audience for Tailwind includes college students, writers, analysts, and lawyers, among others.
Tailwind is currently in the preview stage, and Google intends to observe its usage to inform future projects. The waitlist for Project Tailwind is open to U.S.-based users.
The nonprofit Allen Institute for AI Research (AI2) is developing an open language model called Open Language Model (OLMo) in collaboration with AMD, the Large Unified Modern Infrastructure consortium, Surge AI, and MosaicML. Link.
OLMo aims to bridge the gap between public and private research capabilities in language models and provide a platform for the research community to use and improve its components.
AI2 plans to make everything it creates for OLMo openly available, including a public demo, training dataset and API, with limited exceptions under suitable licensing.
OLMo differentiates itself by focusing on leveraging and understanding textbooks and academic papers, making it suited for scientific and academic applications.
AI2 aims to address ethical and legal concerns by working with legal experts, reassessing privacy and intellectual property rights issues, and establishing an ethics review committee.
OLMo will have around 70 billion parameters and will be trained on LUMI's supercomputer in Finland. AI2 is inviting collaborators to contribute to the model's development process.
IMAGE/VIDEO
Amazon is forming a new team to develop AI tools for generating photos and videos for merchants on its platform, aiming to expand its ad business. Link.
The company intends to place ads across various Amazon businesses, including video streaming, sports ads, music streaming, and in-store digital ads.
Amazon faces competition from Meta and Google, which are also working on similar AI projects in advertising.
Other companies like Scale AI and Shopify already offer generative AI products for online merchants.
Amazon's advertising revenue was $38 billion last year, and it aims to strengthen its existing businesses with more powerful selling tools.
The use of AI in customer service, content generation, and chat systems is becoming more prevalent, but concerns exist about potential misuse and the need for responsible deployment.
SPEECH/AUDIO
Google has released MusicLM, an experimental AI tool that can convert text descriptions into music, available in the AI Test Kitchen app on web, Android, and iOS. Link.
Users can input prompts specifying the genre, instruments, and mood to generate multiple versions of a song.
Google initially had no plans to release MusicLM due to ethical challenges, including the potential use of copyrighted material in generated songs.
Google has been working with musicians and hosting workshops to explore the creative potential of MusicLM, resulting in the version released in AI Test Kitchen not generating music with specific artists or vocals.
The use of generative AI in music creation raises legal and copyright concerns, with record labels taking action against AI-generated songs that infringe on intellectual property.
Ongoing lawsuits are likely to shape the legal landscape regarding music-generating AI and the rights of artists whose work is used to train AI systems without consent.
CODE/DEVTOOLS
Hugging Face has released Transformers Agent, a tool that allows users to manage over 100,000 Hugging Face models by conversing with the Transformers and Diffusers interface. Link. Link.
Transformers Agent provides a natural language API and curated tools to address complex, multimodal challenges.
The tools focus on specific tasks and can be used to prompt the agent to perform a given task.
There are two APIs available: single execution (run) and chat-based execution (chat), each with its own advantages.
Code execution is limited to Hugging Face functions and the print function, reducing the risk of attacks.
Hugging Face recently partnered with ServiceNow to develop StarCoder, an open-source language model for codes that outperforms larger models and rivals closed models like OpenAI's code-Cushman-001.
Google announces a competitor to GitHub's Copilot and a chat tool for coding and Google Cloud services. Link.
These tools are powered by Codey, a large language model based on Google's PaLM 2, trained specifically for coding-related prompts and queries related to Google Cloud.
The model was trained on a large corpus of permissively licensed open-source code, as well as internal Google code, samples, and reference applications.
Developers can access these tools through extensions for Visual Studio Code, JetBrains IDEs, the Google Shell Editor, and Google's cloud-hosted Workstations service.
The code generation features support Go, Java, JavaScript, Python, and SQL, and developers can interact with the model through a chat box in their IDE or by writing comments in a text file.
Google aims to integrate these models beyond code generation and use them to help developers manage services on Google Cloud, including deploying and scaling applications, using chatbot technology. The goal is to create a more human-centric and goal-oriented cloud experience.
Builder.ai, a turnkey app development platform, has secured a strategic collaboration and equity investment from Microsoft. Link.
The collaboration will involve joint work on AI solutions, with Builder's "Natasha" AI being available to Microsoft Teams users for app and software development.
Microsoft Teams has surpassed 280 million monthly active users.
The partnership will include integrations across Azure OpenAI Service and other Azure Cognitive Services with Builder.ai's software assembly line.
Builder.ai aims to empower everyone to be a developer, and the collaboration with Microsoft is expected to bring their combined power to businesses worldwide.
The exact amount of Microsoft's equity investment in Builder.ai has not been disclosed but is described as significant.
MULTIMODAL
Meta has introduced ImageBind, an AI model capable of binding data from six modalities simultaneously without explicit supervision. Link.
It recognizes relationships between images and video, audio, text, depth, thermal, and inertial measurement units (IMUs) to analyze different forms of information.
ImageBind learns a single embedding space to bind multiple sensory inputs together, allowing for audio-based search, cross-modal search, multimodal arithmetic, and cross-modal generation.
The model achieves state-of-the-art performance on emergent zero-shot recognition tasks across modalities, surpassing specialized models trained for specific modalities.
ImageBind enables machines to better understand and analyze multimodal data, improving the capabilities of AI systems.
The open-source ImageBind model enhances the recognition performance of AI and has various potential applications.
POLICY/LAW/ETHICS
Anthropic aims to raise $5 billion to train powerful text-generating AI systems and has introduced its approach called "constitutional AI." Link.
Constitutional AI involves imbuing AI systems with values defined by a "constitution" to make their behavior more understandable and adjustable.
Anthropic's technique uses principles to guide the system's judgment in generating text, promoting behavior like being "nontoxic" and "helpful."
The approach involves training models to critique and revise their responses based on the principles and examples, followed by training the final model using AI-generated feedback.
Anthropic argues that constitutional AI is superior to methods relying on human contractors, as it is more scalable, consistent, and transparent.
The principles used in Anthropic's constitutional AI come from various sources, including the U.N. Declaration of Human Rights and values from global platform guidelines.
MEPs in the European Parliament have voted on amendments to the draft AI legislation, including requirements for foundational models that underpin generative AI technologies like OpenAI's ChatGPT. Link.
The amendments oblige providers of foundational models to apply safety checks, data governance measures, and risk mitigations before putting their models on the market. They also require consideration of risks to health, safety, fundamental rights, the environment, and democracy.
Foundational model makers are also expected to reduce energy consumption, register their systems in an EU database, comply with transparency obligations, and provide summaries of copyrighted materials used to train their AI systems.
MEPs expanded the list of prohibited practices, including bans on real-time and post remote biometric identification systems, biometric categorization systems using sensitive characteristics, predictive policing systems, emotion recognition systems in various settings, and indiscriminate scraping of biometric data.
The parliament aims to lead in making AI human-centric, trustworthy, and safe, and to create a competitive AI landscape in Europe while protecting fundamental rights and strengthening democratic oversight.
Concerns remain over enforcement, areas such as AI used for migration control, and the developer-led determination of high-risk AI systems. Trilogue talks with the Council will follow after the parliament votes next month to finalize the AI Act.
OTHER
Meta is introducing the AI Sandbox, a testing playground for new ad tools and features, including generative AI-powered tools. Link.
The company has been using AI and machine learning since 2006 to power its apps and services, including the ads system.
The AI Sandbox will include tools like text variation, background generation, and image outcropping to enhance ad text, creative elements, and adapt to different surfaces.
Meta is working with a small group of advertisers to gather feedback and plans to expand access gradually.
Meta Advantage, Meta’s portfolio of automation products, is being improved with new features to optimize campaigns, provide creative flexibility, and measure performance.
Meta is making significant investments in AI infrastructure and modeling to enhance ad performance, optimize across multiple surfaces, and achieve better results for businesses and users.
Google Cloud is expanding its generative AI capabilities by partnering with popular enterprise companies. Link.
These partnerships aim to address various enterprise use cases for generative AI, including content management, virtual collaboration, sales, customer service, marketing content creation, and embedding generative AI in the Salesforce customer data platform.
Box and Google Cloud are integrating Google's advanced AI models into Box AI to enhance content interaction and management.
Canva is adding generative AI features to enable automatic translation and video creation in its visual communication platform.
Dialpad is leveraging generative AI to provide AI recaps, real-time suggestions, and automated call review processes for sales and service teams.
Partnerships with Jasper, Salesforce, and UKG will bring generative AI capabilities to automate content creation, enhance CRM functionality, and transform employee experiences.
More examples of partnerships and Google Cloud's approach to applied generative AI will be shared in the future, including at Google I/O.
Adobe's Firefly, a collection of AI models for generating media content, is partnering with Google's Bard, a generative AI tool, to power text-to-image capabilities. Link.
Firefly, currently in public beta, will be integrated into Bard, allowing users to generate images and modify them using Adobe Express, a free graphic design tool.
Bard users will have access to templates, fonts, stock images, and other assets from the Express library to enhance their generated images.
Firefly in Bard is designed to generate images without copyrighted materials, using licensed images from Adobe Stock, openly licensed images, and public domain content.
The partnership focuses on empowering creators to tell their stories authentically while providing content consumers with information about the origin and modification of the content.
Adobe is using technology from the Content Authenticity Initiative (CAI) to tag images created with Firefly in Bard with "content credentials" containing information about the model used, creation date, and any edits made. CAI has over 1,000 active members, but Google is notably absent from the initiative.
Scale AI has deployed the first large language model on a classified network for the XVIII Airborne Corps, enabling AI-powered decision-making for defense. Link.
The partnership introduces Scale Donovan, a federal AI platform that helps operators, analysts, and decision-makers understand, plan, and act quickly using reinforcement learning from human feedback.
Scale Donovan addresses the challenge of processing vast amounts of information in different formats, allowing decision-makers to make faster and more informed decisions in rapidly evolving situations.
Donovan will ingest over 100,000 pages of live data for the XVIII Airborne Corps, including orders, situation reports, and intelligence reports, to help military staff understand and organize the data.
The XVIII Airborne Corps will deploy Donovan for use cases such as live context of friendly and enemy force operations, intelligence-operations fusion, and crisis planning support.
Donovan will also support the Joint All-Domain Command and Control (JADC2), accelerating planning and enabling collaboration across combatant commands, potentially shortening the air tasking order cycle from days to hours.
The Allen Institute for AI's AI2 Incubator has raised $30 million in funding for its second fund, attracting backing from investors such as Madrona, Sequoia Capital, Vinod Khosla, Evergreen Ventures, and others. Link.
The incubator has supported 21 companies since 2017, which have collectively raised around $160 million in investment and have seen notable success, including the acquisition of XNOR by Apple for approximately $200 million.
AI2 Incubator focuses on technical founders and aims to address undervalued markets and solve real problems with AI-driven solutions.
The incubator aims tobuild companies that are specialized in specific areas, such as accounting or public speaking, rather than challenging larger companies in general AI.
The team acknowledges the hype around AI but emphasizes the importance of remaining pragmatic and focusing on startup fundamentals.
The AI2 Incubator provides its companies with guidance and support, leveraging its network and expertise to navigate challenges, including entering regulated or conservative markets.
Yiğit Ihlamur, a former senior program manager at Google, launched Vela Partners, an AI-powered and product-led VC firm, with the goal of accelerating innovation. Link.
Vela uses predictive algorithms and draws data from websites, social networks, and paid datasets to identify investment areas, source opportunities, and assess threats.
Vela's approach combines AI with expert heuristics to provide market intelligence and insights to decision-makers.
The VC industry is prone to bias, but Ihlamur did not offer a specific solution, stating that biases can be learned from other VCs or the past.
Vela differentiates itself with a game-like terminal that assists entrepreneurs, limited partners, and other VCs in using its services.
The firm's GitHub repository, which includes algorithmic models, is public for inspection and reuse.
SLAIT School is an interactive tutor powered by computer vision that aims to revolutionize the teaching of sign language. Link.
Originally developed as a sign language AI translator, the team pivoted to focus on education due to the limitations of data for full language translation.
SLAIT School provides an interactive learning experience, offering instant feedback on words and phrases using gesture recognition technology.
The platform is currently available on desktop browsers, with plans for a mobile app in the future.
Users can access lessons, practice signing, interact with the AI tutor, and receive feedback at a cost comparable to in-person tutoring.
SLAIT School plans to expand its curriculum to cover different sign language groups and specialized domains such as medical and scientific signs. The long-term goal is to develop a sign language translator.
Everseen, a computer vision company focused on preventing theft at self-checkout counters, has raised €65 million ($71.32 million) in a Series A funding round led by Crosspoint Capital Partners. Link.
The funds will be used to scale Everseen's business and address the rising demand from retailers facing declining customer spending, operational losses, labor shortages, and cost inflation.
Everseen's computer vision algorithms detect and track objects of interest, analyze interactions, and recognize actions performed by shoppers and sales associates to reduce theft and improve stock management.
The platform can also provide real-time analytics, insights, and interventions to optimize processes and enhance the customer experience.
Everseen has faced challenges in accuracy and bias, with instances of misidentifying theft and potential biases in the training data for suspicious activity.
Despite these challenges, Everseen has secured partnerships with over half of the world's top 15 retailers, deployed its technology in thousands of retail stores, and plans to expand into other industries such as supply chain and manufacturing.
Antimetal, a startup focused on helping companies reduce their AWS bills, has raised $4.3 million in a seed funding round. Link.
The company uses artificial intelligence and machine learning to analyze thousands of data points and identify savings opportunities for customers.
Antimetal's solution can save customers up to 75% on their AWS bills in less than five minutes.
In addition to cost optimization, Antimetal has created an AI-powered online marketplace for selling unused AWS resources, providing a faster and more efficient way to sell spare instances.
The founders are building a defensible product by focusing on cost reduction, which is not a core focus for cloud vendors like AWS.
Antimetal plans to use the funding to expand its team and further develop its solution. The company currently has over 30 paying customers and aims to expand beyond AWS to other major cloud platforms.
GPT Tip of the Week
Prompt: What types of specific tasks have historically been entirely manual for engineers that can now be automated with LLMs in the developer tooling layer?
Response: Here are some tasks in the developer tooling layer that can be automated or assisted using LLMs:
Automated code formatting: LLMs can be trained to understand coding style guidelines and automatically format code according to those guidelines. This helps ensure consistent code formatting across a project or team.
Code refactoring suggestions: LLMs can analyze codebases and provide suggestions for refactoring, such as identifying duplicate code, recommending code modularization, or proposing more efficient algorithms or data structures.
Automated code review: LLMs can assist in code reviews by automatically analyzing code changes, checking for common coding errors or anti-patterns, and providing feedback to developers. This can speed up the code review process and improve code quality.
Test case generation: LLMs can generate test cases based on code analysis, requirements, or natural language descriptions. This helps automate the creation of test suites and improves test coverage.
API documentation generation: LLMs can automatically extract information from code and generate API documentation, including method signatures, descriptions, parameter details, and example usage. This saves time for developers and keeps documentation up to date.
Dependency management and versioning: LLMs can assist in managing project dependencies by analyzing compatibility, security vulnerabilities, and versioning conflicts. They can provide recommendations for updating dependencies or resolving conflicts.
Automatic code generation: LLMs can generate boilerplate code or scaffolding for common programming tasks, such as setting up project structures, creating database schemas, or generating CRUD (Create, Read, Update, Delete) operations.
Code search and discovery: LLMs can help developers find relevant code snippets, examples, or documentation by analyzing natural language queries and code repositories. This accelerates the process of code reuse and knowledge sharing.