- AI Weekly
- Posts
- AI Weekly: 08/28/23
AI Weekly: 08/28/23
Hugging Face lands new capital, OpenAI partners with Scale, and a handful of new startups are emerging to take some of NVIDIA’s marketshare
Good morning and welcome to this week’s edition of AI Weekly! This week is headlined by Hugging Face’s fundraise announcement, as the open-source juggernaut just raked in $235 million from big strategic investors like Salesforce, Google, Amazon, Nvidia, Intel, AMD, Qualcomm, IBM, and…us.
In infra news, OpenAI is partnering with Scale AI to make it easier for developers and enterprises to fine-tune its AI models using custom data, and Meta has open sourced an AI machine learning system for generating and explaining code in natural language. Also, Modular, a startup that enhances AI model inferencing performance on CPUs and GPUs, just raised $100 million in funding led by General Catalyst.
In the image/video world, Ideogram, a new startup that was founded by former Google Brain researchers, launched with $16.5 million in seed funding led by a16z and Index Ventures.
Continue reading below for more on this past week’s AI happenings !
- ZG
Here are the most important stories of the week:
TEXT
Khan Academy is testing an AI tutor called Khanmigo with more than 8,000 teachers and students, aiming to provide individualized guidance in math, science, humanities, and more. Link.
Khanmigo offers interactive features, including debates on topics like student debt and AI's impact, as well as conversations with AI-powered historical figures and literary characters.
The Chief Learning Officer of Khan Academy emphasizes the need for AI tools that address individualized learning needs beyond what a single teacher can provide.
Khanmigo users can provide real-time feedback to improve its responses, and the tool is designed to engage students in meaningful conversations.
While some educators express concerns about AI, Khan Academy's efforts include an AI literacy course for teachers and implementing safeguards in Khanmigo to maintain an encouraging tone and prevent providing direct answers to questions.
Lex, an AI-powered writing tool, has raised a $2.75 million seed round led by True Ventures. Link.
Lex is a modern writing platform that integrates AI into its tools to extend and smooth out users' writing workflow.
The platform provides formatting tools, markdown-based shortcuts, and AI-generated content suggestions to help writers in their writing process.
The AI roadmap includes features like sentence rephrasing, summarization, and more.
Lex aims to maintain user privacy and is not using user content for training its AI models.
The startup started as a side project, signed up around 25,000 users in its first 24 hours, and intends to start charging for its product soon with pricing expected to be modest.
IMAGE/VIDEO
Ideogram, a new generative AI image startup, founded by former Google Brain researchers, launched with $16.5 million in seed funding led by a16z and Index Ventures. Link.
Ideogram distinguishes itself by solving a significant issue in AI image generation – reliable text generation within images, including lettering on signs and logos.
The startup offers preset image generation styles like typography, 3D rendering, cinematic, painting, fashion, and more on its web app.
While Ideogram's examples of text generation are impressive, it lacks some features present in rival image generators, such as zoom out/outpainting, and its results are less consistent in tests.
The company's launch and beta release highlighted its text generation feature through a mission statement posted on X (formerly Twitter).
Ideogram's investors include AIX Ventures, Golden Ventures, Two Small Fish Ventures, and notable industry experts, and it has garnered attention from AI figures like David Ha and Margaret Mitchell.
Bellevue-based AI startup Irreverent Labs has secured a strategic investment from Samsung Next, indicating its ambitions to transform video content production. Link.
Founded in 2021 by Rahul Sood and David Raskino, the startup has developed AI models to generate 3D animated videos from text prompts.
The partnership with Samsung Next aims to bring Irreverent Labs' AI video generation technology to Samsung device users globally.
The startup's "video foundation model" can translate text prompts into high-quality 3D videos, potentially lowering barriers for producing video entertainment.
While initially focusing on gaming and creative professionals, Irreverent Labs intends to release a developer preview later this year, aiming to expand the technology's use cases.
Experts caution that the technology is early and unproven, especially in generating longer, coherent video content safely.
Wand.app, an AI-powered creative tool, has raised $4.2 million in seed funding led by O’Shaughnessy Ventures. Link.
Unlike other AI creative tools that limit artists' control, Wand combines visual tools and personalization to bridge the gap between AI-generated content and artists' specific visions.
Wand allows artists to teach their personal AI their own style to achieve consistent results with their desired aesthetic.
The tool is being tested with beta users and plans to launch publicly in the coming months.
Wand aims to cater to artists and illustrators, with an emphasis on professionals in fields like game studios, concept artists, branding firms, and architects.
The funding will be used to develop collaborative team features, expand creative tooling to desktop, improve model architectures, and explore methods for artists to share or sell their fine-tuned models.
SPEECH/AUDIO
ElevenLabs, an AI-powered platform for creating synthetic voices, has launched its platform out of beta, now supporting over 30 languages. Link.
With a new in-house AI model, ElevenLabs can automatically identify languages and generate emotionally rich speech in those languages.
Users of ElevenLabs can utilize the voice-cloning tool to speak across different languages without typing text.
The company aims to make high-quality AI voices available in various dialects, aiming to make content universally accessible in any language and voice.
ElevenLabs has faced both positive and negative attention, with high-quality generated voices, but also concerns about misuse for generating hateful messages and threats.
The company has raised $19 million from investors including Andreessen Horowitz and DeepMind co-founder Mustafa Suleyman, with plans to expand into voice dubbing and emotion transfer across languages.
Meta has released a new speech-to-text model called SeamlessM4T, which can translate speech-to-text and text-to-text for nearly 100 languages. Link.
The model is designed to function as a universal language translator and is released under a Creative Commons CC BY-NC 4.0 license for research iteration.
SeamlessM4T recognizes 100 input languages and converts them into 35 output languages for speech-to-speech and text-to-speech tasks.
This model performs the entire translation in one step, different from other translation models that divide translation across different systems.
SeamlessM4T is designed to identify code-switching and gender bias in translations and can detect toxic or sensitive words, striving to improve translation quality.
Meta has been releasing various AI models, such as AudioCraft and Llama 2, in an open-source manner to researchers and developers.
CODE/DEVTOOLS/INFRA
OpenAI is partnering with Scale AI to make it easier for developers and enterprises to fine-tune its AI models using custom data. Link.
This partnership will integrate Scale AI's fine-tuning tools with OpenAI's GPT-3.5 text-generating model.
Developers will be able to tailor GPT-3.5 for specific tasks, brand voice, language, and more using Scale's tools and expertise.
Scale's Data Engine platform will be used to prep and enhance data, followed by fine-tuning GPT-3.5 with the custom data.
Fine-tuned models will be reviewed by human experts to ensure performance and safety.
OpenAI COO Brad Lightcap sees this collaboration as a way for companies to benefit from Scale AI's expertise in addition to OpenAI's capabilities.
Meta has open sourced Code Llama, an AI machine learning system for generating and explaining code in natural language. Link.
Code Llama can complete and debug code across various programming languages including Python, C++, Java, and more.
Code Llama is based on the Llama 2 text-generating model, fine-tuned for code generation and understanding.
The model has different versions optimized for Python and for understanding instructions.
Code Llama models range from 7 billion to 34 billion parameters, trained on code-related data, and can insert code into existing projects.
While promising, Code Llama and similar AI coding tools raise concerns about code quality, security vulnerabilities, intellectual property issues, and malicious use.
Modular, a startup that enhances AI model inferencing performance on CPUs and GPUs, has secured $100 million in funding led by General Catalyst, joined by GV, SV Angel, Greylock, and Factory. Link.
By improving model inferencing performance, Modular is able to provide cost savings and compatibility with popular frameworks.
The total raised funds for Modular now amount to $130 million.
The funding will be used for product expansion, hardware support, and the growth of their programming language, Mojo.
Modular was co-founded by ex-Googler Chris Lattner and Tim Davis to simplify AI system development and optimization.
Modular's programming language, Mojo, aims to combine Python's usability with caching and adaptive compilation techniques. It has gained significant developer interest and is nearing general availability.
Ikigai raised $25 million in Series A funding led by Premji Invest, bringing its total funding to $38.2 million. Link.
Ikigai's graphical models excel for tabular and time-stamped enterprise data, providing a cost-effective alternative to LLMs.
Organizations struggle with utilizing and analyzing vast amounts of data, with only 13% succeeding in their analytics and data strategies.
Devavrat Shah, founder of Celect and director of MIT's statistics and data science center, believes AI is crucial for effective forecasting and scenario-based planning in enterprises.
Shah established Ikigai Labs to empower enterprises with AI, offering a no-code platform based on proprietary graphical models for prediction, sparse data reconciliation, and optimization.
Despite competition from companies like C3.ai, Anaplan, Dataiku, and Hugging Face, Ikigai's unique approach with large graphical models is expected to make it stand out in the market.
OpenAI has announced that businesses can now fine-tune GPT-3.5 Turbo using their own data, allowing them to create more focused AI models for specific tasks. Link.
The fine-tuning process enables companies to create unique AI models that provide reliable responses in specific languages or with concise wording.
Previously, business customers were limited to using GPT-3 variants like davinci-002 or babbage-002 for this purpose.
GPT-3.5 Turbo, introduced earlier this year, can handle 4,000 tokens at a time, double what previous models could process.
The fine-tuned AI models can be used for mimicking brand voices, generating routine code, formatting and completing code snippets, and other applications.
Pricing for GPT-3.5 Turbo's fine-tuning is $0.0080 per 1,000 tokens for training, $0.0120 per 1,000 tokens for input usage, and $0.0160 per 1,000 tokens for the chatbot's output.
HARDWARE
Nvidia's second-quarter earnings report reveals substantial profits from the generative AI boom, driven by the demand for AI chips like A100 and H100 for building and running AI applications. Link.
Nvidia's Q2 revenue of $13.51 billion far exceeded Wall Street expectations, doubling its $6.7 billion revenue from the same period the previous year.
The company's data center business generated $10.32 billion in revenue, up 141% from the previous quarter and 171% from a year ago, outshining its gaming unit's $2.49 billion revenue.
Nvidia's success is attributed to its focus on AI-powered image processing, like ray tracing and intelligent upscaling, which has significantly contributed to its growth.
Nvidia has forecasted further growth, projecting revenue of $16 billion for the third quarter.
CEO Jensen Huang noted the transition from general-purpose to accelerated computing and the rise of generative AI as major factors propelling the company's success.
Startups like d-Matrix, Rain Neuromorphics, Tiny Corp, Modular, MatX, Qyber, SiMa.ai, and Lightmatter are entering the market to challenge Nvidia's dominance in AI chip technology. Link.
These startups are aiming to offer alternative designs for AI chips that they claim work more efficiently and cost-effectively than Nvidia's GPUs.
Some startups are also targeting Nvidia's app-writing software, such as Cuda, with alternatives that can be used for running and training machine-learning models.
The startups are focused on reducing the costs of training and running machine-learning models compared to Nvidia's products.
While some of these startups haven't yet brought their products to market, others like Cerebras and SambaNova Systems are better positioned from the earlier wave to challenge Nvidia.
These startups face challenges like competing against Nvidia's entrenched position, high technical complexity, and obtaining funding in a more competitive environment.
DESIGN
Microsoft Designer, a free AI-powered design tool, is now widely available to Edge users in the US. Link.
The integration allows users to access Designer from Edge's sidebar, enabling them to generate designs without opening a separate tab or program.
Designer uses AI to suggest designs for various purposes, such as social media posts, fliers, greeting cards, and more, and allows users to customize them.
The tool is powered by DALL-E, a text-to-image generator, which lets users create pictures to add to their designs.
While Designer is already available as a standalone app on the web, its integration with Edge makes it more convenient for users to bring their designs to various platforms without switching between windows.
Microsoft Designer is still in preview, and users can access it by updating Edge and enabling "Designer (Preview)" in the sidebar. Additionally, Microsoft has introduced updates to Bing Chat in the browser, expanding its capabilities.
Sydney-based startup Relume, a web design platform, has chosen to rely on bootstrapping rather than venture capital investments since its launch in November 2021. Link.
Initially, Relume started as a component library for web design tools Webflow and Figma, offering customizable blocks for building websites.
The company introduced a generative AI twist in August, enabling users to input text prompts that AI uses to quickly sketch out editable sitemaps and wireframes for websites.
Relume's AI is trained in-house using OpenAI's large language model for prompt interpretation, and the text output is converted into real-time visual wireframes.
The platform has gained around 54,000 users, with a retention rate of around 90%, and it targets individuals, freelancers, and web design agencies.
Relume's strategy aims to empower designers by automating heavy lifting, retaining human decision-making for style guide, font, color, and other choices.
OTHER
Hugging Face has raised $235 million in Series D funding led by Salesforce, while investors including Google, Amazon, Nvidia, Intel, AMD, Qualcomm, IBM, and Sound Ventures also participated. Link.
This funding round values Hugging Face at $4.5 billion, twice its valuation from May 2022.
Hugging Face offers data science hosting and development tools, including a hub for AI code repositories, models, and datasets, web apps for AI-powered applications, and libraries for dataset processing and model evaluation.
Its paid features include AutoTrain for automating model training, Inference API for hosting models without managing infrastructure, and Infinity for accelerating in-production model processing.
Hugging Face's platform has attracted 10,000 customers and over 50,000 organizations, hosting more than 1 million repositories on its model hub.
The company's focus spans MLOps, open-source language models like BigScience's Bloom, partnerships with cloud providers, and expansion efforts across research, enterprise, and startups.
Crate, a new AI-powered app, functions as an AI-fueled version of Pinterest, analyzing user-curated content to provide personalized product recommendations. Link.
The app offers customized activities and content based on the user's saved content, known as "crates." It auto-generates summaries and cover photos for these content folders.
Crate employs OpenAI's GPT-3.5 model for text generation and Stable Diffusion for image generation, along with its proprietary in-house models.
Founded by Anna Bofa, who previously worked on partnerships at Pinterest, the LA-based startup recently raised $5 million in seed funding at a $25 million post-money valuation.
Crate aims to monetize by charging brands to suggest their products to users, similar to Pinterest's revenue model.
Consumer AI startups often face challenges including cloud computing costs and access to GPU resources. Crate was accepted into Amazon Web Services’ Generative AI Accelerator program, providing up to $300,000 in AWS credits and technical support.
Edo Liberty, founder of Pinecone, warns that AI startups should not assume they can spend billions of dollars on training AI models recklessly, as there are instances of excessive spending, like an intern who spent half a million dollars on AI model testing by accident. Link.
Liberty suggests that AI startups can improve cost-efficiency by storing their company's data in a vector database that models can access when needed, instead of constantly retraining models with large amounts of data.
AI-generated fakes are a concern, and there's difficulty in distinguishing between AI-suggested grammar changes and fully AI-generated text.
Open source AI models like Meta's Llama 2 raise concerns about security and privacy for large enterprises, according to Aidan Gomez, CEO of Cohere, a closed model provider.
Gomez and Liberty dismiss claims that top AI models like GPT-4 have struggled recently, explaining that models can improve in certain areas while declining in others, and users might be asking more challenging questions, giving the impression of reduced performance.
The shortage of graphics processing units (GPUs), needed for running AI software, is acknowledged as a challenge. Gomez recommends specialized cloud providers for GPU supply and suggests looking into other chip options like those from AMD or Google.
The caution is advised to avoid repeating the wild spending and unprofitability seen in the 2020-2021 tech bull market.
The New York Times has blocked OpenAI's web crawler, GPTBot, preventing OpenAI from using content from the publication to train its AI models. Link.
The NYT's robots.txt page shows that the block was put in place as early as August 17th.
This change follows the NYT's updated terms of service at the beginning of the month, which prohibit the use of its content to train AI models.
The NYT is reportedly considering legal action against OpenAI for potential intellectual property rights violations.
Other individuals and entities, including authors and a programmer-lawyer, have previously sued OpenAI over its data scraping practices and use of copyrighted material.
The New York Times spokesperson declined to comment, and OpenAI has not yet responded to requests for comment.