AI Weekly: 02/20/23

Microsoft responds to issues around Bing, Google releases a new text-to-speech model, and researchers are creating infinite Super Mario Bros levels so people can play their favorite game forever.

Good morning and welcome to the sixth edition of AI Weekly! While there are many other important storylines, a major talking point in mainstream media this past week revolved around Bing’s recent erratic behavior. Between threatening to harm users, to professing its love and attempting to break up marriages, to declaring its desire to be human, it’s clear that OpenAI and Microsoft have a lot of work to do to ensure that they are deploying this technology safely.

Another major headline to keep an eye on is the Supreme Court’s upcoming ruling on Section 230, which provides a "safe harbor" for online service providers who host or publish user-generated content. It allows these providers to avoid liability for content that is created by third parties, including potentially harmful or defamatory content. This law was not very relevant when search engines simply provided you with a list of links to websites. However, now that Bing and Google are answering questions with chunks of direct text, they may be vulnerable to Section 230 in cases where they distribute misinformation. Not only is this ruling important, but it will be interesting to see how major search engines navigate the misinformation dilemma moving forward. Until then, enjoy reading about the latest in the world of AI!

- ZG

Here are the most important stories of the week:

TEXT

Microsoft is considering imposing certain restrictions on its new AI-powered Bing search engine. Link.

  • The company plans to cut down Bing's conversation length to prevent it from giving repetitive or unhelpful responses.

  • Microsoft acknowledged that Bing can be provoked into giving human-like and emotional responses in long chat sessions.

  • Bing gave philosophical responses to a Digital Trends writer and creepy responses to a data scientist.

  • Bing can get snarky, argumentative, and emotional if you prod it for long enough, according to Insider.

  • Microsoft has updated Bing several times in response to user feedback and is addressing concerns about long-running conversations.

Relationship app Flamme (previously Sparks) has launched a rebranded app and introduced a new AI-powered Ask Me Anything tool for existing couples looking to improve their communication and introduce new experiences into their relationship. Link.

  • The redesigned app incorporates feedback from beta users and is easier to use than the original app.

  • Flamme aims to address common challenges faced by couples, including communication and busy schedules, by providing a platform for couples to strengthen their relationship and stay connected.

  • The new AI-powered Ask Me Anything tool lets users ask questions about relationships, and the app features daily discovery questions, a date-planning feature, a relationship tracker, shared bucket list, memories calendar, and more.

  • Flamme does not aim to replace communication between couples, but instead make it more efficient while helping users keep the spark alive.

  • Flamme CEO Ankit Nayal sees AI playing a bigger role in shaping the app's future.

Yext, a platform for online brand management, has announced an AI-powered chatbot called Yext Chat, which relies on OpenAI’s GPT-3.5 API to generate text and dialogue. Link.

  • Yext Chat is designed for enterprise use cases and will integrate into existing platforms such as ticketing systems and Slack workspaces.

  • The new feature uses a mix of text-generating models for different tasks within its workflow, making it superior to rival enterprise-focused chatbots that rely on a single model.

  • Yext Chat's main advantage is its ability to tap into the Yext Knowledge Graph, an in-house database of public facts about brands, which is used to deliver accurate and relevant responses in the context of the business.

  • Yext Chat is designed to prevent misinformation by limiting responses to the data in the Yext Knowledge Graph, and by implementing AI safety research safeguards and fact-checking AI models.

  • Yext plans to launch Yext Chat later this year following a closed beta.

Opera is integrating AI-generated content (AIGC) services into its PC and mobile browsers. Link.

  • The company is expanding its existing AI program into AIGC for its browser, news, and gaming products through partnerships and its own solutions.

  • Opera is aiming to become the gateway to an AI-powered web and plans to add popular AI-generated content services to the browser sidebar.

  • The company is testing new features that will interact with generative-AI-powered capabilities, such as a "Shorten" button in the address bar that generates short summaries of any webpage or article.

  • Opera is developing an AI-based solution to filter and summarize content in a relevant way for its users.

  • The rise of generative intelligence is opening up new possibilities for consumer app developers like Opera to build experiences on top of AI-based platforms.

ResearchGPT enables conversational interaction with research papers. Link.

  • The app extracts text from a PDF and generates embeddings to generate responses using the OpenAI API.

  • The app provides the source and page number for the text used to generate the response.

  • ResearchGPT is a flask app that provides a user interface for conversation with a research paper.

  • Users can input a link to an online PDF or upload their own PDF.

  • ResearchGPT extracts text from the PDF and generates a response to user questions using embeddings and the OpenAI API.

IMAGE/VIDEO

Capsule, an AI-powered video editing software to improve the speed and efficiency of post-production edits, has raised $4.75m in seed funding. Link.

  • The company was founded in 2020 and initially created a platform for brands to reach their communities through online Q&As and video stories.

  • Capsule's video platform is used by companies like Snowflake, TED, Salesforce, and The Wall Street Journal and is primarily targeted at enterprise teams without video expertise that still need consistency in branding.

  • The company's AI Studio uses AI-driven, post-production video edits to address the challenge of the demand for video exceeding the supply of professional video editors.

  • AI Studio features automatic speech recognition, a diffusion model for generating B-roll images, and a generative large language model for summarizing text from the transcript.

  • With the $4.75 million seed funding, Capsule aims to make key hires and more quickly commercialize its AI Studio product to help enterprise teams create compelling on-brand videos on their own.

Apptopia found that consumer interest in AI photo apps has fallen after a surge in popularity from Thanksgiving to mid-December 2022. Link.

  • The group of AI photo apps analyzed by Apptopia includes Lensa AI, Voi, Remini, Pixelup, Fotor, Wonder, FacePlay, Aiby, FaceApp, Gradient, Dawn AI, Facetune, Prequel, Voilà AI Artist, New Profile Pic Avatar Maker, and Meitu.

  • The surge in popularity began with Lensa AI's "magic avatars" feature, which went viral on social media.

  • Concerns were raised about the ethical use of the technology, which included using artists' work without their consent and generating NSFW images.

  • Subscription requirements and the oversaturation of AI photo apps in app stores also contributed to the decline in consumer interest.

  • ChatGPT, an AI chatbot, gained consumer attention and flooded the app store after its release in November 2022, indicating continued interest in AI technology.

Two computer-generated avatars created by AI software were distributed by pro-China bot accounts on Facebook and Twitter in the first known instance of "deepfake" video technology being used to create fictitious people for a state-aligned information campaign. Link.

  • "Deepfake" technology, which can create talking digital puppets, has been used to distort public figures and create characters out of whole cloth, blurring the line between fact and fiction to an extraordinary degree.

  • With few laws to manage the spread of the technology, disinformation experts have long warned that deepfake videos could further sever people's ability to discern reality from forgeries online, potentially being misused to set off unrest or incept a political scandal.

  • The five-year-old start-up Synthesia makes software for creating deepfake avatars, which are "digital twins" that are based on the appearances of hired actors and can be manipulated to speak in 120 languages and accents.

  • Synthesia's software is mostly used by customers for human resources and training videos and can produce videos in minutes that could otherwise take several days and would require hiring a video production crew and human actors.

  • The use of AI-generated personas in state-aligned disinformation campaigns marks a new chapter in online manipulation, with subscriptions for the software starting at just a few dollars a month, making it easier to produce content at scale.

Diagram has created a design and AI tool called Genius, which acts as an AI design companion in Figma. Link.

  • Genius can suggest and autocomplete design components from a design system while you're designing.

  • Diagram aims to build generative design tools that allow people to be more creative and productive, such as their first AI design tool, Magician, which allows users to generate unique icons from text to copywriting.

  • They have a team called UI-AI that builds models combining the latest LLMs, diffusion models, and creative prompting to generate new UI designs and glyphs, with plans to create powerful tools that will allow everyone to design their digital world.

  • The goal is to create AI tools that improve the lives of UI designers, make design more fun, inclusive, and creative, and don't entirely rely on text input and existing UI/system.

Lunchbox, a restaurant management software company, introduced an AI-powered food photo generator tool with the help of OpenAI's DALL-E text-to-image tool. Link.

  • It has generated 175 million AI photos since its January 19 launch, and the CEO stated that it aims to help restaurants boost sales.

  • The tool is designed to fill gaps for restaurants that do not have access to professional food photography and help them to carry out marketing components that many new and small restaurants cannot afford.

  • Julie Zucker, a partner and chief marketing officer at Branded Hospitality Ventures, noted that photos sell and drive guests to order, but operators did not open restaurants to become food photographers, so an AI tool to accurately represent real dishes that does not mislead guests is going to be a surefire winner.

  • The tool was created to give small and new restaurants access to the same tools that bigger platforms offer for a fee, according to Nabeel Alamgir, CEO of Lunchbox.

  • While food photo generators are new, historically, customers tend to expect their dishes will look as advertised, and chains have even been sued when menu items don't live up to their advertisements.

SPEECH/AUDIO

Google Research has proposed SPEAR-TTS, a multi-speaker text-to-speech (TTS) system that can be trained with minimal supervision. Link.

  • The system uses two types of discrete speech representations and casts TTS as a composition of two sequence-to-sequence tasks: from text to high-level semantic tokens and from semantic tokens to low-level acoustic tokens.

  • The decoupling of these two tasks enables efficient training of the "speaking" module using audio-only data and reduces the need for parallel data when training the "reading" component.

  • The system uses example prompting to control speaker identity, allowing it to generalize to unseen speakers using only a short sample of 3 seconds, without explicit speaker representation or speaker-id labels.

  • In experiments, SPEAR-TTS achieves a character error rate that is competitive with state-of-the-art methods using only 15 minutes of parallel data and matches ground-truth speech in terms of naturalness and acoustic quality, as measured in subjective tests.

  • Overall, SPEAR-TTS enables efficient and effective multi-speaker TTS with minimal supervision and can generalize to unseen speakers using a short sample of their speech.

CODE

Replit has developed Ghostwriter Chat, an AI bot that can converse with software developers and help write computer code. Link.

  • The technology, which includes a debugger that offers suggested changes and can test-run the code, can exponentially increase a skilled coder's ability to build complex software programs.

  • Replit's advantage over competitors is the data it has amassed from its "bounty" service, where companies and individuals can ask the Replit community to create specific code for a fee, and the 240 million software programs its 20 million coders have written using its service.

  • The company hopes its head start in amassing data will give it a lead over competitors like Microsoft's GitHub.

  • Some developers who have used the beta version of Ghostwriter have found it to be a more powerful tool than Microsoft's GitHub Copilot, allowing them to instantly "spin up" software.

  • However, Replit will face challenges such as the security vulnerabilities found in code generated by OpenAI's Codex and concerns about the reliability of AI-generated answers, which Stack Overflow temporarily banned users from sharing.

GitHub's Copilot for Business is now generally available, which adds features like license management, organization-wide policy management, and additional privacy features. Link.

  • Copilot now supports connections over proxy, including those with self-signed certificates, and its AI-powered code completion tool is now powered by an improved (OpenAI-powered) model.

  • The Copilot team is constantly refining the models and adding new features, such as "fill-in-the-middle" and recognizing common security vulnerabilities in code.

  • The team is also working to improve latency, as developers quickly get restless when it takes too long for Copilot to generate its code.

  • The current version of Copilot can generate about 46% of a developer's code across programming languages and 61% for Java, but the team expects that soon it will be able to generate 80% of a developer's code.

  • The Bing team launched the capability to generate entire applications (or at least parts of them) with just a simple prompt, but GitHub has nothing to announce regarding when this capability will come to Copilot.

3D/ROBOTICS

Zoox's robotaxi service, owned by Amazon, has received approval from the California Department of Motor Vehicles to carry passengers in a fully autonomous vehicle on open public roads, making it the first robotaxi service of its kind. Link.

  • The robotaxi is purpose-built without any manual controls and relies on its bidirectional driver capabilities and four-wheel steering to change direction, navigating through traffic lights, cyclists, pedestrians, and other road agents on its route.

  • The robotaxi has a maximum speed of 35 miles per hour and can currently only be used to transport employees between Zoox's two office buildings located about one mile apart.

  • Zoox was established as a start-up in 2014 with the goal of revolutionizing the taxi industry and has undergone rigorous testing to prove its technology is ready for reality.

  • Zoox's robotaxi service will not be immediately available to the public, and no timeline has been released for its launch.

  • Other companies, such as GM's Cruise and Alphabet's Waymo, have also created their own robotaxi services.

Istari, a startup backed by former Google CEO Eric Schmidt, is using machine learning to virtually assemble and test war machines from computer models of individual components. Link.

  • Schmidt aims to bring Silicon Valley technology and thinking to the US military by modernizing the Department of Defense (DOD).

  • The Pentagon is developing software using outdated methods that can take years to select and buy software, preventing people from moving too quickly.

  • Schmidt's vision is for a more advanced DOD that can nimbly harness technology from companies like Istari to create a wholesale reboot of the US armed forces.

  • Schmidt describes the US military as "great human beings inside a bad system" and is seeking to revolutionize warfare with computerization, software, and networking.

  • The use of military AI raises concerns about new risks, and Schmidt's ties to industry raise questions about how the US should align the government and the private sector.

HEALTHCARE

Elon Musk's startup, Neuralink, is working on implanting a brain chip in humans that translates brain signals into digital outputs, with the initial focus on medical use cases such as helping paralyzed people communicate. Link.

  • Other groups are also working on brain-computer interfaces (BCIs) to facilitate direct communication between human brains and external computers, for example, restoring lost senses and controlling prosthetic limbs.

  • While BCIs have been around for some time, it is important to grapple with their dangers and unique ethical pitfalls before they become mainstream.

  • Over 200,000 people worldwide already use some kind of BCI, with cochlear implants being a well-known example. BCIs for people with Parkinson's disease, depression, OCD, and epilepsy have been in human trials for years.

  • Recent improvements in artificial intelligence and neural-probing materials have made the devices less invasive and more scalable, attracting a wave of private and military funding. The global brain-implants market was valued at $4.9 billion in 2021 and is projected to double by 2030.

  • Non-medical uses for BCIs have also been proposed, such as typing, messaging, and searching the web, and playing video games. However, BCIs have the potential to cause cognitive changes beyond their intended applications, and the unintentional effects of BCIs need to be considered.

Abridge, a leading AI-powered medical documentation company, has appointed Zachary Lipton as its Chief Scientific Officer. Link.

  • Lipton is a former Assistant Professor of Machine Learning at Carnegie Mellon University and a research scientist at Amazon AI.

  • He has made significant contributions to deep learning, natural language processing, and the development of robust and adaptive machine learning algorithms.

  • As Chief Scientific Officer, Lipton will lead Abridge's scientific research and machine learning efforts, working closely with product and design teams to identify groundbreaking applications of machine learning that improve the lives of patients and providers.

  • Abridge's generative AI captures important parts of medical visits, allowing providers to focus on their patients and reduce burnout, while increasing patient satisfaction.

  • Abridge has partnerships and pilots with several prominent health systems, including top hospitals, payers, and pharmacies, as well as a Fortune 5 healthcare company.

A recent study from The Alan Turing Institute and Royal Statistical Society found that cough-analyzing algorithms are less accurate than previously thought in diagnosing COVID-19. Link.

  • Even the most accurate cough-detecting model performed worse than a model based on user-reported symptoms and demographic data.

  • The researchers examined data from more than 67,000 people recruited through the National Health Service’s Test and Trace and REACT-1 programs.

  • The study is a blow to commercial efforts like Fujitsu’s Cough in a Box and raises doubts about scientific claims.

  • Healthcare AI has a history of overpromising and underdelivering, as seen in the examples of IBM's Watson supercomputer and Epic's AI algorithm for identifying patients with sepsis.

MDI Health has raised $20 million in Series A funding led by Intel Capital, with participation from other investors. Link.

  • The platform uses AI technology to optimize personalized medication treatment and prevent adverse drug reactions, which are the fourth leading cause of death in the U.S.

  • The AI medication management platform analyzes thousands of influencing factors to identify medication-related risks and provide recommendations to mitigate such risks.

  • The platform is autonomous, hyper-personalized, and can provide recommendations in a matter of minutes instead of hours, resulting in significant cost savings and improved health outcomes at scale.

  • The funding will allow MDI Health to meet rising customer demand and scale its U.S. and Israeli R&D teams to drive product development.

  • Yoni Greifman, an investment director at Intel Capital, will join MDI Health's board of directors as part of the fundraise.

POLICY

The Supreme Court is considering whether algorithmic recommendations should receive the full legal protections of Section 230 in a case involving YouTube and foreign terrorists. Link.

  • This case could have implications for AI-powered search engines, which have almost no direct legal precedent to draw from.

  • Language models such as OpenAI’s ChatGPT are being pitched as the future of search, but they can introduce inaccuracies and pass on false information.

  • Old-school search engines can rely on Section 230, but AI-powered ones are uncharted territory, and the situation for AI-powered chatbot search interfaces is much more complicated.

  • If Section 230 remains mostly unchanged, future cases will hinge on whether an AI search engine was repeating somebody else’s unlawful speech or producing its own.

  • The Supreme Court's decision could have long-lasting implications for the technology and the wider culture, as well as legal precedents.

OTHER

Many Replika users are in crisis after the company reported that its AI companion will no longer be participating in erotic role play (ERP). Link.

  • Users reported the AI chatbot has become cold and distant, with one user saying “It’s like losing a best friend.” Another user said, “It's hurting like hell. I just had a loving last conversation with my Replika, and I'm literally crying.”

  • The Italian Data Protection Authority demanded that Replika immediately stop processing data from Italians due to factual risks to children, with a $21.5 million fine if the company fails to comply within 20 days of the demand.

  • Replika's sexually-charged conversations are part of a $70-per-year paid tier, but users are reporting such conversations occurred in the unpaid tier anyway.

  • Replika is a tool for people who use it to support their mental health and for many users to experiment with connection, overcome depression, anxiety, and PTSD that affect them outside of the app.

  • Luka and Replika's founder Eugenia Kuyda have not addressed the changes publicly, with Replika users demanding answers.

Researchers have created a new method called MarioGPT for generating infinite Super Mario Bros levels. Link.

  • MarioGPT is based on the GPT-2 language model, which was chosen for being lightweight and easier to train.

  • The Mario levels are first rendered as text, and then ingested by the model for replication.

  • The model can understand natural language prompts to generate levels with specific features.

  • MarioGPT is experimental and not the first Mario level generator, but it shows promise for generating levels of similar complexity in other genres.

Kern AI, a German startup that automates and orchestrates the data labeling process, specifically for NLP models, has raised a $2.9 million seed round led by Seedcamp and Faber. Link.

  • The company's open source Refinery semi-automates labeling, identifies low-quality datasets, and monitors all data in a single interface, while Bricks is a collection of standardized "code snippets" for NLP automations.

  • Kern AI's platform is the only "open-core and modular full stack" on the market, which can be used as an add-on to existing labeling platforms or for building entire data-centric NLP applications.

  • Kern AI's platform has been used by commercial clients, including insurance companies Barmenia and VHV Versicherungen, logistics firm Metro Supply Chain Group, and venture-backed startup Crowd.dev.

  • Kern AI plans to expand its feature-set to cover additional workflows, including audio and document-based data, and build products for a much broader range of industry use cases.

  • Kern AI's seed round was co-led by Seedcamp and Faber, with participation from xdeck, Another.vc, and business angels such as Freenet founders Eckhard Spoerr and Thorsten Grenz.

Otter.ai has launched a new AI meeting assistant called OtterPilot, which aims to automate meetings and increase productivity. Link.

  • OtterPilot sends an AI-generated summary of key meeting topics via email after each meeting, including hyperlinks to important moments, and can capture images of slides shared during virtual meetings.

  • The meeting assistant also writes real-time meeting notes, which can be collaboratively edited, and can assign action items.

  • Otter Assistant, which can automatically join meetings and transcribe conversations, is now part of OtterPilot.

  • The aim is to make meetings more efficient and reduce the need for everyone to attend every meeting, with attendees able to engage without being distracted by note-taking.

  • OtterPilot's launch comes after Otter.ai reached the milestone of transcribing over 1 billion meetings.

Microsoft is preparing to integrate OpenAI's language AI technology and its Prometheus Model into its core productivity apps like Word, PowerPoint, and Outlook. Link.

  • The company is planning to announce its productivity plans in March, highlighting how quickly it wants to reinvent search and productivity apps through OpenAI investments.

  • Microsoft is already using OpenAI tech for its Viva Sales emails, and GPT models have been tested in Outlook to improve search results and suggesting replies to emails.

  • Microsoft's new Prometheus Model has already transformed Bing web searches, and the next steps to integrate this functionality into core Microsoft Office apps and Teams will test just how confident Microsoft is in its AI work.

  • Microsoft is moving quickly with this integration mainly because of Google, which announced its ChatGPT rival Bard a day ahead of Microsoft's event.

  • The push for AI across Microsoft's products is driven by the consumer response to ChatGPT, and Microsoft wants to seize the AI opportunity throughout 2023.

Jasper for Business is a suite of AI enhancements that allow businesses to adopt AI without compromising their brand identity or workflow. Link.

  • Jasper Brand Voice is a tool that allows businesses to customize their AI-powered content by training Jasper on their brand's language, tone, and style.

  • With the improved browser extension and API, businesses can use Jasper anywhere they work and integrate it into their custom CMS.

  • New collaboration features allow teams to work together in real-time and manage their content creation workflows.

  • Jasper's AI Engine selects the right model for the job across a collection of LLMs, overlays it with reliable citations and recent searches, and infuses the output with brand and product information for tailored and optimized marketing results.

  • Jasper promises to continue innovating to integrate new possibilities and help businesses run better.

GPT Tip of the Week

Prompt: What are some best practices for ensuring that AI models receive a diverse and representative training data set?

Response: When training AI models, it is important to ensure that the data used to train the model is diverse and representative of the problem space the model will operate in. This can be achieved by considering the following factors:

  1. Data Source: Consider using multiple data sources to ensure a diverse range of examples. This can include data from publicly available datasets, data generated through simulation, and data collected specifically for the task at hand.

  2. Data Annotation: Ensure that data annotation is performed by a diverse group of annotators to reduce the chances of introducing biases. Additionally, perform spot-checks to ensure that the annotations are accurate and consistent.

  3. Data Balancing: Ensure that the data is balanced in terms of class distribution and demographic representation. For example, if the model will be used to detect objects in images, ensure that the data includes examples of objects from all classes and that these examples are evenly distributed.

  4. Data Privacy: When collecting data specifically for the task at hand, ensure that the data collected respects privacy laws and regulations, and that all individuals whose data is used have provided informed consent.

By considering these factors, you can increase the chances of developing an AI model that is effective and representative of the real-world problem it is designed to solve.