Hello Engineering Leaders and AI Enthusiasts!

Another eventful week in the AI realm. Lots of big news from huge enterprises.

In today’s edition:

🚀 OpenAI reveals new models, drops prices, and fixes ‘lazy’ GPT-4
🌌 Prophetic company wants AI to enter your dreams
📊 The recent advances in Multimodal LLM
🔝 Meta released Code Llama 70B, rivals GPT-4
🧠 Neuralink implants its brain chip in the first human
🌐 Alibaba announces Qwen-VL; beats GPT-4V and Gemini
🔍 Microsoft’s annual report predicts AI’s impact on jobs
✨ Amazon’s new AI helps enhance virtual try-on experiences
🤖 Cambridge’s braille-reading robot is 2x faster than humans
🛍️ Shopify boosts its commerce platform with AI enhancements
🚫 OpenAI explores how good GPT-4 is at creating bioweapons
💡 LLaVA-1.6: Improved reasoning, OCR, and world knowledge
🔥 Google bets big on AI with huge upgrades
🛒 Amazon launches an AI shopping assistant for mobile app
💼 Meta to deploy in-house chips to reduce dependence on NVIDIA

Let’s go!

OpenAI reveals new models, drops prices, and fixes ‘lazy’ GPT-4

OpenAI announced a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and lower pricing on GPT-3.5 Turbo.

The new models include:

2 new embedding models

An updated GPT-4 Turbo preview model 

An updated GPT-3.5 Turbo model

An updated text moderation model

Source 

Also:

Updated text moderation model

Introducing new ways for developers to manage API keys and understand API usage

Quietly implemented a new ‘GPT mentions’ feature to ChatGPT (no official announcement yet). The feature allows users to integrate GPTs into a conversation by tagging them with an ‘@.’ 

Source 

Prophetic wants AI to enter your dreams

Prophetic introduces Morpheus-1, the world’s 1st ‘multimodal generative ultrasonic transformer’. This innovative AI device is crafted with the purpose of exploring human consciousness through controlling lucid dreams. Morpheus-1 monitors sleep phases and gathers dream data to enhance its AI model. 

The device is set to be accessible to beta users in the spring of 2024. 

You can Sign up for their beta program here.

Source

The recent advances in Multimodal LLM

This paper ‘MM-LLMs’ discusses recent advancements in MultiModal LLMs which combine language understanding with multimodal inputs or outputs. The authors provide an overview of the design and training of MM-LLMs, introduce 26 existing models, and review their performance on various benchmarks.

(Above is the timeline of MM-LLMs)

They also share key training techniques to improve MM-LLMs and suggest future research directions. 

Source

Meta released Code Llama 70B, rivals GPT-4

Meta released Code Llama 70B, a new, more performant version of its LLM for code generation. It is available under the same license as previous Code Llama models–

CodeLlama-70B

CodeLlama-70B-Python

CodeLlama-70B-Instruct

CodeLlama-70B-Instruct achieves 67.8 on HumanEval, making it one of the highest-performing open models available today. CodeLlama-70B is the most performant base for fine-tuning code generation models.

Source

Neuralink implants its brain chip in the first human

In a first, Elon Musk’s brain-machine interface startup, Neuralink, has successfully implanted its brain chip in a human. In a post on X, he said “promising” brain activity had been detected after the procedure and the patient was “recovering well”. In another post, he added:

The company’s goal is to connect human brains to computers to help tackle complex neurological conditions. It was given permission to test the chip on humans by the FDA in May 2023.

Source

Alibaba announces Qwen-VL; beats GPT-4V and Gemini

Alibaba’s Qwen-VL series has undergone a significant upgrade with the launch of two enhanced versions, Qwen-VL-Plus and Qwen-VL-Max. The key technical advancements in these versions include

Substantial boost in image-related reasoning capabilities;

Considerable enhancement in recognizing, extracting, and analyzing details within images and texts contained therein;

Support for high-definition images with resolutions above one million pixels and images of various aspect ratios.

Compared to the open-source version of Qwen-VL, these two models perform on par with Gemini Ultra and GPT-4V in multiple text-image multimodal tasks, significantly surpassing the previous best results from open-source models.

Source

Microsoft’s annual report predicts AI’s impact on jobs

Microsoft released its annual ‘Future of Work 2023’ report with a focus on AI. It highlights the 2 major shifts in how work is done in the past three years, driven by remote and hybrid work technologies and the advancement of Gen AI. This year’s edition focuses on integrating LLMs into work and offers a unique perspective on areas that deserve attention.

Source

Amazon’s new AI helps enhance virtual try-on experiences

Amazon researchers have developed the “Diffuse to Choose” AI tool. It’s a new image inpainting model that combines the strengths of diffusion models and personalization-driven models; It allows customers to virtually place products from online stores into their homes to visualize fit and appearance in real-time.

Source

Cambridge researchers develop braille-reading robots 2x faster than humans

Cambridge researchers developed a robotic sensor reading braille 2x faster than humans. The sensor, which incorporates AI techniques, was able to read braille at 315 words per minute with 90% accuracy. 

While the robot was not developed as an assistive technology, the high sensitivity required to read braille makes it ideal for testing the development of robot hands or prosthetics with comparable sensitivity to human fingertips. The researchers used ML algorithms to train the robotic sensor to quickly slide over lines of braille text and accurately recognize the letters.

Source

Enjoying the weekly updates?

Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you’ll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Shopify boosts its commerce platform with AI enhancements

Shopify unveiled over 100 new updates to its commerce platform, with AI emerging as a key theme. The new AI-powered capabilities are aimed at helping merchants work smarter, sell more, and create better customer experiences.

The headline feature is Shopify Magic, which applies different AI models to assist merchants in various ways. This includes automatically generating product descriptions, FAQ pages, and other marketing copy.

On the marketing front, Shopify is infusing its Audiences ad targeting tool with more AI to optimize campaign performance. Its new semantic search capability better understands search intent using natural language processing. 

Source

OpenAI explores how good GPT-4 is at creating bioweapons

OpenAI is developing a blueprint for evaluating the risk that a large language model (LLM) could aid someone in creating a biological threat. 

In an evaluation involving both biology experts and students, it found that GPT-4 provides at most a mild uplift in biological threat creation accuracy. While this uplift is not large enough to be conclusive, the finding is a starting point for continued research and community deliberation.

Source

LLaVA-1.6: Improved reasoning, OCR, and world knowledge

LLaVA-1.6 releases with improved reasoning, OCR, and world knowledge. It even exceeds Gemini Pro on several benchmarks. Compared with LLaVA-1.5, LLaVA-1.6 has several improvements:

Increasing the input image resolution to 4x more pixels. 

Better visual reasoning and OCR capability with an improved visual instruction tuning data mixture.

Better visual conversation for more scenarios, covering different applications.Better world knowledge and logical reasoning.

Efficient deployment and inference with SGLang.

Along with performance improvements, LLaVA-1.6 maintains the minimalist design and data efficiency of LLaVA-1.5. The largest 34B variant finishes training in ~1 day with 32 A100s.

Source

Google rolls out huge AI upgrades:

1. Launches an AI image generator – ImageFX

It allows users to create and edit images using a prompt-based UI. It offers an “expressive chips” feature, which provides keyword suggestions to experiment with different dimensions of image creation. Google claims to have implemented technical safeguards to prevent the tool from being used for abusive or inappropriate content.

Additionally, images generated using ImageFX will be tagged with a digital watermark called SynthID for identification purposes. Google is also expanding the use of Imagen 2, the image model, across its products and services.

(Source)

2. Google has released two new AI tools for music creation: MusicFX and TextFX

MusicFX generates music based on user prompts but has limitations with stringed instruments and filters out copyrighted content.

TextFX, conversely, is a suite of modules designed to aid in the lyrics-writing process, drawing inspiration from rap artist Lupe Fiasco. 

(Source)

3. Google’s Bard is now Gemini Pro-powered globally, supporting 40+ languages
The chatbot will have improved understanding and summarizing content, reasoning, brainstorming, writing, and planning capabilities. Google has also extended support for more than 40 languages in its “Double check” feature, which evaluates if search results are similar to what Bard generates. 

(Source)

4. Google’s Bard can now generate photos using its Imagen 2 text-to-image model
Bard’s image generation feature is free, and Google has implemented safety measures to avoid generating explicit or offensive content.

(Source)

5. Google Maps introduces a new AI feature to help users discover new places
The feature uses LLMs to analyze over 250M locations and contributions from over 300M Local Guides. Users can search for specific recommendations, and the AI will generate suggestions based on their preferences. It’s currently being rolled out in the US.

(Source)

Amazon launches an AI shopping assistant for product recommendations

Amazon has launched an AI-powered shopping assistant called Rufus in its mobile app. Rufus is trained on Amazon’s product catalog and information from the web, allowing customers to chat with it to get help with finding products, comparing them, and getting recommendations.

The AI assistant will initially be available in beta to select US customers, with plans to expand to more users in the coming weeks. Customers can type or speak their questions into the chat dialog box, and Rufus will provide answers based on their training.

Source

Meta to deploy custom in-house chips to reduce dependence on costly NVIDIA

Meta plans to deploy a new version of its custom chip aimed at supporting its AI push in its data centers this year, according to an internal company document. The chip, a second generation of Meta’s in-house silicon line, could help reduce the company’s dependence on Nvidia chips and control the costs associated with running AI workloads. The chip will work in coordination with commercially available graphics processing units (GPUs).

Source

That’s all for now!

Subscribe to The AI Edge and gain exclusive access to content enjoyed by professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other esteemed organizations.

Subscribe now

Thanks for reading, and see you on Monday. 😊

Read More in  The AI Edge