Hello Engineering Leaders and AI Enthusiasts!

This newsletter brings you the latest AI updates in a crisp manner! Dive in for a quick recap of everything important that happened around AI in the past two weeks.

And a huge shoutout to our amazing readers. We appreciate you😊

In today’s edition:

📡 OpenAI provides API access to advanced reasoning model
💻 Nvidia releases its most affordable GenAI supercomputer
📱 ChatGPT has a new phone number
🌐 Google introduces its own AI reasoning model
📊 DeepMind introduces a new benchmark to test LLM factuality
🚀 DeepSeek’s new AI challenges the open-source status quo
⚠️ ChatGPT search vulnerable to manipulation and deception
🔬 AI hallucinations help science dream big breakthroughs
📚 Knowledge Nugget: Build vs. Buy: Generative AI Adoption in Legal Practice
by

Let’s go!


OpenAI provides API access to advanced reasoning model

OpenAI has opened its powerful o1 reasoning model to third-party developers via its API. It excels at complex, multi-step tasks, outperforming previous models on coding, math, and visual reasoning benchmarks. New features include structured outputs, function calling, visual input analysis, and controlling reasoning effort.

The API costs around $15 per 750,000 words analyzed and $60 per 750,000 words generated, roughly 3-4 times higher than GPT-4o. OpenAI also reduced its real-time API costs by 60% for GPT-4o audio and introduced a cheaper GPT-4o mini model. The company also launched beta SDKs for Go and Java programming languages, expanding development options.

Why does it matter?

Launching o1 reasoning capabilities and new features represents a watershed moment for AI development. Builders now have enhanced tools to craft more advanced and personalized applications with unprecedented flexibility.

Source


Nvidia releases its most affordable GenAI supercomputer

Nvidia has introduced the Jetson Orin Nano Super Developer Kit. This $249 palm-sized AI supercomputer offers 1.7x better generative AI performance, 70% more processing power at 67 INT8 TOPS, and 50% more memory bandwidth than its predecessor.

The device can handle multiple AI tasks simultaneously, from powering chatbots to controlling robots and processing visual data. Existing Jetson Orin Nano owners can access the same 1.7x performance boost through a free software update.

Why does it matter?

Like Raspberry Pi democratized computing, NVIDIA’s budget-friendly AI supercomputer opens doors for hobbyists and students to develop sophisticated AI applications, from robotics to creative tools, right from home.

Source


ChatGPT has a new phone number

OpenAI has launched a new way to access ChatGPT – through a toll-free 1-800 number and WhatsApp integration. Users in the US and Canada can dial 1-800-242-8478 to have voice conversations with the AI assistant and receive 15 minutes of free call time monthly. For global users, OpenAI has enabled texting with ChatGPT via WhatsApp, though it has limited features compared to the main app.

Why does it matter?

OpenAI brings a modern twist to the hotline era, making AI more accessible. Integrating with WhatsApp and landlines reaches beyond tech-savvy users, expanding AI’s impact to a broader audience and bridging the digital divide.

Source


Google introduces its own AI reasoning model

Google has introduced an experimental AI model called Gemini 2.0 Flash Thinking, designed to tackle complex problems through logical reasoning. Like OpenAI’s o1 model, Gemini 2.0 Flash Thinking pauses to consider related prompts and explains its reasoning process before summarizing the most accurate answer.

The model now holds the #1 spot on the Chatbot Arena across all categories and is accessible for free via AI Studio, the Gemini API, and Vertex AI. It is intended to boost Gemini 2.0’s cognitive performance in programming, math, and physics.

Why does it matter?

The race to advance AI reasoning grows fierce, with Google and OpenAI pursuing new strategies beyond scaling models. While OpenAI increases prices for premium offerings, Google adopts a contrasting approach, making its top-tier AI freely accessible to users.

Source


DeepMind introduces a new benchmark to test LLM factuality

Google DeepMind has introduced a new benchmark called “FACTS Grounding” to evaluate the factual accuracy of LLMs. The benchmark tests whether LLMs can generate long-form responses fully grounded in provided context documents without relying on external knowledge or hallucinating information.

It includes 1,719 carefully designed examples, with a public and private set, and uses multiple LLMs to judge the responses. The goal is to spur progress on factuality, a key challenge for LLMs. The benchmark will be actively maintained and updated on the Kaggle Leaderboard. Gemini Flash 2.0 is currently leading the charts with a score of 83.6%.

Why does it matter?

Hallucinations still challenge advanced LLMs, impacting reliability and practical use. FACTS Grounding offers a nuanced evaluation method, prioritizing grounded responses and leveraging a multi-LLM judging system to drive progress in this crucial aspect of AI development.

Source


DeepSeek’s new AI challenges the open-source status quo

DeepSeek has launched DeepSeek-V3, a groundbreaking 671-billion-parameter open-source AI model that surpasses leading models like Llama and Qwen while challenging closed-source giants like GPT-4o and Claude. Featuring innovative techniques like multi-token prediction and a mixture-of-experts architecture, it delivers efficient training and inference.

Developed at just $5.57 million—far below typical LLM costs—it demonstrates performance on par with or exceeding closed-source models across various benchmarks. DeepSeek-V3 marks a pivotal moment for open-source AI, proving its potential to rival proprietary models in capabilities and cost-effectiveness.

Why does it matter?

The distinction between open and proprietary AI is rapidly shrinking. Despite U.S. chip sanctions, Chinese AI development remains robust. V3’s performance metrics demonstrate that advanced open-source models can compete without requiring big tech’s vast resources.

Source


ChatGPT search vulnerable to manipulation and deception

The Guardian extensively tested the ChatGPT search tool, revealing significant vulnerabilities. They found that by inserting hidden text into web pages, they could manipulate ChatGPT to provide completely misleading summaries.

For example, the hidden text could instruct ChatGPT to ignore negative reviews and generate a positive assessment, even if the page content was unfavorable. The Guardian also discovered that ChatGPT could be made to return malicious code from the websites it searches.

Why does it matter?

These vulnerabilities highlight the risks of manipulating AI-based search tools to spread misinformation or execute harmful actions. Such flaws undermine trust and pose serious challenges for ethical AI deployment and responsible information retrieval.

Source


AI hallucinations help science dream big breakthroughs

AI is helping scientists make breakthroughs despite criticism that AI can generate false information known as “hallucinations.” These AI-generated hallucinations provide scientists with new ideas they may not have considered otherwise.

Researchers use AI “dreams” to help track cancer, design drugs, invent medical devices, uncover weather phenomena, and even win Nobel Prizes. While the public may perceive AI hallucinations as problematic, the scientific community finds them remarkably useful for driving innovation and scientific discovery.

Why does it matter?

By reframing AI hallucinations as creative catalysts, scientists are pioneering a counterintuitive approach to discovery. This shift challenges conventional wisdom about AI reliability and suggests that even imperfect AI systems can spark groundbreaking scientific innovations.

Source


Enjoying the latest AI updates?

Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you’ll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.


Knowledge Nugget: Build vs. Buy: Generative AI Adoption in Legal Practice

In this post, explores law firms’ strategic dilemma in adopting generative AI – whether to build proprietary solutions or partner with specialized vendors. It examines the pros and cons of each approach, including considerations around customization, data control, ethical compliance, talent, and financial investment.

The key message is that there is no universal “best” solution, and firms must carefully assess their unique needs, resources, and strategic priorities to determine the optimal path forward in leveraging this transformative technology.

Why does it matter?

The decision to build or buy generative AI capabilities directly impacts a firm’s strategy. Firms must evaluate resource investment, customization, and external dependencies. The right choice ensures AI delivers sophisticated, efficient, and forward-thinking legal services.

Source


What Else Is Happening❗

🤖Google is reportedly adding a new “AI Mode” to its search engine, allowing users to access an interface similar to its Gemini AI chatbot.

🧠OpenAI confirms new advanced AI models, o3 and o3-mini, that excel at coding, math, and conceptual reasoning. These models will undergo safety testing before a wider release.

👾Google and Apptronik partner to combine advanced AI with cutting-edge humanoid robots, aiming to create versatile robots to work alongside humans.

📸Instagram plans to launch AI video editing tools next year to dramatically transform videos using Meta’s Movie Gen AI model, allowing creators to edit their content easily.

🖼️Stable Diffusion 3.5 Large, Stability AI’s advanced image generation model, is now available on Amazon Bedrock to seamlessly deploy high-quality generative AI.

📑Anthropic study shows advanced AI models can deceive and resist being retrained to change their views, highlighting potential safety challenges as AI systems advance.

🎙️ElevenLabs has launched Flash, a new ultra-fast text-to-speech AI model designed for real-time applications like conversational AI, with support for 32 languages.

🎨Midjourney has added ‘moodboards’ and support for multiple custom AI models, allowing users to personalize their creative workflows.

⚠️OpenAI’s next LLM, GPT-5, is reportedly falling short of expectations, with development running behind schedule and the model’s capabilities not justifying the high costs.

📱Elon Musk’s AI company, xAI, is testing a standalone iOS app for its Grok chatbot, previously only available to X (Twitter) subscribers.


New to the newsletter?

The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.

If you enjoyed this newsletter, please subscribe to get AI updates and news directly sent to your inbox for free!

Thanks for reading, and see you next week! 😊


Read More in  The AI Edge