#207 - GPT 4.1, Gemini 2.5 Flash, Ironwood, Claude Max

Af
- Skynet Today
Episode
Published
Forlag
- Skynet Today

0 Anmeldelser: 0
Episode: 247 of 259
Længde: 1T 42M
Sprog: Engelsk
Format
Kategori: Fakta

Our 207th episode with a summary and discussion of last week's big AI news! Recorded on 04/14/2025

Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

Join our Discord here! https://discord.gg/nTyezGSKwP

In this episode:

• OpenAI introduces GPT-4.1 with optimized coding and instruction-following capabilities, featuring variants like GPT-4.1 Mini and Nano, and a million-token context window.

• Concerns arise as OpenAI reduces resources for safety testing, sparking internal and external criticisms.

• XAI's newly launched API for Grok 3 showcases significant capabilities comparable to other leading models.

• Meta faces allegations of aiding China in AI development for business advantages, with potential compliances and public scrutiny looming.

Timestamps + Links:

• Tools & Apps

• (00:03:13) OpenAI’s new GPT-4.1 AI models focus on coding • (00:08:12) ChatGPT will now remember your old conversations • (00:11:16) Google’s newest Gemini AI model focuses on efficiency • (00:14:27) Elon Musk’s AI company, xAI, launches an API for Grok 3 • (00:18:35) Canva is now in the coding and spreadsheet business • (00:20:31) Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark •

• Applications & Business

• (00:25:46) Ironwood: The first Google TPU for the age of inference • (00:34:15) Anthropic rolls out a $200-per-month Claude subscription • (00:37:17) OpenAI co-founder Ilya Sutskever’s Safe Superintelligence reportedly valued at $32B • (00:40:20) Mira Murati’s AI startup gains prominent ex-OpenAI advisers • (00:42:52) Hugging Face buys a humanoid robotics startup • (00:44:58) Stargate developer Crusoe could spend $3.5 billion on a Texas data center. Most of it will be tax-free. •

• Projects & Open Source

• (00:48:14) OpenAI Open Sources BrowseComp: A New Benchmark for Measuring the Ability for AI Agents to Browse the Web •

• Research & Advancements

• (00:56:09) Sample, Don't Search: Rethinking Test-Time Alignment for Language Models • (01:03:32) Concise Reasoning via Reinforcement Learning • (01:09:37) Going beyond open data – increasing transparency and trust in language models with OLMoTrace • (01:15:34) Independent evaluations of Grok-3 and Grok-3 mini on our suite of benchmarks •

• Policy & Safety

• (01:17:58) OpenAI countersues Elon Musk, calls for enjoinment from ‘further unlawful and unfair action’ • (01:24:33) OpenAI slashes AI model safety testing time • (01:27:55) Ex-OpenAI staffers file amicus brief opposing the company’s for-profit transition • (01:32:25) Access to future AI models in OpenAI’s API may require a verified ID • (01:34:53) Meta whistleblower claims tech giant built $18 billion business by aiding China in AI race and undermining U.S. national security •

Previous Episode Next Episode