Episode 123

I spoke with Suhail Doshi about:

Why benchmarks aren’t prepared for tomorrow’s AI models

How he thinks about artists in a world with advanced AI tools

Building a unified computer vision model that can generate, edit, and understand pixels.

Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they’re hiring!).

Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSS
Follow The Gradient on Twitter

Subscribe now

Outline:

(00:00) Intro

(00:54) Ad read — MLOps conference

(01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music

(03:45) AI and music, similarities to Playground

(07:50) Skill vs. creative capacity in art

(12:43) What we look for in music and art

(15:30) Enabling creative expression

(18:22) Building a unified computer vision model, underinvestment in computer vision

(23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires

(29:05) “Benchmarks are not prepared for how powerful these models will become”

(31:56) Personalized models and personalized benchmarks

(36:39) Engaging users and benchmark development

(39:27) What a foundation model for graphics requires

(45:33) Text-to-image is insufficient

(46:38) DALL-E 2 and Imagen comparisons, FID

(49:40) Compositionality

(50:37) Why Playground focuses on images vs. 3d, video, etc.

(54:11) Open source and Playground’s strategy

(57:18) When to stop open-sourcing?

(1:03:38) Suhail’s thoughts on AGI discourse

(1:07:56) Outro

Links:

Playground homepage

Suhail on Twitter

Read More in  The Gradient