Today in AI – 11-29-2025

AI and automation news montage: model weights on a code repo, a social media feed ranked by AI, and a mobile screen showing text-to-video generation

KEY STORIES (past one to two days)

DeepSeek drops open‑weight math model that hits IMO gold — and publishes the playbook

Chinese lab DeepSeek released DeepSeekMath‑V2, an open‑weights reasoning model that reports gold‑medal‑level performance on the 2025 International Mathematical Olympiad (IMO) and near‑perfect Putnam 2024 results (118/120). Unusually, DeepSeek published a detailed “generator–verifier–meta‑verifier” approach aimed at self‑verifiable theorem proving and posted weights under an Apache‑2.0 license on Hugging Face, along with a technical report on GitHub. For practitioners, this is a notable proof‑point that open‑weights models can approach state‑of‑the‑art reasoning — and a chance to study, replicate, or adapt the training recipe.

Why it matters:

Open research signal: DeepSeek shared architecture, evaluation details (IMO‑ProofBench), and code/outputs, giving applied teams concrete ideas to test (e.g., verifier‑driven reward shaping and scaled test‑time compute).
Enterprise implication: If results generalize beyond math, verifier‑guided workflows could harden agent reliability for regulated domains (e.g., finance, engineering), where auditable reasoning is required.

X hands its Following feed to Grok AI by default — here’s what changes

Full story on ThinkAutomated.io

X (formerly Twitter) began rolling out a change that lets Grok — xAI’s model — rank posts in the “Following” timeline. Users can still switch back to an unfiltered chronological feed, but AI ranking becomes the default after app updates. For brands and media, expect shifts in organic reach dynamics; for power users, relevance‑weighted timelines may surface different conversations than pure recency.

Why it matters:

Platform governance: This is a mainstream deployment of an agentic model to curate public discourse by default — a material policy and UX change for distribution.
Measurement: Social teams should re‑baseline engagement analytics and test post formats/topics that Grok appears to favor in ranking.

xAI adds text‑to‑video to Grok Imagine on mobile

Full story on ThinkAutomated.io

xAI is rolling out a Grok Imagine update that generates short videos directly from a text prompt on iOS and Android, with quick switching between text‑to‑image and text‑to‑video. Early user demos show short, stylized clips, suggesting consumer chat apps are fast becoming entry points for everyday generative video. For growth teams, this widens the top‑of‑funnel for video creation without dedicated tools.

Why it matters:

Feature parity race: As video generation diffuses from specialist tools to general assistants, teams should anticipate rising expectations for multimodal outputs in customer‑facing experiences.

EMERGING TRENDS

Verifier‑guided reasoning goes open‑weights DeepSeek’s release is a concrete signal that generator–verifier architectures (plus heavy test‑time sampling) are moving from papers to downloadable weights. Expect more agents to embed formal “critique and fix” loops (not just chain‑of‑thought), especially for code, finance, and safety‑critical workflows. Early evidence: IMO‑ProofBench, Putnam scores, and public repo artifacts. Impact: higher reliability and auditability for enterprise agents.
AI as default curator of social feeds X’s shift makes AI ranking the default in a previously chronological space. This accelerates a pattern: recommender agents will increasingly arbitrate distribution on public platforms, with opt‑out controls retained but de‑emphasized. Impact: content strategies pivot from posting cadence to “relevance signals” tuned to model objectives.
Generative video moves into chat UX Grok’s text‑to‑video on mobile underscores an industry drift: video generation landing inside messaging/assistants, not just creative suites. Expect short, template‑driven clips to proliferate in support, commerce, and UGC flows. Evidence: app‑level feature rollout and user demos in the wild.

CONVERSATIONS & INSIGHTS

“Open weights, open playbook?” — community reaction to DeepSeekMath‑V2 Where: Techmeme threads, X, Hacker News, Reddit. Voices: The Decoder’s analysis, open‑source maintainers, researchers debating self‑verifiable reasoning vs. final‑answer RL, and whether the results generalize beyond math. Takeaway: The combination of claimed state‑of‑the‑art results plus open artifacts is fueling optimism around fast‑follower innovation — and scrutiny of benchmark design and reproducibility.
“Do we trust AI to rank the public square?” — X’s Grok default sparks debate Where: Tech and mainstream press coverage, user posts. Voices: Platform watchers and social leads weighing better relevance against concerns over opacity and newsfeed manipulation; operationally, many point to the still‑available chronological toggle. Takeaway: Teams active on X should re‑test publishing patterns, diversify channels, and monitor how Grok treats replies, links, and media.
“Text‑to‑video everywhere” — early Grok Imagine trials Where: User demos and community chats. Voices: Creators sharing iOS/Android clips; others noting staggered rollout and short clip limits. Takeaway: Lightweight video prompts inside assistants could shift experimentation from specialist apps to daily messaging contexts — expect rapid iteration on length, style controls, and licensing.

QUICK TAKEAWAYS

If you ship agent workflows, study DeepSeek’s verifier loop; consider adding an internal “critic-and-fix” step before final output for sensitive tasks.
For social reach on X, assume an AI‑ranked default. Test post structure (hook, media, link placement) and measure per‑topic lift under Grok ranking vs. chronological.
Prepare for “video by prompt” asks from stakeholders; map where short AI clips can accelerate onboarding, help, or marketing experiments inside your app.