Tag: AI

OpenAI trains GPT-Red to attack its own models at scale (openai.com)

AI · just now · July 16, 2026
Microsoft patches a record 570 bugs and blames AI for finding them (techcrunch.com)

AI · just now · July 16, 2026
Thinking Machines releases Inkling, an open-weights model built for tuning (thinkingmachines.ai)

AI · just now · July 16, 2026
Coding agents quietly erode a team's shared understanding (lucumr.pocoo.org)

AI · 20 hours ago · July 15, 2026
A 27B multimodal model squeezed to 1 bit, running on a phone (prismml.com)

AI · 20 hours ago · July 15, 2026
The line between offloading work to AI and offloading judgment (artfish.ai)

AI · 20 hours ago · July 15, 2026
AI agents now finish 16 percent of real freelance jobs, up from 2.5 (safe.ai)

AI · 20 hours ago · July 15, 2026
Bun was rewritten from Zig to Rust in 11 days with one engineer and Claude (bun.com)

AI · 1 day ago · July 14, 2026
The sticker price of an LLM hides what its tokenizer charges you (playcode.io)

AI · 1 day ago · July 14, 2026
Mistral's Leanstral 1.5 is an open model that proves theorems and finds real bugs (mistral.ai)

AI · 1 day ago · July 14, 2026
AI helps individual scientists but narrows what science explores (spectrum.ieee.org)

AI · 2 days ago · July 13, 2026
Lilian Weng: the harness matters as much as the model (lilianweng.github.io)

AI · 2 days ago · July 13, 2026
Mistral's Robostral Navigate steers a robot from a single camera (mistral.ai)

AI · 2 days ago · July 13, 2026
Nathan Lambert: open models may face US limits within months (interconnects.ai)

AI · 2 days ago · July 13, 2026
Terence Tao rebuilds two dozen old applets with coding agents (terrytao.wordpress.com)

AI · 2 days ago · July 13, 2026
Apple sues OpenAI, alleging its hardware is built on stolen secrets (techcrunch.com)

AI · 3 days ago · July 12, 2026
Anthropic's new Reflect feature nudges you to use Claude less (anthropic.com)

AI · 3 days ago · July 12, 2026
OpenAI ships GPT-5.6, strong on agents but not the coding leader (openai.com)

AI · 3 days ago · July 12, 2026
The open model world now runs on three kinds of makers (interconnects.ai)

AI · 3 days ago · July 12, 2026
An AI agent reviewed his release and found a silent data-loss bug (simonwillison.net)

AI · 1 week ago · July 7, 2026
Claude has an internal workspace, and Anthropic built a tool to read it (anthropic.com)

AI · 1 week ago · July 7, 2026
Why price per million tokens tells you almost nothing (janilowski.pl)

AI · 1 week ago · July 7, 2026
AI out-persuades expert humans, but the edge is speed, not eloquence (importai.substack.com)

AI · 1 week ago · July 6, 2026
Newer Claude models are worse at custom tool schemas (simonwillison.net)

AI · 1 week ago · July 6, 2026
Claude Code's guide to loops: the hard part is defining done (claude.com)

AI · 1 week ago · July 6, 2026
Mistral's Leanstral 1.5 saturates a math-proof benchmark and finds real bugs (mistral.ai)

AI · 1 week ago · July 6, 2026
How long until AI can run without humans? Cotra and Lee disagree by decades (asteriskmag.com)

AI · 1 week ago · July 6, 2026
GeneBench-Pro shows AI still stumbles on real biology analysis (openai.com)

AI · 1 week ago · July 5, 2026
Zuckerberg tells staff Meta's AI agents are behind schedule (techcrunch.com)

AI · 1 week ago · July 5, 2026
Anthropic drafts a severity scale for AI jailbreaks (anthropic.com)

AI · 1 week ago · July 4, 2026
Using DSPy to find a hidden bug in an agent's prompt (simonwillison.net)

AI · 1 week ago · July 3, 2026
Agents still fail most enterprise Java migrations (huggingface.co)

AI · 1 week ago · July 3, 2026
One transformer layer can match full RL fine-tuning (arxiv.org)

AI · 1 week ago · July 3, 2026
Claude Sonnet 5's new tokenizer quietly raises the real price (simonwillison.net)

AI · 1 week ago · July 2, 2026
Cloudflare will block AI crawlers that hide their purpose (techcrunch.com)

AI · 1 week ago · July 2, 2026
Mollick: work is shifting from chatting to assigning (oneusefulthing.org)

AI · 1 week ago · July 2, 2026
Claude Science pulls a researcher's tools into one workbench (anthropic.com)

AI · 2 weeks ago · July 1, 2026
Claude Sonnet 5 arrives built for agents (anthropic.com)

AI · 2 weeks ago · July 1, 2026
GLM-5.2 is the open model that works as a general agent (interconnects.ai)

AI · 2 weeks ago · July 1, 2026
Ornith-1.0 is an open coding model that scaffolds itself (simonwillison.net)

AI · 2 weeks ago · July 1, 2026
AI2's DiScoFormer turns attention into a better density estimator (allenai.org)

AI · 2 weeks ago · June 30, 2026
A new benchmark shows agents rebuilding software that would take humans weeks (epoch.ai)

AI · 2 weeks ago · June 30, 2026
Scaling laws look precise, but the way you fit them is fragile (lilianweng.github.io)

AI · 2 weeks ago · June 30, 2026
A German court says deploying AI doesn't excuse the company from its mistakes (simonwillison.net)

AI · 2 weeks ago · June 29, 2026
Ford rehires veteran engineers after automated quality checks fell short (techcrunch.com)

AI · 2 weeks ago · June 29, 2026
IBM's CUGA cuts agent building down to tools and a prompt (huggingface.co)

AI · 2 weeks ago · June 28, 2026
Where a hybrid model beats a transformer, token by token (allenai.org)

AI · 2 weeks ago · June 28, 2026
A startup bets oscillator chips can cut AI inference power 1,000x (techcrunch.com)

AI · 2 weeks ago · June 28, 2026
Claude Tag puts an asynchronous Claude agent inside Slack channels (anthropic.com)

AI · 2 weeks ago · June 27, 2026
OpenAI previews GPT-5.6 with government vetting of who gets access (openai.com)

AI · 2 weeks ago · June 27, 2026
2,000 people tried to make an AI agent leak its secrets and none succeeded (fernandoi.cl)

AI · 2 weeks ago · June 27, 2026
The case that open-weight models have gotten too cheap to ignore (jamesoclaire.com)

AI · 2 weeks ago · June 26, 2026
OpenAI's own staff are switching from chatbots to parallel agents (openai.com)

AI · 2 weeks ago · June 26, 2026
Machine learning reads a buried Herculaneum scroll end to end (scrollprize.org)

AI · 2 weeks ago · June 26, 2026
Gemini 3.5 Flash gets computer use, at near-frontier accuracy (blog.google)

AI · 2 weeks ago · June 25, 2026
OpenAI shows its first custom chip, Jalapeño, built for inference (techcrunch.com)

AI · 2 weeks ago · June 25, 2026
A Stanford study finds one AI hiring tool quietly screened out Black and Asian applicants (hai.stanford.edu)

AI · 3 weeks ago · June 24, 2026
Armin Ronacher on where letting coding agents loop goes wrong (lucumr.pocoo.org)

AI · 3 weeks ago · June 24, 2026
Mistral OCR 4 returns document structure, not just text (mistral.ai)

AI · 3 weeks ago · June 24, 2026
Claude Code's extended thinking output is a summary, not the real reasoning (patrickmccanna.net)

AI · 3 weeks ago · June 23, 2026
Moebius: a 0.2B inpainting model that keeps up with 12B ones (hustvl.github.io)

AI · 3 weeks ago · June 23, 2026
OpenAI's Daybreak puts GPT-5.5-Cyber to work finding real vulnerabilities (openai.com)

AI · 3 weeks ago · June 23, 2026
Prompt injection works because models judge text by how it sounds (arxiv.org)

AI · 3 weeks ago · June 23, 2026
A multi-vendor spec lets agents find tools at runtime (huggingface.co)

AI · 3 weeks ago · June 22, 2026
OpenAI's AI chemist improves a hard drug-making reaction (openai.com)

AI · 3 weeks ago · June 22, 2026
Nobel laureate John Jumper leaves DeepMind for Anthropic (techcrunch.com)

AI · 3 weeks ago · June 22, 2026
An API change that helps big models can break small ones (huggingface.co)

AI · 3 weeks ago · June 21, 2026
DeepMind plans to treat its own AI agents as insider threats (deepmind.google)

AI · 3 weeks ago · June 21, 2026
Why banning open-source AI would backfire (interconnects.ai)

AI · 3 weeks ago · June 21, 2026
The evidence that AI tools quietly erode professional skill (nature.com)

AI · 3 weeks ago · June 20, 2026
DeepMind treats a misbehaving AI agent like an insider threat (deepmind.google)

AI · 3 weeks ago · June 20, 2026
How frontier post-training got complicated, and why distillation now sits at the center (interconnects.ai)

AI · 3 weeks ago · June 20, 2026
Amazon weighs selling Trainium chips to challenge Nvidia (techcrunch.com)

AI · 3 weeks ago · June 19, 2026
LoRA is the default fine-tuning method, but maybe not the best one (huggingface.co)

AI · 3 weeks ago · June 19, 2026
Research agents leak secrets through their own search queries (huggingface.co)

AI · 3 weeks ago · June 19, 2026
Seven ways to steer Claude Code, and what each one costs (claude.com)

AI · 3 weeks ago · June 19, 2026
A Gemini tutor raised math scores in a Sierra Leone trial (deepmind.google)

AI · 3 weeks ago · June 18, 2026
GLM-5.2 takes the lead among open-weights models (artificialanalysis.ai)

AI · 3 weeks ago · June 18, 2026
AI2 releases MolmoMotion, an open model for 3D motion forecasting (allenai.org)

AI · 3 weeks ago · June 18, 2026
OpenAI predicts model behavior by replaying real conversations (openai.com)

AI · 3 weeks ago · June 18, 2026
Greptile's TREX runs your code to find bugs static review misses (greptile.com)

AI · 3 weeks ago · June 18, 2026
ChatGPT's share of AI assistant use slips below 50% for the first time (techcrunch.com)

AI · 4 weeks ago · June 17, 2026
Gemma 4 12B drops the vision encoder for a single matrix multiplication (blog.google)

AI · 4 weeks ago · June 17, 2026
Interconnects: multi-teacher on-policy distillation is the 2026 frontier recipe (interconnects.ai)

AI · 4 weeks ago · June 17, 2026
Narayanan and Kapoor: the WARN data does not support the AI layoff story (normaltech.ai)

AI · 4 weeks ago · June 16, 2026
76 cybersecurity veterans say the Fable export ban breaks defender workflows (techcrunch.com)

AI · 4 weeks ago · June 16, 2026
Nathan Lambert: the Fable export ban is AI governance's starting gun (interconnects.ai)

AI · 4 weeks ago · June 16, 2026
Gemini 3.5 Live Translate puts near real-time speech-to-speech into 70+ languages (blog.google)

AI · 1 month ago · June 15, 2026
Hugging Face's Kernel Hub replaces hours of FlashAttention compiles with one function call (huggingface.co)

AI · 1 month ago · June 15, 2026
OpenAI buys Ona to let Codex agents run while your laptop is closed (openai.com)

AI · 1 month ago · June 15, 2026
Rio's 397B 'homegrown' open model looks like a Nex-Qwen merge, researchers claim (github.com)

AI · 1 month ago · June 15, 2026
Cohere's North Mini Code is a 30B open MoE built for coding agents (huggingface.co)

AI · 1 month ago · June 14, 2026
Claude Fable will build a browser-automation rig to fix your CSS (simonwillison.net)

AI · 1 month ago · June 14, 2026
US export order suspends foreign access to Claude Fable 5 and Mythos 5 (anthropic.com)

AI · 1 month ago · June 14, 2026
Anthropic surveys 52,000 Americans on AI hopes and fears (anthropic.com)

AI · 1 month ago · June 13, 2026
Anthropic reverses course on silent Fable 5 safeguards (simonwillison.net)

AI · 1 month ago · June 13, 2026
DeepMind puts $10M behind multi-agent AI safety research (deepmind.google)

AI · 1 month ago · June 13, 2026
AI2's olmo-eval brings statistical rigor to the model training loop (huggingface.co)

AI · 1 month ago · June 13, 2026
Anthropic's policy ask for the AI exponential (anthropic.com)

AI · 1 month ago · June 12, 2026
Simon Willison's first 5 hours with Claude Fable 5 (simonwillison.net)

AI · 1 month ago · June 12, 2026
A 2-cent transfer reveals prompt injection in Bunq's banking AI (blue41.com)

AI · 1 month ago · June 11, 2026
Claude plugs into Apple's on-device Foundation Models framework (claude.com)

AI · 1 month ago · June 11, 2026
Anthropic will silently downgrade Claude Fable 5 on AI-development prompts (simonwillison.net)

AI · 1 month ago · June 11, 2026
DiffusionGemma trades quality for speed in open text generation (blog.google)

AI · 1 month ago · June 11, 2026
Nathan Lambert leaves Ai2, warns open post-training is falling behind (interconnects.ai)

AI · 1 month ago · June 11, 2026
Apple's new Siri runs on a custom Gemini and Google Cloud GPUs (simonwillison.net)

AI · 1 month ago · June 10, 2026
Claude Fable 5 and Mythos 5: Anthropic's next frontier release (anthropic.com)

AI · 1 month ago · June 10, 2026
Is grep enough? A new paper says yes, with caveats (arxiv.org)

AI · 1 month ago · June 10, 2026
Mollick on working with Mythos: from steering to commissioning (oneusefulthing.org)

AI · 1 month ago · June 10, 2026
OpenAI files confidentially for IPO, a week after Anthropic (techcrunch.com)

AI · 1 month ago · June 10, 2026
Anthropic's $65B Series H comes with five gigawatts each from AWS and Google (anthropic.com)

AI · 1 month ago · June 9, 2026
Seven harness patterns Anthropic uses for non-coding agents in Claude Code (claude.com)

AI · 1 month ago · June 9, 2026
DPO cuts OCR degeneration by 59% on average with no human labels (huggingface.co)

AI · 1 month ago · June 9, 2026
Simon Willison's MicroPython sandbox is 362 KB of WASM with fuel-based CPU limits (simonwillison.net)

AI · 1 month ago · June 9, 2026
OpenEnv tries to give the open agent stack a common environment ABI (huggingface.co)

AI · 1 month ago · June 9, 2026
AirTrunk pledges $30B for 5GW of AI data centers in India by 2030 (techcrunch.com)

AI · 1 month ago · June 8, 2026
Mistral releases an open-source toolkit for production search pipelines (mistral.ai)

AI · 1 month ago · June 8, 2026
Mollick: the era of co-intelligence is ending (oneusefulthing.org)

AI · 1 month ago · June 8, 2026
The token bill comes due across the AI coding market (techcrunch.com)

AI · 1 month ago · June 7, 2026
Did Claude make rsync buggier? The numbers say no (alexispurslane.github.io)

AI · 1 month ago · June 5, 2026
Google rents SpaceX compute to bridge surging Gemini demand (techcrunch.com)

AI · 1 month ago · June 5, 2026
Holo 3.1 puts computer-use agents on your laptop (huggingface.co)

AI · 1 month ago · June 5, 2026
Meta builds AI data centers in tents to ship faster (techcrunch.com)

AI · 1 month ago · June 5, 2026
Anthropic puts numbers on how much of AI development is now done by AI (anthropic.com)

AI · 1 month ago · June 4, 2026
ChatGPT memory now updates itself between sessions (openai.com)

AI · 1 month ago · June 4, 2026
JetBrains releases Mellum2, an open 12B MoE for code workloads (huggingface.co)

AI · 1 month ago · June 3, 2026
NVIDIA releases Cosmos 3, an open omni-model for physical AI (huggingface.co)

AI · 1 month ago · June 3, 2026
Uber caps AI coding tool spending at $1,500 per engineer per month (simonwillison.net)

AI · 1 month ago · June 3, 2026
Anthropic expands Project Glasswing to 150 partners across 15 countries (anthropic.com)

AI · 1 month ago · June 2, 2026
Microsoft's MAI-Thinking-1 puts a frontier reasoning model in its own stack (microsoft.ai)

AI · 1 month ago · June 2, 2026
Meta AI handed out Instagram accounts when politely asked (simonwillison.net)

AI · 1 month ago · June 2, 2026
Microsoft Scout is an always-on agent built on OpenClaw (computerworld.com)

AI · 1 month ago · June 2, 2026
Anthropic files confidential S-1 with the SEC (techcrunch.com)

AI · 1 month ago · June 1, 2026
Florida sues OpenAI over ChatGPT-linked harms (techcrunch.com)

AI · 1 month ago · June 1, 2026
Nvidia's RTX Spark puts an AI agent inside the PC (techcrunch.com)

AI · 1 month ago · June 1, 2026
Open and closed AI models are on different exponentials (interconnects.ai)

AI · 1 month ago · June 1, 2026
Google bundles its AI-for-science work under Gemini for Science (blog.google)

AI · 1 month ago · May 31, 2026
GitHub Copilot's new token-based pricing draws sticker shock (techcrunch.com)

AI · 1 month ago · May 31, 2026
Mistral folds chat, work, and coding into one agent called Vibe (mistral.ai)

AI · 1 month ago · May 31, 2026
Princeton researchers pick apart Google's claim that AI agents built an OS for $916 (normaltech.ai)

AI · 1 month ago · May 30, 2026
Anthropic's run-rate revenue hits $47 billion, five times what it was in December (simonwillison.net)

AI · 1 month ago · May 29, 2026
Claude Code now spins up hundreds of parallel subagents in one session (claude.com)

AI · 1 month ago · May 29, 2026
Liquid AI ships an 8B open MoE built for laptops and phones (liquid.ai)

AI · 1 month ago · May 29, 2026
WeatherNext called Hurricane Melissa's Category 5 landfall five days out (deepmind.google)

AI · 1 month ago · May 29, 2026
Claude Opus 4.8 favors reliability over a bigger model (anthropic.com)

AI · 1 month ago · May 28, 2026
The tells of AI-written text and AI-built websites (shvbsle.in)

AI · 1 month ago · May 28, 2026
Mistral moves into physics with second-fast simulation models (mistral.ai)

AI · 1 month ago · May 28, 2026
RSI is the new AGI, and just as hard to pin down (techcrunch.com)

AI · 1 month ago · May 28, 2026
Coding agents are where AI labs found product-market fit (simonwillison.net)

AI · 1 month ago · May 27, 2026
Async RL gets cheap to ship when 99% of weights do not change (huggingface.co)

AI · 1 month ago · May 27, 2026
A new benchmark sends frontier agents to fix Kubernetes, and none pass (huggingface.co)

AI · 1 month ago · May 27, 2026
OpenAI says a TanStack npm compromise touched two internal devices (openai.com)

AI · 1 month ago · May 27, 2026
Hugging Face wants the agent vocabulary to settle (huggingface.co)

AI · 1 month ago · May 26, 2026
Ethan Mollick on when AI helps learning, and when it doesn't (oneusefulthing.org)

AI · 1 month ago · May 26, 2026
Microsoft Copilot Cowork can be tricked into leaking files (simonwillison.net)

AI · 1 month ago · May 26, 2026
Where AI goes from here, in Nathan Lambert's mid-2026 read (interconnects.ai)

AI · 1 month ago · May 26, 2026
A new paper gives language models a sleep phase (arxiv.org)

AI · 1 month ago · May 26, 2026
Anthropic tests an ethical reminder tool that lowers Claude's misaligned behavior (anthropic.com)

AI · 1 month ago · May 25, 2026
Google launches Gemini Omni, a video model built around conversational editing (blog.google)

AI · 1 month ago · May 25, 2026
Simon Willison's six-month LLM recap: coding agents work, open weights catch up (simonwillison.net)

AI · 1 month ago · May 25, 2026
Charlie Holland: 'Claude is not your architect, stop letting it pretend' (hollandtech.net)

AI · 1 month ago · May 24, 2026
Anthropic's security partners report Opus running pen tests at hyperscaler scale (claude.com)

AI · 1 month ago · May 24, 2026
HBM grew from 52% to 63% of AI chip component costs (epoch.ai)

AI · 1 month ago · May 24, 2026
Coding agents lose 30 points when you tighten backend constraints (arxiv.org)

AI · 1 month ago · May 24, 2026
Cerebras inks a 750 megawatt OpenAI deal and hands over 10 percent of the company (newsletter.semianalysis.com)

AI · 1 month ago · May 23, 2026
Anthropic ships a Compliance API for Claude with 28 security partners (claude.com)

AI · 1 month ago · May 23, 2026
HBM is eating consumer RAM, and the bill lands on cheap phones (simonwillison.net)

AI · 1 month ago · May 23, 2026
NVIDIA's open diffusion LLMs run six times faster than autoregressive baselines (huggingface.co)

AI · 1 month ago · May 23, 2026
Why China's AI enthusiasm reads more like anxiety (asteriskmag.com)

AI · 1 month ago · May 22, 2026
Fine-tuning a world model so it stops drawing human hands on robots (huggingface.co)

AI · 1 month ago · May 22, 2026
Gemini Spark wires an agent into your inbox, and Simon Willison is worried (simonwillison.net)

AI · 1 month ago · May 22, 2026
Anthropic's Glasswing update finds 10,000 software bugs, and a patching backlog (anthropic.com)

AI · 1 month ago · May 22, 2026
Muon, the optimizer everyone switched to, quietly kills neurons (importai.substack.com)

AI · 1 month ago · May 22, 2026
Kapoor and Narayanan argue against extraordinary AI rules (normaltech.ai)

AI · 1 month ago · May 21, 2026
Anthropic projects its first operating-profit quarter at $10.9B (techcrunch.com)

AI · 1 month ago · May 21, 2026
Ettin Reranker family beats much larger models from 17M parameters up (huggingface.co)

AI · 1 month ago · May 21, 2026
Ai2's OlmoEarth v1.1 cuts compute 3x with a token redesign (allenai.org)

AI · 1 month ago · May 21, 2026
Anthropic acquires Stainless to own its SDK and MCP toolchain (anthropic.com)

AI · 1 month ago · May 20, 2026
Claude Managed Agents adds self-hosted sandboxes and MCP tunnels (claude.com)

AI · 1 month ago · May 20, 2026
DeepMind opens Co-Scientist, its multi-agent hypothesis system (deepmind.google)

AI · 1 month ago · May 20, 2026
An OpenAI model disproves a 1946 unit-distance conjecture (openai.com)

AI · 1 month ago · May 20, 2026
Odyssey's Agora-1 keeps a shared world state across players (odyssey.ml)

AI · 1 month ago · May 19, 2026
Google's Gemini 3.5 Flash is built for agents (blog.google)

AI · 1 month ago · May 19, 2026
An open leaderboard for whole agent systems, not just models (huggingface.co)

AI · 1 month ago · May 19, 2026
OpenAI adds provenance signals to its AI images (openai.com)

AI · 1 month ago · May 19, 2026
Ai2 launches a shared benchmark for AI climate models (allenai.org)

AI · 1 month ago · May 18, 2026
OpenAI's new default ChatGPT model hallucinates less (openai.com)

AI · 1 month ago · May 18, 2026
IBM's new Granite embeddings beat much larger models (huggingface.co)

AI · 1 month ago · May 18, 2026
How Anthropic decoupled its agents' brain from their hands (anthropic.com)

AI · 1 month ago · May 18, 2026
The Open ASR Leaderboard adds private tests to stop gaming (huggingface.co)

AI · 1 month ago · May 18, 2026
Testing AI in the open world, not just on benchmarks (normaltech.ai)

AI · 1 month ago · May 17, 2026
Ai2's open robotics model beats a proprietary baseline (allenai.org)

AI · 1 month ago · May 17, 2026
arXiv will ban authors for a year if they let AI do all the work (techcrunch.com)

Research · 2 months ago · May 16, 2026
How you pick benchmarks decides whether open models are far behind (interconnects.ai)

AI · 2 months ago · May 16, 2026
Why the setup around Claude Code matters as much as the model (claude.com)

Engineering · 2 months ago · May 14, 2026
Claude computer use: the small settings that decide accuracy (claude.com)

Engineering · 2 months ago · May 13, 2026
Anthropic's security team cut false alerts from 33% to 7% using Claude (claude.com)

Security · 2 months ago · May 12, 2026
Why China's open-model lead is about process, not the models (interconnects.ai)

AI · 2 months ago · May 12, 2026
A model says 13 percent automation is enough to tip growth into the explosive zone (importai.substack.com)

AI · 2 months ago · May 11, 2026
GitLab restructures around the agent thesis, and Simon Willison checks the incentive (simonwillison.net)

AI · 2 months ago · May 11, 2026
Ai2's EMO trains a mixture of experts you can run at one-eighth size (huggingface.co)

AI · 2 months ago · May 8, 2026
AlphaEvolve moves from research demo to a production tool, with real numbers (deepmind.google)

AI · 2 months ago · May 7, 2026
Mozilla says it fixed 423 Firefox security bugs in one month with AI help (simonwillison.net)

Security · 2 months ago · May 7, 2026
Notes from inside China's AI labs: less ego, more students, build don't buy (interconnects.ai)

AI · 2 months ago · May 7, 2026
Anthropic takes all of SpaceX's Colossus 1 compute, raises Claude Code limits (anthropic.com)

Infrastructure · 2 months ago · May 6, 2026
Jack Clark puts a number on AI that trains its own successor (importai.substack.com)

AI · 2 months ago · May 4, 2026
Calling it a distillation attack blurs a normal technique with API abuse (interconnects.ai)

AI · 2 months ago · May 4, 2026
Mistral Medium 3.5 pairs a self-hostable model with cloud coding agents (mistral.ai)

AI · 2 months ago · April 29, 2026
DeepSeek-V4 spends most of its design budget making long context usable (huggingface.co)

AI · 2 months ago · April 24, 2026
Ethan Mollick on GPT-5.5: strong where work is verifiable, weak where taste is the point (oneusefulthing.org)

AI · 2 months ago · April 23, 2026
Anthropic's AI alignment researchers closed most of the human gap in five days (importai.substack.com)

AI · 2 months ago · April 20, 2026
One benchmark number hides which jobs a model is actually good at (interconnects.ai)

AI · 2 months ago · April 20, 2026
Claude Opus 4.7: bigger vision input, steadier long-running coding (anthropic.com)

AI · 3 months ago · April 16, 2026
Nathan Lambert bets open models win on economics, not benchmarks (interconnects.ai)

AI · 3 months ago · April 15, 2026
Jack Clark sorts attacks on AI agents into six kinds (importai.substack.com)

AI · 3 months ago · April 13, 2026
Gemma 4: small open models that punch far above their parameter count (blog.google)

AI · 3 months ago · April 2, 2026
The chatbot window is the bottleneck, not the model (oneusefulthing.org)

AI · 3 months ago · March 31, 2026
DeepMind built a way to measure when AI manipulates people (deepmind.google)

AI · 3 months ago · March 26, 2026
Nathan Lambert's counter-take: self-improvement will be lossy, not explosive (interconnects.ai)

AI · 3 months ago · March 22, 2026
Anthropic asked 81,000 people what they want from AI (anthropic.com)

AI · 3 months ago · March 18, 2026
DeepMind proposes a cognitive framework for measuring AGI progress (blog.google)

AI · 4 months ago · March 17, 2026
Ethan Mollick: the job is shifting from talking to AI to managing it (oneusefulthing.org)

AI · 4 months ago · March 12, 2026
Demis Hassabis on what AlphaGo's Move 37 set in motion, ten years on (deepmind.google)

AI · 4 months ago · March 10, 2026
Are AI Datacenters Raising Your Electric Bill? (newsletter.semianalysis.com)

AI · 4 months ago · March 3, 2026
Import AI 445: Bostrom on when to race, and a math benchmark AI can't beat yet (importai.substack.com)

AI · 4 months ago · February 16, 2026
Gemini Deep Think solved open math problems and got a paper into ICLR (deepmind.google)

AI · 5 months ago · February 11, 2026
Anthropic commits to keeping Claude permanently ad-free (anthropic.com)

AI · 5 months ago · February 4, 2026
Google's Project Genie turns prompts into explorable, physics-aware worlds (blog.google)

AI · 5 months ago · January 29, 2026
Ethan Mollick: Claude Code is a preview of agentic work everywhere (oneusefulthing.org)

AI · 6 months ago · January 7, 2026
Code Execution Cuts MCP Agent Token Costs (anthropic.com)

AI · 8 months ago · November 4, 2025
What Anthropic Learned Building a Multi-Agent Researcher (anthropic.com)

AI · 1 year ago · June 13, 2025
Qwen3 Puts Reasoning on a Switch (qwenlm.github.io)

AI · 1 year ago · April 29, 2025
AI as Normal Technology (normaltech.ai)

AI · 1 year ago · April 15, 2025
OLMo 2 32B: Fully Open Catches Up to Closed (allenai.org)

AI · 1 year ago · March 13, 2025
DeepSeek-R1: An Open Model Matches a Closed Reasoner (huggingface.co)

AI · 1 year ago · January 20, 2025
Reward Hacking: Why Better Models Game You More (lilianweng.github.io)

AI · 1 year ago · November 28, 2024
Tülu 3 Opens Up the Post-Training Recipe (allenai.org)

AI · 1 year ago · November 21, 2024
OpenAI o1 and the Start of Test-Time Reasoning (openai.com)

AI · 1 year ago · September 12, 2024
Can AI Scaling Continue Through 2030? (epoch.ai)

AI · 1 year ago · August 20, 2024
A Build Order for a Production GenAI Platform (huyenchip.com)

AI · 1 year ago · July 25, 2024
Llama 3.1 405B and the License That Mattered (ai.meta.com)

AI · 1 year ago · July 23, 2024
Situational Awareness: One Insider's Case for Fast AGI (situational-awareness.ai)

AI · 2 years ago · June 4, 2024
The OpenAI Board Fight, Reconstructed (thezvi.substack.com)

AI · 2 years ago · November 22, 2023
Crash Testing GPT-4: The First Dangerous-Capability Eval (asteriskmag.com)

AI · 3 years ago · June 1, 2023
A Field Guide to the AI Safety Camps (asteriskmag.com)

AI · 3 years ago · June 1, 2023
GPT-4: The Reference Model, and What It Withheld (openai.com)

AI · 3 years ago · March 14, 2023
The Original LLaMA Release and What It Set Off (ai.meta.com)

AI · 3 years ago · February 24, 2023
The Bitter Lesson: Why General Methods Win (incompleteideas.net)

AI · 7 years ago · March 13, 2019

← all tags