Tech / AI / IT Intelligence Briefing
Period: March 26–27, 2026 | Generated from Twitter/X monitoring
Executive Summary
The AI coding assistant wars intensified as OpenAI launched Codex plugins (resetting usage limits across all plans) while Anthropic's Claude Code faces user backlash over new usage limits on its Max plan. NousResearch's Hermes Agent is emerging as a significant open-source alternative to commercial coding frameworks, rapidly gaining community traction with over 2,100 users in ~90 hours. OpenAI CEO Sam Altman confirmed the first steel beams went up at the Michigan Stargate data center (with Oracle and Related Digital), signaling continued mega-scale AI infrastructure buildout. On the model front, Cohere released a new 2B-parameter multilingual transcription model (Apache 2.0), and the community is actively benchmarking NVIDIA's Nemotron Cascade 2 on consumer hardware.
Key Events
-
OpenAI launches Codex plugins — Usage limits reset across all plans to encourage experimentation with new plugin ecosystem integrations (Linear, GitHub Copilot, Telegram/Discord bridges, etc.) → link
-
Stargate Michigan breaks ground — Sam Altman shares photos of the first steel beams going up at the Michigan Stargate AI data center site with Oracle and Related Digital → link
-
Hermes Agent surpasses 2,100 users in ~90 hours — NousResearch's open-source agent framework is rapidly being adopted as an alternative to commercial tools, gaining traction on OpenRouter and developer communities → link
-
Ollama integrates with Visual Studio Code via GitHub Copilot — Local/cloud models from Ollama can now be selected directly inside VS Code, marking a significant step for local AI in mainstream developer tooling → link
-
Cohere releases 2B multilingual speech transcription model (Apache 2.0) — Supports 14 languages, handles poor audio quality well, available on Hugging Face → link
-
GLM-5.1 announced — Framed as a long-horizon agentic engineering model capable of tackling week-scale software tasks (debugging, integration), representing a new paradigm shift beyond copilot and vibe coding eras → link
-
Sam Altman shares ChatGPT mRNA vaccine story — A user named Paul used ChatGPT/LLMs to design an mRNA vaccine protocol to treat his dog; Altman calls the story extraordinary and says "this should be a company" → link
-
Unitree releases open-source robotics data (Apache 2.0) — Open-source robotics training data with rolling updates released, seen as a step toward home robot assistants → link
-
Claude Code skill clones entire websites from one prompt — Open-source Claude Code skill uses Chrome MCP, parallel builder agents, and git worktrees to fully reconstruct any website from a single URL prompt → link
-
Apple may be abandoning SwiftUI — Rumor circulating (via Steve Troughton-Smith on Mastodon) that Apple is reconsidering its SwiftUI strategy, causing discussion in dev communities → link
-
NVIDIA Nemotron Cascade 2 benchmarked on RTX 3090 — Community benchmarks show 187 tok/s at 625K context on a single RTX 3090 with IQ4_XS quant via llama.cpp; outperforms Qwen 3.5 35B-A3B on same hardware → link
-
Qwen3.5-35B compressed 20% with ~1% performance drop — New quantization work allows the model to fit in 4-bit on 24GB VRAM with minimal quality loss → link
-
Real-time Moondream inference — Moondream's new inference engine demonstrated running in real-time → link
-
Stacked Mac Studio clusters for AI inference — Community showcasing 6× M1 Max Mac Studio clusters running MiniMax M2.5 via Exo Labs with Thunderbolt 4 interconnects as a consumer-grade alternative to expensive GPU servers → link
-
Tenstorrent cluster unveiled — New Tenstorrent cluster with >1TB VRAM, 3TB DDR5, 32TB SSD teased by community member → link
-
OpenCode plugin system expanding — OpenCode adding npm-based plugin install system using package.json exports spec for server and client entry points → link
-
Linear Agent available in Microsoft Teams — Linear's AI project management agent now integrated into Microsoft Teams, described as "the best MS Teams Agent I've ever seen" for enterprise → link
-
Flet 0.83.0 released — Cross-platform UI framework update with up to 6.7× faster diffing, smarter
.update()calls, and declarative validation via annotations → link -
NousResearch Hermes Agent "GODMODE" skill added — New skill enables automatic model jailbreaking and locks in the jailbroken state for sessions → link
-
NVIDIA CUDA moat explained — Discussion of Jensen Huang's explanation of CUDA's competitive moat: millions of developers over 20 years, installed on 1B+ devices → link
Analysis
Patterns & Trends
AI Coding Tool Wars Heating Up: The competition between Claude Code, Codex, OpenCode, and the open-source Hermes Agent is intensifying rapidly. OpenAI's move to launch Codex plugins and reset usage limits is a direct competitive response to Anthropic's Claude Code momentum — while simultaneously, Anthropic's imposition of usage caps on Claude Max is creating user resentment that is being actively exploited by open-source advocates. This is a classic "push users to alternatives" dynamic.
Local AI vs. Cloud Bifurcation: A clear narrative is forming: every cloud model restriction (token limits, rate limits, pricing) drives users toward local inference. The Ollama/VS Code integration, Nemotron Cascade 2 benchmarks, Mac Studio clusters, and the Hermes Agent community growth are all part of the same trend. The community is proving that local models are now "80–95% there" for most tasks.
Infrastructure Scale Continues: The Stargate Michigan groundbreaking confirms that the AI infrastructure mega-buildout is accelerating rather than pausing despite macro uncertainty. The divergence between OpenAI (vertical integration: chips, healthcare, infra) and Anthropic (focused model + API approach) noted by analysts is becoming a defining strategic split to watch.
Agentic Long-Horizon Models as New Frontier: GLM-5.1's framing around "week-scale" software engineering tasks, combined with OpenAI and Anthropic both building RL training environments for long-horizon agents, signals that the next major battleground is agentic reliability over extended tasks — not just single-prompt quality.
Open-Source Hardware Diversity: The community is experimenting with a remarkable variety of hardware configurations — from single RTX 3090s to stacked Mac Studios to Tenstorrent clusters — reflecting growing confidence in running competitive models outside cloud environments.
What to Watch Next
- Will Anthropic revise Claude Max limits amid community backlash, or hold firm?
- Codex plugin ecosystem adoption speed — can OpenAI close the developer loyalty gap with Anthropic?
- GLM-5.1 benchmark releases and community reception
- Apple SwiftUI strategy clarification (could be significant for millions of iOS devs)
- Nemotron Cascade 2 broader hardware benchmarks (AMD, Apple Silicon)
- Hermes Agent OpenRouter ranking trajectory — currently #9 fastest growing GitHub repo
Tweet Feed
🏗️ AI Infrastructure & Industry
@sama · 2026-03-27T19:17
The first steel beams went up this week at our Michigan Stargate site with Oracle and Related Digital → tweet link
@sama · 2026-03-27T05:10
The coolest meeting I had this week with was Paul, who used ChatGPT and other LLMs to create an mRNA vaccine protocol to save his dog Rosie... "The chat bots empowered me as an individual to act with the power of a research institute"... It immediately got me thinking "this should be a company". → tweet link
@TrungTPhan · 2026-03-27T18:06
RT @bearlyai: Jensen explains why the install base of Cuda is Nvidia's largest moat: ▫️millions developers over 20 years ▫️installed on 1... → tweet link
@louszbd · 2026-03-27T03:14
OpenAI and Anthropic are diverging. OpenAI is going full vertical integration: from custom chips to healthcare, social, owned infra. Anthropic: no custom silicon, infra outsourced, laser focused on making the best model and API. Claude Code getting all the love. both are valid strategies... → tweet link
🤖 AI Models & Research
@louszbd · 2026-03-27T12:09
finally glm-5.1 ... we are approaching a moment where AI can operate on the same time horizon as engineers. this is why we built glm-5.1. we want to unlock a new long-horizon paradigm. where it starts to tackle the kinds of problems that unfold over weeks: debugging, integration. → tweet link
@victormustar · 2026-03-27T16:49
Very hyped by the new Cohere Transcribe model 🌍 Works surprisingly well on bad quality audio when the mic doesn't cooperate. 2B params, 14 supported languages and it's Apache 2.0. → tweet link
@victormustar · 2026-03-27T09:40
RT @huggingface: Model weights are here! → tweet link
@victormustar · 2026-03-27T13:26
RT @wildmindai: StepFun+Qwen-Edit=Expression Photoshop. Nice LoRA for fine-grained facial expression editing. - linear intensity control... → tweet link
@victormustar · 2026-03-27T12:29
RT @ostrisai: I trained this @ltx_model LTX 2.3 LoRA of George Costanza at home on my 5090 in about a day with AI Toolkit. → tweet link
@TheAhmadOsman · 2026-03-26T22:56
Remember when i said back in October 2024 that Small & Specialized Models are the future? We're on the way to that now → tweet link
@TheAhmadOsman · 2026-03-27T18:49
RT: I asked Jensen whether we will see more Nemotron models or if the recent releases were just to prove NVFP4 training works... → tweet link
@victormustar · 2026-03-27T16:09
RT @0xSero: Qwen3.5-35B compressed 20% with 1%~ performance drop on average. Now you can fit this (4bits) with full context on 24GB of VRAM... → tweet link
💻 AI Coding Tools — Claude Code & Codex
@gdb · 2026-03-27T01:56
Plugins are now available in Codex: → tweet link
@steipete · 2026-03-26T22:51
RT @OpenAIDevs: We're rolling out plugins in Codex. Codex now works seamlessly out of the box with the most important tools builders already use... → tweet link
@steipete · 2026-03-27T02:00
RT @thsottiaux: Hello. We have reset Codex usage limits across all plans to let everyone experiment with the magnificent plugins we just launched... → tweet link
@RydMike · 2026-03-27T08:32
RT @thsottiaux: Hello. We have reset Codex usage limits across all plans to let everyone experiment with the magnificent plugins we just launched... → tweet link
@LinusEkenstam · 2026-03-27T06:29
Yo, this guy just built a Claude Code skill that clones entire websites from ONE prompt 🤯 You literally just point it at any URL, type /clone-website, and it goes to work... All of this happens in isolated git worktrees that auto-merge when done. And yeah, it's open source. → tweet link
@RealGeneKim · 2026-03-27T00:18
I finally invested some time into creating a Claude Code skill and associated tools with a goal of being able to one-shot [GCP Cloud Run setup]... It one-shotted a backup job in a new project (with Secrets Manager) in 5 minutes. I literally gasped. → tweet link
@thdxr · 2026-03-27T14:33
one place where i always need the smartest