Tech / AI / IT Monitor

March 29, 2026 · Based on tweets from the last 24 hours · 173 tweets analyzed · model: ollama-cloud/minimax-m2.7:cloud

Executive Summary

The most significant development this period is the rapid adoption of Hermes Agent v0.5.0 from Nous Research, which has attracted over 2,600 community members in under a week with no marketing. Multiple users report superior performance compared to alternatives like OpenClaw, particularly for its memory management and tool-calling capabilities. In local AI infrastructure, discussions highlight the performance advantages of llama.cpp compiled from source over Ollama, with benchmarks showing 35 tok/s on Qwen 27B dense using a single RTX 3090. Developer tools continue advancing, including MCPorter v0.8.0 for CLI integration and new Claude Code extensions for privacy protection and skill creation.

Key Events

Hermes Agent Community Surge: Nous Research's Hermes Agent gaining massive traction with 2,600+ members joining within a week, driven by superior memory management and zero per-token pricing compared to competitors → link
Hermes Agent v0.5.0 Release: Major update includes improved browser automation integration, direct browsing from ClawHub and 7 other major sources, and enhanced agentic trajectory capabilities → link
Local AI Stack Recommendations: Community consensus emerging favoring llama.cpp + Hermes Agent over OpenClaw/Ollama combinations for production deployments → link
NVIDIA Nemotron Cascade 2 Testing: Benchmarks show 187 tok/s across 4K-625K context on RTX 3090, but limitations identified for complex autonomous coding tasks requiring architectural coherence → link
MCPorter v0.8.0 Released: Updated MCP-to-CLI tool with stronger OAuth handling, valid JSON fallback, and improved daemon reliability → link

Analysis

Patterns Observed: - Strong community-driven momentum for open-source local AI solutions, with vocal criticism of "bloat" in commercial wrappers - Hermes Agent emerging as the recommended alternative to OpenClaw, specifically for its per-model tool call parsers and memory handling - Continued emphasis on owning AI infrastructure ("buy a GPU, compile from source") versus renting API access - Growing frustration with token-count metrics as developer KPIs, shifting toward "token smarter" approaches

Emerging Trends: - Continuous batching enabled by default in EXO v1.0.69 for distributed inference - Privacy-focused extensions for AI coding agents (PII auto-detection) - Skill creation frameworks with built-in evaluation capabilities for Claude Code - vibecoding gaining traction for rapid prototyping of apps and circuits

What to Watch: - Hermes Agent community growth trajectory and open dataset release for training - StepFun 3.5 performance comparisons against established models - Continued adoption of llama.cpp over Ollama for production local AI deployments

Tweet Feed

Hermes Agent / Nous Research

@Teknium · 2026-03-29T16:06

Would people be interested in a large data dump of agentic trajectories in hermes agent to train any sized model to perform extremely well in hermes agent? 😳🤗 → tweet link

@sudoingX · 2026-03-29T17:23

2,600 people joined the hermes agent community in under a week. no ads, no giveaways. just builders helping builders.

if you're new to hermes agent or migrating away from tools that charge you per token to think, this is where you start. people in here share configs, debug setups, report bugs, and push open source forward together. → tweet link

@Teknium · 2026-03-29T16:07

RT @gkisokay: The founder of Hermes just confirmed something I missed.

'hermes skills browse' pulls directly from ClawHub and 7 other majo… → tweet link

@sudoingX · 2026-03-29T17:13

are you still paying per token to think? → tweet link

@sudoingX · 2026-03-29T06:41

this guy owe an apology to every builder who spends their time helping strangers run models on their own hardware for free while he charge you for a api wrapper.

the local model community debugs configs in comments and DMs for free. we share exact flags and tested benchmarks so people don't waste hours guessing. → tweet link

@sudoingX · 2026-03-29T06:23

the concept is right. local models replacing API bills is the future and i've been saying this for months. but the stack recommendation here will frustrate you before you even get started.

openclaw is 120K+ lines of typescript bloat that can't parse tool calls correctly on most local models. i've tested it extensively and have DMs from people whose "broken" models started working instantly after switching harnesses. the model was never the problem. the harness was.

ollama is convenient but it wraps llama.cpp with overhead and worse defaults. if you want real performance compile llama.cpp from source with CUDA. i get 35 tok/s on qwen 27B dense on a single 3090 with 262K context. flat speed, zero degradation. ollama won't really give you that.

the stack that actually works in march 2026 is llama.cpp compiled from source + hermes agent by nousresearch. per-model tool call parsers, fully open source, no corporation behind it mining your thinking, and the fastest growing agent community in local AI right now. → tweet link

Local AI / GPU Infrastructure

@sudoingX · 2026-03-29T14:20

hey if you're considering nvidia's nemotron cascade 2 for agent coding on your 3090 this might save you time. here's what afew days of testing taught me.

speed settled. 187 tok/s flat from 4K to 625K context. 67% faster than qwen 3.5 35B-A3B on the same card. mamba2 is context independent and needs zero flags to get there. for chat, bash scripting, API calls, simple tool use, this model at this speed is unmatched in the 3B active class.

but i pushed it harder. gave it the same autonomous coding test i give every model. octopus invaders, a full space shooter game, pixel art enemies, particle systems, audio, HUD, game states. the kind of build that tests whether a model can hold architectural coherence across thousands of lines.

i ran it five times. multi file, single file, thinking mode on. broken imports, blank screens, skeleton code that never rendered a single frame. on the same 3090 qwen's 9B dense built 2,699 lines and was playable on its first iteration. cascade 2 at 3B active never got there.

3 billion active parameters winning gold at the international math olympiad is real. but math competitions and autonomous coding are different problems. the speed is there. the reasoning is there for structured tasks. but holding coherence across thousands of lines of game logic, particle systems, audio, and collision detection? 3B active MoE hits a ceiling.

cascade 2 is the fastest local model i've tested in its class. for complex agentic coding it's not ready at this size. test before you commit. → tweet link

@TheAhmadOsman · 2026-03-29T03:02

Continual learning is solved btw

It just happens that it requires running the model on your own hardware and the labs don't want that

Ultimately, your AI won't be first-class-in terms of quality-if it is not running on your own hardware → tweet link

@TheAhmadOsman · 2026-03-29T00:48

Running Claude Code w/ local models on my own GPUs at home

SGLang serving MiniMax on 8x RTX 3090s nvtop showing live GPU load Claude Code generating code + docs end-2-end on a single node from my AI cluster → tweet link

@TheAhmadOsman · 2026-03-29T18:14

KVCache quantization is a no-no as well

I'd rather quantize the model to 2-bit rather than quantize the KVCache to 4-bit or even 8-bit → tweet link

@TheAhmadOsman · 2026-03-29T16:26

a lot of REAPs out there DO NOT work and are a waste of time

lots of lists for models that fit under the 16GB / 8GB are simply useless (I don't wanna say engagement farming…)

there is so much unwarranted hype and I don't want newcomers to local AI disappointed

let's do better → tweet link

@TheAhmadOsman · 2026-03-29T01:40

I have been sleeping on StepFun 3.5

WHAT A BEAST → tweet link

@TheAhmadOsman · 2026-03-28T21:06

This is the same stupid shit that happened in January 2025 when NVIDIA's stocks dropped after DeepSeek R1 came out, we all know how that ended lol

P.S. Buy what you need while it's at a discount → tweet link

@alexocheema · 2026-03-29T01:30

Own your intelligence, don't rent it. → tweet link

Developer Tools

@steipete · 2026-03-29T00:10

RT @upster: Just published openclaw-teams-setup: one command to get your OpenClaw/NemoClaw agent running in Teams.

Simply run: npx opencla… → tweet link

@steipete · 2026-03-29T00:21

RT @StefanFSchubert: While social media is polarizing, evidence suggests AI may nudge people towards the centre.

This holds true of all st… → tweet link

@badlogicgames · 2026-03-29T17:27

RT @micLivs: Today I'm introducing not 1, not 2, but 12 🫣 new pi extensions.

agent web search is a mess I'm not going to try to clean up.…

@nummanali · 2026-03-29T16:12

The upgraded Skill Creator from Claude Code team has built in eval framework that runs tests with child agents to determine effectiveness and drive improvement

What you have in the skill is years of researcher experience encoded

Very surprised by effectiveness → tweet link

@RydMike · 2026-03-29T08:26

RT @adamlyttleapps: Screw it, I made it open source..

This is Notchy -_-

He stops you getting distracted when using Claude code by replac… → tweet link

Open Source / Frameworks

@badlogicgames · 2026-03-29T14:23

RT @0xSero: Here's a codebase ready for plugin into any agentic harness that will auto strip secrets, anonymise data, and push it to huggin… → tweet link

@alexocheema · 2026-03-29T00:45

RT @exolabs: EXO v1.0.69 is out.

Continuous batching is now on by default. EXO will automatically batch together requests sent to any node… → tweet link

@tinygrad · 2026-03-29T12:32

RT @JamesTervic: I will get stuck into tuning and testing Tinygrad Mac M3 and RTX Pro 6000 Workstation 300w edition and a Razer eGPU TB5 e… → tweet link

Software Development

@MatejKnopp · 2026-03-29T14:30

Well, TIL: dart --snapshot actually runs the dart file, and if that hangs for any reason. So when flutter_tool analytics gets stuck, the snapshot gets never build and no error message is printed. Spent some debugging dartdev, just to find out that it's actually flutter_tool. → tweet link

@iamdevloper · 2026-03-29T10:00

Debugging code is like cleaning a room. You start enthusiastically, but as hours pass you're crying, surrounded by mess you didn't know existed, falling into the abyss of despair. Let's not even talk about the cobwebs, also known as 'legacy code.' → tweet link

@badlogicgames · 2026-03-29T09:35

RT @siddharthkp: i'm seeing a new problematic behavior emerge in medium-big companies because of ai

somebody notices a performance issue… → tweet link

@badlogicgames · 2026-03-29T09:32

RT @mikehostetler: AI didn't kill software engineering

It killed the illusion that shipping code and engineering a system were the same th… → tweet link

@badlogicgames · 2026-03-29T09:27

RT @dexhorthy: noticing the ai coding meta is starting to shift in some circles from "token harder" to "token smarter"

In August the manda… → tweet link

@badlogicgames · 2026-03-29T15:00

we all should do this. token count as a kpi is fucking insane. → tweet link