I spotted Yamak on Hacker News last weekend, wedged between a Skyvern 2.0 benchmark post and Steel.dev’s new AI Browser Agent Leaderboard. Three open-source browser agent projects on the front page in one week. And a year ago this category barely had a name.

Who’s scoring what, and does it matter

Browser Use sits at 78,000 GitHub stars with an 89.1% on WebVoyager. Skyvern 2.0 posted 85.85% using a new Planner-Actor-Validator architecture, which nearly doubled their v1 score of around 45%. Stagehand from Browserbase rewrote everything for v3, dropping Playwright entirely in favor of direct Chrome DevTools Protocol communication. And then there’s Yamak, six GitHub stars, built in Kotlin on JetBrains’ Koog agent framework, which is wild because the entire rest of the ecosystem runs on Python or TypeScript and apparently nobody told the Yamak developers.

I care less about the scores than the architectural bets underneath them. Skyvern decided screenshots and computer vision were the right input layer, not DOM parsing. Browser Use stuck with Playwright and layered planning logic on top. Yamak ships as a desktop-native app that hooks into your existing Chrome installation. Three completely different theories about where the abstraction boundary should sit, and I find that more revealing than benchmark rankings — most of these tests run against cooperative websites that never deploy bot detection, never throw CAPTCHAs, never rearrange their layouts mid-session.

Steel.dev’s leaderboard fills in some gaps. Magnitude leads at 93.9%, proprietary. But Browser Use and Skyvern dominate the open-source tier because they’ve had the most runway to grind on error recovery and multi-step sequencing. I wrote about what these benchmarks actually measure in a previous post.

A public repo is not an audit

People keep treating “open source” like it means “trustworthy” and I think this conflation is going to burn someone. OpenClaw crossed 100,000+ GitHub stars and security researchers at Wiz almost immediately found unauthenticated access to its production database, exposing tens of thousands of email addresses, prompting China’s Ministry of Industry and Information Technology to issue a formal misconfiguration warning. All of that code was sitting right there on GitHub the whole time. And nobody caught it by reading the source because nobody reads the source, not at scale, not with the kind of sustained adversarial scrutiny that security actually requires. A public repo is a license model, not a security guarantee, and when the software in question watches you browse, fills your forms, and handles your sessions, I want to know where my data actually flows and who holds the keys.

Yamak

Six stars. Kotlin. Desktop-native. I have no idea if it goes anywhere.

Nobody’s installing Python for this

So most open-source browser agents require you to clone a repo, configure Python or Docker, wire up an LLM backend, and run tasks from a terminal. This is fine if you’re a developer, and it is useless if you are anyone else, which includes the overwhelming majority of people whose browser is their entire workplace and who will never open a terminal voluntarily — something I keep coming back to in this earlier post.

dassi runs in Chrome’s side panel. Your API key talks directly to the LLM provider. So your data never routes through our infrastructure because there is no intermediary server. And an extension architecture that uses your own API key eliminates the entire server-compromise threat surface in one move. More projects in this space should be stealing that idea.

The subscription math

Google launched Chrome Auto Browse in late January, powered by Gemini 3. Perplexity made Comet free. OpenAI’s Operator keeps improving. But people keep building open-source alternatives anyway, because $20 to $200 per month for a commercial browser agent is a lot of money when you could pay the LLM provider directly at API rates and spend pennies on the dollar for most use cases. Bring-your-own-key plus local-only data plus readable source code is a package the big providers cannot assemble without dismantling the subscription model that pays for their infrastructure. Sort of the whole reason these projects exist.