Read in

News

GLM Coding Plan Special Offer

We have talked a bunch the past few weeks about Z.ai and their GLM series of models and how it is the best deal for agentic coding right now at only $3 a month.

Now that deal gets even better; new users can use the Vector Lab invite code to get 10% off any GLM Coding plan.

More GLM performance benchmarks

GLM-4.6 outperforms claude-4-5-sonnet while being ~8x cheaper — from gum1hox on Twitter (note, this is a math benchmark)

OpenAI Dev Day

Chat with Apps

The first announcement of Dev Day was the ability to Chat with Apps. This feature allows you to embed your website into the ChatGPT app, allowing users to interact with the app and also use ChatGPT to control the app and answer any questions the user may have, taking in the current app’s context to better answer the question.

Right now it can be used by directly mentioning one of the partnered apps that have already been released (like Canva), or for a given request the model can also suggest an app to use.

It’s very easy to build your own app for ChatGPT, they have built the SDK on top of the MCP protocol, if you have an existing MCP server, all you need is a tool that returns a UI and it should work in ChatGPT.

Actually getting your app published is a whole nother issue however, as OpenAI seems to be only allowing select businesses to add their apps to the ChatGPT website. Right now there are 7, with 11 more on the way. OpenAI says they will assess more near the end of the year, but I wouldn’t be holding your breath in anticipation if you are a small startup.

AgentKit

The next major release is their agent builder platform. This platform is similar to N8N or ComfyUI, where it has a set of nodes that you can string together into other nodes to be able to go and create a custom workflow for your agents.

The UI

The OpenAI team claims that it was primarily vibe-coded using their Codex models over the course of six weeks. This is not necessarily a good thing as many users have mentioned a lack of polish on the app as well as complicated and confusing UI.

I personally don’t think these visual builders are all that useful. I think if you’re a non-technical user, you don’t want to have to worry about any of the logic at all. And you just want to be able to give a description of the task and have an agent go and build out the actual workflow or code for you. And if you’re a more technical user, you’re going to want the additional control that actually writing the code yourself gives you. I think visual workflow editors are good for debugging and understanding the general flow of what your agent is doing. But I don’t think they are the way to go and actually build these agents.

CodexSDK

Claude Code and the Codex CLI are the best agentic platforms out there right now since they were made by the model creators, and will continue to be in the future since they will be able to train their models on these frameworks specifically.

Claude Code has the Claude Agent SDK (recently rebranded from the Claude Code SDK), which allows you to programmatically use Claude Code and build your own workflows with it. The CodexCLI was missing its own SDK to use (something I thought about building myself), but it now exists.

This unlocks a whole new set of problems that you can conquer, as GPT-5 does not get stuck or hallucinate nearly as much as Claude does, and also has a far greater attention to detail.

The library is only in typescript for now unfortunately, but I expect a python version to be released in the near future as well. If you want to play around with it now, you can check it out in the Codex github.

Misc

Sora 2 via the api
- Good pricing, much more severe restrictions than on the app
GPT 5 Pro API access
- Not a model most people know of, since you could only use it on the $200/month plan. You still shouldn’t use it, as it’s only a few percent better than normal GPT-5 high while being 12x more expensive.
GPT realtime mini and GPT image mini
- smaller, faster, and cheaper versions of their normal counterparts. Expect quality to take a bit of a hit, but if you can handle the blow, these models will be much more cost effective.

Releases

Qwen3 VL 30B

Two weeks ago I complained about how Qwen3-VL was only 235B parameters and how I would like to have a 30B version as well.

Well my wish came true, as this week they released the Qwen3-VL-30B model.

Benchmarks

The model does very well in image and video benchmarks for its size, and also shows negligible decreases in its text only abilities as well.

Because of its multimodal ability and string text performance, along with its fast inference speed (its a MoE model with only 3B active params), I am switching to it as my local daily driver LLM.

Liquid AI 8B

Liquid AI has recently been specing heavily into the small, efficient model space, which has been ignored by pretty much all of the major labs up to this point, despite being wanted by many consumers and businesses alike.

This week they continued this trend, releasing LFM2-8B-A1B, which, as the name suggest, has 8B parameters with 1B active, making it very fast, even on edge devices. It benchmarks around the Qwen3 4B level, while being 3x faster.

3x faster than 4B models

This is an extremely attractive model for deployment on phones, since they have the available memory to load the model in 4bit (~4GB) and the model can run at a very respectible 50 tokens per second on an iPhone 17, while also being smart enough to be usable for real world tasks.

NeuTTS

There is a new, small, high quality text to speech model that can do voice cloning. It’s a 600 million parameter model called NeuTTS Air.

There are a bunch of models like this that get released every week, but this one stood out, as it has very natural sounding voice cloning, something that most models struggle with a lot. They normally tend to be robotic, noisy, or choppy, but NeuTTS doesn;t have any of these issues.

You don’t have to take my word for it though, you can test it right now for free on Huggingface.

Quick Hits

Do LLMs like to gamble too much?

Do LLMs internalize human-like cognitive biases, like gambling addictions? The answer seems to be yes, as researchers have recently discovered.

Abstract for the paper

Finish

I hope you enjoyed the news this week. If you want to get the news every week, be sure to join our mailing list below.

Dancing through the void — by me (Andrew) using Fluxmania Legacy and the SynthWave Lora

OpenAI Dev Day

News

GLM Coding Plan Special Offer

OpenAI Dev Day

Chat with Apps

AgentKit

CodexSDK

Misc

Releases

Qwen3 VL 30B

Liquid AI 8B

NeuTTS

Quick Hits

Do LLMs like to gamble too much?

Finish

Stay Updated