Harness Engineering: The Core Engineering Discipline of the AI Agent Era

March 27, 2026 · 2253 words · 11 min

2025 was the year AI agents exploded. 2026’s defining theme is the Agent Harness — the runtime control framework that wraps around agents. OpenAI and Anthropic have both published dedicated articles on harness engineering, and systematic academic papers have emerged. This post breaks down this emerging discipline from an engineering perspective: what it is, why it matters, and how to do it.

Fixing OpenCode Prompt Cache Misses When Using GPT via Third-Party Proxy

March 26, 2026 · 539 words · 3 min

While using OpenCode with GPT 5.3 Codex for daily development, I noticed abnormally high token consumption — around 69K input tokens per request with virtually zero cache hits. The same model and proxy worked fine with Codex CLI, where caching functioned as expected. This post documents the full debugging and resolution process.

Build a weather query agent

August 22, 2024 · 1078 words · 6 min

In the field of artificial intelligence, intelligent agents (Agent) are one of the cutting-edge applications of Large Language Models (LLM).

Understanding the "Temperature" of Large Language Models

August 20, 2024 · 539 words · 3 min

In the realm of artificial intelligence, large language models (LLMs) have become sophisticated tools for generating human-like text. A pivotal concept in steering these models is the “temperature,” which dictates the randomness and creativity of the generated text. This blog post aims to demystify the temperature setting in LLMs and provide a professional overview.

Build Your AI Search Bot with LLM and Search Engines

August 15, 2024 · 938 words · 5 min

In the tide of artificial intelligence, the combination of large language models (LLMs) and search engines has brought revolutionary changes to the field of information retrieval. This article will provide a detailed introduction on how to use these technologies to build an intelligent information retrieval assistant—an AI search bot. We will delve into various aspects such as project background, core principles, code implementation, usage examples, and open-source addresses.

Retrieval-Augmented Generation

April 24, 2024 · 759 words · 4 min

In the era of information explosion, we are faced with the challenge of vast amounts of data. Retrieving useful information from this data is becoming increasingly difficult for people. To address this issue, researchers have proposed a novel technology called RAG (Retrieval-Augmented Generation).

RAG combines the methods of retrieval and generation to make information extraction from large-scale data more efficient and accurate.

This article will introduce the definition, working principle, and problems addressed by RAG.