कोडिंग एजेंट के घटक

(magazine.sebastianraschka.com)

32 पॉइंट द्वारा GN⁺ 24 일 전 | अभी कोई टिप्पणी नहीं है. | WhatsApp पर शेयर करें

कोडिंग एजेंट एक ऐसी प्रणाली है जो LLM-केंद्रित control loop और software harness से बनी होती है, और कोड लिखने, चलाने तथा feedback के दोहराव वाले चक्र में काम करती है
एजेंट harness context management, tool access, prompt composition और state control संभालता है, जबकि कोडिंग कार्यों के लिए विशेष coding harness repository, test और error checking को प्रबंधित करता है
कोडिंग एजेंट live repo context, prompt cache, tool access, context management, session memory, subagent delegation — इन छह घटकों पर काम करता है
harness design की गुणवत्ता के अनुसार, एक ही LLM होने पर भी performance और user experience में बड़ा अंतर आ सकता है, और अच्छी तरह डिज़ाइन किया गया harness अधिक सतत और context-aware development environment देता है
Mini Coding Agent इस संरचना का pure Python में बना एक minimal example है, और OpenClaw से इसका अंतर coding specialization और operational scope में है

कोडिंग एजेंट के घटक

कोडिंग एजेंट एक ऐसी प्रणाली है जो LLM-केंद्रित control loop और उसे घेरने वाले software harness से मिलकर बनती है, और कोड लिखने, संशोधित करने, चलाने और feedback के दोहराव वाले चक्र में काम करती है
LLM मूल रूप से next-token prediction model है, जबकि reasoning model ऐसा LLM है जिसे intermediate reasoning और verification अधिक करने के लिए प्रशिक्षित किया गया है
एजेंट वह control loop है जो लक्ष्य हासिल करने के लिए model calls, tool use, state updates और termination decision को बार-बार चलाता है
agent harness इस loop को घेरने वाली software structure है, जो context management, tool access, prompt composition, state control आदि संभालती है
coding harness इसका code work के लिए विशेष रूप है, जो repository context, code execution, testing और error checking को प्रबंधित करता है

LLM, reasoning model और agent का संबंध

LLM को engine, reasoning model को enhanced engine, और agent harness को उस engine को नियंत्रित करने वाली system के रूप में समझा जा सकता है
LLM और reasoning model अपने-आप में coding tasks कर सकते हैं, लेकिन वास्तविक development environment में repo exploration, function search, test execution, error analysis जैसे जटिल context management की आवश्यकता होती है
coding harness model की क्षमता को अधिकतम करता है और साधारण chat interface की तुलना में कहीं अधिक शक्तिशाली coding experience देता है

coding harness की भूमिका

यह model को घेरने वाली software layer है, जो prompt assembly, tool exposure, file state tracking, command execution, permission management, cache और memory storage जैसे काम करती है
एक ही LLM होने पर भी harness design के आधार on performance और user experience में बड़ा फर्क पड़ता है
उदाहरण के लिए, GLM-5 जैसा open-weight model भी यदि Codex या Claude Code स्तर के harness में integrate किया जाए, तो समान performance दे सकता है
OpenAI ने GPT-5.3 और GPT-5.3-Codex जैसे harness-specific post-processing models को अलग से बनाए रखने के उदाहरण भी दिए हैं

कोडिंग एजेंट के 6 मुख्य घटक

1. Live Repository Context
- एजेंट को मौजूदा Git repo state, branch, documentation, test commands आदि की जानकारी होनी चाहिए
- “test ठीक करो” जैसे निर्देश repo structure और context के अनुसार बदलते हैं, इसलिए काम शुरू करने से पहले repo summary information इकट्ठा की जाती है
- इससे हर बार zero state से शुरू नहीं करना पड़ता और stable facts का आधार मिलता है
2. Prompt Shape and Cache Reuse
- repo summary, tool descriptions और general instructions अक्सर नहीं बदलते, इसलिए इन्हें stable prompt prefix के रूप में cache किया जाता है
- हर request पर पूरा prompt फिर से assemble करने के बजाय, केवल बदले हुए हिस्सों को update किया जाता है
- इससे repeated sessions में compute waste कम होता है और response consistency बनी रहती है
3. Tool Access and Use
- model केवल commands suggest नहीं करता, बल्कि harness द्वारा परिभाषित tool set के माध्यम से वास्तव में commands चला भी सकता है
- हर tool के स्पष्ट input-output formats और boundaries होते हैं, और execution से पहले validation तथा approval process किया जाता है
- उदाहरण: “क्या यह known tool है?”, “क्या arguments valid हैं?”, “क्या working path workspace के अंदर है?” जैसी जाँच
- इससे security और reliability बेहतर होती है; model की स्वतंत्रता कुछ घटती है, लेकिन practical usability बढ़ती है
4. Minimizing Context Bloat
- लंबी sessions में repeated file reads, logs और tool outputs के कारण prompt length overflow की समस्या आती है
- harness इसे दो रणनीतियों से संभालता है
  - clipping: लंबे text, logs और notes को एक तय लंबाई तक छोटा करना
  - summarization: पुराने conversation history को compressed summary में बदलना
- हाल की घटनाओं को विस्तार से रखा जाता है, जबकि पुरानी जानकारी को deduplicate और compress किया जाता है
- नतीजतन, model quality से भी अधिक context quality का वास्तविक performance पर बड़ा असर पड़ता है
5. Structured Session Memory
- एजेंट state को working memory और full transcript में अलग रखता है
- full transcript में सभी requests, responses और tool outputs शामिल होते हैं, जिससे session resume करना संभव होता है
- working memory वर्तमान में महत्वपूर्ण जानकारी, जैसे current task, key files और recent notes, को summary form में store करती है
- compact transcript model prompt reconstruction के लिए होता है, जबकि working memory task continuity बनाए रखने के लिए
6. Delegation With Bounded Subagents
- मुख्य एजेंट सहायक कार्यों को parallel में चलाने के लिए subagent बनाता है
- उदाहरण: किसी specific symbol की definition location, config file contents, या test failure का कारण अलग subtask के रूप में बाँटना
- subagent केवल ज़रूरी context inherit करता है, और read-only access, recursion depth limits जैसी constraints के भीतर काम करता है
- Claude Code और Codex दोनों subagents को support करते हैं, और task scope तथा context depth के आधार पर सीमाएँ तय करते हैं

घटकों का सार

ये छहों घटक आपस में गहराई से जुड़े हैं, और harness design की गुणवत्ता ही model उपयोग की दक्षता तय करती है
अच्छी तरह डिज़ाइन किया गया coding harness साधारण LLM chat की तुलना में कहीं अधिक context-aware और persistent development support environment देता है
Mini Coding Agent(https://github.com/rasbt/mini-coding-agent) इस संरचना का pure Python में बना एक minimal example है

OpenClaw के साथ तुलना

OpenClaw coding-only helper से अधिक general agent platform के करीब है
समानताएँ:
- workspace के भीतर prompt और instruction files (AGENTS.md, TOOLS.md आदि) का उपयोग
- JSONL session files, conversation compression, session management जैसी सुविधाएँ शामिल
- auxiliary sessions और subagents बनाए जा सकते हैं
अंतर:
- coding agents repo exploration, code editing, local tool execution के लिए optimize किए जाते हैं
- OpenClaw का फोकस multi-channel और cross-workspace long-running agent operations पर अधिक है

परिशिष्ट: नई पुस्तक की सूचना

Build A Reasoning Model (From Scratch) का लेखन पूरा हो चुका है और अभी Early Access संस्करण उपलब्ध है
publisher गर्मियों में प्रकाशन के लक्ष्य के साथ layout पर काम कर रहा है
यह पुस्तक LLM के reasoning mechanism को स्वयं implement करके समझने वाले approach पर केंद्रित है

कोडिंग एजेंट के घटक

कोडिंग एजेंट के घटक

LLM, reasoning model और agent का संबंध

coding harness की भूमिका

कोडिंग एजेंट के 6 मुख्य घटक

1. Live Repository Context

2. Prompt Shape and Cache Reuse

3. Tool Access and Use

4. Minimizing Context Bloat

5. Structured Session Memory

6. Delegation With Bounded Subagents

घटकों का सार

OpenClaw के साथ तुलना

परिशिष्ट: नई पुस्तक की सूचना

संबंधित पढ़ाई

अभी कोई टिप्पणी नहीं है.