प्रभावी AI एजेंट बनाना

(anthropic.com)

8 पॉइंट द्वारा GN⁺ 2025-06-18 | 1 टिप्पणियां | WhatsApp पर शेयर करें

Anthropic के फील्ड अनुभव के अनुसार, सफल LLM एजेंट अक्सर जटिल frameworks की बजाय सरल और composable patterns से शुरू होते हैं
Agentic systems को दो हिस्सों में बांटा जा सकता है: workflows, जो तय code paths का पालन करते हैं, और agents, जिनमें LLM प्रक्रिया और tool usage को dynamically तय करता है
कई LLM applications के लिए single LLM call में search और in-context examples जोड़ना ही पर्याप्त होता है; complexity तभी बढ़ानी चाहिए जब evaluation से उसका असर साबित हो
Frameworks शुरुआत को तेज करते हैं, लेकिन prompts और responses को छिपाने वाली abstraction layers debugging को कठिन बना सकती हैं
Autonomous agents open-ended problems में मजबूत होते हैं, लेकिन लागत बढ़ने और errors accumulate होने के जोखिम के कारण sandbox testing, guardrails और स्पष्ट tool design की जरूरत होती है

Agentic systems का बुनियादी वर्गीकरण

Agentic systems शब्द का इस्तेमाल बहुत व्यापक रूप से होता है—लंबे समय तक स्वतंत्र रूप से चलने वाले पूरी तरह autonomous systems से लेकर पहले से defined workflows का पालन करने वाले implementations तक
Anthropic इन सभी variations को agentic systems मानता है, लेकिन architecture के लिहाज से इन्हें दो भागों में बांटता है
- Workflow: LLM और tools पहले से defined code paths के अनुसार orchestrate किए जाते हैं
- Agent: LLM काम करने का तरीका, प्रक्रिया और tool usage dynamically निर्देशित और नियंत्रित करता है

एजेंट कब इस्तेमाल करें, यह तय करने के मानदंड

LLM applications को संभव हो तो सबसे सरल समाधान से शुरू करना चाहिए और जरूरत पड़ने पर ही complexity बढ़ानी चाहिए
Agentic systems बेहतर task performance के बदले latency और cost स्वीकार करने वाली संरचना हैं, इसलिए पहले यह जांचना चाहिए कि यह trade-off वास्तव में जरूरी है या नहीं
जहां complexity जरूरी हो, वहां भी चयन के मानदंड अलग होते हैं
- अच्छी तरह defined tasks के लिए workflows predictability और consistency देते हैं
- बड़े पैमाने की flexibility और model-driven decision making की जरूरत वाले tasks के लिए agents अधिक उपयुक्त होते हैं
कई applications में search और in-context examples के साथ single LLM call को optimize करना ही काफी होता है

Framework इस्तेमाल करने के मानदंड

Agentic systems implement करने के tools के रूप में Claude Agent SDK, Strands Agents SDK by AWS, Rivet, Vellum का उल्लेख किया गया है
ये frameworks LLM calls, tool definitions और parsing, calls को जोड़ने जैसे low-level standard tasks को सरल बनाकर शुरुआत तेज करते हैं
हालांकि, अतिरिक्त abstraction layers असली prompts और responses को छिपाकर debugging कठिन बना सकती हैं
- वे उन स्थितियों में भी अनावश्यक complexity जोड़ने के लिए प्रेरित कर सकती हैं जहां simple setup पर्याप्त होता है
Developers के लिए बेहतर है कि वे पहले सीधे LLM API इस्तेमाल करने के तरीके से शुरू करें
- कई patterns कुछ lines of code में implement किए जा सकते हैं
- Framework इस्तेमाल करने पर भी उसके internal code के behavior को समझना चाहिए
- Internal behavior के बारे में गलत assumptions customer errors का एक सामान्य कारण हैं
Sample implementation cookbook में देखी जा सकती है

बुनियादी building block: augmented LLM

Agentic systems का basic building block augmented LLM है, जिसे search, tools, memory जैसी capabilities से मजबूत किया गया होता है
मौजूदा models search queries खुद बना सकते हैं, सही tools चुन सकते हैं और कौन-सी information बनाए रखनी है यह तय कर सकते हैं, यानी वे इन capabilities का active use कर सकते हैं
Implementation करते समय दो बातों पर ध्यान देना चाहिए
- Capabilities को use case के अनुसार adjust करना
- LLM के लिए इस्तेमाल में आसान documented interface देना
एक implementation method के रूप में Model Context Protocol का उल्लेख किया गया है
- Developers एक simple client implementation के जरिए third-party tool ecosystem से integrate कर सकते हैं

Workflow patterns

Prompt chaining
- Prompt chaining में task को sequential steps में बांटा जाता है, और हर LLM call पिछले call के output को process करती है
- हर intermediate step पर programmatic checks डालकर यह confirm किया जा सकता है कि process सही path पर है
- यह तब उपयुक्त है जब task fixed subtasks में साफ-साफ टूट सकता हो
- मुख्य trade-off यह है कि latency स्वीकार करके हर LLM call की difficulty कम की जाती है ताकि accuracy बढ़े
- Examples
  - Marketing copy generate करने के बाद उसे दूसरी language में translate करना
  - Document outline बनाना, criteria पूरा होने की जांच करना, और outline के आधार पर document लिखना
Routing
- Routing input को classify करके specialized follow-up task की ओर भेजने का तरीका है
- इससे concerns अलग किए जा सकते हैं और अधिक specialized prompts बनाए जा सकते हैं
- इस structure के बिना, एक तरह के input के लिए optimization दूसरे input की performance खराब कर सकती है
- यह तब अच्छा काम करता है जब अलग-अलग categories अलग processing के लिए उपयुक्त हों, और LLM या traditional classification model/algorithm उन्हें सही classify कर सके
- Examples
  - General questions, refund requests, technical support जैसी customer service queries को अलग-अलग processes, prompts और tools तक भेजना
  - आसान/सामान्य questions को Claude Haiku 4.5 जैसे छोटे और cost-efficient model पर, और कठिन या unusual questions को Claude Sonnet 4.5 जैसे ज्यादा powerful model पर route करना
Parallelization
- Parallelization में LLM एक task को simultaneous तरीके से process करता है और outputs को programmatically aggregate किया जाता है
- इसके दो मुख्य variants हैं
  - Sectioning: Task को independent subtasks में बांटकर parallel run करना
  - Voting: Same task को कई बार run करके diverse outputs पाना
- यह तब प्रभावी है जब subtasks में बांटकर speed बढ़ाई जा सकती हो, या ज्यादा reliability के लिए multiple perspectives या attempts की जरूरत हो
- Complex tasks में अगर हर consideration को अलग LLM call संभाले, तो वह किसी खास aspect पर ज्यादा focus कर सकती है
- Examples
  - एक model instance user query process करे और दूसरा inappropriate content या requests check करे—ऐसा guardrail
  - LLM performance evaluation में हर call model performance के अलग aspect को evaluate करे
  - कई prompts code vulnerabilities review करें और problem मिलने पर flag करें
  - Content inappropriateness evaluation में false positives और false negatives का balance करने के लिए multiple prompts और voting threshold का इस्तेमाल
Orchestrator-worker
- Orchestrator-worker में central LLM task को dynamically decompose करता है, worker LLMs को delegate करता है और results को synthesize करता है
- यह उन complex tasks के लिए उपयुक्त है जहां जरूरी subtasks का पहले से अनुमान नहीं लगाया जा सकता
- यह parallelization जैसा दिख सकता है, लेकिन मुख्य फर्क flexibility है
  - Parallelization में subtasks पहले से defined होते हैं
  - Orchestrator-worker में input के आधार पर orchestrator subtasks तय करता है
- Examples
  - Coding products जो हर बार कई files में complex changes करते हैं
  - Search tasks जो कई sources से संभावित रूप से relevant information collect और analyze करते हैं
Evaluator-optimizer
- Evaluator-optimizer एक loop structure है जिसमें एक LLM call response बनाती है और दूसरी LLM call evaluation और feedback देती है
- यह खासकर तब प्रभावी है जब clear evaluation criteria हों और iterative improvement measurable value दे
- इसके लिए दो संकेत अच्छे माने जाते हैं
  - जब human feedback साफ-साफ व्यक्त करे, तो LLM response वास्तव में improve होता है
  - LLM वैसा feedback दे सकता है
- यह उस iterative writing process जैसा है जिससे human writer polished document बनाते हैं
- Examples
  - Literary translation में evaluator LLM उन nuances की critique करे जिन्हें translation LLM पहले miss कर सकता है
  - Complex search tasks में evaluator यह तय करे कि अतिरिक्त search की जरूरत है या नहीं

Autonomous agents

Agents तब production में इस्तेमाल होने लगे जब LLMs complex input understanding, reasoning और planning, reliable tool use, और error recovery की capabilities हासिल करने लगे
Task human command या conversation से शुरू होता है
- Task clear होने पर agent plan बनाता है और independently काम करता है
- अगर अतिरिक्त information या judgment की जरूरत हो तो वह वापस human के पास आ सकता है
Execution के दौरान हर step पर environment से वास्तविक verification signals पाना जरूरी है
- जैसे: tool call results, code execution results
- इससे progress evaluate की जाती है
Agents checkpoints या stuck situations में human feedback के लिए रुक सकते हैं
Tasks अक्सर complete होने पर खत्म होते हैं, लेकिन control बनाए रखने के लिए maximum iterations जैसी stopping conditions रखना भी आम है
Implementation अक्सर simple होता है
- Agent आम तौर पर environment feedback के आधार पर loop में tools इस्तेमाल करने वाला LLM होता है
- इसलिए tool set और documentation को स्पष्ट और सावधानी से design करना चाहिए
Usage conditions
- Open-ended problems जिनमें required steps की संख्या predict करना कठिन या असंभव हो
- ऐसे tasks जिन्हें fixed path में hardcode नहीं किया जा सकता
- ऐसी situations जहां LLM कई turns तक operate कर सकता हो और decision making में कुछ level of trust चाहिए हो
Constraints
- Autonomy के साथ ज्यादा cost और error accumulation की संभावना आती है
- Sandbox environment में extensive testing और appropriate guardrails की सिफारिश की जाती है
Examples
- कई files edit करने की जरूरत वाले SWE-bench tasks हल करने वाला coding agent
- Claude द्वारा computer का इस्तेमाल करके tasks करने वाला “computer use” reference implementation

Patterns का combination और customization

दिए गए building blocks कोई fixed prescription नहीं हैं, बल्कि common patterns हैं जिन्हें developers अपने use case के हिसाब से adjust और combine कर सकते हैं
सफलता की कुंजी, LLM capabilities की तरह, performance measure करने और implementation को iteratively improve करने में है
Complexity तभी जोड़नी चाहिए जब results वास्तव में improve हों

Implementation principles

LLM domain में success का मतलब सबसे sophisticated system बनाना नहीं, बल्कि जरूरत के हिसाब से सही system बनाना है
Recommended order यह है
- Simple prompt से शुरू करें
- Comprehensive evaluation से prompt optimize करें
- Simple solution पर्याप्त न हो तभी multi-step agentic system जोड़ें
Agent implementation में तीन principles महत्वपूर्ण हैं
- Design की simplicity बनाए रखना
- Agent के planning steps को explicitly दिखाकर transparency को priority देना
- Thorough tool documentation और testing के साथ agent-computer interface, यानी ACI, को सावधानी से design करना
Frameworks quick start में मदद करते हैं, लेकिन production की ओर जाते समय abstraction layers कम करके basic components से build करने का तरीका भी जरूरी हो सकता है

Practical application areas

Customer support
- Customer support familiar chatbot interface को tool integration के जरिए expanded capabilities के साथ combine करता है
- यह ज्यादा open-ended agents के लिए naturally fit होने के कुछ कारण हैं
  - Support interactions conversation flow का पालन करते हुए external information और action access की जरूरत रखते हैं
  - Tools customer data, order history, knowledge base documents fetch करने के लिए integrate किए जा सकते हैं
  - Refund processing या ticket updates जैसे tasks programmatically handle किए जा सकते हैं
  - Success को user-defined resolution से साफ मापा जा सकता है
- कई companies ने successful resolutions पर ही fee charge करने वाले usage-based pricing models के जरिए इस approach की viability दिखाई है
Coding agents
- Software development क्षेत्र ने code completion से autonomous problem solving तक LLM capabilities evolve होने के साथ बड़ी potential दिखाई है
- Agents प्रभावी होने के कारण हैं
  - Code solutions automated tests से verify किए जा सकते हैं
  - Agent test results को feedback की तरह इस्तेमाल करके solution को iteratively improve कर सकता है
  - Problem space well-defined और structured है
  - Output quality objectively measure की जा सकती है
- Anthropic implementation में agent केवल pull request description के आधार पर SWE-bench Verified benchmark के real GitHub issues हल कर सकता है
- Automated tests functionality verify करने में मददगार हों, फिर भी solution broader system requirements के अनुरूप है या नहीं, यह confirm करने के लिए human review अभी भी महत्वपूर्ण है

Tool prompt engineering

किसी भी agentic system में tools एक महत्वपूर्ण component होने की संभावना रखते हैं
Tools Claude को external services और APIs से interact करने देते हैं
- API में precise structure और definitions specify किए जाते हैं
- जब Claude tool call plan करता है, तो API response में tool use block शामिल होता है
Tool definitions और specifications को भी उतना ही prompt engineering attention मिलना चाहिए जितना full prompt को
Tool format selection
- Same task को कई तरीकों से specify किया जा सकता है
  - File editing को diff के रूप में लिखा जा सकता है या entire file rewrite के रूप में specify किया जा सकता है
  - Structured output को Markdown के अंदर code या JSON के अंदर code के रूप में return किया जा सकता है
- Software engineering perspective से ये formats lossless transformable हो सकते हैं, लेकिन LLM के लिए कुछ formats कहीं ज्यादा कठिन होते हैं
  - Diff लिखते समय new code लिखने से पहले chunk header में यह पता होना चाहिए कि कितनी lines बदलेंगी
  - JSON के अंदर code लिखने पर newlines और quotes escape करने की अतिरिक्त जरूरत होती है
- Tool format चुनते समय model को unnecessary format burden में फंसने से बचाना चाहिए
  - Dead-end format में जाने से पहले पर्याप्त thinking tokens दें
  - Format को उन formats के करीब रखें जिन्हें model ने internet text में naturally देखा है
  - हजारों lines of code की exact line count गिनने या code string escape करने जैसे format overhead हटाएं
ACI design
- Human-computer interface (HCI) पर जितना effort लगाया जाता है, उतना ही agent-computer interface (ACI) design पर भी लगाना चाहिए
- Good tool definitions में अक्सर example usage, edge cases, input format requirements और other tools के साथ clear boundaries शामिल होती हैं
- Parameter names और descriptions को model के लिए आसानी से समझने योग्य बनाने के लिए adjust करना चाहिए
  - यह team के junior developer के लिए शानदार docstring लिखने जैसा है
  - यह खासकर तब महत्वपूर्ण है जब similar tools कई हों
- Model के tool usage को test करना चाहिए
  - workbench में कई example inputs run करके model की mistakes देखें और iteratively improve करें
  - Poka-yoke approach से tools design करके arguments को इस तरह बदलना recommended है कि mistakes करना कठिन हो जाए
- SWE-bench के लिए agent बनाते समय full prompt की तुलना में tool optimization पर ज्यादा समय लगाया गया
  - Agent के root directory से बाहर जाने के बाद relative file path इस्तेमाल करने वाले tool में गलती करने की समस्या थी
  - Tool को हमेशा absolute file path मांगने के लिए बदलने पर model ने इस method को बिना errors के इस्तेमाल किया

1 टिप्पणियां

GN⁺ 2025-06-18

Hacker News की राय

मुझे लगता है कि यह लेख इस विषय पर अब भी अच्छे लेखों में से एक है। खास तौर पर शुरुआत में AI agent से उनका क्या मतलब है, इसे साफ़ तौर पर define करना अच्छा लगा
यहां इसे “ऐसा system जिसमें LLM अपनी processing process और tool usage को dynamically निर्देशित करता है, और task को कैसे पूरा किया जाए इस पर control बनाए रखता है” के रूप में define किया गया है
साथ ही “agent” और “workflow” के बीच फर्क करना, और कई उपयोगी workflow patterns समझाने का तरीका भी अच्छा लगा
जब यह पहली बार आया था, तब मैंने इस लेख पर notes लिखे थे: https://simonwillison.net/2024/Dec/20/building-effective-age...
Anthropic का एक और हालिया लेख है https://www.anthropic.com/engineering/built-multi-agent-rese... — “How we built our multi-agent research system”, और यह भी बहुत दिलचस्प था, इसलिए मैंने इस पर भी notes संकलित किए: https://simonwillison.net/2025/Jun/14/multi-agent-research-s...
- Building Effective Agents के लेखकों में से एक AIE में आए और इस लेख पर आधारित presentation भी दिया, जिसे अच्छी प्रतिक्रिया मिली: https://www.youtube.com/watch?v=D7_ipDqhtwk
- multi-agent research system वाला लेख शानदार है। हालांकि Building Effective AI Agents लेख में शुरुआती system को framework के बिना बनाने वाली बात से मैं सहमत नहीं हूं
  सीखने के मकसद से तो यह अच्छा लगता है, लेकिन अच्छे framework का पहला फायदा यह है कि अलग-अलग providers के LLM को आसानी से test किया जा सकता है
- मुझे लगता है इस लेख की workflow definition सटीक नहीं है। modern workflow engines सिर्फ पहले से तय code paths पर नहीं चलते, और ऐसे मामलों में वे असल में agents जैसे ही होते हैं
  यह workflow को दोबारा define करके फर्क दिखाने की कोशिश लगती है, लेकिन ज्यादातर agents सिर्फ iterative workflows होते हैं जो LLM response के आधार पर dynamically कुछ call करते हैं। modern workflow engines बहुत dynamic होते हैं
- क्या किसी को पता है कि Anthropic कौन-सा AI agent framework इस्तेमाल करता है? लगता नहीं कि उन्होंने अपना framework public किया है
“LLM calls, tool definition और parsing, call chaining जैसे standard low-level tasks को simplify करके शुरुआत आसान बना देते हैं, लेकिन अक्सर abstraction की अतिरिक्त layer बना देते हैं, जिससे मूल prompts और responses छिप जाते हैं और debugging मुश्किल हो जाती है। जब simpler setup काफी हो सकता है, तब भी complexity जोड़ने का मन करा देते हैं। developers को सलाह है कि वे सीधे LLM API इस्तेमाल करने से शुरू करें” — मुझे लगता है यह सलाह पूरे लेख में सबसे बेहतरीन है
असल में strings की array को web service पर भेजने जैसे काम के लिए विशाल framework इस्तेमाल करना समझदारी नहीं है
हमने company project से भी LangChain और LangGraph हटा दिए; असल में उनमें value नहीं थी और उन्होंने सिर्फ complexity बढ़ाई। framework के boilerplate को संभालना पड़ता था, इसलिए न इस्तेमाल करने की तुलना में उल्टा ज्यादा code लिखना पड़ता था
- langflow भी शायद इसी category में आएगा। फिर भी कई flows को common format में व्यवस्थित करने के लिए इसका उपयोग जरूर है
  Stable Diffusion से image generation के सभी steps चलाए जा सकते हैं या shader code खुद लिखा जा सकता है, लेकिन अगर flow या task एक से ज्यादा हों और experimentation चल रहा हो, तो comfy-UI या shader graph इस्तेमाल करना कहीं ज्यादा व्यवस्थित रहता है
आधा साल बीत गया है, और AI field में यह काफी लंबा समय लगता है। कुछ महीने पहले मैंने यह लेख बार-बार पढ़ा था, लेकिन अब लगता है कि agent development साफ़ तौर पर bottleneck पर पहुंच गया है
latest Gemini तक regress करता हुआ दिखता है
- कई agents चलाने पर cost महंगी हो जाती है और return on investment कम हो जाता है। stocks के लिए DeepSearch agent 6 agents इस्तेमाल करता है, और प्रति query करीब 2 dollars खर्च होते हैं
  multi-agent orchestration को control करना मुश्किल है, और model performance जितनी बेहतर होती है, multi-agent की जरूरत उतनी कम हो जाती है। इसके उलट, model performance जितनी कम होती है, उतना ही narrow-scope AI business के लिहाज से ज्यादा उचित होता है
- आखिर regress किस वजह से हो रहा है? मैं जानना चाहता हूं कि यह खुद को एक झुंड में fork करके 24 घंटे parallel काम क्यों नहीं कर सकता, results verify करते हुए लगातार improve क्यों नहीं हो सकता
- prompt injection समस्या को हल करने में मुश्किल हो रही है, और वही bottlenecks में से एक है
क्या production environment में कंपनी का खर्च बचाने और सच में मूल्यवान काम करने वाले agents के उदाहरण हैं? मेरा मतलब ऐसे मामलों से है जो chips के packet में खाली जगह भरने के लिए text लिखने जैसे न हों
- ChatIPT अच्छा लगा। यह biodiversity data में वास्तविक समस्याएं हल करता है। वे “agentic” शब्द इस्तेमाल नहीं करते, लेकिन यह साफ तौर पर Python code लिखता और चलाता है
  https://www.gbif.org/news/6aw2VFiEHYlqb48w86uKSf/chatipt-sys...
  अभी beta में है
  प्रेस रिलीज़ के मुताबिक, Rukaya Johaadien का chatbot उन students और researchers को interactive support देता है जिनके पास biodiversity data है, लेकिन data publishing उनके लिए नया या दुर्लभ है। यह spreadsheets को साफ और standardize करता है, basic metadata बनाता है, और अच्छी तरह structured dataset को Darwin Core Archive के रूप में GBIF.org पर publish करने में guide करता है
  अब तक PhD या master's research या छोटे biodiversity studies से निकले high-quality data को बड़े पैमाने पर publish करना मुश्किल था। वजह यह थी कि data standardization के लिए आम तौर पर programming languages, data management techniques और specialized software की जानकारी चाहिए होती थी
  GBIF network के data sharing core app, Integrated Publishing Toolkit(IPT), तक access करने की प्रक्रिया भी beginners के लिए मुश्किल है। कहा गया है कि node managers का समय और resources सीमित होते हैं, और occasional users हर साल सही process और details भूल जाते हैं, इसलिए सिर्फ training से logistical और language barriers पार करना मुश्किल होता है
  उन्होंने समझाया, “data standardization कठिन है, और biologists coding या Excel पसंद करते हैं इसलिए biologist नहीं बने; इसी वजह से बहुत-सा संभावित रूप से valuable data बेकार चला जाता है। यह देखते हुए कि large language models code generation और data tasks में बहुत बेहतर हो गए हैं, हमने ऐसा tool बनाया जो non-technical users को रोज़मर्रा के सवालों के जरिए guide करे, messy data को जितना हो सके process करे, और फिर GBIF पर तेज़ी से और automatically publish करे”
- louie.ai में users के रोज़ के investigation work को automate करने के लिए agents और agentic reasoning इस्तेमाल हो रहे हैं
  हर incoming alert या ticket के लिए agent संबंधित APIs, databases आदि पर preliminary investigation करता है, false positives की पहचान करता है और real issues पर ज्यादा context देता है। इससे human time कम लगता है और processing speed बढ़ती है
  यही agentic reasoning exploration tasks में भी इस्तेमाल होता है, और simple text-to-SQL से आगे जाकर LLM 2–10 मिनट तक Splunk, Databricks आदि की तरफ से investigation करता है
  internally उनके पास database के ऊपर semantic layer, large-scale log/text/dataframe analyzers जैसे tools हैं
मैंने लगभग इसी लेख जैसी setup वाला अपना बनाया हुआ n8n workflow इस्तेमाल करके देखा। एक simple सवाल का जवाब पाने में 3 डॉलर और कम से कम 3 मिनट लगे
फिलहाल मैं normal search ही इस्तेमाल करता रहूंगा
यह लेख अच्छी तरह याद दिलाता है कि काम करने वाली सबसे सरल चीज़ से शुरुआत करें, और complexity सिर्फ तब जोड़ें जब सच में जरूरत हो
कुछ clearly defined LLM calls और हल्के glue logic से भी आम तौर पर ऐसा system बन जाता है जो ज्यादा stable, debug करने में आसान और चलाने में काफी सस्ता होता है। चमकदार और feature-rich agents अक्सर जितनी problems solve करते हैं, उससे ज्यादा पैदा कर देते हैं
production environment में workflow नहीं बल्कि असली agents वाली company में काम करने के नाते, यहां “LangGraph जैसे agent framework इस्तेमाल करें” वाली पहली लाइन से मैं बिल्कुल सहमत नहीं हूं
हमने भी ठीक यही किया था, और एक महीने में सब फेंकना पड़ा; फिर शुरुआत से दोबारा बनाया और अब system काफी अच्छी तरह scale कर रहा है
निष्पक्ष होकर कहूं तो agent frameworks के लिए जगह हो सकती है। लेकिन agent field अभी इतनी शुरुआती अवस्था में है कि कोई पर्याप्त अच्छा framework आ सके, ऐसा कहना मुश्किल है
कुछ हद तक मेरा उल्टा विचार भी है: agent field इतनी तेजी से बदल रही है कि शायद पर्याप्त अच्छा framework कभी आए ही नहीं
- बल्कि यह तो लेख से सहमत लग रहा है। original में भी कहा गया है कि पिछले एक साल में कई industries की LLM agent teams के साथ काम करने पर सबसे successful implementations complex frameworks या specialized libraries से नहीं, बल्कि simple और composable patterns से बने
  frameworks शुरुआत आसान बनाते हैं, लेकिन extra abstraction layers prompts और responses को छिपाकर debugging मुश्किल बना सकती हैं, और जहां simpler setup काफी होता है वहां भी complexity जोड़ सकती हैं। इसलिए बहुत से patterns कुछ lines of code में implement हो सकते हैं, तो LLM API सीधे इस्तेमाल करने से शुरुआत करने की सलाह दी गई है
- अभी N8N के agent tools से बने prototype से self-hostable real system पर जा रहा हूं
  pragmatic teams ने ज्यादातर LangChain, LangGraph, Haystack, Crew जैसी चीजें छोड़कर simpler internal code अपनाया, ऐसे comments बहुत देखे हैं, लेकिन reality में tool calling जैसी चीजें असल में कैसे implement होती हैं, इसका अभी भी साफ अंदाजा नहीं है
  अगर आपके पास काम की base के तौर पर इस्तेमाल किए गए links या docs हों, तो share कर सकते हैं?
- वह agent कौन-सा काम करता है?
यह December 2024 का लेख है, लेकिन अजीब तरह से बहुत पुराना लगता है
- फिर भी personally मुझे लगता है कि यह आज भी काफी अच्छी तरह टिकता है। मैं इस लेख को लगातार reference के तौर पर इस्तेमाल करता हूं और यह outdated नहीं लगता
  AI tools development में Anthropic को “practical partner” के रूप में फिर से देखने पर मजबूर करने वाला लेख था
- “नहीं, अब फिर से दिमाग लगाना पड़ेगा और December 2024 के आदिम इंसान की तरह 100% code खुद लिखना पड़ेगा”
  https://news.ycombinator.com/item?id=44260988
अब लगता है agent hype थोड़ी शांत हो गई है
“simple और composable patterns इस्तेमाल करें” वाली बात अजीब तरह से भरोसा देती है
अच्छा लगता है कि “एक काम अच्छी तरह करो” वाली कहावत दशकों बाद भी अब भी valid है। composability सबसे अच्छी चीज़ है

प्रभावी AI एजेंट बनाना

Agentic systems का बुनियादी वर्गीकरण

एजेंट कब इस्तेमाल करें, यह तय करने के मानदंड

Framework इस्तेमाल करने के मानदंड

बुनियादी building block: augmented LLM

Workflow patterns

Prompt chaining

Routing

Parallelization

Orchestrator-worker

Evaluator-optimizer

Autonomous agents

Patterns का combination और customization

Implementation principles

Practical application areas

Customer support

Coding agents

Tool prompt engineering

Tool format selection

ACI design

संबंधित पढ़ाई

1 टिप्पणियां

Hacker News की राय