civStation – Civilization VI को रणनीतिक स्तर पर नियंत्रित करने वाला computer-use VLM आधारित एजेंट (Human in the loop तक)

ironman0722 · 2026-03-31T14:03:37+09:00

प्राकृतिक भाषा कमांड के ज़रिए Civilization VI खेलने के लिए computer-use VLM harness “पूर्व की ओर विस्तार”, “अर्थव्यवस्था पर फ़ोकस”, “विज्ञान विजय” जैसे high-level intent इनपुट → एजेंट वास्तविक ऑपरेशन करता है रणनीति और execution को अलग करने वाली 3-layer संरचना (Strategy / Action / HITL) Strategy Layer: प्राकृतिक भाषा → structured goals में रूपांतरण, दीर्घकालिक रणनीति बनाए रखना और task decomposition Action Layer: स्क्रीन-आधारित (VLM) state recognition + mouse/keyboard से execution (कोई game API नहीं) HITL Layer: execution के दौरान intervention/modification/stop संभव करने वाली controllable autonomy संरचना एक रणनीति → कई action sequence में विभाजित होती है, और प्रति task 2~16 model calls होते हैं sub-agent आधारित तरीके से city management, unit movement जैसे bounded task units में execution मौजूदा RL/IL/script तरीकों के बजाय “intent → action interface transition” पर प्रयोग direct manipulation नहीं, बल्कि strategic delegation और agent orchestration का तरीका प्रमुख तकनीकी मुद्दे: VLM perception errors, execution drift, सफलता सत्यापित करने में कठिनाई multi-step execution में latency और API cost बढ़ना, fallback रणनीति की quality गिरना पूर्ण automation नहीं, बल्कि human-in-the-loop आधारित real-time रणनीति संशोधन और नियंत्रण संभव UI-only environment में agent control / verification समस्याओं को संभालने वाला experimental system गेमप्ले से अधिक फ़ोकस इस पर कि “मानव-सिस्टम इंटरफ़ेस को रणनीति स्तर तक ऊपर उठाया जाए”

(github.com/NomaDamas)

7 पॉइंट द्वारा ironman0722 2026-03-31 | 1 टिप्पणियां | WhatsApp पर शेयर करें

प्राकृतिक भाषा कमांड के ज़रिए Civilization VI खेलने के लिए computer-use VLM harness
“पूर्व की ओर विस्तार”, “अर्थव्यवस्था पर फ़ोकस”, “विज्ञान विजय” जैसे high-level intent इनपुट → एजेंट वास्तविक ऑपरेशन करता है
रणनीति और execution को अलग करने वाली 3-layer संरचना (Strategy / Action / HITL)
- Strategy Layer: प्राकृतिक भाषा → structured goals में रूपांतरण, दीर्घकालिक रणनीति बनाए रखना और task decomposition
- Action Layer: स्क्रीन-आधारित (VLM) state recognition + mouse/keyboard से execution (कोई game API नहीं)
- HITL Layer: execution के दौरान intervention/modification/stop संभव करने वाली controllable autonomy संरचना
एक रणनीति → कई action sequence में विभाजित होती है, और प्रति task 2~16 model calls होते हैं
sub-agent आधारित तरीके से city management, unit movement जैसे bounded task units में execution
मौजूदा RL/IL/script तरीकों के बजाय “intent → action interface transition” पर प्रयोग
direct manipulation नहीं, बल्कि strategic delegation और agent orchestration का तरीका
प्रमुख तकनीकी मुद्दे:
- VLM perception errors,
- execution drift,
- सफलता सत्यापित करने में कठिनाई
- multi-step execution में latency और API cost बढ़ना, fallback रणनीति की quality गिरना
पूर्ण automation नहीं, बल्कि human-in-the-loop आधारित real-time रणनीति संशोधन और नियंत्रण संभव
UI-only environment में agent control / verification समस्याओं को संभालने वाला experimental system
गेमप्ले से अधिक फ़ोकस इस पर कि “मानव-सिस्टम इंटरफ़ेस को रणनीति स्तर तक ऊपर उठाया जाए”

1 टिप्पणियां

bus710 2026-04-01

जब आप जीत के लिए कब्ज़ा/संस्कृति/विज्ञान/कूटनीति की तरफ़ पूरी मेहनत से दौड़ रहे होते हैं, तभी कहीं न कहीं से धार्मिक जीत आपके सिर पर वार कर देती है

civStation – Civilization VI को रणनीतिक स्तर पर नियंत्रित करने वाला computer-use VLM आधारित एजेंट (Human in the loop तक)

संबंधित पढ़ाई

1 टिप्पणियां