macOS 26.2 से Thunderbolt के जरिए RDMA के साथ तेज़ AI क्लस्टर बनाना संभव

(developer.apple.com)

11 पॉइंट द्वारा GN⁺ 2025-12-13 | अभी कोई टिप्पणी नहीं है. | WhatsApp पर शेयर करें

macOS Tahoe 26.2 में नया Thunderbolt 5-आधारित RDMA फीचर जोड़ा गया है, जिससे MLX का उपयोग करने वाले वितरित AI inference जैसे low-latency communication संभव हो गए हैं
इसका मतलब है "Mac को high-speed distributed computing node की तरह ट्रीट किया जा सकता है", और macOS अब सिर्फ़ एक desktop OS से आगे बढ़कर लोकल AI·HPC experiment platform के रूप में भी विस्तार कर सकता है

RDMA क्या है

RDMA(Remote Direct Memory Access) एक ऐसी communication method है जिसमें एक कंप्यूटर दूसरे कंप्यूटर की memory तक CPU के दखल के बिना सीधे पहुँचता है
यह network stack, kernel copy और context switching को bypass करके latency को बेहद कम और throughput को काफ़ी ज़्यादा कर देता है
इसका इस्तेमाल मुख्य रूप से InfiniBand, RoCE जैसे datacenter network में होता रहा है
high-performance computing(HPC), distributed storage, और बड़े पैमाने के AI training·inference में यह एक standard technology बन चुकी है
इसकी सबसे अहम बात यह है कि "नेटवर्क communication होने के बावजूद यह ऐसे काम करता है मानो एक ही memory इस्तेमाल हो रही हो"

macOS 26.2 में Thunderbolt 5 से जुड़े Mac के बीच RDMA communication को support किया गया है
पहले RDMA server-grade network equipment तक सीमित था, लेकिन अब यह सिर्फ़ एक cable से जुड़े लोकल Mac cluster में भी संभव हो गया है
Thunderbolt की high bandwidth और बेहद low latency को RDMA model में उसी तरह इस्तेमाल किया जा सकता है
यानी, "डेस्क पर रखे कई Mac को datacenter की तरह जोड़ने का रास्ता" खुल गया है

distributed AI inference या training में node के बीच tensor exchange अक्सर bottleneck बन जाता है
RDMA इस प्रक्रिया में CPU का इस्तेमाल किए बिना GPU ↔ GPU जैसी communication pattern उपलब्ध कराता है
release notes में बताए गए MLX-आधारित distributed AI inference का डिज़ाइन ऐसे ही low-latency, high-bandwidth communication को ध्यान में रखकर किया गया है
model को कई Mac में बाँटकर चलाना और ऐसा inference cluster बनाना जो single machine की तरह काम करे, अब और अधिक संभव हो गया है
छोटे team या research environment में "server के बिना Mac से AI cluster बनाना" अब एक व्यवहारिक विकल्प बन सकता है

कई Mac Studio / Mac Pro मशीनों को Thunderbolt से जोड़कर लोकल AI inference farm बनाना
जब किसी बड़े model को एक single GPU पर चलाना मुश्किल हो, तब model-partitioned inference के प्रयोग करना
लोकल distributed simulation, high-speed data pipeline, और experimental distributed systems research
datacenter में जाने से पहले prototype·PoC environment बनाने की लागत में बड़ा reduction