निजी सुरक्षा लिंक, क्या सच में public access असंभव है?

(vin01.github.io)

2 पॉइंट द्वारा GN⁺ 2024-03-08 | 1 टिप्पणियां | WhatsApp पर शेयर करें

urlscan.io, Hybrid Analysis, Cloudflare Radar URL Scanner जैसी URL/malware analysis services threat intelligence sharing के लिए links store करती हैं, लेकिन user की गलती या गलत configured scanners की वजह से sensitive links public data के रूप में रह सकते हैं
Dropbox, iCloud, AWS S3, Zoom, OneDrive, Airtable, password reset links, OAuth login links जैसी ऐसी services जहां URL खुद access permission की तरह इस्तेमाल होता है खास तौर पर प्रभावित हो सकती हैं
सभी links तुरंत expose नहीं होते, लेकिन tax documents, invoices, photos, work communications, onetimesecret shared secrets, smart home और meeting recordings जैसी सामग्री वास्तव में मिली है
urlscan Pro, Public के साथ-साथ Unlisted scans भी paid customers को दिखाता है, और TheHive के Cortex-Analyzers configuration जैसी settings के कारण अनजाने में unlisted के रूप में submit होने के रास्ते हो सकते हैं
पिछले 24 घंटों में urlscan.io scans की संख्या Public 398,563, Unlisted 328,147, Private 955,432 थी, और canary token experiment में भी submission के 1 घंटे के अंदर access confirm हुआ, इसलिए scan visibility management जरूरी है

URL analysis services में बच जाने वाले sensitive links

urlscan.io, Hybrid Analysis, Cloudflare Radar URL Scanner URL और malware analysis के लिए बहुत सारे links store करते हैं
समस्या यह है कि इस storage में private और sensitive links भी शामिल हो सकते हैं
- user यह जाने बिना sensitive link scan के लिए submit कर देता है कि वह public information बन सकता है
- गलत configured scanner या extension, email में scan किए गए private links को public data के रूप में submit कर देता है

कौन-से links और data expose हो सकते हैं

मिले हुए links में कई services के shared URLs और authentication-related URLs शामिल हैं
- Dropbox, iCloud, Sync, Egnyte, Ionos Hidrive, AWS S3 जैसे cloud storage file sharing
- Western Digital Mycloud जैसे cloud-connected NAS
- Slido, Zoom, OneDrive, Airtable जैसे enterprise communication tools
- password reset links और OAuth login links
ऐसी services अक्सर security के लिए random identifier वाले single private link से access allow करती हैं
कुछ links password या passphrase से extra protected होते हैं, इसलिए सिर्फ link access करने से data तुरंत expose नहीं होता
वास्तव में मिले sensitive content में ये शामिल थे
- tax documents, invoices, photos, work communications जैसी private files
- onetimesecret से share किए गए secrets
- smart home device recordings
- cloud में stored meeting recordings

Submission paths और responsibility gaps

urlscan.io पर check किए गए कई submissions में falconsandbox tag था, और इसी से Hybrid Analysis तक analysis scope बढ़ा
Cloudflare Radar भी और व्यापक रूप से इस्तेमाल हो सकता है, और इसमें पहले से कुछ private links public data के रूप में शामिल हैं
Hybrid Analysis की terms बताती हैं कि user-submitted content को analyze, publish और share किया जा सकता है, और submissions या auto-generated reports में संयोग से शामिल information के लिए वह जिम्मेदार नहीं है
urlscan.io की terms भी user content या actions के लिए responsibility नहीं लेतीं, और बताती हैं कि account के तहत posted content और activities के लिए user जिम्मेदार है
existing content को review करके sensitive links mark या remove करने का clear mechanism मौजूद नहीं दिखता, और इसे automate करना भी आसान नहीं हो सकता
Positive Security का analysis urlscan.io पर canary tokens का इस्तेमाल करके automated sources detect करता है, और बताता है कि email में malicious links scan करने वाले security tools वजह हो सकते हैं
यही behavior canary links से भी verify हुआ

Unlisted scans और urlscan Pro access

urlscan Pro paid users और enterprises को Public के साथ-साथ Unlisted scans तक भी broader scan access देता है
urlscan.io में Unlisted, public page या search results में नहीं दिखता, लेकिन urlscan Pro platform customers को visible रहता है
- बताया जाता है कि urlscan Pro customers verified security researchers या reputable companies तक सीमित हैं
TheHive के Cortex-Analyzers unintended exposure path दिखाने वाला example हैं
- urlscan.io analyzer में public:on setting explicitly इस्तेमाल की गई है
- इस setting की वजह से urlscan account visibility Private होने पर भी link unlisted के रूप में दिख सकता है
- related code Cortex-Analyzers के urlscan.py में है
इस case में data पूरी तरह public न भी हो, तब भी urlscan Pro users को दिख सकता है, इसलिए ज्यादा sensitive information expose होने की संभावना बनी रहती है

Scan counts और canary token results

पिछले 24 घंटों के आधार पर urlscan.io scan counts इस प्रकार हैं
- Public: 398,563
- Unlisted: 328,147
- Private: 955,432
canary token से check किए गए access results इस प्रकार हैं
- urlscan.io पर unlisted के रूप में submit किया गया link submission के 1 घंटे के अंदर 12 बार access हुआ
- Hybrid Analysis पर browser के बजाय API से submit किया गया link submission के 1 घंटे के अंदर 10 बार access हुआ
- कुछ IP addresses ने दोनों services में submit किए गए unique links को एक साथ access किया, और source IP anonymization service का इस्तेमाल किया
संबंधित IP address list separate file में public की गई है

Sensitive links हटाना और use करते समय सावधानियां

urlscan.io और Hybrid Analysis links report करके remove करने की procedure provide करते हैं
- urlscan.io FAQ
- Hybrid Analysis sensitive file removal guide
Hybrid Analysis में removal और sharing scope ज्यादा complex है
- Public Sandbox में submit की गई सभी files searchable होती हैं और पूरी दुनिया को available होती हैं
- “Do not share my sample with the community” checkbox select करने पर भी screenshots और actual reports उपलब्ध रहते हैं
- “do not share” केवल actual input sample पर apply होता है
service use करते समय पहले scan visibility check करनी चाहिए
ऐसे URL databases में links या files access करने पर phishing attempts, real malicious files, malicious links मिल सकते हैं
access जरूरी हो तो sandbox environment में check करना चाहिए

1 टिप्पणियां

GN⁺ 2024-03-08

Hacker News की रायें

मूल समस्या यह है कि access control के बिना links को सिर्फ इसलिए private मान लिया जाता है क्योंकि उनके public identifiers का कोई index नहीं है
पिछले महीने भी buckets के ज़रिए AWS account ID खोजने की बात HN पर काफ़ी चर्चा में रही थी[0], और comments में सहमति इस तरफ़ थी कि account identifiers को private मानने की धारणा पर security निर्भर करना गलत है
यहाँ भी वही concept लागू होता है; यह कोई नया security issue नहीं, बल्कि search operator-based exploration (dorking) का एक और तरीका है
[0]: https://news.ycombinator.com/item?id=39512896
- समस्या यह है कि links leak हो जाते हैं
  सैद्धांतिक रूप से 256-अक्षरों वाला hexadecimal link, यानी 1024-bit, 32-अक्षर username और 32-अक्षर password की तुलना में guess करना कहीं ज़्यादा कठिन है
  https://site.com/[256chars] में 2^1024 combinations हैं, इसलिए brute force करना व्यवहार में असंभव है
  वहीं https://site,com/[32chars] और 32-अक्षर password में 2^256 combinations हैं, यह भी लगभग असंभव है लेकिन पहले वाले की तुलना में इसकी संभावना ज़्यादा है
  इसे https://site,com/[32chars][32chars] जैसा मान सकते हैं
  हालांकि पहला वाला guess करना ज़्यादा कठिन हो, फिर भी URL password की तुलना में कहीं ज़्यादा leak होते हैं
- हो सकता है कुछ details छूट रही हों, लेकिन मूल समस्या यह लगती है कि लोगों के बीच के private messages को private माना जाता है, जबकि असल में message deliver करने वाला platform उन messages को पढ़ता है और links access करता है
  यहाँ message में email, DM, और documents में paste किए गए links तक व्यापक रूप से शामिल हैं
- थोड़ा अलग बात है, लेकिन हाल ही में एक consultant ने सलाह दी कि हर NAR file name में बड़ा hash होता है, इसलिए private Nix closures को publicly accessible S3 bucket में डालना ठीक है
  असहज लगा, इसलिए आखिरकार दूसरा तरीका चुना, लेकिन मैं लगातार सोचता रहा कि URL में “secret” होने और URL request करते समय submit किए जाने वाले token में secret होने में वास्तव में कितना फर्क है
  मेरा निष्कर्ष यह है कि tokens customer-wise issue किए जा सकते हैं, और access logs monitor करके suspicious behavior देखकर उन्हें revoke किया जा सकता है
  और जैसा दूसरों ने कहा, file names की list को secret बनाए रखने को कितनी अहमियत दी जाए, इसके बारे में mindset भी अलग है
  Amazon जिस पैमाने पर गलती कर सकता है, उसमें public bucket की file-name list को अनजाने में expose कर देना ऐसी बात है जिसकी 99% users परवाह नहीं करेंगे, इसलिए यह कम priority लगती है
- जिस company में पहले काम करता था, उसने एक customer company के साथ काम करते समय S3 bucket name collision झेला था; पता चला कि दोनों पक्षों ने hyphenated-company-name को अच्छा S3 bucket name माना था
  स्वाभाविक रूप से हमारी company वह मुकाबला हार गई
  उसके बाद से AWS पर काम करते समय bucket names आम तौर पर - वाले रूप में रखने का छोटा-सा सबक मिला
  अगर सच में private होना चाहिए, तो project name भी encrypt करें, और “friendly” names से buckets list करने वाली script दे दें
  hosting services में हमेशा अजीब compromises होते हैं, इसलिए तकनीकी रूप से perfect तरीका—पूरी तरह random identifiers—imperfect तरीके यानी descriptive names की तुलना में operational burden बहुत बढ़ा सकता है
- सोच रहा हूँ कि password वाले private link और link पर जाने के बाद password enter करने वाली site में कोई फर्क है या नहीं
  Bitwarden Send दूसरों को भेजने के लिए link बनाता है, और # के बाद लंबी random string लगती है
  मैं इसे नियमित रूप से इस्तेमाल करता हूँ, इसलिए जानना चाहता हूँ कि इसमें कोई security problem है या नहीं
  कम-से-कम link revoke किया जा सकता है और कुछ दिनों बाद auto-expire भी हो सकता है, जबकि आम password आम तौर पर ऐसा behave नहीं करते
private sharing link बनाना हो तो URL के hash हिस्से में private value store कर सकते हैं
hash DNS query या HTTP request में transmit नहीं होता
उदाहरण के लिए links.com?token= पर visit करने पर वह link search parameters समेत transmit होता है और Cloudflare जैसे intermediaries में store हो सकता है
वहीं links.com# पर visit करने पर hash हिस्सा browser से बाहर नहीं जाता
hash हिस्से के data को handle करते समय URL Safe Base64 string में encode करना सुविधाजनक है
यानी flow है JS Object ↔ JSON String ↔ URL Safe Base 64 String
- अगर HTTPS इस्तेमाल कर रहे हैं, तो parameter string और path भी encrypted होते हैं, इसलिए उस secret को पढ़ने के लिए संबंधित intermediary को traffic decrypt करने में सक्षम होना होगा
  बाकी बात सही है, बस यह HTTPS encryption nuance जोड़ना चाहता था
- एक बड़ा caveat है: उस page पर चलने वाला, ऊपर से harmless दिखने वाला JavaScript भी fragment को internet पर कहीं भी भेज सकता है
  fragment में रखना मददगार है, लेकिन perfect नहीं
  यह सिर्फ आदर्शवादी बात नहीं है; मैंने वास्तविक रूप से fragment से private token को इस तरह leak होते कई बार देखा है
- क्या कोई DNS feature है जिसके बारे में मुझे पता नहीं, जो domain हिस्से से ज़्यादा query करता है?
  [https://example.com?token=](<https://example.com?token=<secret>>;) के लिए DNS query सिर्फ “example.com” के लिए जानी चाहिए
- इस समस्या को हल करने का तरीका सोचें तो खासकर email-based login या account reset याद आते हैं
  क्या email के अंदर links follow करने वाले bots JavaScript execute करते हैं? क्या JavaScript द्वारा induce किए गए POST से actual action activate होने का risk है?
- संदर्भ के लिए, उस चीज़ को fragment कहते हैं
जो links तेज़ redirect loop का हिस्सा नहीं हैं, उन्हें sharing के लिए copy-paste होना ही है
URL मूल रूप से इसी काम के लिए हैं; वे universal हैं और किसी protocol द्वारा दिए गए resource तक access आसान बनाते हैं
जिन चीज़ों की lifetime छोटी नहीं है, उनका access control URL के बाहर होना चाहिए
end-to-end encrypted नहीं होने वाले channel पर link share करने पर उस URL को सबसे पहले access करने वाली entity recipient नहीं, बल्कि channel service होती है
यह Bitwarden के user experience के लिए favicon ढूंढने जैसा legitimate हो सकता है, या Facebook Messenger crawler के private messages में क्या share हो रहा है, इसके बारे में ज़्यादा जानना चाहने जैसा malicious भी हो सकता है
ऐसे scanner tools user experience को बेहतर नहीं बनाएंगे
अगर साफ़ लिखा जाए कि scan results public हो जाएंगे, तो कुछ users service इस्तेमाल करने से पहले दोबारा सोचेंगे, और चाहे free users हों या pro license users, यह business के लिए खराब होगा
असीमित इस्तेमाल किए जा सकने वाले “private” links हमेशा थोड़ा संदिग्ध लगे हैं
आखिरकार यह security by obscurity ही है
Google Docs जैसी चीज़ें शेयर करते समय कम-से-कम “URL वाले किसी भी व्यक्ति को access मिल सकता है” ऐसा साफ़ बताने वाला option होता है
मैंने बनाए systems में जब इस तरह की चीज़ चाहिए होती थी, तो आम तौर पर कुछ ही मिनटों की lifetime वाले signed URL इस्तेमाल किए
URL ज़्यादातर implementation detail होता है और user को सीधे नहीं दिखाया जाता, लेकिन browser debug screen में दिखने की संभावना रहती है
- अगर key space पर्याप्त बड़ा हो, तो private link और username/password या API key से protected link के बीच functionally कोई फर्क नहीं है
- Google Docs sharing अफसोस की बात है कि document ID पर based है, इसलिए नए URL से access permission फिर से activate नहीं की जा सकती
इंटरनेट पर अगर ऐसी कोई चीज़ URL के अंदर random string के अलावा किसी और चीज़ से protected नहीं है, तो वह असल में private नहीं है
यह वैसी ही बात है जैसे खोजने पर मिल जाने वाले internet-connected webcams
क्या यह बात हमें पहले से पता नहीं थी? समझ नहीं आता कि “कौन जिम्मेदार है” वाले section में इस point को बिल्कुल क्यों नहीं छुआ गया
- ऐसे links “use case के हिसाब से security पर्याप्त है” वाले context में बहुत उपयोगी होते हैं
  हर चीज़ को highest level security की जरूरत नहीं होती, और कुछ मामलों में broad sharing रोकने वाली एक barrier ही काफी होती है
  उदाहरण के लिए photo gallery में “link share बनाएं” दबाकर किसी को photo link भेजते हैं, तो आप नहीं चाहेंगे कि वह password type करे
  Link खोलते ही photo दिखनी चाहिए, और उस purpose के लिए यह ठीक है
  यहां दिए examples में से एक बिल्कुल ऐसा ही case है, और उस use case के लिए यह fit बैठता है
  Privacy concern के लिहाज से भी, login process होने पर भी end user उस point पर screenshot दोबारा share कर सकता है
  Security use case के अनुरूप ही है
  स्थिति यह है कि user के पास अब photo link है और वह उसे फिर share भी कर सकता है, लेकिन आप भरोसा करते हैं कि वह जानबूझकर ऐसा नहीं करेगा
  यहां बड़ा issue link अपने आप में नहीं, बल्कि यह है कि security analysis tool user को email से मिले सभी links scan करता है और उन्हें उस community के दूसरे users के लिए भी accessible बना देता है
  जब मैं किसी को photo भेजता हूं, तो यह मेरे intended sharing से कहीं ज्यादा बड़ा re-sharing है
इस email-based authentication problem का workaround यह है कि password account creation तक जाए बिना temporary one-time code इस्तेमाल किया जाए
तब URL गलती से share हो भी जाए तो बड़ी समस्या नहीं रहती
1. User “private” link visit करता है, या यह ऐसा public link भी हो सकता है जिसमें email फिर से enter किया जाता हो
2. Site time-limited one-time code user को फिर से email करती है
3. User temporary code enter करके email ownership verify करता है
4. HTTP cookies या session data से flow जारी रखते हुए, यह reasonable confidence मिल जाता है कि email account owner शामिल था
Topic से हटकर, link Cloudflare Radar पर जाता है और यह service apparently 1.1.1.1 data mine करती दिखती है
मुझे लगा था कि 1.1.1.1 user data किसी भी purpose के लिए use नहीं करता
- Cloudflare वह data बेचता या marketing में use नहीं करता, लेकिन वह address उन्हें मिला ही इसलिए था क्योंकि APNIC 1.1.1.1 पर आने वाले noise traffic को study करना चाहता था
कोई और समझदार व्यक्ति समझा दे तो अच्छा होगा। इन दोनों में क्या फर्क है?
1. domain.com/login user: John password: 5 char random password
2. domain.com/12 char random url
  मान लें कि दोनों में समान brute-force protection या rate limiting है, या दोनों में नहीं है, तो 1 आखिर 2 से ज्यादा secure क्यों है?
- Information theory के नजरिए से कोई फर्क नहीं है
  असल में फर्क है
  possession-based secrets और knowledge-based secrets अलग होते हैं
  URL ऐसी चीज़ है जो आपके पास होती है; अगर उसे accessible जगह छोड़ दिया तो कोई छीन सकता है
  Password वह चीज़ है जिसे आप जानते हैं, और सही से manage करें तो छीना नहीं जा सकता। हां, lead pipe attack अपवाद है
  एक और category biometric-based है, जिसमें retina या fingerprint scan आते हैं
- यह article खुद ही इसका कारण है
  (1) में authentication के लिए अलग path की information चाहिए, और यह ऐसी information है जिसे लोग safely store करने के आदी हैं
  जबकि (2) वाला URL, URL की तरह handle होता है
  URLs अक्सर log होते हैं, record होते हैं, share होते हैं और इधर-उधर pass होते हैं
  उदाहरण के लिए अगर company firewall किसी service में login करते समय इस्तेमाल username और password record करे तो यह साफ़ तौर पर खराब है, लेकिन accessed URL record करना ठीक लग सकता है
  बाद वाली बात सिर्फ example है; TLS की end-to-end guarantees की वजह से दोनों accessible नहीं होने चाहिए
- दो बातें हैं
  1. “password” शब्द एक magic word है जो लोगों के उसे कहीं भी paste कर देने की संभावना घटाता है
  2. Username और password आम तौर पर साथ-साथ copy-paste नहीं किए जाते, और उन्हें एक-दूसरे के बगल में store करने का कोई standard तरीका भी नहीं होता; वे दो अलग pieces of information होते हैं
- इस article के context में फर्क यह है कि company या user द्वारा इस्तेमाल security scanning software email के अंदर मौजूद 12-character link का हिस्सा index कर देता है और कुछ मामलों में public scan results में डाल देता है
  साथ ही अगर domain.com/12-char-password को HTTPS के बिना request किया जाए, तो redirect होने पर भी initial request unencrypted भेजी जाती है, जिससे man-in-the-middle attack संभव है
  इसके उलट login page के लिए यह ensure करने के ज्यादा तरीके होते हैं कि password submission हमेशा केवल HTTPS पर ही हो
- पहले कभी मैंने यह जानने के लिए research की थी कि authentication token को query parameter में रखना ठीक है या नहीं
  बड़ी समस्याओं में से एक यह है कि कई logging applications पूरा URL कहीं record करती हैं, और तब आप effectively “password” को logs में छोड़ रहे होते हैं
Private airtable.com app पर upload किए गए सभी media और photos public links हैं
URL पता हो तो बिना authentication access किया जा सकता है
- CDN या API से images load करने वाले web developers के सामने dilemma होता है
  सामान्य image tag में request पर token वाला Authorization header set नहीं किया जा सकता, जैसा API request में fetch() से करते हैं
  उपलब्ध choices बस URL में token जोड़ना या cookie authentication इस्तेमाल करना हैं
  Cookie authentication तभी काम करता है जब CDN उसी domain पर हो, और कई मामलों में subdomain भी problem बन सकता है
- CDN इस्तेमाल करने वाले apps में यह तरीका सिर्फ airtable तक सीमित नहीं, बल्कि काफी common है
  मैं मानता हूं कि यह potentially problematic हो सकता है
Zoom मीटिंग लिंक में अक्सर पासवर्ड को query parameter के रूप में जोड़ा जाता है
क्या यह लिंक “निजी सुरक्षित” लिंक है? क्या बिना पासवर्ड वाला लिंक “निजी सुरक्षित” लिंक है?
- अगर पासवर्ड हर मीटिंग के लिए random हो, तो URL लिंक भी इतना बुरा नहीं है
  क्योंकि जब तक URL कहीं और दिखाई देगा, तब तक वह मीटिंग शायद पहले ही खत्म होकर गायब हो चुकी होगी
  लेकिन असलियत में कोई भी इस पर ध्यान नहीं देता, और लोग बिना कुछ अलग से टाइप किए “क्लिक करके जुड़ना” चाहते हैं
  पहले वाला “सिर्फ meeting ID इस्तेमाल करना” तरीका बहुत आसानी से guess किया जा सकता था

निजी सुरक्षा लिंक, क्या सच में public access असंभव है?

URL analysis services में बच जाने वाले sensitive links

कौन-से links और data expose हो सकते हैं

Submission paths और responsibility gaps

Unlisted scans और urlscan Pro access

Scan counts और canary token results

Sensitive links हटाना और use करते समय सावधानियां

संबंधित पढ़ाई

1 टिप्पणियां

Hacker News की रायें