Feedback Requested: Comprehensive Research PR - Includes Code Review, OPSEC, Onchain Analysis etc

:tada: Comprehensive Research Contribution: Web3 Privacy Projects

Date: October 25, 2025
PR: #1997


TL;DR

Just submitted a large research contribution to the Web3Privacy Explorer. 40 privacy projects now have comprehensive documentation including:

  • GitHub code analysis
  • Team details
  • Security assessments
  • Technical deep-dives

Total contribution: 265 files, 12,378 lines of analysis


What Was Submitted

Research Scope: 171 Analyzed β†’ 40 Submitted, 788 More in Pipeline

This submission represents the first wave of a comprehensive ecosystem analysis:

Initial Analysis Pool:

  • 171 projects analyzed or research attempted in initial deep-dive
  • 40 projects passed quality threshold for submission (had sufficient verified data)
  • 131 projects had incomplete or insufficient data for submission

Research Pipeline:

  • 788 projects in research-required/ folder awaiting analysis
    • 705 from Web3Privacy Explorer database
    • 48 verified from original archive
    • 35 privacy ecosystem projects (funders, infrastructure, tools)

Total ecosystem coverage: 959 Web3 privacy projects (171 analyzed + 788 pipeline)

This PR includes the 40 projects where we found enough information to meet our quality standards through multi-source verification. The methodology that worked for these 40 will now be applied to the remaining 919 projects.

See examples of research depth:

What Each Submitted Project Includes

:bar_chart: CODE_REVIEW.md - Repository analysis (stars, forks, contributors, commit activity, languages)
:busts_in_silhouette: TEAM.md - Verified leadership information and organization structure
:locked: SECURITY.md - Security features, audit status, privacy mechanisms
:gear: TECHNICAL.md - Technology stack, architecture, capabilities

File Structure Example

web3privacy/explorer-data/src/projects/cake-wallet/
β”œβ”€β”€ index.yaml              ← Basic metadata + links to reports
β”œβ”€β”€ logo.png                ← Project logo
β”œβ”€β”€ README.md               ← Project overview
β”œβ”€β”€ project_metadata.json   ← Aggregated data
└── reports/
    β”œβ”€β”€ CODE_REVIEW.md      ← 160 lines of GitHub analysis
    β”œβ”€β”€ TEAM.md             ← 98 lines of team info
    β”œβ”€β”€ SECURITY.md         ← 117 lines of security analysis
    └── TECHNICAL.md        ← 172 lines of tech details

Research Methodology

This research follows a systematic, multi-phase approach based on best available information:

Information Flow Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    PHASE 1: BROAD SEARCH                        β”‚
β”‚  Automated discovery across official sources (parallel agents)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Website    β”‚   GitHub     β”‚  Social      β”‚   News/Blog     β”‚
β”‚   Scraping   β”‚   API        β”‚  Media       β”‚   Aggregators   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  verified_data  β”‚
                    β”‚  (confidence    β”‚
                    β”‚   scores 0-1)   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    PHASE 2: DEEP DIVES                          β”‚
β”‚       Specialized analysis with domain-specific tools           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   GitHub     β”‚   OSINT      β”‚  On-Chain    β”‚   Smart         β”‚
β”‚   Analysis   β”‚   Tools      β”‚  Analysis    β”‚   Contract      β”‚
β”‚   (commits,  β”‚ (Spiderfoot, β”‚  (APIs,      β”‚   Review        β”‚
β”‚   languages) β”‚  OPSEC)      β”‚  explorers)  β”‚  (Solidity)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  analysis/      β”‚
                    β”‚  (JSON files    β”‚
                    β”‚   with data)    β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                PHASE 3: REPORT GENERATION                       β”‚
β”‚    Clean markdown reports (internal methodology removed)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   CODE_      β”‚   TEAM.md    β”‚  SECURITY.md β”‚   TECHNICAL.md  β”‚
β”‚   REVIEW.md  β”‚              β”‚              β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                  β”‚  Web3Privacy Explorer β”‚
                  β”‚  (Public Database)    β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Four-Layer Data Architecture

Our research maintains a strict separation between internal methodology and public presentation:

Layer 1: Internal Research (sources/)

  • verified_data.json - Raw data with confidence scores (0.0-1.0)
  • Multi-source verification metadata
  • Research methodology tracking
  • Never publicly exposed

Layer 2: Analysis Data (analysis/)

  • github_analysis.json - Repository metrics, languages, activity
  • smart_contracts.json - Contract addresses, deployment info
  • osint_data.json - Infrastructure and team OSINT research
  • oso_data.json - Open Source Observer data
  • Intermediate processing layer

Layer 3: Public Reports (reports/)

  • CODE_REVIEW.md - Clean GitHub analysis
  • TEAM.md - Verified team information
  • SECURITY.md - Security features and audits
  • TECHNICAL.md - Technology and capabilities
  • Clean, professional markdown - no internal methodology exposed

Layer 4: Export Metadata (root)

  • project_metadata.json - Aggregated data for APIs
  • index.yaml - Web3Privacy Explorer format
  • README.md - Project landing page

Research Quality Standards

Every data point follows our β€œconstitutional research” methodology:

:white_check_mark: No Placeholders or Fabrication

Best available information with honest limitations

  • No placeholder text like β€œTeam of 5-10 developers”
  • No estimates like β€œApproximately $2M in funding”
  • If we don’t know it, we document it as a gap
  • Note: Sources themselves may contain errors, tools can hallucinate, experimental pipelines may introduce issues - we ask the community to submit corrections via PRs/issues

:white_check_mark: Multi-Source Verification

Critical facts require 2+ independent sources

Founder: Vikrant Sharma
β”œβ”€β”€ Source 1: Official interview (changenow.io)
β”œβ”€β”€ Source 2: LinkedIn profile
└── Confidence: 0.95

:white_check_mark: Confidence Scoring (0.0 - 1.0)

  • 1.0 - Official source (website, GitHub)
  • 0.9-0.95 - Secondary source (verified interview, LinkedIn)
  • 0.7-0.85 - Tertiary source (news, community)
  • < 0.7 - Not included in public reports

:white_check_mark: Honest Gap Reporting

We explicitly document what we DON’T know.

This prevents others from wasting time on failed research approaches.


Decentralized Agent Swarms (Tool Orchestration)

To scale this methodology across 40 projects, we deployed parallel LLM agents for task coordination:

Batch Processing Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            CONTROL AGENT (Task Orchestrator)                β”‚
β”‚  Assigns projects to worker agents, monitors progress       β”‚
β”‚  LLM manages which tools to run, not data generation        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β–Ό               β–Ό               β–Ό             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent 1 β”‚     β”‚ Agent 2 β”‚     β”‚ Agent 3 β”‚...β”‚ Agent 6 β”‚
β”‚ Batch 1 β”‚     β”‚ Batch 2 β”‚     β”‚ Batch 3 β”‚   β”‚ Batch 6 β”‚
β”‚ 7 proj  β”‚     β”‚ 7 proj  β”‚     β”‚ 7 proj  β”‚   β”‚ 7 proj  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚               β”‚               β”‚             β”‚
      β–Ό               β–Ό               β–Ό             β–Ό
For each project (LLM decides which tools to run):
1. WebSearch official sources β†’ real-time data
2. GitHub API calls β†’ live repo data
3. Python scripts β†’ process/analyze results
4. Generate markdown reports β†’ format output
5. Verify quality β†’ cross-check sources

What LLMs Did vs Didn’t Do

LLMs handle (coordination & transformation layers):

  • Deciding which tools to run (WebSearch, GitHub API, Python scripts)
  • Coordinating task sequences (search first, then analyze)
  • Extracting structured data from API responses :warning: hallucination risk
  • Formatting results into markdown reports :warning: hallucination risk
  • Cross-validating between different data sources

LLMs DIDN’T do:

  • Generate data from training knowledge
  • Make up facts or statistics
  • Fill in missing information with guesses

Where hallucination risk exists:

  • Coordination layer: LLM might choose wrong tool or misinterpret task requirements
  • Transformation layer: LLM might misextract data when parsing API responses or formatting markdown

Risk mitigation through validation layers:

  • Multi-agent cross-validation (different agents verify same data)
  • Confidence scoring on extracted data
  • Manual spot-checks of outputs
  • Source URLs preserved in internal files for verification

All actual research data comes from: Live web searches, API calls, and real-time sources - not LLM training data.

Parallel Execution Benefits

Quality assurance: Each agent runs independent verification using real-time tools, control agent validates outputs before acceptance. Multiple agents working in parallel allows faster processing while maintaining research quality through cross-validation.


Quality Metrics

By The Numbers

Metric Value
Projects committed 40
Total research files 265
Lines of analysis 12,378
Average confidence 0.85-0.95
No intentional fabrication Yes
Multi-source verification 100%
Community corrections welcome Yes

Data Quality Breakdown

Tier 1 (Basic Info): 100% complete

  • Website, GitHub, description, status

Tier 2 (Detailed Data): 70-80% complete

  • Team information, technology stack, security features

Tier 3 (Advanced): 40-60% complete

  • Full funding details, complete team rosters, detailed on-chain metrics

ATTEMPTED Files: Honest Gap Reporting

18 projects have blockchain_metrics_ATTEMPTED.md files documenting research that found no data:

  • Attempted: Etherscan, DeFiLlama, block explorers
  • Result: No verifiable on-chain data found
  • Reason: Could be testnet-only, private chains, or insufficient documentation
  • Purpose: Prevent others from repeating failed approaches

Research Tools & Technology Stack

Data Collection Tools

  • WebFetch - Automated website content extraction (real-time data)
  • WebSearch - Multi-source verification and news discovery (real-time data)
  • GitHub API - Repository metrics and code analysis (live API calls)
  • Blockchain APIs - Etherscan, DeFiLlama, custom block explorers (on-chain data)
  • Playwright MCP - Browser automation for debugging and verification

Analysis Tools

  • Python scripts - Custom data processing and analysis
  • Spiderfoot - OSINT reconnaissance on team members and infrastructure
  • Smart Contract Review - Solidity analysis, security pattern detection

Agent Coordination

  • LLM agents - Used for tool orchestration and task management (NOT for data generation)
  • Important: LLMs coordinated which tools to run and how to process results
  • Data sources: All actual research data came from web APIs, searches, and live sources - not LLM training data

Quality Assurance

  • β€œConstitutional research” methodology - No intentional fabrication, best available sources
  • Multi-agent verification - Cross-checking between agents
  • Confidence scoring - Automated quality metrics
  • Manual spot-checks - Random sampling of 5+ projects per batch
  • Community review - Open to corrections via PRs and issues

What’s Next

Immediate Next Steps

  1. Tag all 40 project repositories on GitHub

    • Notify project teams that their research is available
    • Encourage them to review and submit corrections/updates
    • Invite them to fill data gaps (team info, funding, on-chain metrics)
  2. Research-required projects (47 remaining)

    • Work to obtain basic surface-level information (website, repo, description)
    • Many are promising projects but lack public documentation
    • Once basics are found, apply same comprehensive research methodology
  3. Regular updates as projects evolve

    • New releases, team changes, security audits
    • On-chain metric updates for active protocols
    • Community-submitted corrections and additions

Projects Awaiting Further Research (47 total)

Click to expand list of projects in research-required/ folder

These projects need basic surface-level information before comprehensive research can begin:

1inch-privacy, aleo, anoma, brave-browser, curve-privacy, curvy, dark-forest, dash, dusk-network, eth2-deposit-cli, farcaster, gitcoin-grants, hinkal, horizen, hurricane-core, inco, keep-network, lens-protocol, maci, mask, metamask-snaps, mina-protocol, night, nighthawk-wallet, nocturne, nuconstruct, nucypher, polygon-hermez, polygon-zero, pse-privacy-scaling-explorations, railway, rarime, ronin, samourai-wallet, snapshot-x, starknet, taceo, token-shielder, zama, zecrey, zkbob, zksync-era, zupass

If you’re involved with any of these projects - please submit basic info (website, GitHub, brief description) so we can conduct comprehensive research!

Long-term Vision: Decentralized Research Infrastructure

The goal is to scale this methodology across the entire Web3 privacy ecosystem through:

  1. Scale to 700+ projects - Apply this research methodology broadly
  2. Continuous monitoring - Detect new commits, releases, security advisories
  3. Automated updates - Generate PRs when significant changes detected
  4. Community-driven - Multiple contributors using various tools and approaches
  5. Open methodology - Anyone can apply this research framework with their own infrastructure

Why Code Review & OPSEC Matter for Privacy Projects

Privacy projects require deeper analysis beyond basic features. Here’s why:

:magnifying_glass_tilted_left: Code Review is Critical for Privacy Claims

  • Trust verification: β€œPrivacy-preserving” is a claim that requires code-level validation
  • Implementation quality: Privacy features in whitepaper β‰  privacy features in production code
  • Active development: Regular commits indicate ongoing security maintenance
  • Community involvement: Contributor count and diversity signal decentralization
  • Example: A mixer claiming β€œtrustless privacy” but with centralized admin keys visible in code

:detective: OPSEC Research Protects Users

  • Infrastructure analysis: Centralized servers for β€œdecentralized” privacy tools
  • Funding transparency: Who’s backing privacy infrastructure matters for trust
  • Attack surface: OSINT reveals potential vectors (DNS, hosting, dependencies, team member vulnerabilities)

These layers are especially important for privacy projects because:

  1. Users trust these tools with sensitive data
  2. Privacy failures can have severe real-world consequences
  3. Marketing claims often exceed technical reality
  4. Decentralization claims need verification
  5. Security research requires technical depth

Why This Matters

For Projects:

  • Professional, accurate documentation of their work
  • Technical validation of privacy claims
  • Regular updates as they evolve
  • Opportunity to correct errors and fill gaps

For Users:

  • Comprehensive, up-to-date information
  • Technical verification of privacy features
  • Honest assessment of limitations and risks
  • Verified, multi-source data

For Researchers:

  • Avoid duplicating failed research
  • Build on existing verified work
  • Contribute to growing knowledge base
  • Technical depth beyond surface-level analysis

Contributing & Feedback

How to Help

Project Teams: Found an error? Have updated information?

Researchers: Want to contribute?

Community: Questions or suggestions?

  • Reply to this thread
  • DM me on the forum
  • Open a discussion on GitHub

:light_bulb: What Should We Prioritize Next?

I’d love the community’s input on next steps:

Potential priorities:

  1. Tag all 40 projects on GitHub to notify teams?
  2. Focus on the 47 research-required projects to get basic info?
  3. Deeper OPSEC research on high-profile projects (infrastructure, dependencies)?
  4. On-chain analysis for protocols with smart contracts?
  5. Automated monitoring for security advisories and major updates?
  6. Something else entirely?

Specific questions:

  • Which projects from the research-required list are most important to the community?
  • What additional data layers would be most valuable? (Funding details? Token metrics? Audit history?)
  • Should we prioritize breadth (more projects) or depth (more detailed analysis)?
  • Any tools or data sources we should integrate?

Drop your thoughts below! This is community-driven research - your input shapes the roadmap.


Repository Links

Main Research Repo:

PR to Web3Privacy Explorer:

Forked Explorer (with all files):


Acknowledgments

Huge thanks to:

  • Web3Privacy team for maintaining this essential database
  • All project teams for building privacy-preserving infrastructure
  • Web3Privacy community for feedback and support
  • Future contributors who will help scale this research methodology

License

All research data submitted under Open Database License (ODbL) to match Web3Privacy Explorer licensing.


Important Disclaimers

This research is based on publicly available information and best-effort analysis. Limitations include:

  • Source reliability: We rely on official sources, but those sources themselves may contain errors or outdated information
  • Tool limitations: Research tools (APIs, OSINT tools, browser automation) can miss data, timeout, or introduce errors
  • LLM coordination & transformation risks: While research data comes from live sources (not LLM training data), LLMs can introduce errors in:
    • Coordination layer: Choosing wrong tools or misinterpreting tasks
    • Transformation layer: Misextracting data from API responses or incorrectly formatting outputs
    • Validation layers (multi-agent verification, manual spot-checks) mitigate but don’t eliminate this risk
  • Experimental pipeline: This methodology uses experimental agent swarms and novel approaches - bugs may exist
  • Point-in-time data: Information accurate as of October 2025 - projects evolve rapidly
  • Coverage gaps: Some areas (funding, complete team rosters, on-chain metrics) have limited public data

We actively encourage corrections! If you spot errors, outdated info, or have additional data:

  • Submit a PR to the Web3Privacy Explorer
  • Open an issue on our research repo
  • Comment on this thread

This is a living dataset that improves through community collaboration.


Questions? Comments? Let’s discuss below! :backhand_index_pointing_down:

I’m happy to explain the methodology in more detail, share specific examples, or discuss how we can scale this to cover more projects.

:locked: Privacy is a fundamental right.


Research conducted October 2025
Contributor: @M0nkeyFl0wer

3 Likes

Thanks for the thorough research and suggested additions for the explorer. See end of post for TL:DR

I’ll respond or comment to sections individually:

What Each Submitted Project Includes:
CODE_REVIEW.md
TEAM.md
SECURITY.md
TECHNICAL.md

  • Code review is an important data point for a user to see how active and well maintained a project is.
  • Team OSINT research helps a user learn more information about a team behind a project

Research Quality Standards

What categories were the research quality standards applied to?

Decentralized Agent Swarms (Tool Orchestration)

The Web3privacy explorer team is already looking into using LLM agents to scrape and report data for new projects.

Agents run a risk and @BenW notes the hallucination risk which is important. I would like to abstract away any extra work we can do and use agents where economically feasible while reducing hallucination risk/wrong information being populated.

Research Tools & Technology Stack

The tools outlined here are extremely helpful for us, we can configure/optimize the tools we use to increase the depth of the information we want to collect. We also should consider the resources needed to use tools like these most optimally.

What’s Next
3. Regular updates as projects evolve

This is a tool that would be very useful for projects on the explorer. Including both informing users of newly pushed updates of projects and alerting project team members of edits/updates to explorer information. This can also include contacting projects if there are issues discovered using the explorer research tools.

Why Code Review & OPSEC Matter for Privacy Projects

This explanation is very helpful to justify why its important for this data to be collected and represented on the explorer.

@BenW your resource listed here was great to review and get a better sense of what you were describing in your post

Thanks and looking forward to getting this worked into the explorer, even if its incremental

TL:DR

I personally would like to apply this code review, OPSEC, and Onchain Analysis to the explorer. We may be able to include a more detailed breakdown on specific research reports for projects or teams. This was a sample size of 40 projects from the explorer, and we can work to apply this to the rest of the explorer.

Generally I think this would be a great addition to the explorer for users. More information displayed about projects and their teams will give more data points. We would need to figure out where to include this data.

Whats needed for added OSINT results?
- token cost
- still needs a person in between to review
- the simpler the OSINT test, the less resources the explorer uses
- cost around running infrastructure
- need to have added form fields, and code reviewed
- script that could run monthly whenever a new PR is pushed to a project’s main repo

Pros

  • More data for projects
  • Automated data collection

Cons

  • Need to run extra infrastructure
  • cost/resource expense
  • not 100% fully automated