Share this job
Founding AI Engineer
Apply for this job

About Us:

We are building the world's first AI-powered proof-of-work blockchain, combining the security of Bitcoin with the intelligence of large language models. We're an SVM-compatible Layer 1 that enables hyperscaled verified inference on 600B+ parameter models through our revolutionary Proof of Logits (PoL) consensus mechanism.

Backed by a16z Crypto, Delphi Digital, and Amber Group with $7.2M in seed funding, we're on a mission to decentralize AI and create a viable alternative to centralized AI giants like OpenAI. Our network delivers verified AI inference at 100x the speed of legacy systems with just 0.1% overhead.


The Role:

We're seeking an exceptional CUDA Engineer to architect and optimize the GPU compute infrastructure that powers our decentralized AI network. You'll be at the forefront of building the computational backbone that enables miners worldwide to participate in AI inference, fine-tuning, and training while maintaining cryptographic verification and blockchain consensus.


Key Responsibilities:

Core Infrastructure Development

  • Design and implement CUDA kernels for large-scale transformer inference on 600B+ parameter models
  • Optimize memory management for distributed GPU workloads across heterogeneous hardware (consumer GPUs to enterprise cards)
  • Build parallel execution frameworks that enable verified inference with minimal computational overhead
  • Develop custom CUDA libraries for cryptographic verification of AI computations (Proof of Logits)

Performance Optimization

  • Achieve 100x performance improvements over existing AI inference systems through low-level GPU optimization
  • Minimize verification overhead to sub-0.1% while maintaining cryptographic security guarantees
  • Optimize for diverse hardware - from RTX 4090s to A100s, ensuring broad miner participation
  • Implement efficient batching strategies for handling multiple inference requests across network nodes

Blockchain Integration

  • Integrate CUDA workloads with Solana Virtual Machine (SVM) execution environment
  • Design consensus-critical GPU computations that serve dual purpose as useful work and network security
  • Build verification systems that can cryptographically prove AI computation correctness
  • Optimize gas metering for GPU-intensive operations on our blockchain

Research & Innovation

  • Pioneer new approaches to distributed AI training verification using CUDA primitives
  • Explore sparsity optimizations for large model training across unreliable network connections
  • Research novel memory techniques for handling massive models on consumer hardware
  • Contribute to open-source CUDA libraries for the broader AI and blockchain communities


Required Qualifications:

Technical Expertise

  • 5+ years of CUDA development experience with proven track record of shipping production systems
  • Deep understanding of GPU architectures (Ampere, Ada Lovelace, Hopper) and their optimization characteristics
  • Expert-level C++/CUDA programming with experience in memory optimization and kernel development
  • Strong background in machine learning inference optimization, particularly for transformer architectures
  • Experience with distributed computing and handling workloads across multiple GPUs/nodes

AI/ML Experience

  • Hands-on experience optimizing large language models (10B+ parameters) for inference
  • Understanding of modern transformer architectures (attention mechanisms, feedforward networks, layer normalization)
  • Familiarity with quantization techniques (INT8, FP16, mixed precision) for model compression
  • Experience with popular ML frameworks (PyTorch, TensorFlow) and their CUDA backends

Systems Programming

  • Low-level systems programming experience in resource-constrained environments
  • Understanding of cryptographic primitives and their efficient GPU implementation
  • Experience with peer-to-peer networking and distributed systems challenges
  • Familiarity with blockchain concepts and consensus mechanisms (preferred but not required)


Preferred Qualifications:

Specialized Experience

  • Previous work on cryptocurrency mining or blockchain infrastructure
  • Experience with Solana development or other high-performance blockchains
  • Background in zero-knowledge proofs or cryptographic verification systems
  • Contributions to open-source CUDA projects or AI optimization libraries

Advanced Technical Skills

  • Custom CUDA library development for specialized computational workloads
  • Experience with NVIDIA's enterprise stack (TensorRT, Triton, NCCL)
  • Knowledge of advanced GPU debugging and profiling tools (Nsight, nvprof)
  • Understanding of hardware-software co-design for AI acceleration

Research Background

  • Publications in GPU computing or high-performance AI systems
  • PhD in Computer Science, Electrical Engineering, or related field focusing on parallel computing
  • Experience in competitive programming or algorithmic optimization contests


What We Offer:

Compensation & Equity

  • Competitive base salary ($180K - $280K based on experience)
  • Significant equity package in a fast-growing crypto-AI startup
  • Token allocation in our upcoming network launch
  • Performance bonuses tied to network performance metrics

Technical Environment

  • Access to cutting-edge hardware including latest NVIDIA GPUs for development and testing
  • Work directly with co-founders Travis Good (PhD, ex-AI research) and Max Lang (serial founder, ex-Microsoft/Amazon)
  • Collaborate with top-tier engineers from crypto and AI backgrounds
  • Contribute to open-source projects that will shape the future of decentralized AI

Growth & Impact

  • Ground-floor opportunity to build the infrastructure for decentralized AI
  • Direct impact on network economics affecting thousands of miners worldwide
  • Opportunity to pioneer new fields at the intersection of AI and blockchain
  • Conference speaking opportunities and thought leadership in the space

Technical Challenges You'll Solve

  1. Distributed Inference Optimization: How do you achieve consistent performance across thousands of different GPU configurations while maintaining verification guarantees?
  2. Memory-Constrained Large Models: How do you enable 600B+ parameter models to run efficiently on consumer hardware through innovative sharding and caching strategies?
  3. Cryptographic Verification at Scale: How do you design CUDA kernels that can prove computational correctness without significant overhead?
  4. Economic Security Through Computation: How do you ensure that useful AI work provides the same security guarantees as traditional proof-of-work mining?


Apply for this job
Powered by