Founding AI Engineer

Share this job

About Us:

We are building the world's first AI-powered proof-of-work blockchain, combining the security of Bitcoin with the intelligence of large language models. We're an SVM-compatible Layer 1 that enables hyperscaled verified inference on 600B+ parameter models through our revolutionary Proof of Logits (PoL) consensus mechanism.

Backed by a16z Crypto, Delphi Digital, and Amber Group with $7.2M in seed funding, we're on a mission to decentralize AI and create a viable alternative to centralized AI giants like OpenAI. Our network delivers verified AI inference at 100x the speed of legacy systems with just 0.1% overhead.

The Role:

We're seeking an exceptional CUDA Engineer to architect and optimize the GPU compute infrastructure that powers our decentralized AI network. You'll be at the forefront of building the computational backbone that enables miners worldwide to participate in AI inference, fine-tuning, and training while maintaining cryptographic verification and blockchain consensus.

Key Responsibilities:

Core Infrastructure Development

Design and implement CUDA kernels for large-scale transformer inference on 600B+ parameter models
Optimize memory management for distributed GPU workloads across heterogeneous hardware (consumer GPUs to enterprise cards)
Build parallel execution frameworks that enable verified inference with minimal computational overhead
Develop custom CUDA libraries for cryptographic verification of AI computations (Proof of Logits)

Performance Optimization

Achieve 100x performance improvements over existing AI inference systems through low-level GPU optimization
Minimize verification overhead to sub-0.1% while maintaining cryptographic security guarantees
Optimize for diverse hardware - from RTX 4090s to A100s, ensuring broad miner participation
Implement efficient batching strategies for handling multiple inference requests across network nodes

Blockchain Integration

Integrate CUDA workloads with Solana Virtual Machine (SVM) execution environment
Design consensus-critical GPU computations that serve dual purpose as useful work and network security
Build verification systems that can cryptographically prove AI computation correctness
Optimize gas metering for GPU-intensive operations on our blockchain

Research & Innovation

Pioneer new approaches to distributed AI training verification using CUDA primitives
Explore sparsity optimizations for large model training across unreliable network connections
Research novel memory techniques for handling massive models on consumer hardware
Contribute to open-source CUDA libraries for the broader AI and blockchain communities

Required Qualifications:

Technical Expertise

5+ years of CUDA development experience with proven track record of shipping production systems
Deep understanding of GPU architectures (Ampere, Ada Lovelace, Hopper) and their optimization characteristics
Expert-level C++/CUDA programming with experience in memory optimization and kernel development
Strong background in machine learning inference optimization, particularly for transformer architectures
Experience with distributed computing and handling workloads across multiple GPUs/nodes

AI/ML Experience

Hands-on experience optimizing large language models (10B+ parameters) for inference
Understanding of modern transformer architectures (attention mechanisms, feedforward networks, layer normalization)
Familiarity with quantization techniques (INT8, FP16, mixed precision) for model compression
Experience with popular ML frameworks (PyTorch, TensorFlow) and their CUDA backends

Systems Programming

Low-level systems programming experience in resource-constrained environments
Understanding of cryptographic primitives and their efficient GPU implementation
Experience with peer-to-peer networking and distributed systems challenges
Familiarity with blockchain concepts and consensus mechanisms (preferred but not required)

Preferred Qualifications:

Specialized Experience

Previous work on cryptocurrency mining or blockchain infrastructure
Experience with Solana development or other high-performance blockchains
Background in zero-knowledge proofs or cryptographic verification systems
Contributions to open-source CUDA projects or AI optimization libraries

Advanced Technical Skills

Custom CUDA library development for specialized computational workloads
Experience with NVIDIA's enterprise stack (TensorRT, Triton, NCCL)
Knowledge of advanced GPU debugging and profiling tools (Nsight, nvprof)
Understanding of hardware-software co-design for AI acceleration

Research Background

Publications in GPU computing or high-performance AI systems
PhD in Computer Science, Electrical Engineering, or related field focusing on parallel computing
Experience in competitive programming or algorithmic optimization contests

What We Offer:

Compensation & Equity

Competitive base salary ($180K - $280K based on experience)
Significant equity package in a fast-growing crypto-AI startup
Token allocation in our upcoming network launch
Performance bonuses tied to network performance metrics

Technical Environment

Access to cutting-edge hardware including latest NVIDIA GPUs for development and testing
Work directly with co-founders Travis Good (PhD, ex-AI research) and Max Lang (serial founder, ex-Microsoft/Amazon)
Collaborate with top-tier engineers from crypto and AI backgrounds
Contribute to open-source projects that will shape the future of decentralized AI

Growth & Impact

Ground-floor opportunity to build the infrastructure for decentralized AI
Direct impact on network economics affecting thousands of miners worldwide
Opportunity to pioneer new fields at the intersection of AI and blockchain
Conference speaking opportunities and thought leadership in the space

Technical Challenges You'll Solve

Distributed Inference Optimization: How do you achieve consistent performance across thousands of different GPU configurations while maintaining verification guarantees?
Memory-Constrained Large Models: How do you enable 600B+ parameter models to run efficiently on consumer hardware through innovative sharding and caching strategies?
Cryptographic Verification at Scale: How do you design CUDA kernels that can prove computational correctness without significant overhead?
Economic Security Through Computation: How do you ensure that useful AI work provides the same security guarantees as traditional proof-of-work mining?

Apply for this job