CUDA BGV BFV CKKS TFHE/FHEW Scheme Switching

YUENIX-FHE

CUDA-Accelerated Fully Homomorphic Encryption

Compute on encrypted data at GPU speed. YUENIX-FHE supports BGV, BFV, CKKS, and TFHE/FHEW schemes — including CKKS-to-TFHE/FHEW scheme switching — with a clean three-layer architecture and CUDA-native optimizations built for NVIDIA GPUs.

Three-Layer Architecture

Scheme Layer

High-level FHE operations
BGV BFV CKKS TFHE/FHEW Scheme Switching

RNS Arithmetic Layer

Residue Number System computations
NTT Operations RNS Decomposition

Core Arithmetic Layer

Foundational arithmetic operations
Modular Arithmetic Multi-Precision Arithmetic

Powered by NVIDIA CUDA

Supported Schemes

BGV

Exact integer arithmetic with noise management — ideal for counting, voting, and database queries on encrypted data.

BFV

Modular integer arithmetic without rescaling — suited for simple encrypted computations with predictable noise budgets.

CKKS

Approximate arithmetic on encrypted real numbers — designed for machine learning inference and statistical analysis.

TFHE/FHEW

Bit-level Boolean circuit evaluation with fast bootstrapping — enables arbitrary function evaluation on encrypted bits.

GPU Optimizations

Multi-GPU Scaling

Distribute FHE workloads across multiple GPUs with automatic load balancing, scaling throughput linearly as you add devices.

Dynamic Resource Allocation

Scalable resource management that adapts GPU memory and compute allocation in real time based on workload demands.

Kernel Fusion

Intra-kernel and inter-kernel fusion reduces launch overhead and overlaps computation with memory transfers for maximum throughput.

Enhanced Hybrid Key-Switching

Optimized key-switching procedure that reduces memory bandwidth and computation by leveraging GPU shared memory.

GPU Memory Pool

Custom allocator that pre-allocates and recycles GPU memory blocks, eliminating allocation latency during FHE operations.

RNS & NTT Optimization

Residue Number System arithmetic and asynchronous Number Theoretic Transforms tuned for GPU warp-level primitives.

Applications & Use Cases

Cryptographic Primitives

Privacy-Preserving ML

Train and run inference on encrypted datasets — models never see raw data, enabling collaborative ML without data exposure.

Private Information Retrieval

Query databases without revealing what you searched for. The server processes the query on encrypted indices and returns encrypted results.

Private Set Intersection

Two parties discover shared records without revealing anything else — critical for fraud detection, contact tracing, and ad measurement.

Industry Scenarios

Healthcare

Analyse patient records, genomic data, and clinical trials across institutions without exposing sensitive health information.

Finance

Run risk models, compliance checks, and cross-institutional analytics on encrypted financial data — meeting regulatory requirements by design.

Government

Enable secure inter-agency data sharing, encrypted census processing, and privacy-preserving national statistics without centralised data access.

System Requirements

Operating System: Linux (glibc 2.35+)
GPU: NVIDIA GPU with compute capability 8.0+
CUDA Toolkit: 12.0 or later
CMake: 3.20 or later
Python: 3.12 or later (for Python bindings)
C++ Compiler: GCC 11+ or Clang 14+

Start Computing Encrypted

Interested in YUENIX-FHE for your project? Get in touch to learn more about integration and licensing.

Get in Touch