Arkova is a jurisdiction-aware verification layer that enables organizations to issue, anchor, and verify credentials against the anchoring network. It transforms documents such as diplomas, certificates, licenses, attestations, and compliance records into tamper-evident digital credentials — without ever taking custody of the underlying documents.
Arkova is not a blockchain company. It is a verification infrastructure company that uses a public ledger as an immutable timestamping layer.
The Issuer uploads or creates a credential. The document is fingerprinted (SHA-256) entirely on the user's device. Only the fingerprint — never the document — leaves the browser.
Arkova anchors the fingerprint to the anchoring network via an OP_RETURN output containing a 36-byte payload (ARKV prefix + SHA-256 hash).
Any Verifier can query Arkova's API or public verification page to confirm the credential's authenticity, timestamp, issuer, and status.
Non-Custodial Architecture
Dimension
What This Means
Document Non-Custody
Documents never leave the user's device. Arkova never receives, stores, transmits, or processes raw document content. Only a one-way SHA-256 fingerprint is stored.
Financial Non-Custody
Arkova does not store, accept, or manage user cryptocurrency. All on-chain fees are paid from an Arkova-managed corporate fee account.
Key Non-Custody
Treasury signing keys are secured in cloud HSMs (AWS KMS / GCP Cloud HSM). No human has access to raw private key material.
Important
This design eliminates regulated data custody risk. Arkova does not become a custodian of PII, financial assets, or cryptographic material.
Schema-First Build Philosophy
Schema First — Define Postgres tables, columns, constraints, and Row Level Security policies before writing any application code.
Migration Immutability — Once a migration is applied, it is never modified. Changes are expressed as compensating migrations.
Type Generation — TypeScript types are auto-generated from the database schema, ensuring compile-time safety across the full stack.
Validation at the Boundary — All write paths are validated with Zod schemas before reaching the database.
Every table in the Arkova database has FORCE ROW LEVEL SECURITY enabled. This is a non-negotiable architectural constraint.
Note
Even if application code has a bug, the database will refuse to return rows the authenticated user is not authorized to see. FORCE ROW LEVEL SECURITY means RLS policies apply even to the table owner.
Table
Policy
anchors
Users see own anchors + org anchors (via org membership)
profiles
Users see own profile only
organizations
Members see their own org
audit_events
Users see own events only
api_keys
ORG_ADMIN only (not readable by ORG_MEMBER)
webhook_endpoints
ORG_ADMIN full CRUD for own org
billing_events
User reads own; append-only (triggers block UPDATE/DELETE)
attestations
Public read; write restricted to authenticated users
Tenant Isolation
Multi-tenancy is enforced at the database level, not the application level. Every row carries an org_id foreign key. RLS policies use auth.uid() to resolve the caller's identity. Cross-tenant data access is architecturally impossible.
The Client-Side Processing Boundary
Important
Documents never leave the user's device. This is Arkova's foundational privacy guarantee.
Arkova is not a data processor under GDPR for document content
There is no "raw mode" bypass
The generateFingerprint() function is architecturally prohibited from being imported in server-side code
Client-side PII stripping uses regex-based removal of SSNs, student IDs, DOBs, emails, phones, and names
Audit Trail
All significant actions logged to immutable, append-only audit_events table. Triggers reject all UPDATE and DELETE — even from service_role. Event categories: AUTH, ANCHOR, PROFILE, ORG, ADMIN, SYSTEM. PII fields nullified at write time.
API Key Security
Keys hashed with HMAC-SHA256 using API_KEY_HMAC_SECRET. Raw keys never stored after initial creation. Supports scoped permissions: verify, verify:batch, keys:manage, usage:read.
On-Chain Content Policy
Only 36 bytes are ever written to the anchoring network: ARKV (4 bytes) + SHA-256 hash (32 bytes). Forbidden from on-chain: filenames, file sizes, MIME types, user IDs, org IDs, email addresses, any PII.
Extracts structured fields from PII-stripped OCR text using Gemini Flash. Returns confidence scores per field.
Batch Extraction
Process multiple credentials in a single request. Up to 100 items.
Semantic Search
Natural language search across all credentials using pgvector embeddings (768-dim).
Fraud / Integrity Scoring
Computes 0-100 integrity score. Scores below 60 auto-flagged for human review.
Visual Fraud Detection
Image-based fraud analysis for credential documents.
Human Review Queue
Flagged credentials surface in admin review queue with disposition workflow.
Extraction Feedback
Closed-loop learning: human corrections improve future accuracy.
Knowledge Query
Retrieval-augmented generation against 29,000+ public records. Returns cited sources.
Cost-Efficiency Model
Operation
Cost
Model
Metadata Extraction
1 AI credit
Gemini 2.0 Flash
Semantic Search
1 AI credit
text-embedding-004
Fraud Analysis
5 AI credits
Gemini 2.0 Flash
Embedding Generation
1 AI credit
text-embedding-004
RAG Query
Variable
Gemini + pgvector
Tip
Gemini Flash provides extraction accuracy on par with larger models (F1=82.1%) at ~$0.075 per 1M input tokens. The provider abstraction layer supports hot-swapping to OpenAI or Anthropic.