status: active
timestamp: 2026-06-22
tags: [decision, package, ai, providers, free-tier, data-repo]
@chirag127/oriz-ai-providers (18th package) + chirag127/oriz-ai-providers-data data repo
@chirag127/oriz-ai-providers aggregates free AI providers LLM API (Cerebras, Groq, Cohere, NVIDIA NIM, GitHub Models, Cloudflare Workers\ AI, HuggingFace, Mistral, SambaNova, OpenRouter, LLM7, OVHcloud, Pollinations,\ Kilo Code, Ollama Cloud, Z.AI, Aion Labs, SiliconFlow, ModelScope — 20+ providers).\ Provider data + model lists + rate limits + base URLs maintained in a SEPARATE\ data repo `chirag127/oriz-ai-providers-data` so the package can stay slim and\ the data can be updated independently of the code. Priority order: no-key-required\ providers first (anonymous OVHcloud / LLM7 / Pollinations), then free-with-key\ providers as fallback chain. NIM + OpenRouter demoted from primary.
@chirag127/oriz-ai-providers (18th package)
Decision
Add @chirag127/oriz-ai-providers as the 18th package in the family. Its job: a thin wrapper around every free LLM API the family uses for blog rewrites, omni-publish drafts, janaushdhi substitute-finder, ncert summaries, etc.
Why a package + data-repo split:
- Code (in
@chirag127/oriz-ai-providers-npm-pkg): wrapper logic, fallback chain, retry, env-var loading, OpenAI-SDK-compatible client. Stable. - Data (in
chirag127/oriz-ai-providers-data): JSON list of providers, models, rate limits, base URLs, env-var names, signup URLs. Changes monthly as the LLM API landscape shifts. - The package fetches the data JSON at build time (or runtime via fetch with a 1-day CF cache).
Providers list (as of 2026-06-22)
Tier 1: Anonymous (no key, no signup, no card)
- OVHcloud AI Endpoints — 2 RPM per IP per model, EU-hosted, 20+ models (Qwen3.5, gpt-oss, Llama 3.3, Mistral)
- LLM7.io — 30 RPM per IP, 30+ models (deepseek-r1, gpt-4o-mini, gemini-2.5-flash-lite, etc.)
- Pollinations — anonymous gpt-oss-20b
Tier 2: Free with no-card signup
- Cerebras — 30 RPM + 1M TPD, ultra-fast (~2,600 tok/s), gpt-oss-120b + Llama 3.1 8B
- Groq Cloud — 30 RPM + 1,000 RPD, llama-3.3-70b-versatile (faster than NIM)
- NVIDIA NIM — 40 RPM, 100+ models, requires phone verification
- Google AI Studio — Gemini 2.5 / 3.x Flash, 5-15 RPM + 20-1,500 RPD per model (free outside EU/UK/Switzerland)
- Cohere — 20 RPM + 1,000 req/month, Command A+ / R+ (non-commercial only)
- GitHub Models — 10-15 RPM + 50-150 RPD, GPT-5 / GPT-4.1 / o4-mini, free with Copilot tier
- Cloudflare Workers AI — 10K neurons/day, Llama 3.3 70B FP8 / GPT-OSS / Qwen3
- HuggingFace — 100K credits/mo, router to Fireworks/Together/Hyperbolic, thousands of models
- Mistral La Plateforme — 500K TPM + ~1B tokens/month (Experiment plan), Mistral Medium 3.5 / Codestral
- SambaNova — 20 RPM + 200K TPD, DeepSeek V3.1 + Llama 3.3 70B
- OpenRouter — 20 RPM + 200 RPD per :free model (Llama 3.3 70B, Qwen3-Coder, Nemotron-Ultra-550B)
- Z.AI (Zhipu) — GLM-4.7-Flash + GLM-4.6V-Flash (Chinese provider)
- SiliconFlow — Qwen3-8B + DeepSeek-R1-Distill (Chinese)
- Aion Labs — 15 RPM + 20K TPD, roleplay-specialized
- Ollama Cloud — qualitative usage, 400+ Ollama-hosted models (not OpenAI SDK)
- ModelScope — 2,000 RPD, Qwen3.5-35B-A3B + Qwen3.5-27B (requires Alibaba real-name)
- Kilo Code — auto-router free models
Priority order (default fallback chain)
For text completion at low rate:
- OVHcloud anonymous (zero friction, EU-hosted)
- LLM7 anonymous
- Cerebras (key required, ultra-fast)
- Groq Cloud (key, also fast)
- NVIDIA NIM (key + phone verified, more model variety)
- OpenRouter free (key, broad coverage)
- Google AI Studio Gemini (key, Gemini-flavored output)
- CF Workers AI (key, lives in our infra)
For high-volume (>30 RPM):
- Spread load across multiple providers in parallel
- LLM7 token + Cerebras + Groq concurrently
For reasoning tasks:
- DeepSeek-R1 via NIM or SambaNova or Hugging Face router
- o4-mini via GitHub Models
For vision/multimodal:
- Llama 4 Scout (CF Workers AI, GitHub Models, OpenRouter)
- Pixtral Large (Mistral)
- Qwen2.5-VL (OVHcloud, HF)
Data repo shape
chirag127/oriz-ai-providers-data:
providers.json # one entry per provider
models.json # one entry per model (with provider link)
rate-limits.json # provider × model × tier
env-vars.json # which env var maps to which provider
signup-urls.json # for the README + onboarding doc
priority.json # default fallback chain (the order above)
Updated via PR. Each change creates a new release tag. Package fetches latest tag (or main) at build/runtime.
Wrapper API
import { ai } from "@chirag127/oriz-ai-providers";
const result = await ai.complete({
prompt: "Rewrite this blog post for Twitter",
task: "rewrite-short", // mapped to priority chain
maxTokens: 280,
// optional overrides:
preferProvider: "cerebras",
fallback: true,
});
// result.text, result.provider, result.model, result.tokensUsed
The wrapper:
- Loads provider data from data-repo (cached 24h)
- Picks the highest-priority provider with a configured env var
- Calls it via OpenAI SDK (most providers are OpenAI-compatible)
- On 429/5xx: falls back to next provider
- Returns first successful result
Master pointer
Adding this package brings the family count to 18 packages (was 17 per the-23-packages.md). Rename + update count in:
knowledge/architecture/the-23-packages.md?the-23-packages.md(rename viagit mvto keep history)knowledge/services/family-inventory.md— bump count- AGENTS.md “Where to look” if referenced
Supersedes in part
decisions/architecture/stack-picks-2026-06-22.md — its “AI inference” section named NIM primary + OpenRouter fallback only. That’s now superseded by this decision (priority chain in this file). Update that file to point here.
Cross-refs
- The 18-package family ? [[architecture/the-23-packages]]
- Stack picks (superseded-in-part) ? [[decisions/stack/stack-picks-2026-06-22]]
- Never hit quotas ? [[rules/never-hit-quotas]]
- No card on file ? [[rules/no-card-on-file]]
- Source list reference: awesome-free-llm-apis + free-llm-api-resources