type: decision
status: active
timestamp: 2026-06-22
tags: [decision, ncert, pdf-merge, client-side, storage, github-releases, dual-mode]

NCERT app: dual-mode downloads — GH Release pre-merged + client-side on-the-fly merge

Both download modes: pre-merged PDFs + per-chapter PDFs' Release artefacts (free GH bandwidth + CDN); (2) Client-side on-the-fly merger\ using pdf-lib in browser — user clicks 'Build my book', browser fetches all\ chapter PDFs from ncert.nic.in URLs, merges in browser via pdf-lib WASM, downloads.\ Zero server storage for the on-the-fly path. (3) Individual chapter links also\ exposed for users who want only a few chapters. Three options per book card.

NCERT dual-mode download

Decision

Each book card offers THREE download paths:

  1. Pre-merged PDF (recommended) — links to GitHub Release asset class-9-maths-en.pdf. Fastest, single download, no compute on user device.
  2. Build-my-book on-the-fly — JS button “Build PDF”. Browser fetches all chapter PDFs from ncert.nic.in CORS-permitted URLs, merges client-side using pdf-lib (npm pdf-lib, MIT, ~200 KB gzip, WASM-backed). User gets a download blob. Zero server storage for this path.
  3. Individual chapters — collapsible list of per-chapter ncert.nic.in URLs. Users wanting only Ch 5 can grab it.

Why both pre-merged + on-the-fly

Pre-merged from GH Releases:

Client-side on-the-fly:

If ncert.nic.in blocks CORS, fallback only to pre-merged path. We test on first visit and gracefully hide the on-the-fly button if CORS denies.

Storage math (defends GH Release path)

GH Releases is the only no-card free tier that holds 30 GB.

pdf-lib client-side merge

import { PDFDocument } from 'pdf-lib';

async function mergeOnTheFly(chapterUrls: string[]) {
  const merged = await PDFDocument.create();
  for (const url of chapterUrls) {
    const bytes = await fetch(url).then(r => r.arrayBuffer());
    const chap = await PDFDocument.load(bytes);
    const pages = await merged.copyPages(chap, chap.getPageIndices());
    pages.forEach(p => merged.addPage(p));
  }
  const blob = new Blob([await merged.save()], { type: 'application/pdf' });
  return URL.createObjectURL(blob);
}

~80 LOC of code in oriz-ncert-app/src/components/BuildMyBook.tsx. Lazy-loaded only when user clicks “Build PDF” — pdf-lib’s 200 KB gzip doesn’t hit non-merge pages.

Scraping mechanism phased rollout

User mandate (2026-06-22): “Run playwright-cli local skill for first-time debugging… in the later stage in GitHub Actions we can use cheerio… after verifying that our playwright script works properly after testing.”

Scope v0

ALL classes (Pre-K + 1-5 + 6-10 + 11-12) in English + Hindi. ~600 books. Full launch.

Cross-refs


Edit on GitHub · Back to index