DCFN - Patents

Changelog

Currently v0.6.4. Engine updates surface here as they ship; older entries are kept for traceability of the trail of changes a buyer or evaluator can audit themselves.

DCFN-Patents — Changelog

What's changed in the engine's user-visible output, in reverse chronological order. Updates to internal architecture (Drive routing, naming conventions, deployment infrastructure) live in DCFN-BUILD-CHARTER.md and aren't repeated here unless they change what a user actually sees.

Pre-1.0 versioning convention:

0.x.0 — feature additions, output-shape changes
0.x.Y — quality fixes, prompt refinements, copy updates
1.0.0 — reserved for first paid Tier 1 customer signing a contract

v0.6.4 — 2026-05-01

Public site copy + footer cleanup (Z directives from CRISPR retest)

Sampler scope first paragraph rewritten to lead with the provisional-patents-only disclaimer (was leading with "top 3 proposed patents" framing). The disclaimer now appears at the head of the sampler scope block instead of buried in the footer.
Six-domain sampler claim removed. "Education, research, bio, legal, materials, and energy" was stale — predates the per-session corpus pull (v0.5.5+). Engine now accepts portfolios from any USPTO-classified domain; the CPC-derived corpus pull adapts per submission. Updated copy: "The engine accepts portfolios from any USPTO-classified domain. A corpus matched to your portfolio's CPC subclasses is pulled fresh per submission — no fixed domain whitelist."
Footer disclaimer block removed (now lives at top of sampler scope where it leads, not where it gets skipped).
dcfn-patents.onrender.com URL removed from footer (redundant — visitors are already on the URL when they read it).

v0.6.3 — 2026-05-01

Pricing locked decision finally shipped

Z locked the Tier 0 pricing change 2026-04-30 ($85/30day → $175/3day, no L2 gate) but the prior instance only documented the lock without shipping the engine change. Result: Z's CRISPR retest charged the old $85 + an additional $20 ($15 provisional + $1 × 5 memos) at the L2 gate. This ship makes the live pricing match the documented lock.

Access price: ACCESS_PRICE_CENTS = 8500 → 17500 ($85 → $175)
Access window: ACCESS_COOKIE_MAX_AGE 30 days → 3 days
L2 per-deliverable Stripe checkout: REMOVED. The fall-through to inline unit_amount: 1500 (provisional) + unit_amount: 100 (memo) line items is gone. Layer 2 deliverables are unconditionally included in the access window. If a request reaches /layer2/{session_id}/select without a valid access cookie, the engine returns 402 with a clear "re-pay $175" message — never falls through to a per-deliverable charge.

The _verified_stripe_sessions orphan bug (cookie issued before a Render redeploy may not survive into the post-deploy session set) is tracked separately as a follow-on; the new 402 path keeps unpaid users from accidentally being charged the old per-deliverable amounts even if cookies orphan in the future.

Render image rebuild

GitHub Actions workflow auto-publishes a fresh ghcr.io/syntaricodex/lef-dcfn-patents:v0.6.3 image on tag push. Tier 2 customers who pulled :v0.6.1 should pull :v0.6.3 to pick up the price update (relevant only if they enable the Stripe gate in their private deployment, which is uncommon for Tier 2 — usually disabled per ADR).

v0.6.1 — 2026-05-01

Tier 2 container infrastructure (foundational ship)

Tier 2 Private Deployment requires a container customers can pull. Until v0.6.1, the deployment runbook + license templates existed but the actual Docker image did not — meaning a Tier 2 customer signing the contract would have hit a "wait, where's the container?" gap. Closes that gap.

Dockerfile — Python 3.11 slim base, requirements pre-installed, SBERT model all-MiniLM-L6-v2 baked in at build time (eliminates ~30s first-request cold-start). Defaults DCFN_TIER=tier_2 so the engine reads Tier 2 caps from tier_config.py (v0.6.0).
.dockerignore — excludes secrets, Misc/Keys/, sample data, OS noise; keeps documentation in the image so customer's deployed instance can serve /changelog + bundle the docs.
.github/workflows/docker-publish.yml — GitHub Actions workflow building + tagging + pushing to GitHub Container Registry (ghcr.io/syntaricodex/lef-dcfn-patents). Triggers on push to main (→ :edge tag, never used by Tier 2) and on git tags matching v*.*.* (→ :VERSION + :latest). Tier 2 customers pin to specific versions per License Schedule.

What's still per-customer (lives in DCFN Admin Layer scope when that instance comes online): registry credential provisioning, per-customer setup runbook generation from DEPLOYMENT_RUNBOOK.md template, customer onboarding email automation, "Used by" public-site queue.

For first Tier 2 conversation: the foundational image is ready to publish; Z + active instance handle per-customer provisioning manually until Admin Layer instance is online. Bridges the gap.

v0.6.0 — 2026-05-01

Tier-config framework — per-tier engine knobs (Cap Audit Doc 5)

New module tier_config.py centralizes per-tier engine cap configuration. Tier resolution at request time via resolve_tier(): env var (DCFN_TIER, used by Tier 2 baked Docker images) → session state → cookie → default. The TIER_CONFIGS dict holds three tier profiles: tier_0 (current production), tier_1 (boutique IP firms), tier_2 (private deployment).

Caps now read from the resolved config (with module-level constants as fallback for CLI / no-context invocations):

hypothesis_engine.SAMPLER_CANDIDATE_CAP → candidate_cap (Tier 0: 3, Tier 1: 5, Tier 2: 10)
session_corpus_pull.MAX_BYTES_PER_SESSION → bq_session_byte_ceiling (Tier 0/1: 1 TB, Tier 2: 5 TB)
session_corpus_pull.DEFAULT_MAX_PATENTS → default_max_patents (Tier 0: 3000, Tier 1: 5000, Tier 2: 10000)
session_corpus_pull.MAX_CPC_SUBCLASSES → max_cpc_subclasses (Tier 0/1: 10, Tier 2: 15)
domain_scan.find_uncovered_territory(top_n=) → top_n_clusters (Tier 0: 10, Tier 1: 15, Tier 2: 25)

Two engine-depth feature flags also land at the tier level (gated to Tier 1+):

include_papers_without_abstracts (Tier 0: False, Tier 1+: True) — opens corpus to structural-metadata-only papers; implementation lands in a follow-on
adaptive_corpus_per_cast (Tier 0/1: False, Tier 2: True) — engine adapts corpus shape per query; implementation lands in a follow-on when first Tier 2 customer surfaces

Default behavior unchanged. Tier 0 caps match prior production constants exactly. Local validation confirmed via Charter §15: resolve_tier() returns tier_0 with no DCFN_TIER env / no session state, and every cap read returns the original constant value.

The two depth-feature flags are wired but the corresponding engine code paths are not yet implemented. Flags exist so the tier-config framework is complete; engine implementation follows when first Tier 1 / Tier 2 customer signals demand.

v0.5.9 — 2026-04-30

Pipeline timing instrumentation (Charter §16)

L1 pipeline now captures wall-clock per step (fetch, corpus_pull, claim_graph, domain_scan, hypothesis, prior_art, landscape_narrative, memo_pitch, render_arc) and surfaces it in two places: a single log line per step ([step_id] elapsed=Xs) and a final [L1_PIPELINE_TIMING] total=Xs — step1=Ys | step2=Zs | ... summary line. Also persisted to session state as stage_timings + total_pipeline_seconds.
Triggered by Z asking whether to upgrade the Render tier after a CRISPR run took ~50 minutes — without timing data the answer is a guess. Now every run answers the question empirically: which steps eat the wall clock, would more CPU/RAM help (compute-bound), would they not (API-bound).

v0.5.8 — 2026-04-30

Layer 2 CTA cleanup

Removed the duplicate "Upgrade to Layer 2 →" button from the landscape report header. It anchor-linked to the bottom-of-page CTA rather than navigating, which was a confusing dead-end click for readers who interpreted the top-right placement as the actual progression button. Single bottom CTA stays.
Renamed the bottom button copy from "Upgrade to Layer 2 →" to "Progress to Layer 2 →". "Upgrade" frames Layer 2 as a paid step-up over what the user already sees; in the access-window pricing model Layer 2 is included, so the framing was misleading. "Progress" matches the actual flow — moving forward through stages of one engagement, not buying a new tier.

v0.5.7 — 2026-04-30

Landscape narrative tells the truth about which corpus it scanned

After v0.5.5 + v0.5.6 unblocked the per-session corpus pull (Pattern B), the CRISPR re-test produced a real biotech corpus scan — 405 CRISPR-relevant patents, 37 clusters, nearest competitor at 87% match — but the landscape narrative still said "This scan ran against a 37-cluster corpus weighted toward AI, educational technology, and adaptive systems—a significant misalignment with your portfolio's biotechnology focus." That was a hardcoded fallback string in landscape_narrative_synth._infer_corpus_theme() that fired regardless of whether the per-session pull had actually run.

Fix: _infer_corpus_theme now reads session_corpus_meta.json (written by session_corpus_pull.py) and characterizes the corpus honestly — naming the actual CPC subclasses pulled and the patent count. Static-fallback (Tier 0 / sampler) language preserved as the no-meta-file branch. The narrative's "Honest caveat" section will now reflect the real scan, not a stale boilerplate that contradicts what the engine actually did.

v0.5.6 — 2026-04-30

Schema fix: per-session corpus matches `domain_scan.py` reader

After v0.5.5 unblocked the BQ auth and the per-session corpus pull actually fired, the next step (domain_scan.py) crashed with TypeError: list indices must be integers or slices, not str. session_corpus_pull.py was writing the corpus as a bare JSON list; domain_scan.load_data does domain["patents"] expecting the wrapped object format the static data/domain_patents.json ships with. Now writes {"patents": [...]} to match. Re-encode-on-cache-miss path in domain_scan.py already handles the absent .npy cache (slower first scan, correct results).

v0.5.5 — 2026-04-30

Critical fix: per-session corpus pull (Pattern B) now actually fires in production

The Charter §12 Pattern B per-session BigQuery corpus pull, shipped 2026-04-30 as commit 8d03b61, has been silently falling back to the static LEF-tuned diagnostic-AI corpus on every paid run since deploy. Confirmed today via Render log of the CRISPR re-test (session efb40c18) and the unfiled-CGM run (fd5a9325):

[corpus_pull] stderr:
  ERROR: BQ client init failed: Your default credentials were not found.
[corpus_pull] Failed (rc=4); falling back to static corpus.

Root cause: domain_ingest.get_bq_client() — which session_corpus_pull.py imports — only handled gcloud Application Default Credentials. Render uses a different precedence: the service account JSON is set inline in the GOOGLE_SERVICE_ACCOUNT_JSON env var. fetch_user_patents._get_bq_client() already handled this; domain_ingest.get_bq_client() didn't. Fix unifies the auth path — both clients now check GOOGLE_SERVICE_ACCOUNT_JSON first, with ADC as the local-dev fallback.

User-visible effect: paid runs will now actually score the user's portfolio against a CPC-matched corpus pulled fresh per session (typical: 3000 patents in the user's actual subclasses, ~$0.05–$0.25 BQ cost per session) instead of falling back to the static 22-patent LEF-Engine portfolio — the misalignment Perplexity flagged 2026-04-29 finally closes.

v0.5.4 — 2026-04-30

Bug fix: unfiled-idea submit validation

The intake form's submit handler was always validating the patent_numbers field, regardless of which intake tab was active. Users on "Analyze an unfiled idea" couldn't submit without entering a patent number — even though the unfiled-idea path doesn't use one. Backend was correct (already accepted both intake_mode=patents and intake_mode=unfiled); the JS submit guard was the bug. Now branches on the active mode: patent-numbers count rules (max 7, min 1) apply only on the patents tab; unfiled tab requires only a non-empty invention description.

v0.5.3 — 2026-04-30

Pricing-language leak closed

Layer 2 page is now access-gated, no more stale $15 each / $1 each pricing visible to unpaid visitors. Perplexity's 2026-04-30 review surfaced lingering per-deliverable pricing ("$15 per provisional draft, $1 per Continuation Strategy Memo") on the Layer 2 selection page — leftover from the pre-2026-04-29 sampler-funnel pricing model. The pricing UI was already conditionally suppressed when paid=True, but the /layer2/{session_id} GET route had no access guard, so anyone reaching the URL without paying first saw the unpaid view with the old per-deliverable line items.
Two-part fix: (1) added _has_run_access(request) guard on the GET route — non-paid visitors get a 402 pointing them back to the landing-page CTA; (2) collapsed the Jinja conditionals in templates/layer2.html so the page renders only the "Included in your access window" view, no {% if not paid %} branches at all. JS pricing constants (PROV_CENTS, BM_CENTS) and dollar-math removed too.
Live pricing model unchanged: $85 unlocks a 30-day access window covering all Layer 1 + Layer 2 deliverables; no per-deliverable charges.

v0.5.2 — 2026-04-30

Patent attribution accuracy

The footer's "Built on" line was undercounting the LEF portfolio: said "7 U.S. Patents Pending" (8 since the Tesseract Composition supplemental on 2026-04-20) and named only CTE while the engine actually rides multiple substrate patents. Fixed:

Total count corrected to 8.
"Built on" line expanded to name CTE (App. No. 64/002,205) AND the Consolidated Supplemental (App. No. 64/043,294, structural-discovery substrate). The Consolidated Supplemental is what specifically protects DCFN-Patents' silent-region detection, structural discovery, and cross-domain transfer — naming only CTE missed the most directly load-bearing protection.

Same correction applied to the Firebase brand site's per-build cards (DCFN-Patents card now lists CTE + QECO + Consolidated Supplemental + Tesseract Composition with their app numbers, plus the "part of the 8-patent LEF Ai portfolio" reference).

The audit caught one other gap: prior-version footers across DCFN-Research and DCFN-Bio had similar undercounting (Research said "6", Bio had no attribution at all). All three deployed-site footers now use the same unified attribution shape so a buyer comparing builds sees consistent and accurate IP framing.

v0.5.1 — 2026-04-30

Quality fix on the Unfiled Idea intake

The v0.5.0 ship took an unstructured-input failure mode for granted: real users describe inventions in many shapes (one-paragraph ideas, bullet lists, partial draft specs, stream-of-consciousness explanations), and the engine's CPC-derivation + claim-synthesis quality scales directly with how structured the user's writing is. Without scaffolding, garbage-in / garbage-out silently teaches prospects the engine is weak — when actually the engine is correctly reading what was given to it.

Three layers added (per Charter §14, codified same day as a Day-1 standard for any DCFN build with free-text intake):

Placeholder template in the upload textarea — visible until the user starts typing — laying out the four-bucket framing the engine reads best (technical problem / proposed mechanism / novel element / validation).
Hint line below the field explicitly naming the structure-quality lever: "the engine reads what you write — descriptions with clear problem / mechanism / novelty / validation structure produce sharper output than freeform descriptions."
Synth-side structure-derivation pre-pass — both the CPC-derivation and Claim-1-synthesis prompts now include a "INPUT MAY BE UNSTRUCTURED" preamble instructing Claude to mentally organize the description into the four buckets before deriving structured output. Makes the engine robust to messy real-world inputs without an extra API call.

The placeholder is a soft suggestion, not enforcement — users who want to paste a single paragraph still can. The scaffolding lifts the floor on input quality without forcing the user to know what the engine wants.

v0.5.0 — 2026-04-30

Unfiled Idea intake mode

A second intake mode on the landscape-scan form: paste the description of an unfiled invention idea instead of granted-patent numbers. The engine treats the upload as a "virtual patent" and runs the same L1 pipeline — landscape positioning, nearest-neighbor scoring, silent-region detection, candidate generation — against your idea instead of your filings.

How it works: - Toggle at the top of the intake form: Analyze granted patents (default) / Analyze an unfiled idea - For the unfiled-idea path: paste your invention description (max 60K chars). The engine calls Claude once to propose 3-5 CPC subclasses and parse-or-synthesize Claim 1 from the upload. Those become the synthetic record that feeds the standard L1 pipeline. - Adds ~10 seconds (one extra Claude API call) to L1 runtime relative to the granted-patents path.

What you can use this for: - Validating an early-stage invention idea against the patent landscape before you commit to drafting a provisional - Stress-testing claim scope: see how close your idea sits to existing prior art in the same CPC subclasses - Identifying the silent-region opportunities the engine surfaces when it analyzes your idea in landscape context

What you should NOT use this for: trade-secret material. The upload passes through Anthropic, Render, and Google APIs during analysis. We auto-delete your text from our systems after your run completes (per Terms §5.1 carve-out, shipped in v0.4.1), but third-party processing is outside our control. A red disclaimer above the upload field calls this out unmissably; users with strict end-to-end confidentiality requirements should ask about Tier 2 Private Deployment in their own VPC.

Auto-delete is mechanical, not aspirational: when the L1 run completes, the engine's cleanup pass deletes unfiled_upload.txt from session storage AND scrubs the _uploaded_text field from the synthetic patents.json record. The session's downstream artifacts (article, briefing, deliverables) are derived analysis — they stay. The raw upload doesn't.

v0.4.1 — 2026-04-30

Terms of Service

§5.1 Carve-out for Unfiled Idea uploads added to the Terms of Service in preparation for the Unfiled Idea intake mode (shipping next). Uploaded text submitted through the Unfiled Idea intake is exempt from the perpetual training-license grant in §5: auto-deleted from our systems after your run completes, never retained as part of the training corpus, never used for engine training or model improvement. Includes a candid caveat about third-party API processing (Anthropic, Google Cloud) and a pointer to Tier 2 Private Deployment for users who require strict end-to-end confidentiality on pre-filing IP.

v0.4.0 — 2026-04-30

A commercial-readiness pass driven by deep-research reviews of the engine's output by Gemini and Perplexity. Per the Perplexity verdict on this version: "This version is commercially ready for a pilot with an IP firm or in-house counsel." Per Gemini: "the Patents build is ready to be put in front of AmLaw 100 IP partners today."

Output

Strategic-leap explainability — new "STRATEGIC CHOICES" section in differentiation memos and continuation memos. When the engine pivots claim form (e.g., from a wet-lab embodiment to a computer-implemented method to capture §101 Desjardins eligibility), or recommends narrowing scope when the landscape suggested broader, the memo now explains the rationale to the reviewing attorney with explicit one-line reasons per choice. Previously these strategic moves were silent — risking dismissal as model hallucination by attorneys reading the draft cold. Validation: Gemini ("legitimately dangerous tool in the hands of a good patent attorney"), Perplexity ("the difference between a tool that produces conclusions and a tool that produces reasoning").
Priority-date trap warnings — explicit lead in continuation memos. When the engine recommends adding modern terminology (e.g., lipid nanoparticles, base editing, Cas12/13) to a continuation that claims priority back to a 2013-era specification, it now leads the recommendation with a PRIORITY-DATE WARNING callout flagging the spec-support risk. Adding scope not in the original disclosure forfeits the original priority date for those claims, exposing them to a decade of intervening prior art. Previously this risk was buried in qualifying clauses; now it's the lead. Validation: Gemini ("the exact advice a $1,000/hour IP partner would give"), Perplexity ("specificity is what makes the warning actionable rather than decorative").
Cover-note salience — must-read warnings now prominent. Lexicon and art-unit routing warnings (which determine whether a filing routes to Art Unit 3200 Medical Devices and gets evaluated under FDA-clearance contexts the invention can't supply) now appear in a high-visibility callout block at the top of every provisional draft. Previously these critical-to-counsel items were rendered in 10pt italic at the end of the cover page, where they were skippable. Now they're bold, larger font, and visually distinct.

Pricing display

Removed legacy per-deliverable pricing references from the landscape briefing output (render_arc.py, export_report.py). All Layer 2 deliverables — provisionals + continuation strategy memos — are included in your active 30-day access window; the engine no longer advertises stale $15-per-provisional / $1-per-memo language that misrepresents the current pricing model.
Layer 2 page UI replaces "Checkout" with "Generate" for users with an active access window. The page hides per-deliverable pricing and surfaces an "Included in your run window" indicator instead.

Terms of Service + Privacy Policy

Updated to reflect the $85 / 30-day access window pricing. Sections 3, 4, 8, 9, 12 of Terms updated; Sections 1, 9 of Privacy Policy updated. No changes to legal substance — license grant scope, indemnity, limitation of liability, governing law all preserved verbatim.

v0.3.0 — 2026-04-29

Pricing model

Single-fee 30-day access window replaces per-deliverable charges. $85 unlocks a 30-day window during which you can submit any number of Layer 1 landscape briefings and any number of Layer 2 file-ready deliverables for any portfolio you have the right to analyze. The $85 covers running the engine on a higher-tier compute environment for the duration; no per-provisional or per-memo charges inside the window.

Engine

Per-session corpus pull — when a user submits patents for an L1 run, the engine pulls a fresh BigQuery corpus filtered to the user's CPC subclasses, scoring user patents against patents actually relevant to their portfolio rather than a generic LEF-tuned static corpus. Adds 5-10 min to L1 runtime; cost-capped per session.

Three-tier deployment

Tier 0 (Try-me) — current $85 / 30-day access window model on the public site
Tier 1 (Per-seat / Per-matter) — credentialed access to the same engine, $50K-$120K/year per seat for IP boutiques + in-house pharma IP teams. Available on contract.
Tier 2 (Private Deployment) — single-tenant Docker deployment in client's own VPC for institutional buyers (top-tier law firms, pharma IP departments, IP intelligence aggregators) facing the 2026 regulatory wall (EU DORA, AI Act, attorney confidentiality). $250K+/year. Customer engineering team deploys per a runbook that ships with the engine.

v0.2.0 — 2026-04-22

Two-layer landscape + deliverables architecture — Layer 1 produces the structural landscape briefing (.docx), Layer 2 produces file-ready provisionals + continuation strategy memos. Layer 2 deliverables are previewable before commitment per the grounding principle ("show the product, not the pitch").
Cluster-discovery agent + auto-promote queue for the local-runner publish track. Twice-weekly cycles surface new patent clusters from BigQuery; each cluster runs 1-2 cycles then dormant.

v0.1.0 — 2026-04-12

Initial deployment. Single-page intake → BigQuery patent fetch → concept graph → cluster scan → silent-region detection → provisional draft + continuation memo generation. Domain-Wide Delegation impersonation for Drive uploads. Verify-on-demand Stripe checkout (no webhooks). FastAPI + Jinja2 inline.