Frontend Layer — Next.js 15
→
IDE-Styled Dashboard
Task selection UI
React SPA
Task Router
JSON POST dispatch
Client logic
Inference Report Pane
Micro-state per tool
Visualizer
JSON POST · HTTP/1.1
Structured JSON · Response
API Layer — FastAPI
FastAPI Gateway
REST endpoints
Entry point
→
Pydantic Validation
Schema enforcement
Type-safe
→
Recursive Chunker
1024-token window management
Middleware
Optimised chunks · Pipeline dispatch
Neural Logic Layer — HuggingFace Transformers
Pipeline Controller
pipelines.py · schemas.py
Dispatcher
→ routes to →
Summarizer
facebook/bart-large-cnn
BART
Sentiment
cardiffnlp/twitter-roberta
RoBERTa
Zero-Shot
facebook/bart-large-mnli
BART
NER
dslim/bert-base-NER
BERT
Q&A
deepset/roberta-base-squad2
RoBERTa
Auto-device detection · CUDA / CPU
Compute Hardware
NVIDIA CUDA Core
GeForce RTX 2050 · 4GB GDDR6 · cuda:0
High-perf inference
System CPU
AMD Ryzen 5 7535HS · Dev fallback
Fallback
01
// Decoupled Architecture
UI and ML layers are independently scalable. GPU-intensive backend can scale without touching the frontend.
02
// Recursive Chunking
Specialised middleware handles token-window constraints — BART's 1024-token limit is managed transparently.
03
// Auto-Device Detection
Backend selects cuda:0 for production inference, falls back gracefully to CPU in development environments.
04
// Micro-State Management
Next.js manages tool-specific state independently — Summarizer and NER reports never collide.