Frontend (items 1-10):
- WebSocket streaming integration with useWebSocket hook
- Admin Dashboard UI (status, voices, agents, governance tabs)
- Voice playback UI (TTS/STT integration)
- Settings/Preferences page (conversation style, sliders)
- Responsive/mobile layout (breakpoints at 480px, 768px)
- Dark/light theme with CSS variables and localStorage
- Error handling & loading states (retry, empty state, disabled input)
- Authentication UI (login page, Bearer token, logout)
- Head visualization improvements (active/speaking states, animations)
- Consequence/Ethics dashboard (lessons, consequences, insights tabs)
Backend stubs (items 11-21):
- Tool connectors: DocsConnector (text/md/PDF), DBConnector (SQLite/Postgres), CodeRunnerConnector (Python/JS/Bash/Ruby sandboxed)
- STT adapter: WhisperSTTAdapter, AzureSTTAdapter
- Multi-modal interface adapters: Visual, Haptic, Gesture, Biometric
- SSE streaming endpoint (/v1/sessions/{id}/stream/sse)
- Multi-tenant support (X-Tenant-ID header, tenant CRUD)
- Plugin marketplace/registry (register, install, list)
- Backup/restore endpoints
- Versioned API negotiation (Accept-Version header, deprecation)
Infrastructure (items 22-26):
- docker-compose.yml (API + Postgres + Redis + frontend)
- .env.example with all configurable vars
- gunicorn.conf.py production ASGI config
- Prometheus metrics collector and /metrics endpoint
- Structured JSON logging configuration
Documentation (items 27-29):
- Architecture docs with module layout and subsystem descriptions
- Quickstart guide with setup, API tour, and test instructions
Tests (items 30-32):
- Integration tests: 25 end-to-end API tests
- Frontend tests: 10 Vitest tests for hooks (useTheme, useAuth)
- Load/performance tests: latency and throughput benchmarks
- Connector tests: 16 tests for Docs, DB, CodeRunner
- Multi-modal adapter tests: 9 tests
- Metrics collector tests: 5 tests
- STT adapter tests: 2 tests
511 Python tests passing, 10 frontend tests passing, 0 ruff errors.
Co-Authored-By: Nakamoto, S <defi@defi-oracle.io>
76 lines
2.6 KiB
Python
76 lines
2.6 KiB
Python
"""SSE streaming endpoint for token-by-token LLM responses."""
|
|
|
|
from __future__ import annotations
|
|
|
|
import asyncio
|
|
import json
|
|
import uuid
|
|
from typing import Any
|
|
|
|
from fastapi import APIRouter
|
|
from fastapi.responses import StreamingResponse
|
|
|
|
from fusionagi._logger import logger
|
|
from fusionagi.api.dependencies import get_orchestrator
|
|
|
|
router = APIRouter()
|
|
|
|
|
|
async def _sse_generator(session_id: str, prompt: str) -> Any:
|
|
"""Generate SSE events for a streaming prompt response."""
|
|
event_id = str(uuid.uuid4())[:8]
|
|
|
|
yield f"event: start\ndata: {json.dumps({'session_id': session_id, 'event_id': event_id})}\n\n"
|
|
|
|
orch = get_orchestrator()
|
|
if orch is None:
|
|
yield f"event: error\ndata: {json.dumps({'error': 'Orchestrator not available'})}\n\n"
|
|
return
|
|
|
|
try:
|
|
yield f"event: heads_running\ndata: {json.dumps({'heads': ['logic', 'creativity', 'research', 'safety']})}\n\n"
|
|
|
|
from fusionagi.schemas.task import Task
|
|
task = Task(task_id=f"stream_{event_id}", prompt=prompt)
|
|
result = orch.run(task)
|
|
|
|
if result and hasattr(result, "final_answer"):
|
|
answer = result.final_answer or ""
|
|
# Stream token-by-token (simulate chunked response)
|
|
words = answer.split()
|
|
for i, word in enumerate(words):
|
|
chunk = word + (" " if i < len(words) - 1 else "")
|
|
yield f"event: token\ndata: {json.dumps({'token': chunk, 'index': i})}\n\n"
|
|
await asyncio.sleep(0.02)
|
|
|
|
yield f"event: complete\ndata: {json.dumps({'session_id': session_id, 'full_text': answer})}\n\n"
|
|
else:
|
|
yield f"event: complete\ndata: {json.dumps({'session_id': session_id, 'full_text': ''})}\n\n"
|
|
|
|
except Exception as e:
|
|
logger.error("SSE streaming error", extra={"error": str(e), "session_id": session_id})
|
|
yield f"event: error\ndata: {json.dumps({'error': str(e)})}\n\n"
|
|
|
|
|
|
@router.post("/sessions/{session_id}/stream/sse")
|
|
async def stream_sse(session_id: str, body: dict[str, Any]) -> StreamingResponse:
|
|
"""Stream a prompt response as Server-Sent Events.
|
|
|
|
Events emitted:
|
|
- ``start``: Stream began
|
|
- ``heads_running``: Which heads are processing
|
|
- ``token``: Individual response token
|
|
- ``complete``: Final response with full text
|
|
- ``error``: Error occurred
|
|
"""
|
|
prompt = body.get("prompt", "")
|
|
return StreamingResponse(
|
|
_sse_generator(session_id, prompt),
|
|
media_type="text/event-stream",
|
|
headers={
|
|
"Cache-Control": "no-cache",
|
|
"Connection": "keep-alive",
|
|
"X-Accel-Buffering": "no",
|
|
},
|
|
)
|