4.4 KiB
Debug Log: Upload 500 Error — Phase 1 Frontend
Date: 2026-04-23 Issue: Document upload (DOCX/PDF) via frontend returns "Request failed with status code 500" Status: ✅ Resolved
Symptoms
- Uploading
NEC4 ACC.docx→ HTTP 500:DOCX library is not installed - Uploading
NEC4 ACC.pdf→ HTTP 500:'function' object has no attribute 'name' - Query endpoint also failing with same
.nameerror
Root Cause Analysis
Environment: Backend was running on global Anaconda Python 3.13 with packages that did NOT match requirements.txt.
| Package | requirements.txt | Actually Installed | Impact |
|---|---|---|---|
| python-docx | 1.1.0 | Missing | DOCX parsing fails |
| chromadb | 0.4.22 | 1.5.8 | API mismatch — embedding function signature changed |
| numpy | (transitive) | 2.4.4 | ChromaDB 0.4.22 uses np.float_ (removed in NumPy 2.0) |
| pytest | 8.0.0 | 8.0.0 | Conflicts with pytest-asyncio==0.23.4 (requires pytest<8) |
Fixes Applied
1. Created Project Venv (Python 3.11)
The pinned packages in requirements.txt require Python ≤3.11 (tiktoken, pydantic-core have no prebuilt wheels for 3.13).
cd backend
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Also fixed: pytest==8.0.0 → pytest==7.4.4 in requirements.txt (dependency conflict).
2. Fixed NumPy Compatibility
ChromaDB 0.4.22 references np.float_ which was removed in NumPy 2.0.
pip install 'numpy<2' # Downgraded to 1.26.4
3. Cleared Incompatible ChromaDB Database
Old backend/chroma_db/ was created by ChromaDB 1.5.8 and incompatible with 0.4.22 schema.
rm -rf backend/chroma_db
4. Fixed Embedding Function Wrapper
ChromaDB 0.4.22 validates embedding function signatures against EmbeddingFunction protocol (__call__(self, input)). The original code passed a plain function which:
- Failed signature validation (
'function' object has no attribute 'name') - Used
asyncio.run()which cannot be called inside a running event loop
File: backend/app/core/database.py
Before:
def get_embedding_function_settings(settings):
def _wrap(texts: list[str]) -> list[list[float]]:
return asyncio.run(client.embed(texts))
return _wrap
After:
class _EmbeddingFunctionWrapper:
def __init__(self, settings):
self.settings = settings
def __call__(self, input):
from concurrent.futures import ThreadPoolExecutor
def _run_in_thread(texts):
client = EmbeddingClient(self.settings)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
return loop.run_until_complete(client.embed(texts))
finally:
loop.close()
with ThreadPoolExecutor(max_workers=1) as executor:
return executor.submit(_run_in_thread, input).result()
Key changes:
- Class-based wrapper with
__call__(self, input)signature matching protocol - Thread pool isolation for async event loop (avoids
asyncio.run()inside running loop) - Per-call
EmbeddingClientinstance + fresh event loop in thread
Files Changed
| File | Change |
|---|---|
backend/requirements.txt |
pytest==7.4.4 (was 8.0.0) |
backend/app/core/database.py |
Added _EmbeddingFunctionWrapper class |
backend/.venv/ |
New Python 3.11 venv (gitignored) |
backend/chroma_db/ |
Cleared and recreated |
Verification
| Test | Result |
|---|---|
DOCX upload (NEC4 ACC.docx, 315KB) |
✅ HTTP 200, 1 chunk |
PDF upload (NEC4 ACC.pdf) |
✅ HTTP 200, 101 chunks |
Query endpoint ("What is NEC4 ACC?") |
✅ HTTP 200, keywords + bullet answer + sources |
Prevention
- Always use the venv:
source backend/.venv/bin/activatebefore running backend - Never run backend in global env: Package versions drift silently
- Clear
chroma_db/when upgrading ChromaDB: Schema is not forward-compatible - Pin Python version: Add
python_requires=">=3.9,<3.12"to project config
Related
- Frontend issue was a red herring — the frontend correctly displayed the 500 error. The actual bugs were all backend-side.
- Query endpoint also affected because it shares the same
RAGService→get_chroma_client()→ embedding function path.