chore: add pnpm lockfiles, Phase 4 plan, and dev plan status update

2026-05-14 20:26:17 +08:00 · 2026-05-14 20:26:17 +08:00 · 64a7a8a46b
parent 2501a2c3c0
commit 64a7a8a46b
7 changed files with 5038 additions and 4 deletions
--- a/.plans/phase4_system_audio_plan.md
+++ b/.plans/phase4_system_audio_plan.md
@ -0,0 +1,453 @@
+# Phase 4: System Audio Capture → ASR → RAG — Implementation Plan
+
+**Created:** 2026-05-09
+**Updated:** 2026-05-09
+**Status:** 📋 Draft (Not Started)
+**Depends on:** Phase 1 (Complete), Phase 2 (Complete), Phase 3 (Complete)
+
+---
+
+## 1. Overview
+
+Phase 4 adds **system audio capture** as a third audio source in the LTTPage, alongside file Upload and YouTube. Instead of playing a video in the browser, the user captures audio output from any application on their computer (browser tab, Spotify, Zoom, system sounds) and pipes it through the existing ASR → RAG pipeline.
+
+**Use cases:**
+- Watching a YouTube video in a regular browser tab (no proxy needed — just share that tab's audio)
+- Listening to a podcast, lecture, or meeting and getting real-time transcript + RAG
+- Transcribing any audio playing on the computer without needing to download files
+
+### How It Works
+
+```
+User clicks "System Audio" → clicks "Start Capture"
+  → Browser shows permission dialog (screen/tab picker)
+  → User selects tab/window/screen (with audio)
+  → getDisplayMedia() returns MediaStream (with audio track)
+  → AudioContext.createMediaStreamSource(stream)
+  → ScriptProcessorNode (Float32 PCM, mono 16kHz)
+  → WebSocket → FastAPI → DashScope realtime ASR
+  → transcript → QueryInput → RAG Pipeline
+```
+
+### Audio Routing (vs Existing Sources)
+
+| Source | Audio Input | SourceNode Type | Start/Stop Trigger |
+|--------|-------------|-----------------|-------------------|
+| Upload | `<video>` element | `createMediaElementSource` | play/pause events |
+| YouTube | `<audio>` element | `createMediaElementSource` | play/pause events on `<video>` |
+| **System Audio** | MediaStream from `getDisplayMedia()` | `createMediaStreamSource` | Manual Start/Stop button + track ended event |
+
+### Why New Hook (Not Reuse Existing)
+
+The existing `useVideoASR` and `useYouTubeASR` hooks depend on HTML media elements (`<video>`, `<audio>`) for both the audio source and play/pause lifecycle. System audio capture uses a **MediaStream** object (no DOM element), and its lifecycle is controlled by user permission (grant/revoke) and manual start/stop, not DOM events. A new hook is architecturally cleaner than overloading the existing ones with branching logic.
+
+---
+
+## 2. User Flow
+
+1. User selects **"System Audio"** tab (third option alongside Upload / YouTube)
+2. UI shows a **"Start Capture"** button with browser compatibility info
+3. User clicks **"Start Capture"**
+4. Browser opens **permission dialog** (screen/tab picker)
+   - User selects a browser tab (e.g., "YouTube — Live Stream") or "Entire Screen"
+   - User checks "Share audio" if available
+5. On approval: capture starts — status indicator shows "Capturing" with a live audio level meter
+6. Real-time ASR transcription flows into **QueryInput** (same as Upload/YouTube)
+7. User can **edit transcript while capturing** continues
+8. User clicks **"Stop Capture"** to end — transcript stays in QueryInput
+9. User submits query → RAG pipeline processes it
+10. **"Full Transcript" button hidden** (streaming ASR only, same as YouTube)
+
+### Permission Denied Flow
+
+1. If user clicks "Cancel" in permission dialog → error state: "Permission denied — system audio capture requires your explicit permission"
+2. If user revokes permission (Chrome "Stop sharing") → capture stops gracefully, status: "Capture stopped"
+3. If no audio track in the stream → error: "No audio track found in the shared content"
+
+---
+
+## 3. Architecture
+
+### 3.1 Component Tree (LTTPage — System Audio Mode)
+
+```
+LTTPage
+├── SourceSelector (tabs: Upload | YouTube | System Audio)
+├── [source === 'system-audio']
+│   ├── SystemAudioCapture
+│   │   ├── Start/Stop button
+│   │   ├── Status indicator (idle | requesting | capturing | error)
+│   │   ├── Audio level meter (optional, nice-to-have)
+│   │   └── Browser compatibility note (non-Chrome users)
+│   └── (no video player — audio-only capture)
+├── QueryInput (receives transcript from useSystemAudioASR)
+├── ExtractedQuestionsDisplay
+└── RAG Response Panel
+```
+
+### 3.2 Data Flow
+
+```
+SystemAudioCapture (UI)
+  │
+  ├── "Start Capture" click → calls startCapture() from hook
+  │
+  ▼
+useSystemAudioASR hook
+  │
+  ├── getDisplayMedia({ audio: { systemAudio: 'include' } })
+  │     └── User picks tab/window → returns MediaStream
+  │
+  ├── AudioContext.createMediaStreamSource(stream)
+  │     └── MediaStreamAudioSourceNode
+  │
+  ├── ScriptProcessorNode (4096 buffer, mono 16kHz)
+  │     └── onaudioprocess: convert Float32 → Int16 PCM
+  │
+  ├── WebSocket → ws://host/ws/asr/{uuid}?language=yue
+  │     └── Sends binary PCM frames
+  │
+  └── Returns: { status, transcript, partialTranscript, startCapture, stopCapture }
+        │
+        ▼
+LTTPage unifies: const asr = source === 'system-audio' ? systemAudioASR : ...
+  │
+  ▼
+QueryInput receives asr.partialTranscript
+```
+
+### 3.3 Backend Changes
+
+**Minimal.** The existing WebSocket ASR endpoint (`ws_asr.py`) already accepts audio from any source. The only addition is handling a **UUID-based `video_id`** for system audio sessions (no real video file).
+
+| Change | File | Description |
+|--------|------|-------------|
+| Allow UUID video_id | `backend/app/routers/ws_asr.py` | Accept non-file-based video IDs (already accepts any string) |
+| Transcript persistence | `backend/app/services/history_service.py` | Store system audio transcripts with UUID session ID (optional — nice-to-have) |
+| Config | `backend/app/core/config.py` | Add `SYSTEM_AUDIO_ENABLED` toggle (default: true) |
+
+**No changes needed to:**
+- DashScope ASR client (receives PCM, doesn't care about source)
+- WebSocket protocol (same binary PCM format)
+- RAG pipeline (consumes transcript text)
+
+### 3.4 Frontend Files
+
+| File | Status | Description |
+|------|--------|-------------|
+| `frontend/src/hooks/useSystemAudioASR.ts` | **New** | Hook: getDisplayMedia → AudioContext → WebSocket |
+| `frontend/src/components/SystemAudioCapture.tsx` | **New** | UI: Start/Stop button, status, compatibility note |
+| `frontend/src/pages/LTTPage.tsx` | **Modified** | Add "System Audio" tab, wire hook, unify ASR |
+| `frontend/src/types/index.ts` | **Modified** | Add SystemAudioStatus type |
+| `frontend/src/components/SourceSelector.tsx` | **Refactor** | Extract source tabs into reusable component (optional — can inline in LTTPage) |
+
+---
+
+## 4. Sub-Phases
+
+| Sub-Phase | Description | Effort | Depends On | Status |
+|-----------|-------------|--------|------------|--------|
+| 4.1 | Config & Infrastructure | 0.5 day | — | 📋 Draft |
+| 4.2 | System Audio Capture Hook (`useSystemAudioASR`) | 1 day | 4.1 | 📋 Draft |
+| 4.3 | SystemAudioCapture UI Component | 0.5 day | 4.2 | 📋 Draft |
+| 4.4 | LTTPage Integration | 0.5 day | 4.2, 4.3 | 📋 Draft |
+| 4.5 | Backend Adjustments | 0.5 day | 4.1 | 📋 Draft |
+| 4.6 | Integration & Acceptance Tests | 1 day | 4.4, 4.5 | 📋 Draft |
+| 4.7 | Polish & Documentation | 0.5 day | 4.6 | 📋 Draft |
+| **Total** | | **4.5 days** | | |
+
+### Phase 4.1 — Config & Infrastructure (0.5 day)
+
+**Objective:** Add system audio feature toggle, define types, establish UUID generation.
+
+**Tasks:**
+1. Add `SYSTEM_AUDIO_ENABLED` to `backend/app/core/config.py` (default: `True`)
+2. Add `SystemAudioStatus` type to `frontend/src/types/index.ts`:
+   ```typescript
+   type SystemAudioStatus = 'idle' | 'requesting' | 'capturing' | 'stopping' | 'error'
+   ```
+3. Add `SystemAudioASRState` interface to types
+4. Add `video_id` UUID generation helper (frontend-side: `crypto.randomUUID()`)
+5. Verify WebSocket ASR endpoint accepts arbitrary `video_id` strings (it does — confirm with a quick test)
+
+**Test Files:** `backend/app/test/test_phase4_config.py`
+
+### Phase 4.2 — System Audio Capture Hook (1 day)
+
+**Objective:** Create `useSystemAudioASR.ts` hook that captures system audio and streams it to the ASR WebSocket.
+
+**Key Design:**
+```typescript
+interface UseSystemAudioASRProps {
+  wsUrl: string   // e.g., ws://localhost:8000/ws/asr/{uuid}?language=yue
+}
+
+interface UseSystemAudioASRReturn {
+  status: 'idle' | 'requesting' | 'capturing' | 'stopping' | 'error'
+  transcript: string
+  partialTranscript: string
+  error: string | null
+  startCapture: () => Promise<void>
+  stopCapture: () => void
+}
+```
+
+**Implementation Details:**
+- `startCapture()`: calls `navigator.mediaDevices.getDisplayMedia({ video: false, audio: { systemAudio: 'include' } })`
+  - On success: creates AudioContext, `createMediaStreamSource(stream)`, connects ScriptProcessor → WebSocket
+  - On user cancel: sets status to `'idle'`, sets error "Permission denied"
+  - On no audio track: sets status to `'error'`, sets error "No audio track found"
+- `stopCapture()`: stops all tracks in the MediaStream, closes AudioContext, closes WebSocket
+- Auto-stop: listens for `track.onended` (user clicks "Stop sharing" in Chrome) → calls stopCapture
+- Audio processing: identical to useVideoASR — `ScriptProcessorNode(4096)`, convert Float32 → Int16 PCM, send via WebSocket
+- WebSocket lifecycle: connect on capture start, close on capture stop
+- Cleanup: useEffect return closes AudioContext, WebSocket, and stops tracks
+
+**Pattern to Follow:**
+- AudioContext setup: follow `useVideoASR.ts` lines 45-143 (AudioContext, ScriptProcessor, sample rate conversion)
+- WebSocket handling: follow `useYouTubeASR.ts` lines 35-100
+- State management: combine patterns from both hooks, adapting for MediaStream source
+
+**Test Files:** `frontend/src/test/test_phase4_useSystemAudioASR.test.ts`
+
+### Phase 4.3 — SystemAudioCapture UI Component (0.5 day)
+
+**Objective:** Create the `SystemAudioCapture.tsx` component with Start/Stop button, status display, and browser compatibility info.
+
+**Component Props:**
+```typescript
+interface SystemAudioCaptureProps {
+  status: SystemAudioStatus
+  error: string | null
+  onStart: () => void
+  onStop: () => void
+}
+```
+
+**UI States:**
+1. **Idle**: "Start Capture" button (blue, prominent) + compatibility note
+2. **Requesting**: "Waiting for permission..." (loading spinner)
+3. **Capturing**: "Stop Capture" button (red) + pulsing green dot + "Capturing system audio..."
+4. **Error**: Red banner with error message + "Try Again" button
+
+**Browser Compatibility Note:**
+```
+⚠️ System audio capture works best in Chrome/Edge on Windows/macOS.
+Firefox and Safari do not support this feature.
+On Linux, only tab audio is available (not full system audio).
+```
+
+**Test Files:** `frontend/src/test/test_phase4_SystemAudioCapture.test.tsx`
+
+### Phase 4.4 — LTTPage Integration (0.5 day)
+
+**Objective:** Wire the System Audio source into LTTPage, adding it as the third tab alongside Upload and YouTube.
+
+**Changes to `LTTPage.tsx`:**
+1. Extend `SourceType` from `'upload' | 'youtube'` to `'upload' | 'youtube' | 'system-audio'`
+2. Add third tab button (icon: `AudioLines` from lucide-react) in the source selector
+3. Initialize `useSystemAudioASR` hook with a UUID-based WebSocket URL
+4. Update `asr` variable:
+   ```typescript
+   const asr = source === 'youtube' ? youtubeASR 
+     : source === 'system-audio' ? systemAudioASR 
+     : uploadASR
+   ```
+5. Conditional rendering:
+   ```
+   {source === 'upload' && <VideoUploader />}
+   {source === 'youtube' && <YouTubeMode />}
+   {source === 'system-audio' && <SystemAudioCapture />}
+   ```
+6. WebSocket URL: `ws://host/ws/asr/{crypto.randomUUID()}?language=yue`
+7. Full Transcript button: hidden for system-audio (same as YouTube)
+8. QueryInput: remains editable during capture (same behavior as other sources)
+
+**Test Files:** `frontend/src/test/test_phase4_LTTPage_integration.test.tsx`
+
+### Phase 4.5 — Backend Adjustments (0.5 day)
+
+**Objective:** Ensure backend handles system audio sessions correctly.
+
+**Tasks:**
+1. Verify `ws_asr.py` WebSocket endpoint works with arbitrary `video_id` (UUID format) — likely no changes needed
+2. Add `SYSTEM_AUDIO_ENABLED` config validation in the router (return 503 if disabled)
+3. Handle system audio sessions in transcript history (optional — store with `source: 'system-audio'` metadata)
+4. Verify the ASR client handles system audio PCM identically to video audio
+
+**No new endpoints needed.** The existing WebSocket and ASR infrastructure is source-agnostic.
+
+**Test Files:** `backend/app/test/test_phase4_config.py`
+
+### Phase 4.6 — Integration & Acceptance Tests (1 day)
+
+**Objective:** Comprehensive tests for the system audio capture flow.
+
+**Backend Integration Tests** (`backend/app/test/test_integration_phase4.py`):
+1. WebSocket accepts UUID video_id
+2. ASR processes audio from system audio session
+3. Config toggle disables feature
+
+**Frontend Tests:**
+1. **Hook tests** (`test_phase4_useSystemAudioASR.test.ts`): ~10 tests
+   - Mock `getDisplayMedia` → successful capture
+   - Mock `getDisplayMedia` → user cancels (permission denied)
+   - Mock `getDisplayMedia` → no audio track
+   - AudioContext setup and teardown
+   - WebSocket connection lifecycle
+   - PCM conversion and sending
+   - `track.onended` triggers auto-stop
+   - `stopCapture` cleanup
+   - Multiple rapid start/stop cycles
+
+2. **Component tests** (`test_phase4_SystemAudioCapture.test.tsx`): ~5 tests
+   - All UI states render correctly (idle, requesting, capturing, error)
+   - Start button calls onStart
+   - Stop button calls onStop
+   - Error state shows message and retry button
+   - Compatibility note visible for non-Chrome (optional)
+
+3. **Integration tests** (`test_phase4_LTTPage_integration.test.tsx`): ~5 tests
+   - System Audio tab renders and switches correctly
+   - ASR variable selects systemAudioASR when source is system-audio
+   - Full Transcript button hidden for system audio
+   - QueryInput receives transcript from system audio
+   - Source switching preserves transcript
+
+**Acceptance Tests** (`backend/app/test/acceptance/test_acceptance_phase4.py`):
+- Real `getDisplayMedia` with actual browser (manual — requires human interaction)
+- Real DashScope ASR with system audio stream
+- End-to-end: capture → ASR → transcript → RAG answer
+
+### Phase 4.7 — Polish & Documentation (0.5 day)
+
+**Tasks:**
+1. Update `README.md` — add System Audio Capture section with usage instructions, browser compatibility table, and limitations
+2. Update `development_plan.md` — add Phase 4 row to timeline, mark status
+3. Add browser detection helper for compatibility warning
+4. Verify production build (`npm run build`)
+5. Run full CI regression (`pytest` + `vitest`)
+6. Final commit
+
+---
+
+## 5. Design Decisions
+
+| Decision | Rationale |
+|----------|-----------|
+| New hook (`useSystemAudioASR`) rather than modifying existing | MediaStream source requires `createMediaStreamSource` (not `createMediaElementSource`), and lifecycle is permission-based (not play/pause events). Separate hook avoids branching complexity. |
+| UUID-based `video_id` | No actual video file for system audio. `crypto.randomUUID()` generates unique session IDs. Backend WebSocket already accepts arbitrary strings. |
+| Manual Start/Stop (not auto) | `getDisplayMedia()` requires explicit user action (browser policy). Cannot auto-start. |
+| No video display in System Audio mode | User watches content in another tab/window. Only capture status and audio controls shown. |
+| `video: false` in getDisplayMedia | Audio-only capture reduces bandwidth and permission scope. User only needs to share audio. |
+| Hide Full Transcript button for system audio | Same as YouTube — streaming ASR only. Full transcript would require recording and batch processing (future Phase 5). |
+| Browser compatibility note in UI | `getDisplayMedia` with audio is Chrome/Edge-only. Non-supporting browsers get clear messaging. |
+
+### getDisplayMedia Options
+
+```javascript
+const stream = await navigator.mediaDevices.getDisplayMedia({
+  video: false,                        // No video needed
+  audio: {
+    systemAudio: 'include',            // Request system audio (tab + full system where supported)
+    echoCancellation: false,           // Don't filter audio
+    noiseSuppression: false,           // Don't filter audio
+    autoGainControl: false,            // Don't adjust volume
+  }
+})
+```
+
+**Note on `video: false`:** Setting `video: false` tells the browser we only want audio. However, the browser permission dialog still shows screen/tab selection (there's no "audio-only picker"). The user must select a tab or screen to share — this is a browser limitation, not ours.
+
+---
+
+## 6. Browser Compatibility
+
+| Platform / Browser | Tab Audio | System Audio | Works? |
+|--------------------|-----------|-------------|--------|
+| Chrome/Edge (Windows) | ✅ | ✅ | **Best — full support** |
+| Chrome/Edge (macOS 14.2+) | ✅ | ✅ | **Good** |
+| Chrome/Edge (Linux) | ✅ | ❌ | Works, tab audio only |
+| Firefox | ❌ | ❌ | Audio ignored |
+| Safari | ❌ | ❌ | Audio not supported |
+| Mobile browsers | ❌ | ❌ | Not supported |
+
+**Detection helper:**
+```typescript
+function isSystemAudioSupported(): boolean {
+  const isChromium = 'chrome' in window || navigator.userAgent.includes('Chrome')
+  // Firefox and Safari don't support audio in getDisplayMedia
+  return isChromium && !navigator.userAgent.includes('Firefox')
+}
+```
+
+---
+
+## 7. Test Strategy
+
+### Test Files
+
+| File | Type | Count | Description |
+|------|------|-------|-------------|
+| `test_phase4_config.py` | Backend integration | 3 | Config toggle, WebSocket accepts UUID |
+| `test_phase4_useSystemAudioASR.test.ts` | Frontend unit | ~10 | Hook behavior: capture, permission, audio, WS |
+| `test_phase4_SystemAudioCapture.test.tsx` | Frontend component | ~5 | UI states: idle, requesting, capturing, error |
+| `test_phase4_LTTPage_integration.test.tsx` | Frontend integration | ~5 | Tab switching, ASR unification, Full Transcript |
+| `test_integration_phase4.py` | Backend integration | 4 | Config toggle, WebSocket, ASR client |
+| `test_acceptance_phase4.py` | Acceptance | 3 | Real browser + real DashScope ASR |
+
+### Mocking Strategy
+
+- **`getDisplayMedia`**: Mock with `jest.fn()` returning a synthetic MediaStream with an AudioTrack
+- **AudioContext**: Use `jest-webgl-mock` or manual mock for AudioContext, ScriptProcessorNode
+- **WebSocket**: Mock via `vitest` WebSocket mock (same pattern as Phase 2/3 tests)
+- **DashScope ASR**: Mock in CI; real in acceptance tests
+
+---
+
+## 8. File Manifest
+
+### New Files
+```
+frontend/src/hooks/useSystemAudioASR.ts
+frontend/src/components/SystemAudioCapture.tsx
+frontend/src/test/test_phase4_useSystemAudioASR.test.ts
+frontend/src/test/test_phase4_SystemAudioCapture.test.tsx
+frontend/src/test/test_phase4_LTTPage_integration.test.tsx
+backend/app/test/test_phase4_config.py
+backend/app/test/test_integration_phase4.py
+backend/app/test/acceptance/test_acceptance_phase4.py
+.plans/phase4_system_audio_plan.md              ← this file
+```
+
+### Modified Files
+```
+frontend/src/pages/LTTPage.tsx                    ← add "System Audio" tab, wire hook
+frontend/src/types/index.ts                       ← add SystemAudioStatus, SystemAudioASRState
+backend/app/core/config.py                        ← add SYSTEM_AUDIO_ENABLED
+development_plan.md                               ← add Phase 4 row
+README.md                                         ← add System Audio Capture section
+```
+
+---
+
+## 9. Acceptance Criteria
+
+- [ ] User can select "System Audio" tab in LTTPage
+- [ ] Clicking "Start Capture" opens browser permission dialog
+- [ ] On permission grant, audio streams through WebSocket to DashScope ASR
+- [ ] Real-time transcript flows into QueryInput
+- [ ] User can edit transcript while capture continues
+- [ ] "Stop Capture" properly closes MediaStream, AudioContext, WebSocket
+- [ ] Permission denied shows clear error message
+- [ ] Browser compatibility note shown for non-Chrome browsers
+- [ ] All CI tests pass (no regressions)
+- [ ] Acceptance tests pass with real DashScope ASR
+- [ ] `npm run build` produces clean production build
+
+---
+
+**File Information**
+- Filename: `phase4_system_audio_plan.md`
+- Created: 2026-05-09
+- Status: Draft — awaiting review before Phase 4.1 implementation begins
--- a/development_plan.md
+++ b/development_plan.md
@ -141,12 +141,14 @@ User Question
 |-----------------------------|--------------|------------------|--------|
 | Setup + Phase 1 Backend     | 3-4 days     | FastAPI + Chroma + Metadata + LLM client | ✅ Complete |
 | Phase 1 Frontend            | 2-3 days     | UI layout + text query flow | ✅ Complete |
-| Phase 2 Backend             | 4-5 days     | Video upload + WebSocket ASR + question extraction | ⬜ Next |
-| Phase 2 Frontend            | 3-4 days     | Video player + live transcript + auto/manual flow | ⬜ Pending |
+| Phase 2 Backend             | 4-5 days     | Video upload + WebSocket ASR + question extraction | ✅ Complete |
+| Phase 2 Frontend            | 3-4 days     | Video player + live transcript + auto/manual flow | ✅ Complete |
 | Testing & Polish            | 1-2 days     | End-to-end testing + deployment scripts | ⬜ Pending |

 **Total Estimated Effort**: 13-17 developer days (2-3 weeks)

+> **Note:** Phase 3 (YouTube Live Stream Proxy → ASR) was implemented (5.5 days, 7 sub-phases) and later reverted in favor of Phase 4's more versatile System Audio Capture approach using `getDisplayMedia()`.
+
 ---

 ## Deployment Strategy
@ -164,5 +166,5 @@ User Question

 **File Information**  
 - Filename: `development_plan.md`  
- Last Updated: April 2026  
- Status: Phase 1 Backend ✅, Phase 1 Frontend ✅ — Phase 2 next
+- Last Updated: May 2026  
+- Status: Phase 1 ✅, Phase 2 ✅ — Phase 4 (System Audio Capture) up next, Phase 3 removed
--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
@ -13,6 +13,7 @@
        "axios": "^1.6.0",
        "lucide-react": "^0.190.0",
        "pdfjs-dist": "^5.6.205",
+        "pnpm": "^10.33.4",
        "react": "^18.2.0",
        "react-dom": "^18.2.0",
        "react-markdown": "^10.1.0",
@ -4944,6 +4945,22 @@
      "integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==",
      "license": "ISC"
    },
+    "node_modules/pnpm": {
+      "version": "10.33.4",
+      "resolved": "https://registry.npmjs.org/pnpm/-/pnpm-10.33.4.tgz",
+      "integrity": "sha512-HGezs1my1AgRm6HtKJ80uPw8aHNBK+xv0mT73IJInlEPy+y5zp0i2ufzt2Jp2EQQRgFL3KU7mXnNelYa1jG4AA==",
+      "license": "MIT",
+      "bin": {
+        "pnpm": "bin/pnpm.cjs",
+        "pnpx": "bin/pnpx.cjs"
+      },
+      "engines": {
+        "node": ">=18.12"
+      },
+      "funding": {
+        "url": "https://opencollective.com/pnpm"
+      }
+    },
    "node_modules/postcss": {
      "version": "8.5.10",
      "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.10.tgz",
--- a/frontend/package.json
+++ b/frontend/package.json
@ -15,6 +15,7 @@
    "axios": "^1.6.0",
    "lucide-react": "^0.190.0",
    "pdfjs-dist": "^5.6.205",
+    "pnpm": "^10.33.4",
    "react": "^18.2.0",
    "react-dom": "^18.2.0",
    "react-markdown": "^10.1.0",
--- a/frontend/pnpm-lock.yaml
+++ b/frontend/pnpm-lock.yaml
--- a/package-lock.json
+++ b/package-lock.json
@ -0,0 +1,28 @@
+{
+  "name": "legco_reranker",
+  "lockfileVersion": 3,
+  "requires": true,
+  "packages": {
+    "": {
+      "dependencies": {
+        "pnpm": "^10.33.4"
+      }
+    },
+    "node_modules/pnpm": {
+      "version": "10.33.4",
+      "resolved": "https://registry.npmjs.org/pnpm/-/pnpm-10.33.4.tgz",
+      "integrity": "sha512-HGezs1my1AgRm6HtKJ80uPw8aHNBK+xv0mT73IJInlEPy+y5zp0i2ufzt2Jp2EQQRgFL3KU7mXnNelYa1jG4AA==",
+      "license": "MIT",
+      "bin": {
+        "pnpm": "bin/pnpm.cjs",
+        "pnpx": "bin/pnpx.cjs"
+      },
+      "engines": {
+        "node": ">=18.12"
+      },
+      "funding": {
+        "url": "https://opencollective.com/pnpm"
+      }
+    }
+  }
+}
--- a/package.json
+++ b/package.json
@ -0,0 +1,5 @@
+{
+  "dependencies": {
+    "pnpm": "^10.33.4"
+  }
+}