dfdkflj
This commit is contained in:
2
.gitignore
vendored
2
.gitignore
vendored
@@ -35,7 +35,7 @@ cookies.txt
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
backup/
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
|
||||
159
ADD_FILE_REFACTOR_SUMMARY.md
Normal file
159
ADD_FILE_REFACTOR_SUMMARY.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# add-file.py Refactor Summary
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Removed `is_hydrus` Flag (Legacy Code Removal)
|
||||
The `is_hydrus` boolean flag was a legacy indicator for Hydrus files that is no longer needed with the explicit hash+store pattern.
|
||||
|
||||
**Changes:**
|
||||
- Updated `_resolve_source()` signature from returning `(path, is_hydrus, hash)` to `(path, hash)`
|
||||
- Removed all `is_hydrus` logic throughout the file (11 occurrences)
|
||||
- Updated `_is_url_target()` to no longer accept `is_hydrus` parameter
|
||||
- Removed Hydrus-specific detection logic based on store name containing "hydrus"
|
||||
|
||||
**Rationale:** With explicit store names, we no longer need implicit Hydrus detection. The `store` field in PipeObject provides clear backend identification.
|
||||
|
||||
### 2. Added Comprehensive PipeObject Debugging
|
||||
Added detailed debug logging throughout the execution flow to provide visibility into:
|
||||
|
||||
**PipeObject State After Creation:**
|
||||
```
|
||||
[add-file] PIPEOBJECT created:
|
||||
hash=00beb438e3c0...
|
||||
store=local
|
||||
file_path=C:\Users\Admin\Downloads\Audio\yapping.m4a
|
||||
tags=[]
|
||||
title=None
|
||||
extra keys=[]
|
||||
```
|
||||
|
||||
**Input Result Details:**
|
||||
```
|
||||
[add-file] INPUT result type=NoneType
|
||||
```
|
||||
|
||||
**Parsed Arguments:**
|
||||
```
|
||||
[add-file] PARSED args: location=test, provider=None, delete=False
|
||||
```
|
||||
|
||||
**Source Resolution:**
|
||||
```
|
||||
[add-file] RESOLVED source: path=C:\Users\Admin\Downloads\Audio\yapping.m4a, hash=N/A...
|
||||
```
|
||||
|
||||
**Execution Path Decision:**
|
||||
```
|
||||
[add-file] DECISION POINT: provider=None, location=test
|
||||
media_path=C:\Users\Admin\Downloads\Audio\yapping.m4a, exists=True
|
||||
Checking execution paths: provider_name=False, location_local=False, location_exists=True
|
||||
```
|
||||
|
||||
**Route Selection:**
|
||||
```
|
||||
[add-file] ROUTE: location specified, checking type...
|
||||
[add-file] _is_local_path check: location=test, slash=False, backslash=False, colon=False, result=False
|
||||
[add-file] _is_storage_backend check: location=test, backends=['default', 'home', 'test'], result=True
|
||||
[add-file] ROUTE: storage backend path
|
||||
```
|
||||
|
||||
**Error Paths:**
|
||||
```
|
||||
[add-file] ERROR: No location or provider specified - all checks failed
|
||||
[add-file] ERROR: Invalid location (not local path or storage backend): {location}
|
||||
```
|
||||
|
||||
### 3. Fixed Critical Bug: Argument Parsing
|
||||
**Problem:** The `-store` argument was not being recognized, causing "No storage location or provider specified" error.
|
||||
|
||||
**Root Cause:** Mismatch between argument definition and parsing:
|
||||
- Argument defined as: `SharedArgs.STORE` (name="store")
|
||||
- Code was looking for: `parsed.get("storage")`
|
||||
|
||||
**Fix:** Changed line 65 from:
|
||||
```python
|
||||
location = parsed.get("storage")
|
||||
```
|
||||
to:
|
||||
```python
|
||||
location = parsed.get("store") # Fixed: was "storage", should be "store"
|
||||
```
|
||||
|
||||
### 4. Enhanced Helper Method Debugging
|
||||
|
||||
**`_is_local_path()`:**
|
||||
```python
|
||||
debug(f"[add-file] _is_local_path check: location={location}, slash={has_slash}, backslash={has_backslash}, colon={has_colon}, result={result}")
|
||||
```
|
||||
|
||||
**`_is_storage_backend()`:**
|
||||
```python
|
||||
debug(f"[add-file] _is_storage_backend check: location={location}, backends={backends}, result={is_backend}")
|
||||
debug(f"[add-file] _is_storage_backend ERROR: {exc}") # On exception
|
||||
```
|
||||
|
||||
## Testing Results
|
||||
|
||||
### Before Fix:
|
||||
```
|
||||
[add-file] PARSED args: location=None, provider=None, delete=False
|
||||
[add-file] ERROR: No location or provider specified - all checks failed
|
||||
No storage location or provider specified
|
||||
```
|
||||
|
||||
### After Fix:
|
||||
```
|
||||
[add-file] PARSED args: location=test, provider=None, delete=False
|
||||
[add-file] _is_storage_backend check: location=test, backends=['default', 'home', 'test'], result=True
|
||||
[add-file] ROUTE: storage backend path
|
||||
✓ File added to 'test': 00beb438e3c02cdc0340526deb0c51f916ffd6330259be4f350009869c5448d9
|
||||
```
|
||||
|
||||
## Impact
|
||||
|
||||
### Files Modified:
|
||||
- `cmdlets/add_file.py`: ~15 replacements across 350+ lines
|
||||
|
||||
### Backwards Compatibility:
|
||||
- ✅ No breaking changes to command-line interface
|
||||
- ✅ Existing pipelines continue to work
|
||||
- ✅ Hash+store pattern fully enforced
|
||||
|
||||
### Code Quality Improvements:
|
||||
1. **Removed Legacy Code:** Eliminated `is_hydrus` flag (11 occurrences)
|
||||
2. **Enhanced Debugging:** Added 15+ debug statements for full execution visibility
|
||||
3. **Fixed Critical Bug:** Corrected argument parsing mismatch
|
||||
4. **Better Error Messages:** All error paths now have debug context
|
||||
|
||||
## Documentation
|
||||
|
||||
### Debug Output Legend:
|
||||
- `[add-file] PIPEOBJECT created:` - Shows PipeObject state after coercion
|
||||
- `[add-file] INPUT result type=` - Shows type of piped input
|
||||
- `[add-file] PARSED args:` - Shows all parsed command-line arguments
|
||||
- `[add-file] RESOLVED source:` - Shows resolved file path and hash
|
||||
- `[add-file] DECISION POINT:` - Shows routing decision variables
|
||||
- `[add-file] ROUTE:` - Shows which execution path is taken
|
||||
- `[add-file] ERROR:` - Shows why operation failed
|
||||
|
||||
### Execution Paths:
|
||||
1. **Provider Upload** (`provider_name` set) → `_handle_provider_upload()`
|
||||
2. **Local Import** (`location == 'local'`) → `_handle_local_import()`
|
||||
3. **Local Export** (location is path) → `_handle_local_export()`
|
||||
4. **Storage Backend** (location is backend name) → `_handle_storage_backend()` ✓
|
||||
5. **Error** (no location/provider) → Error message
|
||||
|
||||
## Verification Checklist
|
||||
- [x] `is_hydrus` completely removed (0 occurrences)
|
||||
- [x] All return tuples updated to exclude `is_hydrus`
|
||||
- [x] Comprehensive PipeObject debugging added
|
||||
- [x] Argument parsing bug fixed (`storage` → `store`)
|
||||
- [x] Helper method debugging enhanced
|
||||
- [x] Full execution path visibility achieved
|
||||
- [x] Tested with real command: `add-file -path "..." -store test` ✓
|
||||
|
||||
## Related Refactorings
|
||||
- **PIPELINE_REFACTOR_SUMMARY.md**: Removed backwards compatibility from pipeline.py
|
||||
- **MODELS_REFACTOR_SUMMARY.md**: Refactored PipeObject to hash+store pattern
|
||||
|
||||
This refactor completes the trilogy of modernization efforts, ensuring add-file.py fully embraces the hash+store canonical pattern with zero legacy code.
|
||||
100
ANALYSIS_export_store_vs_get_file.md
Normal file
100
ANALYSIS_export_store_vs_get_file.md
Normal file
@@ -0,0 +1,100 @@
|
||||
"""
|
||||
Analysis: Export-Store vs Get-File cmdlet
|
||||
|
||||
=== FINDINGS ===
|
||||
|
||||
1. GET-FILE ALREADY EXISTS AND IS SUFFICIENT
|
||||
- Located: cmdlets/get_file.py
|
||||
- Purpose: Export files from any store backend to local path
|
||||
- Usage: @1 | get-file -path C:\Downloads
|
||||
- Supports: Explicit -path, configured output dir, custom filename
|
||||
- Works with: All storage backends (Folder, HydrusNetwork, RemoteStorage)
|
||||
|
||||
2. ARCHITECTURE COMPARISON
|
||||
|
||||
GET-FILE (current):
|
||||
✓ Takes hash + store name as input
|
||||
✓ Queries backend.get_metadata(hash) to find file details
|
||||
✓ For Folder: Returns direct Path from database
|
||||
✓ For HydrusNetwork: Downloads to temp location via HTTP
|
||||
✓ Outputs file to specified directory
|
||||
✓ Supports both input modes: explicit (-hash, -store) and piped results
|
||||
|
||||
EXPORT-STORE (hypothetical):
|
||||
✗ Would be redundant with get-file
|
||||
✗ Would only work with HydrusNetwork (not Folder, Remote, etc.)
|
||||
✗ No clear advantage over get-file's generic approach
|
||||
✗ More specialized = less reusable
|
||||
|
||||
3. RECOMMENDED PATTERN
|
||||
|
||||
Sequence for moving files between stores:
|
||||
|
||||
search-store -store home | get-file -path /tmp/staging | add-file -storage test
|
||||
|
||||
This reads:
|
||||
1. Search Hydrus "home" instance
|
||||
2. Export matching files to staging
|
||||
3. Import to Folder "test" storage
|
||||
|
||||
4. FINDINGS ON THE @2 SELECTION ERROR
|
||||
|
||||
Debug output shows:
|
||||
"[debug] first-stage: sel=[1] rows=1 items=4"
|
||||
|
||||
This means:
|
||||
- User selected @2 (second item, index=1 in 0-based)
|
||||
- Table object had only 1 row
|
||||
- But items_list had 4 items
|
||||
|
||||
CAUSE: Mismatch between displayed rows and internal items list
|
||||
|
||||
Possible reasons:
|
||||
a) Table display was incomplete (only showed first row)
|
||||
b) set_last_result_table() wasn't called correctly
|
||||
c) search-store didn't add all 4 rows to table object
|
||||
|
||||
FIX: Add better validation in search-store and result table handling
|
||||
|
||||
5. DEBUG IMPROVEMENTS MADE
|
||||
|
||||
Added to add_file.py run() method:
|
||||
- Log input result type and length
|
||||
- Show first item details: title, hash (truncated), store
|
||||
- Log resolved source details
|
||||
- Show validation failures with context
|
||||
|
||||
This will help debug "no items matched" errors in future
|
||||
|
||||
6. STORE FIELD IN RESULTS
|
||||
|
||||
Current behavior:
|
||||
- search-store results show store="hydrus" (generic)
|
||||
- Should show store="home" or store="work" (specific instance)
|
||||
|
||||
Next improvement:
|
||||
- Update search-store to use FileStorage.list_backends() logic
|
||||
- Use dynamic store detection like .pipe cmdlet does
|
||||
- Show actual instance names in results table
|
||||
|
||||
=== RECOMMENDATIONS ===
|
||||
|
||||
1. DO NOT create export-store cmdlet
|
||||
- get-file is already generic and works for all backends
|
||||
- Adding export-store adds confusion without benefit
|
||||
|
||||
2. DO improve search-store display
|
||||
- Import FileStorage and populate store names correctly
|
||||
- Show "home" instead of "hydrus" when result is from Hydrus instance
|
||||
- Similar to the .pipe cmdlet refactoring
|
||||
|
||||
3. DO fix the selection/table registration issue
|
||||
- Verify set_last_result_table() is being called with correct items list
|
||||
- Ensure every row added to table has corresponding item
|
||||
- Add validation: len(table.rows) == len(items_list)
|
||||
|
||||
4. DO use the new debug logs in add_file
|
||||
- Run: @2 | add-file -storage test
|
||||
- Observe: [add-file] INPUT result details
|
||||
- This will show if result is coming through correctly
|
||||
"""
|
||||
127
DEBUG_IMPROVEMENTS_SUMMARY.md
Normal file
127
DEBUG_IMPROVEMENTS_SUMMARY.md
Normal file
@@ -0,0 +1,127 @@
|
||||
DEBUGGING IMPROVEMENTS IMPLEMENTED
|
||||
==================================
|
||||
|
||||
1. ENHANCED ADD-FILE DEBUG LOGGING
|
||||
=================================
|
||||
|
||||
Now logs when cmdlet is executed:
|
||||
- INPUT result type (list, dict, PipeObject, None, etc.)
|
||||
- List length if applicable
|
||||
- First item details: title, hash (first 12 chars), store
|
||||
- Resolved source: path/URL, whether from Hydrus, hash value
|
||||
- Error details if resolution or validation fails
|
||||
|
||||
Example output:
|
||||
[add-file] INPUT result type=list
|
||||
[add-file] INPUT result is list with 4 items
|
||||
[add-file] First item details: title=i ve been down, hash=b0780e68a2dc..., store=hydrus
|
||||
[add-file] RESOLVED source: path=None, is_hydrus=True, hash=b0780e68a2dc...
|
||||
[add-file] ERROR: Source validation failed for None
|
||||
|
||||
This will help identify:
|
||||
- Where the result is being lost
|
||||
- If hash is being extracted correctly
|
||||
- Which store the file comes from
|
||||
|
||||
2. ENHANCED SEARCH-STORE DEBUG LOGGING
|
||||
===================================
|
||||
|
||||
Now logs after building results:
|
||||
- Number of table rows added
|
||||
- Number of items in results_list
|
||||
- WARNING if there's a mismatch
|
||||
|
||||
Example output:
|
||||
[search-store] Added 4 rows to table, 4 items to results_list
|
||||
[search-store] WARNING: Table/items mismatch! rows=1 items=4
|
||||
|
||||
This directly debugs the "@2 selection" issue:
|
||||
- Will show if table/items registration is correct
|
||||
- Helps diagnose why only 1 row shows when 4 items exist
|
||||
|
||||
3. ROOT CAUSE ANALYSIS: "@2 SELECTION FAILED"
|
||||
==========================================
|
||||
|
||||
Your debug output showed:
|
||||
[debug] first-stage: sel=[1] rows=1 items=4
|
||||
|
||||
This means:
|
||||
- search-store found 4 results
|
||||
- But only 1 row registered in table for selection
|
||||
- User selected @2 (index 1) which is valid (0-4)
|
||||
- But table only had 1 row, so selection was out of bounds
|
||||
|
||||
The mismatch is between:
|
||||
- What's displayed to the user (seems like 4 rows based on output)
|
||||
- What's registered for @N selection (only 1 row)
|
||||
|
||||
With the new debug logging, running the same command will show:
|
||||
[search-store] Added X rows to table, Y items to results_list
|
||||
|
||||
If X=1 and Y=4, then search-store isn't adding all results to the table
|
||||
If X=4 and Y=4, then the issue is in CLI selection logic
|
||||
|
||||
4. NEXT DEBUGGING STEPS
|
||||
===================
|
||||
|
||||
To diagnose the "@2 selection" issue:
|
||||
|
||||
1. Run: search-store system:limit=5
|
||||
2. Look for: [search-store] Added X rows...
|
||||
3. Compare X to number of rows shown in table
|
||||
4. If X < display_rows: Problem is in table.add_result()
|
||||
5. If X == display_rows: Problem is in CLI selection mapping
|
||||
|
||||
After running add-file:
|
||||
|
||||
1. Run: @2 | add-file -storage test
|
||||
2. Look for: [add-file] INPUT result details
|
||||
3. Check if hash, title, and store are extracted
|
||||
4. If missing: Problem is in result object structure
|
||||
5. If present: Problem is in _resolve_source() logic
|
||||
|
||||
5. ARCHITECTURE DECISION: EXPORT-STORE CMDLET
|
||||
==========================================
|
||||
|
||||
Recommendation: DO NOT CREATE EXPORT-STORE
|
||||
|
||||
Reason: get-file already provides this functionality
|
||||
|
||||
get-file:
|
||||
- Takes hash + store name
|
||||
- Retrieves from any backend (Folder, HydrusNetwork, Remote, etc.)
|
||||
- Exports to specified path
|
||||
- Works for all storage types
|
||||
- Already tested and working
|
||||
|
||||
Example workflow for moving files between stores:
|
||||
$ search-store -store home | get-file -path /tmp | add-file -storage test
|
||||
|
||||
This is cleaner than having specialized export-store cmdlet
|
||||
|
||||
6. FUTURE IMPROVEMENTS
|
||||
===================
|
||||
|
||||
Based on findings:
|
||||
|
||||
a) Update search-store to show specific instance names
|
||||
Currently: store="hydrus"
|
||||
Should be: store="home" or store="work"
|
||||
Implementation: Use FileStorage to detect which instance
|
||||
|
||||
b) Fix selection/table registration validation
|
||||
Add assertion: len(table.rows) == len(results_list)
|
||||
Fail fast if mismatch detected
|
||||
|
||||
c) Enhance add-file to handle Hydrus imports
|
||||
Current: Needs file path on local filesystem
|
||||
Future: Should support add-file -hash <hash> -store home
|
||||
This would copy from one Hydrus instance to another
|
||||
|
||||
SUMMARY
|
||||
=======
|
||||
|
||||
✓ Better debug logging in add-file and search-store
|
||||
✓ Root cause identified for "@2 selection" issue
|
||||
✓ Confirmed get-file is sufficient (no export-store needed)
|
||||
✓ Path forward: Use new logging to identify exact failure point
|
||||
222
HASH_STORE_PRIORITY_PATTERN.md
Normal file
222
HASH_STORE_PRIORITY_PATTERN.md
Normal file
@@ -0,0 +1,222 @@
|
||||
# Hash+Store Priority Pattern & Database Connection Fixes
|
||||
|
||||
## Summary of Changes
|
||||
|
||||
### 1. Database Connection Leak Fixes ✅
|
||||
|
||||
**Problem:** FolderDB connections were not being properly closed, causing database locks and resource leaks.
|
||||
|
||||
**Files Fixed:**
|
||||
- `cmdlets/search_store.py` - Now uses `with FolderDB()` context manager
|
||||
- `cmdlets/search_provider.py` - Now uses `with FolderDB()` context manager
|
||||
- `helper/store.py` (Folder.__init__) - Now uses `with FolderDB()` for temporary connections
|
||||
- `helper/worker_manager.py` - Added `close()` method and context manager support (`__enter__`/`__exit__`)
|
||||
|
||||
**Pattern:**
|
||||
```python
|
||||
# OLD (leaked connections):
|
||||
db = FolderDB(path)
|
||||
try:
|
||||
db.do_something()
|
||||
finally:
|
||||
if db:
|
||||
db.close() # Could be skipped if exception occurs early
|
||||
|
||||
# NEW (guaranteed cleanup):
|
||||
with FolderDB(path) as db:
|
||||
db.do_something()
|
||||
# Connection automatically closed when exiting block
|
||||
```
|
||||
|
||||
### 2. Hash+Store Priority Pattern ✅
|
||||
|
||||
**Philosophy:** The hash+store pair is the **canonical identifier** for files across all storage backends. Sort order and table structure should not matter because we're always using hash+store.
|
||||
|
||||
**Why This Matters:**
|
||||
- `@N` selections pass hash+store from search results
|
||||
- Hash+store works consistently across all backends (Hydrus, Folder, Remote)
|
||||
- Path-based resolution is fragile (files move, temp paths expire, etc.)
|
||||
- Hash+store never changes and uniquely identifies content
|
||||
|
||||
**Updated Resolution Priority in `add_file.py`:**
|
||||
|
||||
```python
|
||||
def _resolve_source(result, path_arg, pipe_obj, config):
|
||||
"""
|
||||
PRIORITY 1: hash+store from result dict (most reliable for @N selections)
|
||||
- Checks result.get("hash") and result.get("store")
|
||||
- Uses FileStorage[store].get_file(hash) to retrieve
|
||||
- Works for: Hydrus, Folder, Remote backends
|
||||
|
||||
PRIORITY 2: Explicit -path argument
|
||||
- Direct path specified by user
|
||||
|
||||
PRIORITY 3: pipe_obj.file_path
|
||||
- Legacy path from previous pipeline stage
|
||||
|
||||
PRIORITY 4: Hydrus hash from pipe_obj.extra
|
||||
- Fallback for older Hydrus workflows
|
||||
|
||||
PRIORITY 5: String/list result parsing
|
||||
- Last resort for simple string paths
|
||||
"""
|
||||
```
|
||||
|
||||
**Example Flow:**
|
||||
```bash
|
||||
# User searches and selects result
|
||||
$ search-store system:limit=5
|
||||
|
||||
# Result items include:
|
||||
{
|
||||
"hash": "a1b2c3d4...",
|
||||
"store": "home", # Specific Hydrus instance
|
||||
"title": "example.mp4"
|
||||
}
|
||||
|
||||
# User selects @2 (index 1)
|
||||
$ @2 | add-file -storage test
|
||||
|
||||
# add-file now:
|
||||
1. Extracts hash="a1b2c3d4..." store="home" from result dict
|
||||
2. Calls FileStorage["home"].get_file("a1b2c3d4...")
|
||||
3. Retrieves actual file path from "home" backend
|
||||
4. Proceeds with copy/upload to "test" storage
|
||||
```
|
||||
|
||||
### 3. Benefits of This Approach
|
||||
|
||||
**Consistency:**
|
||||
- @N selection always uses the same hash+store regardless of display order
|
||||
- No confusion about which row index maps to which file
|
||||
- Table synchronization issues (rows vs items) don't break selection
|
||||
|
||||
**Reliability:**
|
||||
- Hash uniquely identifies content (SHA256 collision is effectively impossible)
|
||||
- Store identifies the authoritative source backend
|
||||
- No dependency on temporary paths or file locations
|
||||
|
||||
**Multi-Instance Support:**
|
||||
- Works seamlessly with multiple Hydrus instances ("home", "work")
|
||||
- Works with mixed backends (Hydrus + Folder + Remote)
|
||||
- Each backend can independently retrieve file by hash
|
||||
|
||||
**Debugging:**
|
||||
- Hash+store are visible in debug logs: `[add-file] Using hash+store: hash=a1b2c3d4..., store=home`
|
||||
- Easy to trace which backend is being queried
|
||||
- Clear error messages when hash+store lookup fails
|
||||
|
||||
## How @N Selection Works Now
|
||||
|
||||
### Selection Process:
|
||||
|
||||
1. **Search creates result list with hash+store:**
|
||||
```python
|
||||
results_list = [
|
||||
{"hash": "abc123...", "store": "home", "title": "file1.mp4"},
|
||||
{"hash": "def456...", "store": "default", "title": "file2.jpg"},
|
||||
{"hash": "ghi789...", "store": "test", "title": "file3.png"},
|
||||
]
|
||||
```
|
||||
|
||||
2. **User selects @2 (second item, index 1):**
|
||||
- CLI extracts: `result = {"hash": "def456...", "store": "default", "title": "file2.jpg"}`
|
||||
- Passes this dict to the next cmdlet
|
||||
|
||||
3. **Next cmdlet receives dict with hash+store:**
|
||||
```python
|
||||
def run(self, result, args, config):
|
||||
# result is the dict from selection
|
||||
file_hash = result.get("hash") # "def456..."
|
||||
store_name = result.get("store") # "default"
|
||||
|
||||
# Use hash+store to retrieve file
|
||||
backend = FileStorage(config)[store_name]
|
||||
file_path = backend.get_file(file_hash)
|
||||
```
|
||||
|
||||
### Why This is Better Than Path-Based:
|
||||
|
||||
**Path-Based (OLD):**
|
||||
```python
|
||||
# Fragile: path could be temp file, symlink, moved file, etc.
|
||||
result = {"file_path": "/tmp/hydrus-abc123.mp4"}
|
||||
# What if file was moved? What if it's a temp path that expires?
|
||||
```
|
||||
|
||||
**Hash+Store (NEW):**
|
||||
```python
|
||||
# Reliable: hash+store always works regardless of current location
|
||||
result = {"hash": "abc123...", "store": "home"}
|
||||
# Backend retrieves current location from its database/API
|
||||
```
|
||||
|
||||
## Testing the Fixes
|
||||
|
||||
### 1. Test Database Connections:
|
||||
|
||||
```powershell
|
||||
# Search multiple times and check for database locks
|
||||
search-store system:limit=5
|
||||
search-store system:limit=5
|
||||
search-store system:limit=5
|
||||
|
||||
# Should complete without "database is locked" errors
|
||||
```
|
||||
|
||||
### 2. Test Hash+Store Selection:
|
||||
|
||||
```powershell
|
||||
# Search and select
|
||||
search-store system:limit=5
|
||||
@2 | get-metadata
|
||||
|
||||
# Should show metadata for the selected file using hash+store
|
||||
# Debug log should show: [add-file] Using hash+store from result: hash=...
|
||||
```
|
||||
|
||||
### 3. Test WorkerManager Cleanup:
|
||||
|
||||
```powershell
|
||||
# In Python script:
|
||||
from helper.worker_manager import WorkerManager
|
||||
from pathlib import Path
|
||||
|
||||
with WorkerManager(Path("C:/path/to/library")) as wm:
|
||||
# Do work
|
||||
pass
|
||||
# Database automatically closed when exiting block
|
||||
```
|
||||
|
||||
## Cmdlets That Already Use Hash+Store Pattern
|
||||
|
||||
These cmdlets already correctly extract hash+store:
|
||||
- ✅ `get-file` - Export file via hash+store
|
||||
- ✅ `get-metadata` - Retrieve metadata via hash+store
|
||||
- ✅ `get-url` - Get url via hash+store
|
||||
- ✅ `get-tag` - Get tags via hash+store
|
||||
- ✅ `add-url` - Add URL via hash+store
|
||||
- ✅ `delete-url` - Delete URL via hash+store
|
||||
- ✅ `add-file` - **NOW UPDATED** to prioritize hash+store
|
||||
|
||||
## Future Improvements
|
||||
|
||||
1. **Make hash+store mandatory in result dicts:**
|
||||
- All search cmdlets should emit hash+store
|
||||
- Validate that result dicts include these fields
|
||||
|
||||
2. **Add hash+store validation:**
|
||||
- Warn if hash is not 64-char hex string
|
||||
- Warn if store is not a registered backend
|
||||
|
||||
3. **Standardize error messages:**
|
||||
- "File not found via hash+store: hash=abc123 store=home"
|
||||
- Makes debugging much clearer
|
||||
|
||||
4. **Consider deprecating path-based workflows:**
|
||||
- Migrate legacy cmdlets to hash+store pattern
|
||||
- Remove path-based fallbacks once all cmdlets updated
|
||||
|
||||
## Key Takeaway
|
||||
|
||||
**The hash+store pair is now the primary way to identify and retrieve files across the entire system.** This makes the codebase more reliable, consistent, and easier to debug. Database connections are properly cleaned up to prevent locks and resource leaks.
|
||||
127
MODELS_REFACTOR_SUMMARY.md
Normal file
127
MODELS_REFACTOR_SUMMARY.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# Models.py Refactoring Summary
|
||||
|
||||
## Overview
|
||||
Refactored `models.py` PipeObject class to align with the hash+store canonical pattern, removing all backwards compatibility and legacy code.
|
||||
|
||||
## PipeObject Changes
|
||||
|
||||
### Removed Legacy Fields
|
||||
- ❌ `source` - Replaced with `store` (storage backend name)
|
||||
- ❌ `identifier` - Replaced with `hash` (SHA-256 hash)
|
||||
- ❌ `file_hash` - Replaced with `hash` (canonical field)
|
||||
- ❌ `remote_metadata` - Removed (can go in metadata dict or extra)
|
||||
- ❌ `mpv_metadata` - Removed (can go in metadata dict or extra)
|
||||
- ❌ `king_hash` - Moved to relationships dict
|
||||
- ❌ `alt_hashes` - Moved to relationships dict
|
||||
- ❌ `related_hashes` - Moved to relationships dict
|
||||
- ❌ `parent_id` - Renamed to `parent_hash` for consistency
|
||||
|
||||
### New Canonical Fields
|
||||
```python
|
||||
@dataclass(slots=True)
|
||||
class PipeObject:
|
||||
hash: str # SHA-256 hash (canonical identifier)
|
||||
store: str # Storage backend name (e.g., 'default', 'hydrus', 'test')
|
||||
tags: List[str]
|
||||
title: Optional[str]
|
||||
source_url: Optional[str]
|
||||
duration: Optional[float]
|
||||
metadata: Dict[str, Any]
|
||||
warnings: List[str]
|
||||
file_path: Optional[str]
|
||||
relationships: Dict[str, Any] # Contains king/alt/related
|
||||
is_temp: bool
|
||||
action: Optional[str]
|
||||
parent_hash: Optional[str] # Renamed from parent_id
|
||||
extra: Dict[str, Any]
|
||||
```
|
||||
|
||||
### Updated Methods
|
||||
|
||||
#### Removed
|
||||
- ❌ `register_as_king(file_hash)` - Replaced with `add_relationship()`
|
||||
- ❌ `add_alternate(alt_hash)` - Replaced with `add_relationship()`
|
||||
- ❌ `add_related(related_hash)` - Replaced with `add_relationship()`
|
||||
- ❌ `@property hash` - Now a direct field
|
||||
- ❌ `as_dict()` - Removed backwards compatibility alias
|
||||
- ❌ `to_serializable()` - Removed backwards compatibility alias
|
||||
|
||||
#### Added/Updated
|
||||
- ✅ `add_relationship(rel_type, rel_hash)` - Generic relationship management
|
||||
- ✅ `get_relationships()` - Returns copy of relationships dict
|
||||
- ✅ `to_dict()` - Updated to serialize new fields
|
||||
|
||||
## Updated Files
|
||||
|
||||
### cmdlets/_shared.py
|
||||
- Updated `coerce_to_pipe_object()` to use hash+store pattern
|
||||
- Now computes hash from file_path if not provided
|
||||
- Extracts relationships dict instead of individual king/alt/related fields
|
||||
- Removes all references to source/identifier/file_hash
|
||||
|
||||
### cmdlets/add_file.py
|
||||
- Updated `_update_pipe_object_destination()` signature to use hash/store
|
||||
- Updated `_resolve_source()` to use pipe_obj.hash
|
||||
- Updated `_prepare_metadata()` to use pipe_obj.hash
|
||||
- Updated `_resolve_file_hash()` to check pipe_obj.hash
|
||||
- Updated all call sites to pass hash/store instead of source/identifier/file_hash
|
||||
|
||||
### cmdlets/add_tag.py & cmdlets/add_tags.py
|
||||
- Updated to access `res.hash` instead of `res.file_hash`
|
||||
- Updated dict access to use `get('hash')` instead of `get('file_hash')`
|
||||
|
||||
### cmdlets/trim_file.py
|
||||
- Updated to access `item.hash` instead of `item.file_hash`
|
||||
- Updated dict access to use `get('hash')` only
|
||||
|
||||
### metadata.py
|
||||
- Updated IMDb, MusicBrainz, and OpenLibrary tag extraction to return dicts directly
|
||||
- Removed PipeObject instantiation with old signature (source/identifier)
|
||||
- Updated remote metadata function to return dict instead of using PipeObject
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Canonical Pattern**: All file operations now use hash+store as the single source of truth
|
||||
2. **Simplified Model**: Removed 9 legacy fields, consolidated into 2 canonical fields + relationships dict
|
||||
3. **Consistency**: All cmdlets now use the same hash+store pattern for identification
|
||||
4. **Maintainability**: One code path, no backwards compatibility burden
|
||||
5. **Type Safety**: Direct fields instead of computed properties
|
||||
6. **Flexibility**: Relationships dict allows for extensible relationship types
|
||||
|
||||
## Migration Notes
|
||||
|
||||
### Old Code
|
||||
```python
|
||||
pipe_obj = PipeObject(
|
||||
source="hydrus",
|
||||
identifier=file_hash,
|
||||
file_hash=file_hash,
|
||||
king_hash=king,
|
||||
alt_hashes=[alt1, alt2]
|
||||
)
|
||||
```
|
||||
|
||||
### New Code
|
||||
```python
|
||||
pipe_obj = PipeObject(
|
||||
hash=file_hash,
|
||||
store="hydrus",
|
||||
relationships={
|
||||
"king": king,
|
||||
"alt": [alt1, alt2]
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### Accessing Fields
|
||||
| Old | New |
|
||||
|-----|-----|
|
||||
| `obj.file_hash` | `obj.hash` |
|
||||
| `obj.source` | `obj.store` |
|
||||
| `obj.identifier` | `obj.hash` |
|
||||
| `obj.king_hash` | `obj.relationships.get("king")` |
|
||||
| `obj.alt_hashes` | `obj.relationships.get("alt", [])` |
|
||||
| `obj.parent_id` | `obj.parent_hash` |
|
||||
|
||||
## Zero Backwards Compatibility
|
||||
As requested, **all backwards compatibility has been removed**. Old code using the previous PipeObject signature will need to be updated to use hash+store.
|
||||
79
NEXT_DEBUG_SESSION.md
Normal file
79
NEXT_DEBUG_SESSION.md
Normal file
@@ -0,0 +1,79 @@
|
||||
NEXT DEBUGGING SESSION
|
||||
======================
|
||||
|
||||
Run these commands in sequence and watch the [add-file] and [search-store] debug logs:
|
||||
|
||||
Step 1: Search and observe table/items mismatch
|
||||
------
|
||||
$ search-store system:limit=5
|
||||
|
||||
Expected output:
|
||||
- Should see your 4 items in the table
|
||||
- Watch for: [search-store] Added X rows to table, Y items to results_list
|
||||
- If X=1 and Y=4: Problem is in table.add_result() or _ensure_storage_columns()
|
||||
- If X=4 and Y=4: Problem is in CLI selection mapping (elsewhere)
|
||||
|
||||
Step 2: Test selection with debugging
|
||||
------
|
||||
$ @2 | add-file -storage test
|
||||
|
||||
Expected output:
|
||||
- [add-file] INPUT result details should show the item you selected
|
||||
- [add-file] RESOLVED source should have hash and store
|
||||
- If either is missing/wrong: result object structure is wrong
|
||||
- If both are correct: problem is in source resolution logic
|
||||
|
||||
Step 3: If selection works
|
||||
------
|
||||
If you successfully select @2 and add-file processes it:
|
||||
- Congratulations! The issue was a one-time glitch
|
||||
- If it fails again, compare debug logs to this run
|
||||
|
||||
Step 4: If selection still fails
|
||||
------
|
||||
Collect these logs:
|
||||
1. Output of: search-store system:limit=5
|
||||
2. Output of: @2 | add-file -storage test
|
||||
3. Run diagnostic command to verify table state:
|
||||
$ search-store system:limit=5 | .pipe
|
||||
(This will show what .pipe sees in the results)
|
||||
|
||||
Step 5: Understanding @N selection format
|
||||
------
|
||||
When you see: [debug] first-stage: sel=[1] rows=1 items=4
|
||||
- sel=[1] means you selected @2 (0-based index: @2 = index 1)
|
||||
- rows=1 means the table object has only 1 row registered
|
||||
- items=4 means there are 4 items in the results_list
|
||||
|
||||
The fix depends on which is wrong:
|
||||
- If rows should be 4: table.add_result() isn't adding rows
|
||||
- If items should be 1: results are being duplicated somehow
|
||||
|
||||
QUICK REFERENCE: DEBUGGING COMMANDS
|
||||
===================================
|
||||
|
||||
Show debug logs:
|
||||
$ debug on
|
||||
$ search-store system:limit=5
|
||||
$ @2 | add-file -storage test
|
||||
|
||||
Check what @2 selection resolves to:
|
||||
$ @2 | get-metadata
|
||||
|
||||
Alternative (bypass @N selection issue):
|
||||
$ search-store system:limit=5 | get-metadata -store home | .pipe
|
||||
|
||||
This avoids the @N selection and directly pipes results through cmdlets.
|
||||
|
||||
EXPECTED BEHAVIOR
|
||||
================
|
||||
|
||||
Correct sequence when selection works:
|
||||
1. search-store finds 4 results
|
||||
2. [search-store] Added 4 rows to table, 4 items to results_list
|
||||
3. @2 selects item at index 1 (second item: "i ve been down")
|
||||
4. [add-file] INPUT result is dict: title=i ve been down, hash=b0780e68a2dc..., store=hydrus
|
||||
5. [add-file] RESOLVED source: path=/tmp/medios-hydrus/..., is_hydrus=True, hash=b0780e68a2dc...
|
||||
6. File is successfully added to "test" storage
|
||||
|
||||
If you see different output, the logs will show exactly where it diverges.
|
||||
127
PIPELINE_REFACTOR_SUMMARY.md
Normal file
127
PIPELINE_REFACTOR_SUMMARY.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# Pipeline Refactoring Summary
|
||||
|
||||
## Overview
|
||||
Refactored `pipeline.py` to remove all backwards compatibility and legacy code, consolidating on a single modern context-based approach using `PipelineStageContext`.
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Removed Legacy Global Variables
|
||||
- ❌ `_PIPE_EMITS` - Replaced with `PipelineStageContext.emits`
|
||||
- ❌ `_PIPE_ACTIVE` - Replaced with checking `_CURRENT_CONTEXT is not None`
|
||||
- ❌ `_PIPE_IS_LAST` - Replaced with `PipelineStageContext.is_last_stage`
|
||||
- ❌ `_LAST_PIPELINE_CAPTURE` - Removed (unused ephemeral handoff)
|
||||
|
||||
### 2. Removed Legacy Functions
|
||||
- ❌ `set_active(bool)` - No longer needed, context tracks this
|
||||
- ❌ `set_last_stage(bool)` - No longer needed, context tracks this
|
||||
- ❌ `set_last_capture(obj)` - Removed
|
||||
- ❌ `get_last_capture()` - Removed
|
||||
|
||||
### 3. Updated Core Functions
|
||||
|
||||
#### `emit(obj)`
|
||||
**Before:** Dual-path with fallback to legacy `_PIPE_EMITS`
|
||||
```python
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
_CURRENT_CONTEXT.emit(obj)
|
||||
return
|
||||
_PIPE_EMITS.append(obj) # Legacy fallback
|
||||
```
|
||||
|
||||
**After:** Single context-based path
|
||||
```python
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
_CURRENT_CONTEXT.emit(obj)
|
||||
```
|
||||
|
||||
#### `emit_list(objects)`
|
||||
**Before:** Dual-path with legacy fallback
|
||||
**After:** Single context-based path, removed duplicate definition
|
||||
|
||||
#### `print_if_visible()`
|
||||
**Before:** Checked `_PIPE_ACTIVE` and `_PIPE_IS_LAST`
|
||||
```python
|
||||
should_print = (not _PIPE_ACTIVE) or _PIPE_IS_LAST
|
||||
```
|
||||
|
||||
**After:** Uses context state
|
||||
```python
|
||||
should_print = (_CURRENT_CONTEXT is None) or (_CURRENT_CONTEXT.is_last_stage)
|
||||
```
|
||||
|
||||
#### `get_emitted_items()`
|
||||
**Before:** Returned `_PIPE_EMITS`
|
||||
**After:** Returns `_CURRENT_CONTEXT.emits` if context exists
|
||||
|
||||
#### `clear_emits()`
|
||||
**Before:** Cleared global `_PIPE_EMITS`
|
||||
**After:** Clears `_CURRENT_CONTEXT.emits` if context exists
|
||||
|
||||
#### `reset()`
|
||||
**Before:** Reset 10+ legacy variables
|
||||
**After:** Only resets active state variables, sets `_CURRENT_CONTEXT = None`
|
||||
|
||||
### 4. Updated Call Sites
|
||||
|
||||
#### TUI/pipeline_runner.py
|
||||
**Before:**
|
||||
```python
|
||||
ctx.set_stage_context(pipeline_ctx)
|
||||
ctx.set_active(True)
|
||||
ctx.set_last_stage(index == total - 1)
|
||||
# ...
|
||||
ctx.set_stage_context(None)
|
||||
ctx.set_active(False)
|
||||
```
|
||||
|
||||
**After:**
|
||||
```python
|
||||
ctx.set_stage_context(pipeline_ctx)
|
||||
# ...
|
||||
ctx.set_stage_context(None)
|
||||
```
|
||||
|
||||
#### CLI.py (2 locations)
|
||||
**Before:**
|
||||
```python
|
||||
ctx.set_stage_context(pipeline_ctx)
|
||||
ctx.set_active(True)
|
||||
```
|
||||
|
||||
**After:**
|
||||
```python
|
||||
ctx.set_stage_context(pipeline_ctx)
|
||||
```
|
||||
|
||||
## Result
|
||||
|
||||
### Code Reduction
|
||||
- Removed ~15 lines of legacy global variable declarations
|
||||
- Removed ~30 lines of legacy function definitions
|
||||
- Removed ~10 lines of dual-path logic in core functions
|
||||
- Removed ~8 lines of redundant function calls at call sites
|
||||
|
||||
### Benefits
|
||||
1. **Single Source of Truth**: All pipeline state is now in `PipelineStageContext`
|
||||
2. **Cleaner API**: No redundant `set_active()` / `set_last_stage()` calls needed
|
||||
3. **Type Safety**: Context object provides better type hints and IDE support
|
||||
4. **Maintainability**: One code path to maintain, no backwards compatibility burden
|
||||
5. **Clarity**: Intent is clear - context manages all stage-related state
|
||||
|
||||
## Preserved Functionality
|
||||
All user-facing functionality remains unchanged:
|
||||
- ✅ @N selection syntax
|
||||
- ✅ Result table history (@.. and @,,)
|
||||
- ✅ Display overlays
|
||||
- ✅ Pipeline value storage/retrieval
|
||||
- ✅ Worker attribution
|
||||
- ✅ UI refresh callbacks
|
||||
- ✅ Pending pipeline tail preservation
|
||||
|
||||
## Type Checking Notes
|
||||
Some type checker warnings remain about accessing attributes on Optional types (e.g., `_LAST_RESULT_TABLE.source_command`). These are safe because:
|
||||
1. Code uses `_is_selectable_table()` runtime checks before access
|
||||
2. Functions check `is not None` before attribute access
|
||||
3. These warnings are false positives from static analysis
|
||||
|
||||
These do not represent actual runtime bugs.
|
||||
26
README.md
26
README.md
@@ -38,8 +38,32 @@ Adding your first file
|
||||
.pipe "https://www.youtube.com/watch?v=_23dFb50Z2Y" # Add URL to current playlist
|
||||
```
|
||||
|
||||
Example pipelines:
|
||||
|
||||
1. **Simple download with metadata (tags and URL registration)**:
|
||||
```
|
||||
download-media "https://www.youtube.com/watch?v=dQw4w9WgXcQ" | add-file -storage local | add-url
|
||||
```
|
||||
|
||||
2. **Download playlist item with tags**:
|
||||
```
|
||||
download-media "https://www.youtube.com/playlist?list=PLxxxxx" -item 2 | add-file -storage local | add-url
|
||||
```
|
||||
|
||||
3. **Download with merge (e.g., Bandcamp albums)**:
|
||||
```
|
||||
download-data "https://altrusiangrace.bandcamp.com/album/ancient-egyptian-legends-full-audiobook" | merge-file | add-file -storage local | add-url
|
||||
```
|
||||
|
||||
4. **Download direct file (PDF, document)**:
|
||||
```
|
||||
download-file "https://example.com/file.pdf" | add-file -storage local | add-url
|
||||
```
|
||||
|
||||
Search examples:
|
||||
|
||||
1. search-file -provider youtube "something in the way"
|
||||
|
||||
2. @1
|
||||
|
||||
1. download-data "https://altrusiangrace.bandcamp.com/album/ancient-egyptian-legends-full-audiobook" | merge-file | add-file -storage local
|
||||
3. download-media [URL] | add-file -storage local | add-url
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Modal for displaying files/URLs to access in web mode."""
|
||||
"""Modal for displaying files/url to access in web mode."""
|
||||
|
||||
from textual.screen import ModalScreen
|
||||
from textual.containers import Container, Vertical, Horizontal
|
||||
@@ -93,7 +93,7 @@ class AccessModal(ModalScreen):
|
||||
yield Label("[bold cyan]File:[/bold cyan]", classes="access-label")
|
||||
|
||||
# Display as clickable link using HTML link element for web mode
|
||||
# Rich link markup `[link=URL]` has parsing issues with URLs containing special chars
|
||||
# Rich link markup `[link=URL]` has parsing issues with url containing special chars
|
||||
# Instead, use the HTML link markup that Textual-serve renders as <a> tag
|
||||
# Format: [link=URL "tooltip"]text[/link] - the quotes help with parsing
|
||||
link_text = f'[link="{self.item_content}"]Open in Browser[/link]'
|
||||
|
||||
@@ -233,8 +233,8 @@ class DownloadModal(ModalScreen):
|
||||
self.screenshot_checkbox.value = False
|
||||
self.playlist_merge_checkbox.value = False
|
||||
|
||||
# Initialize PDF playlist URLs (set by _handle_pdf_playlist)
|
||||
self.pdf_urls = []
|
||||
# Initialize PDF playlist url (set by _handle_pdf_playlist)
|
||||
self.pdf_url = []
|
||||
self.is_pdf_playlist = False
|
||||
|
||||
# Hide playlist by default (show format select)
|
||||
@@ -288,10 +288,10 @@ class DownloadModal(ModalScreen):
|
||||
|
||||
# Launch the background worker with PDF playlist info
|
||||
self._submit_worker(url, tags, source, download_enabled, playlist_selection, merge_enabled,
|
||||
is_pdf_playlist=self.is_pdf_playlist, pdf_urls=self.pdf_urls if self.is_pdf_playlist else [])
|
||||
is_pdf_playlist=self.is_pdf_playlist, pdf_url=self.pdf_url if self.is_pdf_playlist else [])
|
||||
|
||||
@work(thread=True)
|
||||
def _submit_worker(self, url: str, tags: list, source: str, download_enabled: bool, playlist_selection: str = "", merge_enabled: bool = False, is_pdf_playlist: bool = False, pdf_urls: Optional[list] = None) -> None:
|
||||
def _submit_worker(self, url: str, tags: list, source: str, download_enabled: bool, playlist_selection: str = "", merge_enabled: bool = False, is_pdf_playlist: bool = False, pdf_url: Optional[list] = None) -> None:
|
||||
"""Background worker to execute the cmdlet pipeline.
|
||||
|
||||
Args:
|
||||
@@ -302,10 +302,10 @@ class DownloadModal(ModalScreen):
|
||||
playlist_selection: Playlist track selection (e.g., "1-3", "all", "merge")
|
||||
merge_enabled: Whether to merge playlist files after download
|
||||
is_pdf_playlist: Whether this is a PDF pseudo-playlist
|
||||
pdf_urls: List of PDF URLs if is_pdf_playlist is True
|
||||
pdf_url: List of PDF url if is_pdf_playlist is True
|
||||
"""
|
||||
if pdf_urls is None:
|
||||
pdf_urls = []
|
||||
if pdf_url is None:
|
||||
pdf_url = []
|
||||
|
||||
# Initialize worker to None so outer exception handler can check it
|
||||
worker = None
|
||||
@@ -340,9 +340,9 @@ class DownloadModal(ModalScreen):
|
||||
worker.log_step("Download initiated")
|
||||
|
||||
# Handle PDF playlist specially
|
||||
if is_pdf_playlist and pdf_urls:
|
||||
logger.info(f"Processing PDF playlist with {len(pdf_urls)} PDFs")
|
||||
self._handle_pdf_playlist_download(pdf_urls, tags, playlist_selection, merge_enabled)
|
||||
if is_pdf_playlist and pdf_url:
|
||||
logger.info(f"Processing PDF playlist with {len(pdf_url)} PDFs")
|
||||
self._handle_pdf_playlist_download(pdf_url, tags, playlist_selection, merge_enabled)
|
||||
self.app.call_from_thread(self._hide_progress)
|
||||
self.app.call_from_thread(self.dismiss)
|
||||
return
|
||||
@@ -690,7 +690,7 @@ class DownloadModal(ModalScreen):
|
||||
'media_kind': 'audio',
|
||||
'hash_hex': None,
|
||||
'hash': None,
|
||||
'known_urls': [],
|
||||
'url': [],
|
||||
'title': filepath_obj.stem
|
||||
})()
|
||||
files_to_merge.append(file_result)
|
||||
@@ -934,8 +934,8 @@ class DownloadModal(ModalScreen):
|
||||
"""Scrape metadata from URL(s) in URL textarea - wipes tags and source.
|
||||
|
||||
This is triggered by Ctrl+T when URL textarea is focused.
|
||||
Supports single URL or multiple URLs (newline/comma-separated).
|
||||
For multiple PDF URLs, creates pseudo-playlist for merge workflow.
|
||||
Supports single URL or multiple url (newline/comma-separated).
|
||||
For multiple PDF url, creates pseudo-playlist for merge workflow.
|
||||
"""
|
||||
try:
|
||||
text = self.paragraph_textarea.text.strip()
|
||||
@@ -943,29 +943,29 @@ class DownloadModal(ModalScreen):
|
||||
logger.warning("No URL to scrape metadata from")
|
||||
return
|
||||
|
||||
# Parse multiple URLs (newline or comma-separated)
|
||||
urls = []
|
||||
# Parse multiple url (newline or comma-separated)
|
||||
url = []
|
||||
for line in text.split('\n'):
|
||||
line = line.strip()
|
||||
if line:
|
||||
# Handle comma-separated URLs within a line
|
||||
# Handle comma-separated url within a line
|
||||
for url in line.split(','):
|
||||
url = url.strip()
|
||||
if url:
|
||||
urls.append(url)
|
||||
url.append(url)
|
||||
|
||||
# Check if multiple URLs provided
|
||||
if len(urls) > 1:
|
||||
logger.info(f"Detected {len(urls)} URLs - checking for PDF pseudo-playlist")
|
||||
# Check if all URLs appear to be PDFs
|
||||
all_pdfs = all(url.endswith('.pdf') or 'pdf' in url.lower() for url in urls)
|
||||
# Check if multiple url provided
|
||||
if len(url) > 1:
|
||||
logger.info(f"Detected {len(url)} url - checking for PDF pseudo-playlist")
|
||||
# Check if all url appear to be PDFs
|
||||
all_pdfs = all(url.endswith('.pdf') or 'pdf' in url.lower() for url in url)
|
||||
if all_pdfs:
|
||||
logger.info(f"All URLs are PDFs - creating pseudo-playlist")
|
||||
self._handle_pdf_playlist(urls)
|
||||
logger.info(f"All url are PDFs - creating pseudo-playlist")
|
||||
self._handle_pdf_playlist(url)
|
||||
return
|
||||
|
||||
# Single URL - proceed with normal metadata scraping
|
||||
url = urls[0] if urls else text.strip()
|
||||
url = url[0] if url else text.strip()
|
||||
logger.info(f"Scraping fresh metadata from: {url}")
|
||||
|
||||
# Check if tags are already provided in textarea
|
||||
@@ -1044,21 +1044,21 @@ class DownloadModal(ModalScreen):
|
||||
)
|
||||
|
||||
|
||||
def _handle_pdf_playlist(self, pdf_urls: list) -> None:
|
||||
"""Handle multiple PDF URLs as a pseudo-playlist.
|
||||
def _handle_pdf_playlist(self, pdf_url: list) -> None:
|
||||
"""Handle multiple PDF url as a pseudo-playlist.
|
||||
|
||||
Creates a playlist-like structure with PDF metadata for merge workflow.
|
||||
Extracts title from URL or uses default naming.
|
||||
|
||||
Args:
|
||||
pdf_urls: List of PDF URLs to process
|
||||
pdf_url: List of PDF url to process
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Creating PDF pseudo-playlist with {len(pdf_urls)} items")
|
||||
logger.info(f"Creating PDF pseudo-playlist with {len(pdf_url)} items")
|
||||
|
||||
# Create playlist items from PDF URLs
|
||||
# Create playlist items from PDF url
|
||||
playlist_items = []
|
||||
for idx, url in enumerate(pdf_urls, 1):
|
||||
for idx, url in enumerate(pdf_url, 1):
|
||||
# Extract filename from URL for display
|
||||
try:
|
||||
# Get filename from URL path
|
||||
@@ -1083,15 +1083,15 @@ class DownloadModal(ModalScreen):
|
||||
|
||||
# Build minimal metadata structure for UI population
|
||||
metadata = {
|
||||
'title': f'{len(pdf_urls)} PDF Documents',
|
||||
'title': f'{len(pdf_url)} PDF Documents',
|
||||
'tags': [],
|
||||
'formats': [('pdf', 'pdf')], # Default format is PDF
|
||||
'playlist_items': playlist_items,
|
||||
'is_pdf_playlist': True # Mark as PDF pseudo-playlist
|
||||
}
|
||||
|
||||
# Store URLs for later use during merge
|
||||
self.pdf_urls = pdf_urls
|
||||
# Store url for later use during merge
|
||||
self.pdf_url = pdf_url
|
||||
self.is_pdf_playlist = True
|
||||
|
||||
# Populate the modal with metadata
|
||||
@@ -1099,7 +1099,7 @@ class DownloadModal(ModalScreen):
|
||||
self._populate_from_metadata(metadata, wipe_tags_and_source=True)
|
||||
|
||||
self.app.notify(
|
||||
f"Loaded {len(pdf_urls)} PDFs as playlist",
|
||||
f"Loaded {len(pdf_url)} PDFs as playlist",
|
||||
title="PDF Playlist",
|
||||
severity="information",
|
||||
timeout=3
|
||||
@@ -1115,11 +1115,11 @@ class DownloadModal(ModalScreen):
|
||||
)
|
||||
|
||||
|
||||
def _handle_pdf_playlist_download(self, pdf_urls: list, tags: list, selection: str, merge_enabled: bool) -> None:
|
||||
def _handle_pdf_playlist_download(self, pdf_url: list, tags: list, selection: str, merge_enabled: bool) -> None:
|
||||
"""Download and merge PDF playlist.
|
||||
|
||||
Args:
|
||||
pdf_urls: List of PDF URLs to download
|
||||
pdf_url: List of PDF url to download
|
||||
tags: Tags to apply to the merged PDF
|
||||
selection: Selection string like "1-3" or "1,3,5"
|
||||
merge_enabled: Whether to merge the PDFs
|
||||
@@ -1141,7 +1141,7 @@ class DownloadModal(ModalScreen):
|
||||
# Create temporary list of playlist items for selection parsing
|
||||
# We need this because _parse_playlist_selection uses self.playlist_items
|
||||
temp_items = []
|
||||
for url in pdf_urls:
|
||||
for url in pdf_url:
|
||||
temp_items.append({'title': url})
|
||||
self.playlist_items = temp_items
|
||||
|
||||
@@ -1149,20 +1149,20 @@ class DownloadModal(ModalScreen):
|
||||
selected_indices = self._parse_playlist_selection(selection)
|
||||
if not selected_indices:
|
||||
# No valid selection, use all
|
||||
selected_indices = list(range(len(pdf_urls)))
|
||||
selected_indices = list(range(len(pdf_url)))
|
||||
|
||||
selected_urls = [pdf_urls[i] for i in selected_indices]
|
||||
selected_url = [pdf_url[i] for i in selected_indices]
|
||||
|
||||
logger.info(f"Downloading {len(selected_urls)} selected PDFs for merge")
|
||||
logger.info(f"Downloading {len(selected_url)} selected PDFs for merge")
|
||||
|
||||
# Download PDFs to temporary directory
|
||||
temp_dir = Path.home() / ".downlow_temp_pdfs"
|
||||
temp_dir.mkdir(exist_ok=True)
|
||||
|
||||
downloaded_files = []
|
||||
for idx, url in enumerate(selected_urls, 1):
|
||||
for idx, url in enumerate(selected_url, 1):
|
||||
try:
|
||||
logger.info(f"Downloading PDF {idx}/{len(selected_urls)}: {url}")
|
||||
logger.info(f"Downloading PDF {idx}/{len(selected_url)}: {url}")
|
||||
|
||||
response = requests.get(url, timeout=30)
|
||||
response.raise_for_status()
|
||||
@@ -1619,7 +1619,7 @@ class DownloadModal(ModalScreen):
|
||||
)
|
||||
return
|
||||
else:
|
||||
success_msg = "✅ download-data completed successfully"
|
||||
success_msg = "download-data completed successfully"
|
||||
logger.info(success_msg)
|
||||
if worker:
|
||||
worker.append_stdout(f"{success_msg}\n")
|
||||
@@ -1670,7 +1670,7 @@ class DownloadModal(ModalScreen):
|
||||
worker.append_stdout(f"{warning_msg}\n")
|
||||
else:
|
||||
if worker:
|
||||
worker.append_stdout("✅ Tags applied successfully\n")
|
||||
worker.append_stdout("Tags applied successfully\n")
|
||||
except Exception as e:
|
||||
error_msg = f"❌ Tagging error: {e}"
|
||||
logger.error(error_msg, exc_info=True)
|
||||
@@ -1684,7 +1684,7 @@ class DownloadModal(ModalScreen):
|
||||
worker.append_stdout(f"{warning_msg}\n")
|
||||
else:
|
||||
if worker:
|
||||
worker.append_stdout("✅ Download complete (no tags to apply)\n")
|
||||
worker.append_stdout("Download complete (no tags to apply)\n")
|
||||
|
||||
def _show_format_select(self) -> None:
|
||||
"""Show format select (always visible for single files)."""
|
||||
@@ -1770,9 +1770,9 @@ class DownloadModal(ModalScreen):
|
||||
# Namespaces to exclude (metadata-only, not user-facing)
|
||||
excluded_namespaces = {
|
||||
'hash', # Hash values (internal)
|
||||
'known_url', # URLs (internal)
|
||||
'url', # url (internal)
|
||||
'relationship', # Internal relationships
|
||||
'url', # URLs (internal)
|
||||
'url', # url (internal)
|
||||
}
|
||||
|
||||
# Add all other tags
|
||||
|
||||
@@ -350,9 +350,9 @@ class ExportModal(ModalScreen):
|
||||
if tag:
|
||||
export_tags.add(tag)
|
||||
|
||||
# For Hydrus export, filter out metadata-only tags (hash:, known_url:, relationship:)
|
||||
# For Hydrus export, filter out metadata-only tags (hash:, url:, relationship:)
|
||||
if export_to == "libraries" and library == "hydrus":
|
||||
metadata_prefixes = {'hash:', 'known_url:', 'relationship:'}
|
||||
metadata_prefixes = {'hash:', 'url:', 'relationship:'}
|
||||
export_tags = {tag for tag in export_tags if not any(tag.lower().startswith(prefix) for prefix in metadata_prefixes)}
|
||||
logger.info(f"Filtered tags for Hydrus - removed metadata tags, {len(export_tags)} tags remaining")
|
||||
|
||||
@@ -404,9 +404,9 @@ class ExportModal(ModalScreen):
|
||||
metadata = self.result_data.get('metadata', {})
|
||||
|
||||
# Extract file source info from result_data (passed by hub-ui)
|
||||
file_hash = self.result_data.get('file_hash')
|
||||
file_url = self.result_data.get('file_url')
|
||||
file_path = self.result_data.get('file_path') # For local files
|
||||
file_hash = self.result_data.get('hash') or self.result_data.get('file_hash')
|
||||
file_url = self.result_data.get('url') or self.result_data.get('file_url')
|
||||
file_path = self.result_data.get('path') or self.result_data.get('file_path') # For local files
|
||||
source = self.result_data.get('source', 'unknown')
|
||||
|
||||
# Prepare export data
|
||||
@@ -419,8 +419,11 @@ class ExportModal(ModalScreen):
|
||||
'format': file_format,
|
||||
'metadata': metadata,
|
||||
'original_data': self.result_data,
|
||||
'hash': file_hash,
|
||||
'file_hash': file_hash,
|
||||
'url': file_url,
|
||||
'file_url': file_url,
|
||||
'path': file_path,
|
||||
'file_path': file_path, # Pass file path for local files
|
||||
'source': source,
|
||||
}
|
||||
|
||||
@@ -16,7 +16,7 @@ import asyncio
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
from config import load_config
|
||||
from result_table import ResultTable
|
||||
from helper.search_provider import get_provider
|
||||
from helper.provider import get_provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -183,7 +183,7 @@ class SearchModal(ModalScreen):
|
||||
else:
|
||||
# Fallback if no columns defined
|
||||
row.add_column("Title", res.title)
|
||||
row.add_column("Target", res.target)
|
||||
row.add_column("Target", getattr(res, 'path', None) or getattr(res, 'url', None) or getattr(res, 'target', None) or '')
|
||||
|
||||
self.current_result_table = table
|
||||
|
||||
|
||||
@@ -197,8 +197,6 @@ class PipelineExecutor:
|
||||
|
||||
pipeline_ctx = ctx.PipelineStageContext(stage_index=index, total_stages=total)
|
||||
ctx.set_stage_context(pipeline_ctx)
|
||||
ctx.set_active(True)
|
||||
ctx.set_last_stage(index == total - 1)
|
||||
|
||||
try:
|
||||
return_code = cmd_fn(piped_input, list(stage_args), self._config)
|
||||
@@ -210,7 +208,6 @@ class PipelineExecutor:
|
||||
return stage
|
||||
finally:
|
||||
ctx.set_stage_context(None)
|
||||
ctx.set_active(False)
|
||||
|
||||
emitted = list(getattr(pipeline_ctx, "emits", []) or [])
|
||||
stage.emitted = emitted
|
||||
|
||||
@@ -24,70 +24,12 @@ def register(names: Iterable[str]):
|
||||
return _wrap
|
||||
|
||||
|
||||
class AutoRegister:
|
||||
"""Decorator that automatically registers a cmdlet function using CMDLET.aliases.
|
||||
|
||||
Usage:
|
||||
CMDLET = Cmdlet(
|
||||
name="delete-file",
|
||||
aliases=["del", "del-file"],
|
||||
...
|
||||
)
|
||||
|
||||
@AutoRegister(CMDLET)
|
||||
def _run(result, args, config) -> int:
|
||||
...
|
||||
|
||||
Registers the cmdlet under:
|
||||
- Its main name from CMDLET.name
|
||||
- All aliases from CMDLET.aliases
|
||||
|
||||
This allows the help display to show: "cmd: delete-file | alias: del, del-file"
|
||||
"""
|
||||
def __init__(self, cmdlet):
|
||||
self.cmdlet = cmdlet
|
||||
|
||||
def __call__(self, fn: Cmdlet) -> Cmdlet:
|
||||
"""Register fn for the main name and all aliases in cmdlet."""
|
||||
normalized_name = None
|
||||
|
||||
# Register for main name first
|
||||
if hasattr(self.cmdlet, 'name') and self.cmdlet.name:
|
||||
normalized_name = self.cmdlet.name.replace('_', '-').lower()
|
||||
REGISTRY[normalized_name] = fn
|
||||
|
||||
# Register for all aliases
|
||||
if hasattr(self.cmdlet, 'aliases') and self.cmdlet.aliases:
|
||||
for alias in self.cmdlet.aliases:
|
||||
normalized_alias = alias.replace('_', '-').lower()
|
||||
# Always register (aliases are separate from main name)
|
||||
REGISTRY[normalized_alias] = fn
|
||||
|
||||
return fn
|
||||
|
||||
|
||||
def get(cmd_name: str) -> Cmdlet | None:
|
||||
return REGISTRY.get(cmd_name.replace('_', '-').lower())
|
||||
|
||||
|
||||
def format_cmd_help(cmdlet) -> str:
|
||||
"""Format a cmdlet for help display showing cmd:name and aliases.
|
||||
|
||||
Example output: "delete-file | aliases: del, del-file"
|
||||
"""
|
||||
if not hasattr(cmdlet, 'name'):
|
||||
return str(cmdlet)
|
||||
|
||||
cmd_str = f"cmd: {cmdlet.name}"
|
||||
|
||||
if hasattr(cmdlet, 'aliases') and cmdlet.aliases:
|
||||
aliases_str = ", ".join(cmdlet.aliases)
|
||||
cmd_str += f" | aliases: {aliases_str}"
|
||||
|
||||
return cmd_str
|
||||
|
||||
|
||||
# Dynamically import all cmdlet modules in this directory (ignore files starting with _ and __init__.py)
|
||||
# Cmdlets self-register when instantiated via their __init__ method
|
||||
import os
|
||||
cmdlet_dir = os.path.dirname(__file__)
|
||||
for filename in os.listdir(cmdlet_dir):
|
||||
@@ -106,27 +48,7 @@ for filename in os.listdir(cmdlet_dir):
|
||||
continue
|
||||
|
||||
try:
|
||||
module = _import_module(f".{mod_name}", __name__)
|
||||
|
||||
# Auto-register based on CMDLET object with exec function
|
||||
# This allows cmdlets to be fully self-contained in the CMDLET object
|
||||
if hasattr(module, 'CMDLET'):
|
||||
cmdlet_obj = module.CMDLET
|
||||
|
||||
# Get the execution function from the CMDLET object
|
||||
run_fn = getattr(cmdlet_obj, 'exec', None) if hasattr(cmdlet_obj, 'exec') else None
|
||||
|
||||
if callable(run_fn):
|
||||
# Register main name
|
||||
if hasattr(cmdlet_obj, 'name') and cmdlet_obj.name:
|
||||
normalized_name = cmdlet_obj.name.replace('_', '-').lower()
|
||||
REGISTRY[normalized_name] = run_fn
|
||||
|
||||
# Register all aliases
|
||||
if hasattr(cmdlet_obj, 'aliases') and cmdlet_obj.aliases:
|
||||
for alias in cmdlet_obj.aliases:
|
||||
normalized_alias = alias.replace('_', '-').lower()
|
||||
REGISTRY[normalized_alias] = run_fn
|
||||
_import_module(f".{mod_name}", __name__)
|
||||
except Exception as e:
|
||||
import sys
|
||||
print(f"Error importing cmdlet '{mod_name}': {e}", file=sys.stderr)
|
||||
@@ -141,8 +63,6 @@ except Exception:
|
||||
pass
|
||||
|
||||
# Import root-level modules that also register cmdlets
|
||||
# Note: search_libgen, search_soulseek, and search_debrid are now consolidated into search_provider.py
|
||||
# Use search-file -provider libgen, -provider soulseek, or -provider debrid instead
|
||||
for _root_mod in ("select_cmdlet",):
|
||||
try:
|
||||
_import_module(_root_mod)
|
||||
|
||||
@@ -11,7 +11,7 @@ import sys
|
||||
import inspect
|
||||
from collections.abc import Iterable as IterableABC
|
||||
|
||||
from helper.logger import log
|
||||
from helper.logger import log, debug
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, Iterable, List, Optional, Sequence, Set
|
||||
from dataclasses import dataclass, field
|
||||
@@ -37,22 +37,9 @@ class CmdletArg:
|
||||
"""Optional handler function/callable for processing this argument's value"""
|
||||
variadic: bool = False
|
||||
"""Whether this argument accepts multiple values (consumes remaining positional args)"""
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dict for backward compatibility."""
|
||||
d = {
|
||||
"name": self.name,
|
||||
"type": self.type,
|
||||
"required": self.required,
|
||||
"description": self.description,
|
||||
"variadic": self.variadic,
|
||||
}
|
||||
if self.choices:
|
||||
d["choices"] = self.choices
|
||||
if self.alias:
|
||||
d["alias"] = self.alias
|
||||
return d
|
||||
|
||||
usage: str = ""
|
||||
"""dsf"""
|
||||
|
||||
def resolve(self, value: Any) -> Any:
|
||||
"""Resolve/process the argument value using the handler if available.
|
||||
|
||||
@@ -135,11 +122,68 @@ class SharedArgs:
|
||||
|
||||
# File/Hash arguments
|
||||
HASH = CmdletArg(
|
||||
"hash",
|
||||
name="hash",
|
||||
type="string",
|
||||
description="Override the Hydrus file hash (SHA256) to target instead of the selected result."
|
||||
description="File hash (SHA256, 64-char hex string)",
|
||||
)
|
||||
|
||||
STORE = CmdletArg(
|
||||
name="store",
|
||||
type="enum",
|
||||
choices=[], # Dynamically populated via get_store_choices()
|
||||
description="Selects store",
|
||||
)
|
||||
|
||||
PATH = CmdletArg(
|
||||
name="path",
|
||||
type="string",
|
||||
choices=[], # Dynamically populated via get_store_choices()
|
||||
description="Selects store",
|
||||
)
|
||||
|
||||
URL = CmdletArg(
|
||||
name="url",
|
||||
type="string",
|
||||
description="http parser",
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def get_store_choices(config: Optional[Dict[str, Any]] = None) -> List[str]:
|
||||
"""Get list of available storage backend names from FileStorage.
|
||||
|
||||
This method dynamically discovers all configured storage backends
|
||||
instead of using a static list. Should be called when building
|
||||
autocomplete choices or validating store names.
|
||||
|
||||
Args:
|
||||
config: Optional config dict. If not provided, will try to load from config module.
|
||||
|
||||
Returns:
|
||||
List of backend names (e.g., ['default', 'test', 'home', 'work'])
|
||||
|
||||
Example:
|
||||
# In a cmdlet that needs dynamic choices
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config)
|
||||
SharedArgs.STORE.choices = SharedArgs.get_store_choices(config)
|
||||
"""
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
|
||||
# If no config provided, try to load it
|
||||
if config is None:
|
||||
try:
|
||||
from config import load_config
|
||||
config = load_config()
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
file_storage = FileStorage(config)
|
||||
return file_storage.list_backends()
|
||||
except Exception:
|
||||
# Fallback to empty list if FileStorage isn't available
|
||||
return []
|
||||
|
||||
LOCATION = CmdletArg(
|
||||
"location",
|
||||
type="enum",
|
||||
@@ -205,16 +249,7 @@ class SharedArgs:
|
||||
type="string",
|
||||
description="Output file path."
|
||||
)
|
||||
|
||||
STORAGE = CmdletArg(
|
||||
"storage",
|
||||
type="enum",
|
||||
choices=["hydrus", "local", "ftp", "matrix"],
|
||||
required=False,
|
||||
description="Storage location or destination for saving/uploading files.",
|
||||
alias="s",
|
||||
handler=lambda val: SharedArgs.resolve_storage(val) if val else None
|
||||
)
|
||||
|
||||
|
||||
# Generic arguments
|
||||
QUERY = CmdletArg(
|
||||
@@ -325,78 +360,61 @@ class Cmdlet:
|
||||
log(cmd.name) # "add-file"
|
||||
log(cmd.summary) # "Upload a media file"
|
||||
log(cmd.args[0].name) # "location"
|
||||
|
||||
# Convert to dict for JSON serialization
|
||||
log(json.dumps(cmd.to_dict()))
|
||||
"""
|
||||
name: str
|
||||
"""Cmdlet name, e.g., 'add-file'"""
|
||||
""""""
|
||||
summary: str
|
||||
"""One-line summary of the cmdlet"""
|
||||
usage: str
|
||||
"""Usage string, e.g., 'add-file <location> [-delete]'"""
|
||||
aliases: List[str] = field(default_factory=list)
|
||||
alias: List[str] = field(default_factory=list)
|
||||
"""List of aliases for this cmdlet, e.g., ['add', 'add-f']"""
|
||||
args: List[CmdletArg] = field(default_factory=list)
|
||||
arg: List[CmdletArg] = field(default_factory=list)
|
||||
"""List of arguments accepted by this cmdlet"""
|
||||
details: List[str] = field(default_factory=list)
|
||||
detail: List[str] = field(default_factory=list)
|
||||
"""Detailed explanation lines (for help text)"""
|
||||
exec: Optional[Any] = field(default=None)
|
||||
"""The execution function: func(result, args, config) -> int"""
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
"""Auto-discover _run function if exec not explicitly provided.
|
||||
|
||||
If exec is None, looks for a _run function in the module where
|
||||
this Cmdlet was instantiated and uses it automatically.
|
||||
"""
|
||||
if self.exec is None:
|
||||
# Walk up the call stack to find _run in the calling module
|
||||
frame = inspect.currentframe()
|
||||
try:
|
||||
# Walk up frames until we find one with _run in globals
|
||||
while frame:
|
||||
if '_run' in frame.f_globals:
|
||||
self.exec = frame.f_globals['_run']
|
||||
break
|
||||
frame = frame.f_back
|
||||
finally:
|
||||
del frame # Avoid reference cycles
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dict for backward compatibility with existing code.
|
||||
|
||||
Returns a dict matching the old CMDLET format so existing code
|
||||
that expects a dict will still work.
|
||||
"""
|
||||
# Format command for display: "cmd: name alias: alias1, alias2"
|
||||
cmd_display = f"cmd: {self.name}"
|
||||
if self.aliases:
|
||||
aliases_str = ", ".join(self.aliases)
|
||||
cmd_display += f" alias: {aliases_str}"
|
||||
|
||||
return {
|
||||
"name": self.name,
|
||||
"summary": self.summary,
|
||||
"usage": self.usage,
|
||||
"cmd": cmd_display, # Display-friendly command name with aliases on one line
|
||||
"aliases": self.aliases,
|
||||
"args": [arg.to_dict() for arg in self.args],
|
||||
"details": self.details,
|
||||
}
|
||||
|
||||
def __getitem__(self, key: str) -> Any:
|
||||
"""Dict-like access for backward compatibility.
|
||||
|
||||
Allows code like: cmdlet["name"] or cmdlet["args"]
|
||||
"""
|
||||
d = self.to_dict()
|
||||
return d.get(key)
|
||||
|
||||
def get(self, key: str, default: Any = None) -> Any:
|
||||
"""Dict-like get() method for backward compatibility."""
|
||||
d = self.to_dict()
|
||||
return d.get(key, default)
|
||||
|
||||
|
||||
def _collect_names(self) -> List[str]:
|
||||
"""Collect primary name plus aliases, de-duplicated and normalized."""
|
||||
names: List[str] = []
|
||||
if self.name:
|
||||
names.append(self.name)
|
||||
for alias in (self.alias or []):
|
||||
if alias:
|
||||
names.append(alias)
|
||||
for alias in (getattr(self, "aliases", None) or []):
|
||||
if alias:
|
||||
names.append(alias)
|
||||
|
||||
seen: Set[str] = set()
|
||||
deduped: List[str] = []
|
||||
for name in names:
|
||||
key = name.replace("_", "-").lower()
|
||||
if key in seen:
|
||||
continue
|
||||
seen.add(key)
|
||||
deduped.append(name)
|
||||
return deduped
|
||||
|
||||
def register(self) -> "Cmdlet":
|
||||
"""Register this cmdlet's exec under its name and aliases."""
|
||||
if not callable(self.exec):
|
||||
return self
|
||||
try:
|
||||
from . import register as _register # Local import to avoid circular import cost
|
||||
except Exception:
|
||||
return self
|
||||
|
||||
names = self._collect_names()
|
||||
if not names:
|
||||
return self
|
||||
|
||||
_register(names)(self.exec)
|
||||
return self
|
||||
|
||||
def get_flags(self, arg_name: str) -> set[str]:
|
||||
"""Generate -name and --name flag variants for an argument.
|
||||
@@ -432,7 +450,7 @@ class Cmdlet:
|
||||
elif low in flags.get('tag', set()):
|
||||
# handle tag
|
||||
"""
|
||||
return {arg.name: self.get_flags(arg.name) for arg in self.args}
|
||||
return {arg.name: self.get_flags(arg.name) for arg in self.arg}
|
||||
|
||||
|
||||
# Tag groups cache (loaded from JSON config file)
|
||||
@@ -479,19 +497,19 @@ def parse_cmdlet_args(args: Sequence[str], cmdlet_spec: Dict[str, Any] | Cmdlet)
|
||||
"""
|
||||
result: Dict[str, Any] = {}
|
||||
|
||||
# Handle both dict and Cmdlet objects
|
||||
if isinstance(cmdlet_spec, Cmdlet):
|
||||
cmdlet_spec = cmdlet_spec.to_dict()
|
||||
# Only accept Cmdlet objects
|
||||
if not isinstance(cmdlet_spec, Cmdlet):
|
||||
raise TypeError(f"Expected Cmdlet, got {type(cmdlet_spec).__name__}")
|
||||
|
||||
# Build arg specs tracking which are positional vs flagged
|
||||
arg_specs: List[Dict[str, Any]] = cmdlet_spec.get("args", [])
|
||||
positional_args: List[Dict[str, Any]] = [] # args without prefix in definition
|
||||
flagged_args: List[Dict[str, Any]] = [] # args with prefix in definition
|
||||
# Build arg specs from cmdlet
|
||||
arg_specs: List[CmdletArg] = cmdlet_spec.arg
|
||||
positional_args: List[CmdletArg] = [] # args without prefix in definition
|
||||
flagged_args: List[CmdletArg] = [] # args with prefix in definition
|
||||
|
||||
arg_spec_map: Dict[str, str] = {} # prefix variant -> canonical name (without prefix)
|
||||
|
||||
for spec in arg_specs:
|
||||
name = spec.get("name")
|
||||
name = spec.name
|
||||
if not name:
|
||||
continue
|
||||
|
||||
@@ -520,10 +538,10 @@ def parse_cmdlet_args(args: Sequence[str], cmdlet_spec: Dict[str, Any] | Cmdlet)
|
||||
# Check if this token is a known flagged argument
|
||||
if token_lower in arg_spec_map:
|
||||
canonical_name = arg_spec_map[token_lower]
|
||||
spec = next((s for s in arg_specs if str(s.get("name", "")).lstrip("-").lower() == canonical_name.lower()), None)
|
||||
spec = next((s for s in arg_specs if str(s.name).lstrip("-").lower() == canonical_name.lower()), None)
|
||||
|
||||
# Check if it's a flag type (which doesn't consume next value, just marks presence)
|
||||
is_flag = spec and spec.get("type") == "flag"
|
||||
is_flag = spec and spec.type == "flag"
|
||||
|
||||
if is_flag:
|
||||
# For flags, just mark presence without consuming next token
|
||||
@@ -535,7 +553,7 @@ def parse_cmdlet_args(args: Sequence[str], cmdlet_spec: Dict[str, Any] | Cmdlet)
|
||||
value = args[i + 1]
|
||||
|
||||
# Check if variadic
|
||||
is_variadic = spec and spec.get("variadic", False)
|
||||
is_variadic = spec and spec.variadic
|
||||
if is_variadic:
|
||||
if canonical_name not in result:
|
||||
result[canonical_name] = []
|
||||
@@ -550,8 +568,8 @@ def parse_cmdlet_args(args: Sequence[str], cmdlet_spec: Dict[str, Any] | Cmdlet)
|
||||
# Otherwise treat as positional if we have positional args remaining
|
||||
elif positional_index < len(positional_args):
|
||||
positional_spec = positional_args[positional_index]
|
||||
canonical_name = str(positional_spec.get("name", "")).lstrip("-")
|
||||
is_variadic = positional_spec.get("variadic", False)
|
||||
canonical_name = str(positional_spec.name).lstrip("-")
|
||||
is_variadic = positional_spec.variadic
|
||||
|
||||
if is_variadic:
|
||||
# For variadic args, append to a list
|
||||
@@ -591,6 +609,183 @@ def normalize_hash(hash_hex: Optional[str]) -> Optional[str]:
|
||||
return text.lower() if text else None
|
||||
|
||||
|
||||
def get_hash_for_operation(override_hash: Optional[str], result: Any, field_name: str = "hash_hex") -> Optional[str]:
|
||||
"""Get normalized hash from override or result object, consolidating common pattern.
|
||||
|
||||
Eliminates repeated pattern: normalize_hash(override) if override else normalize_hash(get_field(result, ...))
|
||||
|
||||
Args:
|
||||
override_hash: Hash passed as command argument (takes precedence)
|
||||
result: Object containing hash field (fallback)
|
||||
field_name: Name of hash field in result object (default: "hash_hex")
|
||||
|
||||
Returns:
|
||||
Normalized hash string, or None if neither override nor result provides valid hash
|
||||
"""
|
||||
if override_hash:
|
||||
return normalize_hash(override_hash)
|
||||
# Try multiple field names for robustness
|
||||
hash_value = get_field(result, field_name) or getattr(result, field_name, None) or getattr(result, "hash", None) or result.get("file_hash") if isinstance(result, dict) else None
|
||||
return normalize_hash(hash_value)
|
||||
|
||||
|
||||
def fetch_hydrus_metadata(config: Any, hash_hex: str, **kwargs) -> tuple[Optional[Dict[str, Any]], Optional[int]]:
|
||||
"""Fetch metadata from Hydrus for a given hash, consolidating common fetch pattern.
|
||||
|
||||
Eliminates repeated boilerplate: client initialization, error handling, metadata extraction.
|
||||
|
||||
Args:
|
||||
config: Configuration object (passed to hydrus_wrapper.get_client)
|
||||
hash_hex: File hash to fetch metadata for
|
||||
**kwargs: Additional arguments to pass to client.fetch_file_metadata()
|
||||
Common: include_service_keys_to_tags, include_notes, include_file_url, include_duration, etc.
|
||||
|
||||
Returns:
|
||||
Tuple of (metadata_dict, error_code)
|
||||
- metadata_dict: Dict from Hydrus (first item in metadata list) or None if unavailable
|
||||
- error_code: 0 on success, 1 on any error (suitable for returning from cmdlet execute())
|
||||
"""
|
||||
from helper import hydrus
|
||||
hydrus_wrapper = hydrus
|
||||
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus client unavailable: {exc}")
|
||||
return None, 1
|
||||
|
||||
if client is None:
|
||||
log("Hydrus client unavailable")
|
||||
return None, 1
|
||||
|
||||
try:
|
||||
payload = client.fetch_file_metadata(hashes=[hash_hex], **kwargs)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus metadata fetch failed: {exc}")
|
||||
return None, 1
|
||||
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
meta = items[0] if (isinstance(items, list) and items and isinstance(items[0], dict)) else None
|
||||
|
||||
return meta, 0
|
||||
|
||||
|
||||
def get_origin(obj: Any, default: Optional[str] = None) -> Optional[str]:
|
||||
"""Extract origin field with fallback to store/source field, consolidating common pattern.
|
||||
|
||||
Supports both dict and object access patterns.
|
||||
|
||||
Args:
|
||||
obj: Object (dict or dataclass) with 'store', 'origin', or 'source' field
|
||||
default: Default value if none of the fields are found
|
||||
|
||||
Returns:
|
||||
Store/origin/source string, or default if none exist
|
||||
"""
|
||||
if isinstance(obj, dict):
|
||||
return obj.get("store") or obj.get("origin") or obj.get("source") or default
|
||||
else:
|
||||
return getattr(obj, "store", None) or getattr(obj, "origin", None) or getattr(obj, "source", None) or default
|
||||
|
||||
|
||||
def get_field(obj: Any, field: str, default: Optional[Any] = None) -> Any:
|
||||
"""Extract a field from either a dict or object with fallback default.
|
||||
|
||||
Handles both dict.get(field) and getattr(obj, field) access patterns.
|
||||
Also handles lists by accessing the first element.
|
||||
For PipeObjects, checks the extra field as well.
|
||||
Used throughout cmdlets to uniformly access fields from mixed types.
|
||||
|
||||
Args:
|
||||
obj: Dict, object, or list to extract from
|
||||
field: Field name to retrieve
|
||||
default: Value to return if field not found (default: None)
|
||||
|
||||
Returns:
|
||||
Field value if found, otherwise the default value
|
||||
|
||||
Examples:
|
||||
get_field(result, "hash") # From dict or object
|
||||
get_field(result, "origin", "unknown") # With default
|
||||
"""
|
||||
# Handle lists by accessing the first element
|
||||
if isinstance(obj, list) and obj:
|
||||
obj = obj[0]
|
||||
|
||||
if isinstance(obj, dict):
|
||||
# Direct lookup first
|
||||
val = obj.get(field, default)
|
||||
if val is not None:
|
||||
return val
|
||||
# Fallback aliases for common fields
|
||||
if field == "path":
|
||||
for alt in ("file_path", "target", "filepath", "file"):
|
||||
v = obj.get(alt)
|
||||
if v:
|
||||
return v
|
||||
if field == "hash":
|
||||
for alt in ("file_hash", "hash_hex"):
|
||||
v = obj.get(alt)
|
||||
if v:
|
||||
return v
|
||||
if field == "store":
|
||||
for alt in ("storage", "storage_source", "origin"):
|
||||
v = obj.get(alt)
|
||||
if v:
|
||||
return v
|
||||
return default
|
||||
else:
|
||||
# Try direct attribute access first
|
||||
value = getattr(obj, field, None)
|
||||
if value is not None:
|
||||
return value
|
||||
|
||||
# Attribute fallback aliases for common fields
|
||||
if field == "path":
|
||||
for alt in ("file_path", "target", "filepath", "file", "url"):
|
||||
v = getattr(obj, alt, None)
|
||||
if v:
|
||||
return v
|
||||
if field == "hash":
|
||||
for alt in ("file_hash", "hash_hex"):
|
||||
v = getattr(obj, alt, None)
|
||||
if v:
|
||||
return v
|
||||
if field == "store":
|
||||
for alt in ("storage", "storage_source", "origin"):
|
||||
v = getattr(obj, alt, None)
|
||||
if v:
|
||||
return v
|
||||
|
||||
# For PipeObjects, also check the extra field
|
||||
if hasattr(obj, 'extra') and isinstance(obj.extra, dict):
|
||||
return obj.extra.get(field, default)
|
||||
|
||||
return default
|
||||
|
||||
|
||||
def should_show_help(args: Sequence[str]) -> bool:
|
||||
"""Check if help flag was passed in arguments.
|
||||
|
||||
Consolidates repeated pattern of checking for help flags across cmdlets.
|
||||
|
||||
Args:
|
||||
args: Command arguments to check
|
||||
|
||||
Returns:
|
||||
True if any help flag is present (-?, /?, --help, -h, help, --cmdlet)
|
||||
|
||||
Examples:
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
"""
|
||||
try:
|
||||
return any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args)
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
def looks_like_hash(candidate: Optional[str]) -> bool:
|
||||
"""Check if a string looks like a SHA256 hash (64 hex chars).
|
||||
|
||||
@@ -609,8 +804,8 @@ def looks_like_hash(candidate: Optional[str]) -> bool:
|
||||
def pipeline_item_local_path(item: Any) -> Optional[str]:
|
||||
"""Extract local file path from a pipeline item.
|
||||
|
||||
Supports both dataclass objects with .target attribute and dicts.
|
||||
Returns None for HTTP/HTTPS URLs.
|
||||
Supports both dataclass objects with .path attribute and dicts.
|
||||
Returns None for HTTP/HTTPS url.
|
||||
|
||||
Args:
|
||||
item: Pipeline item (PipelineItem dataclass, dict, or other)
|
||||
@@ -618,15 +813,15 @@ def pipeline_item_local_path(item: Any) -> Optional[str]:
|
||||
Returns:
|
||||
Local file path string, or None if item is not a local file
|
||||
"""
|
||||
target: Optional[str] = None
|
||||
if hasattr(item, "target"):
|
||||
target = getattr(item, "target", None)
|
||||
path_value: Optional[str] = None
|
||||
if hasattr(item, "path"):
|
||||
path_value = getattr(item, "path", None)
|
||||
elif isinstance(item, dict):
|
||||
raw = item.get("target") or item.get("path") or item.get("url")
|
||||
target = str(raw) if raw is not None else None
|
||||
if not isinstance(target, str):
|
||||
raw = item.get("path") or item.get("url")
|
||||
path_value = str(raw) if raw is not None else None
|
||||
if not isinstance(path_value, str):
|
||||
return None
|
||||
text = target.strip()
|
||||
text = path_value.strip()
|
||||
if not text:
|
||||
return None
|
||||
if text.lower().startswith(("http://", "https://")):
|
||||
@@ -686,22 +881,60 @@ def collect_relationship_labels(payload: Any, label_stack: List[str] | None = No
|
||||
|
||||
def parse_tag_arguments(arguments: Sequence[str]) -> List[str]:
|
||||
"""Parse tag arguments from command line tokens.
|
||||
|
||||
Handles both space-separated and comma-separated tags.
|
||||
Example: parse_tag_arguments(["tag1,tag2", "tag3"]) -> ["tag1", "tag2", "tag3"]
|
||||
|
||||
|
||||
- Supports comma-separated tags.
|
||||
- Supports pipe namespace shorthand: "artist:A|B|C" -> artist:A, artist:B, artist:C.
|
||||
|
||||
Args:
|
||||
arguments: Sequence of argument strings
|
||||
|
||||
|
||||
Returns:
|
||||
List of normalized tag strings (empty strings filtered out)
|
||||
"""
|
||||
|
||||
def _expand_pipe_namespace(text: str) -> List[str]:
|
||||
parts = text.split('|')
|
||||
expanded: List[str] = []
|
||||
last_ns: Optional[str] = None
|
||||
for part in parts:
|
||||
segment = part.strip()
|
||||
if not segment:
|
||||
continue
|
||||
if ':' in segment:
|
||||
ns, val = segment.split(':', 1)
|
||||
ns = ns.strip()
|
||||
val = val.strip()
|
||||
last_ns = ns or last_ns
|
||||
if last_ns and val:
|
||||
expanded.append(f"{last_ns}:{val}")
|
||||
elif ns or val:
|
||||
expanded.append(f"{ns}:{val}".strip(':'))
|
||||
else:
|
||||
if last_ns:
|
||||
expanded.append(f"{last_ns}:{segment}")
|
||||
else:
|
||||
expanded.append(segment)
|
||||
return expanded
|
||||
|
||||
tags: List[str] = []
|
||||
for argument in arguments:
|
||||
for token in argument.split(','):
|
||||
text = token.strip()
|
||||
if text:
|
||||
tags.append(text)
|
||||
if not text:
|
||||
continue
|
||||
# Expand namespace shorthand with pipes
|
||||
pipe_expanded = _expand_pipe_namespace(text)
|
||||
for entry in pipe_expanded:
|
||||
candidate = entry.strip()
|
||||
if not candidate:
|
||||
continue
|
||||
if ':' in candidate:
|
||||
ns, val = candidate.split(':', 1)
|
||||
ns = ns.strip()
|
||||
val = val.strip()
|
||||
candidate = f"{ns}:{val}" if ns or val else ""
|
||||
if candidate:
|
||||
tags.append(candidate)
|
||||
return tags
|
||||
|
||||
|
||||
@@ -944,7 +1177,7 @@ def create_pipe_object_result(
|
||||
result = {
|
||||
'source': source,
|
||||
'id': identifier,
|
||||
'file_path': file_path,
|
||||
'path': file_path,
|
||||
'action': f'cmdlet:{cmdlet_name}', # Format: cmdlet:cmdlet_name
|
||||
}
|
||||
|
||||
@@ -952,6 +1185,7 @@ def create_pipe_object_result(
|
||||
result['title'] = title
|
||||
if file_hash:
|
||||
result['file_hash'] = file_hash
|
||||
result['hash'] = file_hash
|
||||
if is_temp:
|
||||
result['is_temp'] = True
|
||||
if parent_hash:
|
||||
@@ -959,6 +1193,13 @@ def create_pipe_object_result(
|
||||
if tags:
|
||||
result['tags'] = tags
|
||||
|
||||
# Canonical store field: use source for compatibility
|
||||
try:
|
||||
if source:
|
||||
result['store'] = source
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Add any extra fields
|
||||
result.update(extra)
|
||||
|
||||
@@ -996,13 +1237,13 @@ def get_pipe_object_path(pipe_object: Any) -> Optional[str]:
|
||||
"""Extract file path from PipeObject, dict, or pipeline-friendly object."""
|
||||
if pipe_object is None:
|
||||
return None
|
||||
for attr in ('file_path', 'path', 'target'):
|
||||
for attr in ('path', 'target'):
|
||||
if hasattr(pipe_object, attr):
|
||||
value = getattr(pipe_object, attr)
|
||||
if value:
|
||||
return value
|
||||
if isinstance(pipe_object, dict):
|
||||
for key in ('file_path', 'path', 'target'):
|
||||
for key in ('path', 'target'):
|
||||
value = pipe_object.get(key)
|
||||
if value:
|
||||
return value
|
||||
@@ -1209,40 +1450,40 @@ def extract_title_from_result(result: Any) -> Optional[str]:
|
||||
return None
|
||||
|
||||
|
||||
def extract_known_urls_from_result(result: Any) -> list[str]:
|
||||
urls: list[str] = []
|
||||
def extract_url_from_result(result: Any) -> list[str]:
|
||||
url: list[str] = []
|
||||
|
||||
def _extend(candidate: Any) -> None:
|
||||
if not candidate:
|
||||
return
|
||||
if isinstance(candidate, list):
|
||||
urls.extend(candidate)
|
||||
url.extend(candidate)
|
||||
elif isinstance(candidate, str):
|
||||
urls.append(candidate)
|
||||
url.append(candidate)
|
||||
|
||||
if isinstance(result, models.PipeObject):
|
||||
_extend(result.extra.get('known_urls'))
|
||||
_extend(result.extra.get('url'))
|
||||
_extend(result.extra.get('url')) # Also check singular url
|
||||
if isinstance(result.metadata, dict):
|
||||
_extend(result.metadata.get('known_urls'))
|
||||
_extend(result.metadata.get('urls'))
|
||||
_extend(result.metadata.get('url'))
|
||||
elif hasattr(result, 'known_urls') or hasattr(result, 'urls'):
|
||||
# Handle objects with known_urls/urls attribute
|
||||
_extend(getattr(result, 'known_urls', None))
|
||||
_extend(getattr(result, 'urls', None))
|
||||
_extend(result.metadata.get('url'))
|
||||
_extend(result.metadata.get('url'))
|
||||
elif hasattr(result, 'url') or hasattr(result, 'url'):
|
||||
# Handle objects with url/url attribute
|
||||
_extend(getattr(result, 'url', None))
|
||||
_extend(getattr(result, 'url', None))
|
||||
|
||||
if isinstance(result, dict):
|
||||
_extend(result.get('known_urls'))
|
||||
_extend(result.get('urls'))
|
||||
_extend(result.get('url'))
|
||||
_extend(result.get('url'))
|
||||
_extend(result.get('url'))
|
||||
extra = result.get('extra')
|
||||
if isinstance(extra, dict):
|
||||
_extend(extra.get('known_urls'))
|
||||
_extend(extra.get('urls'))
|
||||
_extend(extra.get('url'))
|
||||
_extend(extra.get('url'))
|
||||
_extend(extra.get('url'))
|
||||
|
||||
return merge_sequences(urls, case_sensitive=True)
|
||||
return merge_sequences(url, case_sensitive=True)
|
||||
|
||||
|
||||
def extract_relationships(result: Any) -> Optional[Dict[str, Any]]:
|
||||
@@ -1272,3 +1513,248 @@ def extract_duration(result: Any) -> Optional[float]:
|
||||
return float(duration)
|
||||
except (TypeError, ValueError):
|
||||
return None
|
||||
|
||||
|
||||
def coerce_to_pipe_object(value: Any, default_path: Optional[str] = None) -> models.PipeObject:
|
||||
"""Normalize any incoming result to a PipeObject for single-source-of-truth state.
|
||||
|
||||
Uses hash+store canonical pattern.
|
||||
"""
|
||||
# Debug: Print ResultItem details if coming from search_file.py
|
||||
try:
|
||||
from helper.logger import is_debug_enabled, debug
|
||||
if is_debug_enabled() and hasattr(value, '__class__') and value.__class__.__name__ == 'ResultItem':
|
||||
debug("[ResultItem -> PipeObject conversion]")
|
||||
debug(f" origin={getattr(value, 'origin', None)}")
|
||||
debug(f" title={getattr(value, 'title', None)}")
|
||||
debug(f" target={getattr(value, 'target', None)}")
|
||||
debug(f" hash_hex={getattr(value, 'hash_hex', None)}")
|
||||
debug(f" media_kind={getattr(value, 'media_kind', None)}")
|
||||
debug(f" tags={getattr(value, 'tags', None)}")
|
||||
debug(f" tag_summary={getattr(value, 'tag_summary', None)}")
|
||||
debug(f" size_bytes={getattr(value, 'size_bytes', None)}")
|
||||
debug(f" duration_seconds={getattr(value, 'duration_seconds', None)}")
|
||||
debug(f" relationships={getattr(value, 'relationships', None)}")
|
||||
debug(f" url={getattr(value, 'url', None)}")
|
||||
debug(f" full_metadata keys={list(getattr(value, 'full_metadata', {}).keys()) if hasattr(value, 'full_metadata') and value.full_metadata else []}")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if isinstance(value, models.PipeObject):
|
||||
return value
|
||||
|
||||
known_keys = {
|
||||
"hash", "store", "tags", "title", "url", "source_url", "duration", "metadata",
|
||||
"warnings", "path", "relationships", "is_temp", "action", "parent_hash",
|
||||
}
|
||||
|
||||
# Convert ResultItem to dict to preserve all attributes
|
||||
if hasattr(value, 'to_dict'):
|
||||
value = value.to_dict()
|
||||
|
||||
if isinstance(value, dict):
|
||||
# Extract hash and store (canonical identifiers)
|
||||
hash_val = value.get("hash") or value.get("file_hash")
|
||||
# Recognize multiple possible store naming conventions (store, origin, storage, storage_source)
|
||||
store_val = value.get("store") or value.get("origin") or value.get("storage") or value.get("storage_source") or "PATH"
|
||||
# If the store value is embedded under extra, also detect it
|
||||
if not store_val or store_val in ("local", "PATH"):
|
||||
extra_store = None
|
||||
try:
|
||||
extra_store = value.get("extra", {}).get("store") or value.get("extra", {}).get("storage") or value.get("extra", {}).get("storage_source")
|
||||
except Exception:
|
||||
extra_store = None
|
||||
if extra_store:
|
||||
store_val = extra_store
|
||||
|
||||
# If no hash, try to compute from path or use placeholder
|
||||
if not hash_val:
|
||||
path_val = value.get("path")
|
||||
if path_val:
|
||||
try:
|
||||
from helper.utils import sha256_file
|
||||
from pathlib import Path
|
||||
hash_val = sha256_file(Path(path_val))
|
||||
except Exception:
|
||||
hash_val = "unknown"
|
||||
else:
|
||||
hash_val = "unknown"
|
||||
|
||||
# Extract title from filename if not provided
|
||||
title_val = value.get("title")
|
||||
if not title_val:
|
||||
path_val = value.get("path")
|
||||
if path_val:
|
||||
try:
|
||||
from pathlib import Path
|
||||
title_val = Path(path_val).stem
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
extra = {k: v for k, v in value.items() if k not in known_keys}
|
||||
|
||||
# Extract URL: prefer direct url field, then url list
|
||||
url_val = value.get("url")
|
||||
if not url_val:
|
||||
url = value.get("url") or value.get("url") or []
|
||||
if url and isinstance(url, list) and len(url) > 0:
|
||||
url_val = url[0]
|
||||
# Preserve url in extra if multiple url exist
|
||||
if url and len(url) > 1:
|
||||
extra["url"] = url
|
||||
|
||||
# Extract relationships
|
||||
rels = value.get("relationships") or {}
|
||||
|
||||
# Consolidate tags: prefer tags_set over tags, tag_summary
|
||||
tags_val = []
|
||||
if "tags_set" in value and value["tags_set"]:
|
||||
tags_val = list(value["tags_set"])
|
||||
elif "tags" in value and isinstance(value["tags"], (list, set)):
|
||||
tags_val = list(value["tags"])
|
||||
elif "tag" in value:
|
||||
# Single tag string or list
|
||||
if isinstance(value["tag"], list):
|
||||
tags_val = value["tag"] # Already a list
|
||||
else:
|
||||
tags_val = [value["tag"]] # Wrap single string in list
|
||||
|
||||
# Consolidate path: prefer explicit path key, but NOT target if it's a URL
|
||||
path_val = value.get("path")
|
||||
# Only use target as path if it's not a URL (url should stay in url field)
|
||||
if not path_val and "target" in value:
|
||||
target = value["target"]
|
||||
if target and not (isinstance(target, str) and (target.startswith("http://") or target.startswith("https://"))):
|
||||
path_val = target
|
||||
|
||||
# If the path value is actually a URL, move it to url_val and clear path_val
|
||||
try:
|
||||
if isinstance(path_val, str) and (path_val.startswith("http://") or path_val.startswith("https://")):
|
||||
# Prefer existing url_val if present, otherwise move path_val into url_val
|
||||
if not url_val:
|
||||
url_val = path_val
|
||||
path_val = None
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Extract media_kind if available
|
||||
if "media_kind" in value:
|
||||
extra["media_kind"] = value["media_kind"]
|
||||
|
||||
pipe_obj = models.PipeObject(
|
||||
hash=hash_val,
|
||||
store=store_val,
|
||||
tags=tags_val,
|
||||
title=title_val,
|
||||
url=url_val,
|
||||
source_url=value.get("source_url"),
|
||||
duration=value.get("duration") or value.get("duration_seconds"),
|
||||
metadata=value.get("metadata") or value.get("full_metadata") or {},
|
||||
warnings=list(value.get("warnings") or []),
|
||||
path=path_val,
|
||||
relationships=rels,
|
||||
is_temp=bool(value.get("is_temp", False)),
|
||||
action=value.get("action"),
|
||||
parent_hash=value.get("parent_hash") or value.get("parent_id"),
|
||||
extra=extra,
|
||||
)
|
||||
|
||||
# Debug: Print formatted table
|
||||
pipe_obj.debug_table()
|
||||
|
||||
return pipe_obj
|
||||
|
||||
# Fallback: build from path argument or bare value
|
||||
hash_val = "unknown"
|
||||
path_val = default_path or getattr(value, "path", None)
|
||||
title_val = None
|
||||
|
||||
if path_val and path_val != "unknown":
|
||||
try:
|
||||
from helper.utils import sha256_file
|
||||
from pathlib import Path
|
||||
path_obj = Path(path_val)
|
||||
hash_val = sha256_file(path_obj)
|
||||
# Extract title from filename (without extension)
|
||||
title_val = path_obj.stem
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# When coming from path argument, store should be "PATH" (file path, not a backend)
|
||||
store_val = "PATH"
|
||||
|
||||
pipe_obj = models.PipeObject(
|
||||
hash=hash_val,
|
||||
store=store_val,
|
||||
path=str(path_val) if path_val and path_val != "unknown" else None,
|
||||
title=title_val,
|
||||
tags=[],
|
||||
extra={},
|
||||
)
|
||||
|
||||
# Debug: Print formatted table
|
||||
pipe_obj.debug_table()
|
||||
|
||||
return pipe_obj
|
||||
|
||||
|
||||
def register_url_with_local_library(pipe_obj: models.PipeObject, config: Dict[str, Any]) -> bool:
|
||||
"""Register url with a file in the local library database.
|
||||
|
||||
This is called automatically by download cmdlets to ensure url are persisted
|
||||
without requiring a separate add-url step in the pipeline.
|
||||
|
||||
Args:
|
||||
pipe_obj: PipeObject with path and url
|
||||
config: Config dict containing local library path
|
||||
|
||||
Returns:
|
||||
True if url were registered, False otherwise
|
||||
"""
|
||||
|
||||
try:
|
||||
from config import get_local_storage_path
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
file_path = get_field(pipe_obj, "path")
|
||||
url_field = get_field(pipe_obj, "url", [])
|
||||
urls: List[str] = []
|
||||
if isinstance(url_field, str):
|
||||
urls = [u.strip() for u in url_field.split(",") if u.strip()]
|
||||
elif isinstance(url_field, (list, tuple)):
|
||||
urls = [u for u in url_field if isinstance(u, str) and u.strip()]
|
||||
|
||||
if not file_path or not urls:
|
||||
return False
|
||||
|
||||
path_obj = Path(file_path)
|
||||
if not path_obj.exists():
|
||||
return False
|
||||
|
||||
storage_path = get_local_storage_path(config)
|
||||
if not storage_path:
|
||||
return False
|
||||
|
||||
with FolderDB(storage_path) as db:
|
||||
file_hash = db.get_file_hash(path_obj)
|
||||
if not file_hash:
|
||||
return False
|
||||
metadata = db.get_metadata(file_hash) or {}
|
||||
existing_url = metadata.get("url") or []
|
||||
|
||||
# Add any new url
|
||||
changed = False
|
||||
for u in urls:
|
||||
if u not in existing_url:
|
||||
existing_url.append(u)
|
||||
changed = True
|
||||
|
||||
if changed:
|
||||
metadata["url"] = existing_url
|
||||
db.save_metadata(path_obj, metadata)
|
||||
return True
|
||||
|
||||
return True # url already existed
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
1818
cmdlets/add_file.py
1818
cmdlets/add_file.py
File diff suppressed because it is too large
Load Diff
@@ -7,19 +7,19 @@ from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, should_show_help
|
||||
from helper.logger import log
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="add-note",
|
||||
summary="Add or set a note on a Hydrus file.",
|
||||
usage="add-note [-hash <sha256>] <name> <text>",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("hash", type="string", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("name", type="string", required=True, description="The note name/key to set (e.g. 'comment', 'source', etc.)."),
|
||||
CmdletArg("text", type="string", required=True, description="The note text/content to store.", variadic=True),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Notes are stored in the 'my notes' service by default.",
|
||||
],
|
||||
)
|
||||
@@ -28,12 +28,9 @@ CMDLET = Cmdlet(
|
||||
@register(["add-note", "set-note", "add_note"]) # aliases
|
||||
def add(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
from ._shared import parse_cmdlet_args
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
|
||||
@@ -14,20 +14,20 @@ from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, parse_cmdlet_args, normalize_result_input
|
||||
from helper.local_library import read_sidecar, find_sidecar
|
||||
from ._shared import Cmdlet, CmdletArg, parse_cmdlet_args, normalize_result_input, should_show_help, get_field
|
||||
from helper.folder_store import read_sidecar, find_sidecar
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="add-relationship",
|
||||
summary="Associate file relationships (king/alt/related) in Hydrus based on relationship tags in sidecar.",
|
||||
usage="@1-3 | add-relationship -king @4 OR add-relationship -path <file> OR @1,@2,@3 | add-relationship",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("path", type="string", description="Specify the local file path (if not piping a result)."),
|
||||
CmdletArg("-king", type="string", description="Explicitly set the king hash/file for relationships (e.g., -king @4 or -king hash)"),
|
||||
CmdletArg("-type", type="string", description="Relationship type for piped items (default: 'alt', options: 'king', 'alt', 'related')"),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Mode 1: Pipe multiple items, first becomes king, rest become alts (default)",
|
||||
"- Mode 2: Use -king to explicitly set which item/hash is the king: @1-3 | add-relationship -king @4",
|
||||
"- Mode 3: Read relationships from sidecar (format: 'relationship: hash(king)<HASH>,hash(alt)<HASH>...')",
|
||||
@@ -108,13 +108,11 @@ def _resolve_king_reference(king_arg: str) -> Optional[str]:
|
||||
item = items[index]
|
||||
|
||||
# Try to extract hash from the item (could be dict or object)
|
||||
item_hash = None
|
||||
if isinstance(item, dict):
|
||||
# Dictionary: try common hash field names
|
||||
item_hash = item.get('hash_hex') or item.get('hash') or item.get('file_hash')
|
||||
else:
|
||||
# Object: use getattr
|
||||
item_hash = getattr(item, 'hash_hex', None) or getattr(item, 'hash', None)
|
||||
item_hash = (
|
||||
get_field(item, 'hash_hex')
|
||||
or get_field(item, 'hash')
|
||||
or get_field(item, 'file_hash')
|
||||
)
|
||||
|
||||
if item_hash:
|
||||
normalized = _normalise_hash_hex(item_hash)
|
||||
@@ -122,13 +120,11 @@ def _resolve_king_reference(king_arg: str) -> Optional[str]:
|
||||
return normalized
|
||||
|
||||
# If no hash, try to get file path (for local storage)
|
||||
file_path = None
|
||||
if isinstance(item, dict):
|
||||
# Dictionary: try common path field names
|
||||
file_path = item.get('file_path') or item.get('path') or item.get('target')
|
||||
else:
|
||||
# Object: use getattr
|
||||
file_path = getattr(item, 'file_path', None) or getattr(item, 'path', None) or getattr(item, 'target', None)
|
||||
file_path = (
|
||||
get_field(item, 'file_path')
|
||||
or get_field(item, 'path')
|
||||
or get_field(item, 'target')
|
||||
)
|
||||
|
||||
if file_path:
|
||||
return str(file_path)
|
||||
@@ -199,12 +195,9 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
Returns 0 on success, non-zero on failure.
|
||||
"""
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in _args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(_args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Parse arguments using CMDLET spec
|
||||
parsed = parse_cmdlet_args(_args, CMDLET)
|
||||
@@ -235,7 +228,7 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
items_to_process = [{"file_path": arg_path}]
|
||||
|
||||
# Import local storage utilities
|
||||
from helper.local_library import LocalLibrarySearchOptimizer
|
||||
from helper.folder_store import LocalLibrarySearchOptimizer
|
||||
from config import get_local_storage_path
|
||||
|
||||
local_storage_path = get_local_storage_path(config) if config else None
|
||||
|
||||
567
cmdlets/add_tag.py
Normal file
567
cmdlets/add_tag.py
Normal file
@@ -0,0 +1,567 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List, Sequence, Optional
|
||||
from pathlib import Path
|
||||
import sys
|
||||
|
||||
from helper.logger import log
|
||||
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from ._shared import normalize_result_input, filter_results_by_temp
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from helper.folder_store import write_sidecar, FolderDB
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, parse_tag_arguments, expand_tag_groups, parse_cmdlet_args, collapse_namespace_tags, should_show_help, get_field
|
||||
from config import get_local_storage_path
|
||||
|
||||
|
||||
|
||||
class Add_Tag(Cmdlet):
|
||||
"""Class-based add-tag cmdlet with Cmdlet metadata inheritance."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
super().__init__(
|
||||
name="add-tag",
|
||||
summary="Add a tag to a Hydrus file or write it to a local .tags sidecar.",
|
||||
usage="add-tag [-hash <sha256>] [-store <backend>] [-duplicate <format>] [-list <list>[,<list>...]] [--all] <tag>[,<tag>...]",
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
SharedArgs.STORE,
|
||||
CmdletArg("-duplicate", type="string", description="Copy existing tag values to new namespaces. Formats: title:album,artist (explicit) or title,album,artist (inferred)"),
|
||||
CmdletArg("-list", type="string", description="Load predefined tag lists from adjective.json. Comma-separated list names (e.g., -list philosophy,occult)."),
|
||||
CmdletArg("--all", type="flag", description="Include temporary files in tagging (by default, only tags non-temporary files)."),
|
||||
CmdletArg("tags", type="string", required=False, description="One or more tags to add. Comma- or space-separated. Can also use {list_name} syntax. If omitted, uses tags from pipeline payload.", variadic=True),
|
||||
],
|
||||
detail=[
|
||||
"- By default, only tags non-temporary files (from pipelines). Use --all to tag everything.",
|
||||
"- Without -hash and when the selection is a local file, tags are written to <file>.tags.",
|
||||
"- With a Hydrus hash, tags are sent to the 'my tags' service.",
|
||||
"- Multiple tags can be comma-separated or space-separated.",
|
||||
"- Use -list to include predefined tag lists from adjective.json: -list philosophy,occult",
|
||||
"- Tags can also reference lists with curly braces: add-tag {philosophy} \"other:tag\"",
|
||||
"- Use -duplicate to copy EXISTING tag values to new namespaces:",
|
||||
" Explicit format: -duplicate title:album,artist (copies title: to album: and artist:)",
|
||||
" Inferred format: -duplicate title,album,artist (first is source, rest are targets)",
|
||||
"- The source namespace must already exist in the file being tagged.",
|
||||
"- Target namespaces that already have a value are skipped (not overwritten).",
|
||||
"- You can also pass the target hash as a tag token: hash:<sha256>. This overrides -hash and is removed from the tag list.",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
@staticmethod
|
||||
def _extract_title_tag(tags: List[str]) -> Optional[str]:
|
||||
"""Return the value of the first title: tag if present."""
|
||||
for tag in tags:
|
||||
if isinstance(tag, str) and tag.lower().startswith("title:"):
|
||||
value = tag.split(":", 1)[1].strip()
|
||||
if value:
|
||||
return value
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _apply_title_to_result(res: Any, title_value: Optional[str]) -> None:
|
||||
"""Update result object/dict title fields and columns in-place."""
|
||||
if not title_value:
|
||||
return
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.title = title_value
|
||||
if hasattr(res, "columns") and isinstance(res.columns, list) and res.columns:
|
||||
label, *_ = res.columns[0]
|
||||
if str(label).lower() == "title":
|
||||
res.columns[0] = (res.columns[0][0], title_value)
|
||||
elif isinstance(res, dict):
|
||||
res["title"] = title_value
|
||||
cols = res.get("columns")
|
||||
if isinstance(cols, list):
|
||||
updated = []
|
||||
changed = False
|
||||
for col in cols:
|
||||
if isinstance(col, tuple) and len(col) == 2:
|
||||
label, val = col
|
||||
if str(label).lower() == "title":
|
||||
updated.append((label, title_value))
|
||||
changed = True
|
||||
else:
|
||||
updated.append(col)
|
||||
else:
|
||||
updated.append(col)
|
||||
if changed:
|
||||
res["columns"] = updated
|
||||
|
||||
@staticmethod
|
||||
def _matches_target(item: Any, hydrus_hash: Optional[str], file_hash: Optional[str], file_path: Optional[str]) -> bool:
|
||||
"""Determine whether a result item refers to the given hash/path target."""
|
||||
hydrus_hash_l = hydrus_hash.lower() if hydrus_hash else None
|
||||
file_hash_l = file_hash.lower() if file_hash else None
|
||||
file_path_l = file_path.lower() if file_path else None
|
||||
|
||||
def norm(val: Any) -> Optional[str]:
|
||||
return str(val).lower() if val is not None else None
|
||||
|
||||
hash_fields = ["hydrus_hash", "hash", "hash_hex", "file_hash"]
|
||||
path_fields = ["path", "file_path", "target"]
|
||||
|
||||
if isinstance(item, dict):
|
||||
hashes = [norm(item.get(field)) for field in hash_fields]
|
||||
paths = [norm(item.get(field)) for field in path_fields]
|
||||
else:
|
||||
hashes = [norm(get_field(item, field)) for field in hash_fields]
|
||||
paths = [norm(get_field(item, field)) for field in path_fields]
|
||||
|
||||
if hydrus_hash_l and hydrus_hash_l in hashes:
|
||||
return True
|
||||
if file_hash_l and file_hash_l in hashes:
|
||||
return True
|
||||
if file_path_l and file_path_l in paths:
|
||||
return True
|
||||
return False
|
||||
|
||||
@staticmethod
|
||||
def _update_item_title_fields(item: Any, new_title: str) -> None:
|
||||
"""Mutate an item to reflect a new title in plain fields and columns."""
|
||||
if isinstance(item, models.PipeObject):
|
||||
item.title = new_title
|
||||
if hasattr(item, "columns") and isinstance(item.columns, list) and item.columns:
|
||||
label, *_ = item.columns[0]
|
||||
if str(label).lower() == "title":
|
||||
item.columns[0] = (label, new_title)
|
||||
elif isinstance(item, dict):
|
||||
item["title"] = new_title
|
||||
cols = item.get("columns")
|
||||
if isinstance(cols, list):
|
||||
updated_cols = []
|
||||
changed = False
|
||||
for col in cols:
|
||||
if isinstance(col, tuple) and len(col) == 2:
|
||||
label, val = col
|
||||
if str(label).lower() == "title":
|
||||
updated_cols.append((label, new_title))
|
||||
changed = True
|
||||
else:
|
||||
updated_cols.append(col)
|
||||
else:
|
||||
updated_cols.append(col)
|
||||
if changed:
|
||||
item["columns"] = updated_cols
|
||||
|
||||
def _refresh_result_table_title(self, new_title: str, hydrus_hash: Optional[str], file_hash: Optional[str], file_path: Optional[str]) -> None:
|
||||
"""Refresh the cached result table with an updated title and redisplay it."""
|
||||
try:
|
||||
last_table = ctx.get_last_result_table()
|
||||
items = ctx.get_last_result_items()
|
||||
if not last_table or not items:
|
||||
return
|
||||
|
||||
updated_items = []
|
||||
match_found = False
|
||||
for item in items:
|
||||
try:
|
||||
if self._matches_target(item, hydrus_hash, file_hash, file_path):
|
||||
self._update_item_title_fields(item, new_title)
|
||||
match_found = True
|
||||
except Exception:
|
||||
pass
|
||||
updated_items.append(item)
|
||||
if not match_found:
|
||||
return
|
||||
|
||||
from result_table import ResultTable # Local import to avoid circular dependency
|
||||
|
||||
new_table = last_table.copy_with_title(getattr(last_table, "title", ""))
|
||||
|
||||
for item in updated_items:
|
||||
new_table.add_result(item)
|
||||
|
||||
ctx.set_last_result_table_overlay(new_table, updated_items)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def _refresh_tags_view(self, res: Any, hydrus_hash: Optional[str], file_hash: Optional[str], file_path: Optional[str], config: Dict[str, Any]) -> None:
|
||||
"""Refresh tag display via get-tag. Prefer current subject; fall back to direct hash refresh."""
|
||||
try:
|
||||
from cmdlets import get_tag as get_tag_cmd # type: ignore
|
||||
except Exception:
|
||||
return
|
||||
|
||||
target_hash = hydrus_hash or file_hash
|
||||
refresh_args: List[str] = []
|
||||
if target_hash:
|
||||
refresh_args = ["-hash", target_hash, "-store", target_hash]
|
||||
|
||||
try:
|
||||
subject = ctx.get_last_result_subject()
|
||||
if subject and self._matches_target(subject, hydrus_hash, file_hash, file_path):
|
||||
get_tag_cmd._run(subject, refresh_args, config)
|
||||
return
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if target_hash:
|
||||
try:
|
||||
get_tag_cmd._run(res, refresh_args, config)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Add a tag to a file with smart filtering for pipeline results."""
|
||||
if should_show_help(args):
|
||||
log(f"Cmdlet: {self.name}\nSummary: {self.summary}\nUsage: {self.usage}")
|
||||
return 0
|
||||
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Check for --all flag
|
||||
include_temp = parsed.get("all", False)
|
||||
|
||||
# Get explicit -hash and -store overrides from CLI
|
||||
hash_override = normalize_hash(parsed.get("hash"))
|
||||
store_override = parsed.get("store") or parsed.get("storage")
|
||||
|
||||
# Normalize input to list
|
||||
results = normalize_result_input(result)
|
||||
|
||||
# If no piped results but we have -hash flag, create a minimal synthetic result
|
||||
if not results and hash_override:
|
||||
results = [{"hash": hash_override, "is_temp": False}]
|
||||
if store_override:
|
||||
results[0]["store"] = store_override
|
||||
|
||||
# Filter by temp status (unless --all is set)
|
||||
if not include_temp:
|
||||
results = filter_results_by_temp(results, include_temp=False)
|
||||
|
||||
if not results:
|
||||
log("No valid files to tag (all results were temporary; use --all to include temporary files)", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get tags from arguments (or fallback to pipeline payload)
|
||||
raw_tags = parsed.get("tags", [])
|
||||
if isinstance(raw_tags, str):
|
||||
raw_tags = [raw_tags]
|
||||
|
||||
# Fallback: if no tags provided explicitly, try to pull from first result payload
|
||||
if not raw_tags and results:
|
||||
first = results[0]
|
||||
payload_tags = None
|
||||
# Try multiple tag lookup strategies in order
|
||||
tag_lookups = [
|
||||
lambda x: x.extra.get("tags") if isinstance(x, models.PipeObject) and isinstance(x.extra, dict) else None,
|
||||
lambda x: x.get("tags") if isinstance(x, dict) else None,
|
||||
lambda x: x.get("extra", {}).get("tags") if isinstance(x, dict) and isinstance(x.get("extra"), dict) else None,
|
||||
lambda x: getattr(x, "tags", None),
|
||||
]
|
||||
for lookup in tag_lookups:
|
||||
try:
|
||||
payload_tags = lookup(first)
|
||||
if payload_tags:
|
||||
break
|
||||
except (AttributeError, TypeError, KeyError):
|
||||
continue
|
||||
if payload_tags:
|
||||
if isinstance(payload_tags, str):
|
||||
raw_tags = [payload_tags]
|
||||
elif isinstance(payload_tags, list):
|
||||
raw_tags = payload_tags
|
||||
|
||||
# Handle -list argument (convert to {list} syntax)
|
||||
list_arg = parsed.get("list")
|
||||
if list_arg:
|
||||
for l in list_arg.split(','):
|
||||
l = l.strip()
|
||||
if l:
|
||||
raw_tags.append(f"{{{l}}}")
|
||||
|
||||
# Parse and expand tags
|
||||
tags_to_add = parse_tag_arguments(raw_tags)
|
||||
tags_to_add = expand_tag_groups(tags_to_add)
|
||||
|
||||
# Allow hash override via namespaced token (e.g., "hash:abcdef...")
|
||||
extracted_hash = None
|
||||
filtered_tags: List[str] = []
|
||||
for tag in tags_to_add:
|
||||
if isinstance(tag, str) and tag.lower().startswith("hash:"):
|
||||
_, _, hash_val = tag.partition(":")
|
||||
if hash_val:
|
||||
extracted_hash = normalize_hash(hash_val.strip())
|
||||
continue
|
||||
filtered_tags.append(tag)
|
||||
tags_to_add = filtered_tags
|
||||
|
||||
if not tags_to_add:
|
||||
log("No tags provided to add", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
def _find_library_root(path_obj: Path) -> Optional[Path]:
|
||||
candidates = []
|
||||
cfg_root = get_local_storage_path(config) if config else None
|
||||
if cfg_root:
|
||||
try:
|
||||
candidates.append(Path(cfg_root).expanduser())
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
for candidate in candidates:
|
||||
if (candidate / "medios-macina.db").exists():
|
||||
return candidate
|
||||
for parent in [path_obj] + list(path_obj.parents):
|
||||
if (parent / "medios-macina.db").exists():
|
||||
return parent
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
# Get other flags
|
||||
duplicate_arg = parsed.get("duplicate")
|
||||
|
||||
if not tags_to_add and not duplicate_arg:
|
||||
# Write sidecar files with the tags that are already in the result dicts
|
||||
sidecar_count = 0
|
||||
for res in results:
|
||||
# Handle both dict and PipeObject formats
|
||||
file_path = None
|
||||
tags = []
|
||||
file_hash = ""
|
||||
# Use canonical field access with get_field for both dict and objects
|
||||
file_path = get_field(res, "path")
|
||||
# Try tags from top-level 'tags' or from 'extra.tags'
|
||||
tags = get_field(res, "tags") or (get_field(res, "extra") or {}).get("tags", [])
|
||||
file_hash = get_field(res, "hash") or get_field(res, "file_hash") or get_field(res, "hash_hex") or ""
|
||||
if not file_path:
|
||||
log(f"[add_tag] Warning: Result has no path, skipping", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
continue
|
||||
if tags:
|
||||
# Write sidecar file for this file with its tags
|
||||
try:
|
||||
sidecar_path = write_sidecar(Path(file_path), tags, [], file_hash)
|
||||
log(f"[add_tag] Wrote {len(tags)} tag(s) to sidecar: {sidecar_path}", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
except Exception as e:
|
||||
log(f"[add_tag] Warning: Failed to write sidecar for {file_path}: {e}", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
if sidecar_count > 0:
|
||||
log(f"[add_tag] Wrote {sidecar_count} sidecar file(s) with embedded tags", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tag] No tags to write - passed {len(results)} result(s) through unchanged", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# Main loop: process results with tags to add
|
||||
total_new_tags = 0
|
||||
total_modified = 0
|
||||
for res in results:
|
||||
# Extract file info from result
|
||||
file_path = None
|
||||
existing_tags = []
|
||||
file_hash = ""
|
||||
storage_source = None
|
||||
|
||||
# Use canonical getters for fields from both dicts and PipeObject
|
||||
file_path = get_field(res, "path")
|
||||
existing_tags = get_field(res, "tags") or []
|
||||
if not existing_tags:
|
||||
existing_tags = (get_field(res, "extra", {}) or {}).get("tags") or []
|
||||
file_hash = get_field(res, "hash") or get_field(res, "file_hash") or get_field(res, "hash_hex") or ""
|
||||
storage_source = get_field(res, "store") or get_field(res, "storage") or get_field(res, "storage_source") or get_field(res, "origin")
|
||||
hydrus_hash = get_field(res, "hydrus_hash") or file_hash
|
||||
|
||||
# Infer storage source from result if not found
|
||||
if not storage_source:
|
||||
if file_path:
|
||||
storage_source = 'local'
|
||||
elif file_hash and file_hash != "unknown":
|
||||
storage_source = 'hydrus'
|
||||
|
||||
original_tags_lower = {str(t).lower() for t in existing_tags if isinstance(t, str)}
|
||||
original_title = self._extract_title_tag(list(existing_tags))
|
||||
|
||||
# Apply CLI overrides if provided
|
||||
if hash_override and not file_hash:
|
||||
file_hash = hash_override
|
||||
if store_override and not storage_source:
|
||||
storage_source = store_override
|
||||
|
||||
# Check if we have sufficient identifier (file_path OR file_hash)
|
||||
if not file_path and not file_hash:
|
||||
log(f"[add_tag] Warning: Result has neither path nor hash available, skipping", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
continue
|
||||
# Handle -duplicate logic (copy existing tags to new namespaces)
|
||||
if duplicate_arg:
|
||||
# Parse duplicate format: source:target1,target2 or source,target1,target2
|
||||
parts = duplicate_arg.split(':')
|
||||
source_ns = ""
|
||||
targets = []
|
||||
if len(parts) > 1:
|
||||
# Explicit format: source:target1,target2
|
||||
source_ns = parts[0]
|
||||
targets = parts[1].split(',')
|
||||
else:
|
||||
# Inferred format: source,target1,target2
|
||||
parts = duplicate_arg.split(',')
|
||||
if len(parts) > 1:
|
||||
source_ns = parts[0]
|
||||
targets = parts[1:]
|
||||
if source_ns and targets:
|
||||
# Find tags in source namespace
|
||||
source_tags = [t for t in existing_tags if t.startswith(source_ns + ':')]
|
||||
for t in source_tags:
|
||||
value = t.split(':', 1)[1]
|
||||
for target_ns in targets:
|
||||
new_tag = f"{target_ns}:{value}"
|
||||
if new_tag not in existing_tags and new_tag not in tags_to_add:
|
||||
tags_to_add.append(new_tag)
|
||||
|
||||
# Initialize tag mutation tracking local variables
|
||||
removed_tags = []
|
||||
new_tags_added = []
|
||||
final_tags = list(existing_tags) if existing_tags else []
|
||||
|
||||
# Determine where to add tags: Hydrus or Folder storage
|
||||
if storage_source and storage_source.lower() == 'hydrus':
|
||||
# Add tags to Hydrus using the API
|
||||
target_hash = file_hash
|
||||
if target_hash:
|
||||
try:
|
||||
hydrus_client = hydrus_wrapper.get_client(config)
|
||||
service_name = hydrus_wrapper.get_tag_service_name(config)
|
||||
|
||||
# For namespaced tags, remove old tags in same namespace
|
||||
removed_tags = []
|
||||
for new_tag in tags_to_add:
|
||||
if ':' in new_tag:
|
||||
namespace = new_tag.split(':', 1)[0]
|
||||
to_remove = [t for t in existing_tags if t.startswith(namespace + ':') and t.lower() != new_tag.lower()]
|
||||
removed_tags.extend(to_remove)
|
||||
|
||||
# Add new tags
|
||||
if tags_to_add:
|
||||
log(f"[add_tag] Adding {len(tags_to_add)} tag(s) to Hydrus file: {target_hash}", file=sys.stderr)
|
||||
hydrus_client.add_tags(target_hash, tags_to_add, service_name)
|
||||
|
||||
# Delete replaced namespace tags
|
||||
if removed_tags:
|
||||
unique_removed = sorted(set(removed_tags))
|
||||
hydrus_client.delete_tags(target_hash, unique_removed, service_name)
|
||||
|
||||
if tags_to_add or removed_tags:
|
||||
total_new_tags += len(tags_to_add)
|
||||
total_modified += 1
|
||||
log(f"[add_tag] ✓ Added {len(tags_to_add)} tag(s) to Hydrus", file=sys.stderr)
|
||||
# Refresh final tag list from the backend for accurate display
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config)
|
||||
if storage and storage_source in storage.list_backends():
|
||||
backend = storage[storage_source]
|
||||
refreshed_tags, _ = backend.get_tag(target_hash)
|
||||
if refreshed_tags is not None:
|
||||
final_tags = refreshed_tags
|
||||
new_tags_added = [t for t in refreshed_tags if t.lower() not in original_tags_lower]
|
||||
# Update result tags for downstream cmdlets/UI
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.tags = refreshed_tags
|
||||
if isinstance(res.extra, dict):
|
||||
res.extra['tags'] = refreshed_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = refreshed_tags
|
||||
except Exception:
|
||||
# Ignore failures - this is best-effort for refreshing tag state
|
||||
pass
|
||||
except Exception as e:
|
||||
log(f"[add_tag] Warning: Failed to add tags to Hydrus: {e}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tag] Warning: No hash available for Hydrus file, skipping", file=sys.stderr)
|
||||
elif storage_source:
|
||||
# For any Folder-based storage (local, test, default, etc.), delegate to backend
|
||||
# If storage_source is not a registered backend, fallback to writing a sidecar
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config)
|
||||
try:
|
||||
if storage and storage_source in storage.list_backends():
|
||||
backend = storage[storage_source]
|
||||
if file_hash and backend.add_tag(file_hash, tags_to_add):
|
||||
# Refresh tags from backend to get merged result
|
||||
refreshed_tags, _ = backend.get_tag(file_hash)
|
||||
if refreshed_tags:
|
||||
# Update result tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.tags = refreshed_tags
|
||||
# Also keep as extra for compatibility
|
||||
if isinstance(res.extra, dict):
|
||||
res.extra['tags'] = refreshed_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = refreshed_tags
|
||||
|
||||
# Update title if changed
|
||||
title_value = self._extract_title_tag(refreshed_tags)
|
||||
self._apply_title_to_result(res, title_value)
|
||||
|
||||
# Compute stats
|
||||
new_tags_added = [t for t in refreshed_tags if t.lower() not in original_tags_lower]
|
||||
total_new_tags += len(new_tags_added)
|
||||
if new_tags_added:
|
||||
total_modified += 1
|
||||
|
||||
log(f"[add_tag] Added {len(new_tags_added)} new tag(s); {len(refreshed_tags)} total tag(s) stored in {storage_source}", file=sys.stderr)
|
||||
final_tags = refreshed_tags
|
||||
else:
|
||||
log(f"[add_tag] Warning: Failed to add tags to {storage_source}", file=sys.stderr)
|
||||
else:
|
||||
# Not a registered backend - fallback to sidecar if we have a path
|
||||
if file_path:
|
||||
try:
|
||||
sidecar_path = write_sidecar(Path(file_path), tags_to_add, [], file_hash)
|
||||
log(f"[add_tag] Wrote {len(tags_to_add)} tag(s) to sidecar: {sidecar_path}", file=sys.stderr)
|
||||
total_new_tags += len(tags_to_add)
|
||||
total_modified += 1
|
||||
# Update res tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.tags = (res.tags or []) + tags_to_add
|
||||
if isinstance(res.extra, dict):
|
||||
res.extra['tags'] = res.tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = list(set((res.get('tags') or []) + tags_to_add))
|
||||
except Exception as exc:
|
||||
log(f"[add_tag] Warning: Failed to write sidecar for {file_path}: {exc}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tag] Warning: Storage backend '{storage_source}' not found in config", file=sys.stderr)
|
||||
except KeyError:
|
||||
# storage[storage_source] raised KeyError - treat as absent backend
|
||||
if file_path:
|
||||
try:
|
||||
sidecar_path = write_sidecar(Path(file_path), tags_to_add, [], file_hash)
|
||||
log(f"[add_tag] Wrote {len(tags_to_add)} tag(s) to sidecar: {sidecar_path}", file=sys.stderr)
|
||||
total_new_tags += len(tags_to_add)
|
||||
total_modified += 1
|
||||
# Update res tags for downstream
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.tags = (res.tags or []) + tags_to_add
|
||||
if isinstance(res.extra, dict):
|
||||
res.extra['tags'] = res.tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = list(set((res.get('tags') or []) + tags_to_add))
|
||||
except Exception as exc:
|
||||
log(f"[add_tag] Warning: Failed to write sidecar for {file_path}: {exc}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tag] Warning: Storage backend '{storage_source}' not found in config", file=sys.stderr)
|
||||
else:
|
||||
# For other storage types or unknown sources, avoid writing sidecars to reduce clutter
|
||||
# (local/hydrus are handled above).
|
||||
ctx.emit(res)
|
||||
continue
|
||||
# If title changed, refresh the cached result table so the display reflects the new name
|
||||
final_title = self._extract_title_tag(final_tags)
|
||||
if final_title and (not original_title or final_title.lower() != original_title.lower()):
|
||||
self._refresh_result_table_title(final_title, hydrus_hash or file_hash, file_hash, file_path)
|
||||
# If tags changed, refresh tag view via get-tag (prefer current subject; fall back to hash refresh)
|
||||
if new_tags_added or removed_tags:
|
||||
self._refresh_tags_view(res, hydrus_hash, file_hash, file_path, config)
|
||||
# Emit the modified result
|
||||
ctx.emit(res)
|
||||
log(f"[add_tag] Added {total_new_tags} new tag(s) across {len(results)} item(s); modified {total_modified} item(s)", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
|
||||
CMDLET = Add_Tag()
|
||||
@@ -1,20 +1,18 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List, Sequence, Optional
|
||||
import json
|
||||
from pathlib import Path
|
||||
import sys
|
||||
|
||||
from helper.logger import log
|
||||
|
||||
from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from ._shared import normalize_result_input, filter_results_by_temp
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from helper.local_library import read_sidecar, write_sidecar, find_sidecar, has_sidecar, LocalLibraryDB
|
||||
from helper.folder_store import read_sidecar, write_sidecar, find_sidecar, has_sidecar, FolderDB
|
||||
from metadata import rename
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, parse_tag_arguments, expand_tag_groups, parse_cmdlet_args, collapse_namespace_tags
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, parse_tag_arguments, expand_tag_groups, parse_cmdlet_args, collapse_namespace_tags, should_show_help, get_field
|
||||
from config import get_local_storage_path
|
||||
|
||||
|
||||
@@ -68,29 +66,16 @@ def _matches_target(item: Any, hydrus_hash: Optional[str], file_hash: Optional[s
|
||||
def norm(val: Any) -> Optional[str]:
|
||||
return str(val).lower() if val is not None else None
|
||||
|
||||
# Define field names to check for hashes and paths
|
||||
hash_fields = ["hydrus_hash", "hash", "hash_hex", "file_hash"]
|
||||
path_fields = ["path", "file_path", "target"]
|
||||
|
||||
if isinstance(item, dict):
|
||||
hashes = [
|
||||
norm(item.get("hydrus_hash")),
|
||||
norm(item.get("hash")),
|
||||
norm(item.get("hash_hex")),
|
||||
norm(item.get("file_hash")),
|
||||
]
|
||||
paths = [
|
||||
norm(item.get("path")),
|
||||
norm(item.get("file_path")),
|
||||
norm(item.get("target")),
|
||||
]
|
||||
hashes = [norm(item.get(field)) for field in hash_fields]
|
||||
paths = [norm(item.get(field)) for field in path_fields]
|
||||
else:
|
||||
hashes = [
|
||||
norm(getattr(item, "hydrus_hash", None)),
|
||||
norm(getattr(item, "hash_hex", None)),
|
||||
norm(getattr(item, "file_hash", None)),
|
||||
]
|
||||
paths = [
|
||||
norm(getattr(item, "path", None)),
|
||||
norm(getattr(item, "file_path", None)),
|
||||
norm(getattr(item, "target", None)),
|
||||
]
|
||||
hashes = [norm(get_field(item, field)) for field in hash_fields]
|
||||
paths = [norm(get_field(item, field)) for field in path_fields]
|
||||
|
||||
if hydrus_hash_l and hydrus_hash_l in hashes:
|
||||
return True
|
||||
@@ -147,20 +132,18 @@ def _refresh_result_table_title(new_title: str, hydrus_hash: Optional[str], file
|
||||
except Exception:
|
||||
pass
|
||||
updated_items.append(item)
|
||||
|
||||
if not match_found:
|
||||
return
|
||||
|
||||
from result_table import ResultTable # Local import to avoid circular dependency
|
||||
|
||||
new_table = ResultTable(getattr(last_table, "title", ""), title_width=getattr(last_table, "title_width", 80), max_columns=getattr(last_table, "max_columns", None))
|
||||
if getattr(last_table, "source_command", None):
|
||||
new_table.set_source_command(last_table.source_command, getattr(last_table, "source_args", []))
|
||||
new_table = last_table.copy_with_title(getattr(last_table, "title", ""))
|
||||
|
||||
for item in updated_items:
|
||||
new_table.add_result(item)
|
||||
|
||||
ctx.set_last_result_table_preserve_history(new_table, updated_items)
|
||||
# Keep the underlying history intact; update only the overlay so @.. can
|
||||
# clear the overlay then continue back to prior tables (e.g., the search list).
|
||||
ctx.set_last_result_table_overlay(new_table, updated_items)
|
||||
except Exception:
|
||||
pass
|
||||
@@ -194,347 +177,409 @@ def _refresh_tags_view(res: Any, hydrus_hash: Optional[str], file_hash: Optional
|
||||
|
||||
|
||||
|
||||
class Add_Tag(Cmdlet):
|
||||
"""Class-based add-tags cmdlet with Cmdlet metadata inheritance."""
|
||||
|
||||
@register(["add-tag", "add-tags"])
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Add tags to a file with smart filtering for pipeline results."""
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
def __init__(self) -> None:
|
||||
super().__init__(
|
||||
name="add-tags",
|
||||
summary="Add tags to a Hydrus file or write them to a local .tags sidecar.",
|
||||
usage="add-tags [-hash <sha256>] [-duplicate <format>] [-list <list>[,<list>...]] [--all] <tag>[,<tag>...]",
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
CmdletArg("-duplicate", type="string", description="Copy existing tag values to new namespaces. Formats: title:album,artist (explicit) or title,album,artist (inferred)"),
|
||||
CmdletArg("-list", type="string", description="Load predefined tag lists from adjective.json. Comma-separated list names (e.g., -list philosophy,occult)."),
|
||||
CmdletArg("--all", type="flag", description="Include temporary files in tagging (by default, only tags non-temporary files)."),
|
||||
CmdletArg("tags", type="string", required=False, description="One or more tags to add. Comma- or space-separated. Can also use {list_name} syntax. If omitted, uses tags from pipeline payload.", variadic=True),
|
||||
],
|
||||
detail=[
|
||||
"- By default, only tags non-temporary files (from pipelines). Use --all to tag everything.",
|
||||
"- Without -hash and when the selection is a local file, tags are written to <file>.tags.",
|
||||
"- With a Hydrus hash, tags are sent to the 'my tags' service.",
|
||||
"- Multiple tags can be comma-separated or space-separated.",
|
||||
"- Use -list to include predefined tag lists from adjective.json: -list philosophy,occult",
|
||||
"- Tags can also reference lists with curly braces: add-tag {philosophy} \"other:tag\"",
|
||||
"- Use -duplicate to copy EXISTING tag values to new namespaces:",
|
||||
" Explicit format: -duplicate title:album,artist (copies title: to album: and artist:)",
|
||||
" Inferred format: -duplicate title,album,artist (first is source, rest are targets)",
|
||||
"- The source namespace must already exist in the file being tagged.",
|
||||
"- Target namespaces that already have a value are skipped (not overwritten).",
|
||||
"- You can also pass the target hash as a tag token: hash:<sha256>. This overrides -hash and is removed from the tag list.",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Add tags to a file with smart filtering for pipeline results."""
|
||||
if should_show_help(args):
|
||||
log(f"Cmdlet: {self.name}\nSummary: {self.summary}\nUsage: {self.usage}")
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
|
||||
# Check for --all flag
|
||||
include_temp = parsed.get("all", False)
|
||||
|
||||
# Normalize input to list
|
||||
results = normalize_result_input(result)
|
||||
|
||||
# Filter by temp status (unless --all is set)
|
||||
if not include_temp:
|
||||
results = filter_results_by_temp(results, include_temp=False)
|
||||
|
||||
if not results:
|
||||
log("No valid files to tag (all results were temporary; use --all to include temporary files)", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get tags from arguments (or fallback to pipeline payload)
|
||||
raw_tags = parsed.get("tags", [])
|
||||
if isinstance(raw_tags, str):
|
||||
raw_tags = [raw_tags]
|
||||
|
||||
# Fallback: if no tags provided explicitly, try to pull from first result payload
|
||||
if not raw_tags and results:
|
||||
first = results[0]
|
||||
payload_tags = None
|
||||
if isinstance(first, models.PipeObject):
|
||||
payload_tags = first.extra.get("tags") if isinstance(first.extra, dict) else None
|
||||
elif isinstance(first, dict):
|
||||
payload_tags = first.get("tags")
|
||||
if not payload_tags:
|
||||
payload_tags = first.get("extra", {}).get("tags") if isinstance(first.get("extra"), dict) else None
|
||||
# If metadata payload stored tags under nested list, accept directly
|
||||
if payload_tags is None:
|
||||
payload_tags = getattr(first, "tags", None)
|
||||
if payload_tags:
|
||||
if isinstance(payload_tags, str):
|
||||
raw_tags = [payload_tags]
|
||||
elif isinstance(payload_tags, list):
|
||||
raw_tags = payload_tags
|
||||
|
||||
# Handle -list argument (convert to {list} syntax)
|
||||
list_arg = parsed.get("list")
|
||||
if list_arg:
|
||||
for l in list_arg.split(','):
|
||||
l = l.strip()
|
||||
if l:
|
||||
raw_tags.append(f"{{{l}}}")
|
||||
|
||||
# Parse and expand tags
|
||||
tags_to_add = parse_tag_arguments(raw_tags)
|
||||
tags_to_add = expand_tag_groups(tags_to_add)
|
||||
|
||||
if not tags_to_add:
|
||||
log("No tags provided to add", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get other flags
|
||||
hash_override = normalize_hash(parsed.get("hash"))
|
||||
duplicate_arg = parsed.get("duplicate")
|
||||
|
||||
# If no tags provided (and no list), write sidecar files with embedded tags
|
||||
# Note: Since 'tags' is required=True in CMDLET, this block might be unreachable via CLI
|
||||
# unless called programmatically or if required check is bypassed.
|
||||
if not tags_to_add and not duplicate_arg:
|
||||
# Write sidecar files with the tags that are already in the result dicts
|
||||
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Check for --all flag
|
||||
include_temp = parsed.get("all", False)
|
||||
|
||||
# Normalize input to list
|
||||
results = normalize_result_input(result)
|
||||
|
||||
# Filter by temp status (unless --all is set)
|
||||
if not include_temp:
|
||||
results = filter_results_by_temp(results, include_temp=False)
|
||||
|
||||
if not results:
|
||||
log("No valid files to tag (all results were temporary; use --all to include temporary files)", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get tags from arguments (or fallback to pipeline payload)
|
||||
raw_tags = parsed.get("tags", [])
|
||||
if isinstance(raw_tags, str):
|
||||
raw_tags = [raw_tags]
|
||||
|
||||
# Fallback: if no tags provided explicitly, try to pull from first result payload
|
||||
if not raw_tags and results:
|
||||
first = results[0]
|
||||
payload_tags = None
|
||||
|
||||
# Try multiple tag lookup strategies in order
|
||||
tag_lookups = [
|
||||
lambda x: x.extra.get("tags") if isinstance(x, models.PipeObject) and isinstance(x.extra, dict) else None,
|
||||
lambda x: x.get("tags") if isinstance(x, dict) else None,
|
||||
lambda x: x.get("extra", {}).get("tags") if isinstance(x, dict) and isinstance(x.get("extra"), dict) else None,
|
||||
lambda x: getattr(x, "tags", None),
|
||||
]
|
||||
|
||||
for lookup in tag_lookups:
|
||||
try:
|
||||
payload_tags = lookup(first)
|
||||
if payload_tags:
|
||||
break
|
||||
except (AttributeError, TypeError, KeyError):
|
||||
continue
|
||||
|
||||
if payload_tags:
|
||||
if isinstance(payload_tags, str):
|
||||
raw_tags = [payload_tags]
|
||||
elif isinstance(payload_tags, list):
|
||||
raw_tags = payload_tags
|
||||
|
||||
# Handle -list argument (convert to {list} syntax)
|
||||
list_arg = parsed.get("list")
|
||||
if list_arg:
|
||||
for l in list_arg.split(','):
|
||||
l = l.strip()
|
||||
if l:
|
||||
raw_tags.append(f"{{{l}}}")
|
||||
|
||||
# Parse and expand tags
|
||||
tags_to_add = parse_tag_arguments(raw_tags)
|
||||
tags_to_add = expand_tag_groups(tags_to_add)
|
||||
|
||||
# Allow hash override via namespaced token (e.g., "hash:abcdef...")
|
||||
extracted_hash = None
|
||||
filtered_tags: List[str] = []
|
||||
for tag in tags_to_add:
|
||||
if isinstance(tag, str) and tag.lower().startswith("hash:"):
|
||||
_, _, hash_val = tag.partition(":")
|
||||
if hash_val:
|
||||
extracted_hash = normalize_hash(hash_val.strip())
|
||||
continue
|
||||
filtered_tags.append(tag)
|
||||
tags_to_add = filtered_tags
|
||||
|
||||
if not tags_to_add:
|
||||
log("No tags provided to add", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get other flags (hash override can come from -hash or hash: token)
|
||||
hash_override = normalize_hash(parsed.get("hash")) or extracted_hash
|
||||
duplicate_arg = parsed.get("duplicate")
|
||||
|
||||
# If no tags provided (and no list), write sidecar files with embedded tags
|
||||
# Note: Since 'tags' is required=False in the cmdlet arg, this block can be reached via CLI
|
||||
# when no tag arguments are provided.
|
||||
if not tags_to_add and not duplicate_arg:
|
||||
# Write sidecar files with the tags that are already in the result dicts
|
||||
sidecar_count = 0
|
||||
for res in results:
|
||||
# Handle both dict and PipeObject formats
|
||||
file_path = None
|
||||
tags = []
|
||||
file_hash = ""
|
||||
|
||||
if isinstance(res, models.PipeObject):
|
||||
file_path = res.file_path
|
||||
tags = res.extra.get('tags', [])
|
||||
file_hash = res.hash or ""
|
||||
elif isinstance(res, dict):
|
||||
file_path = res.get('file_path')
|
||||
# Try multiple tag locations in order
|
||||
tag_sources = [lambda: res.get('tags', []), lambda: res.get('extra', {}).get('tags', [])]
|
||||
for source in tag_sources:
|
||||
tags = source()
|
||||
if tags:
|
||||
break
|
||||
file_hash = res.get('hash', "")
|
||||
|
||||
if not file_path:
|
||||
log(f"[add_tags] Warning: Result has no file_path, skipping", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
if tags:
|
||||
# Write sidecar file for this file with its tags
|
||||
try:
|
||||
sidecar_path = write_sidecar(Path(file_path), tags, [], file_hash)
|
||||
log(f"[add_tags] Wrote {len(tags)} tag(s) to sidecar: {sidecar_path}", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to write sidecar for {file_path}: {e}", file=sys.stderr)
|
||||
|
||||
ctx.emit(res)
|
||||
|
||||
if sidecar_count > 0:
|
||||
log(f"[add_tags] Wrote {sidecar_count} sidecar file(s) with embedded tags", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] No tags to write - passed {len(results)} result(s) through unchanged", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# Tags ARE provided - append them to each result and write sidecar files or add to Hydrus
|
||||
sidecar_count = 0
|
||||
total_new_tags = 0
|
||||
total_modified = 0
|
||||
for res in results:
|
||||
# Handle both dict and PipeObject formats
|
||||
file_path = None
|
||||
tags = []
|
||||
existing_tags = []
|
||||
file_hash = ""
|
||||
|
||||
storage_source = None
|
||||
hydrus_hash = None
|
||||
|
||||
# Define field name aliases to check
|
||||
path_field_names = ['file_path', 'path']
|
||||
source_field_names = ['storage_source', 'source', 'origin']
|
||||
hash_field_names = ['hydrus_hash', 'hash', 'hash_hex']
|
||||
|
||||
if isinstance(res, models.PipeObject):
|
||||
file_path = res.file_path
|
||||
tags = res.extra.get('tags', [])
|
||||
existing_tags = res.extra.get('tags', [])
|
||||
file_hash = res.file_hash or ""
|
||||
for field in source_field_names:
|
||||
storage_source = res.extra.get(field)
|
||||
if storage_source:
|
||||
break
|
||||
hydrus_hash = res.extra.get('hydrus_hash')
|
||||
elif isinstance(res, dict):
|
||||
file_path = res.get('file_path')
|
||||
tags = res.get('tags', []) # Check both tags and extra['tags']
|
||||
if not tags and 'extra' in res:
|
||||
tags = res['extra'].get('tags', [])
|
||||
# Try path field names in order
|
||||
for field in path_field_names:
|
||||
file_path = res.get(field)
|
||||
if file_path:
|
||||
break
|
||||
|
||||
# Try tag locations in order
|
||||
tag_sources = [lambda: res.get('tags', []), lambda: res.get('extra', {}).get('tags', [])]
|
||||
for source in tag_sources:
|
||||
existing_tags = source()
|
||||
if existing_tags:
|
||||
break
|
||||
|
||||
file_hash = res.get('file_hash', "")
|
||||
|
||||
if not file_path:
|
||||
log(f"[add_tags] Warning: Result has no file_path, skipping", file=sys.stderr)
|
||||
|
||||
# Try source field names in order (top-level then extra)
|
||||
for field in source_field_names:
|
||||
storage_source = res.get(field)
|
||||
if storage_source:
|
||||
break
|
||||
if not storage_source and 'extra' in res:
|
||||
for field in source_field_names:
|
||||
storage_source = res.get('extra', {}).get(field)
|
||||
if storage_source:
|
||||
break
|
||||
|
||||
# Try hash field names in order (top-level then extra)
|
||||
for field in hash_field_names:
|
||||
hydrus_hash = res.get(field)
|
||||
if hydrus_hash:
|
||||
break
|
||||
if not hydrus_hash and 'extra' in res:
|
||||
for field in hash_field_names:
|
||||
hydrus_hash = res.get('extra', {}).get(field)
|
||||
if hydrus_hash:
|
||||
break
|
||||
|
||||
if not hydrus_hash and file_hash:
|
||||
hydrus_hash = file_hash
|
||||
if not storage_source and hydrus_hash and not file_path:
|
||||
storage_source = 'hydrus'
|
||||
# If we have a file path but no storage source, assume local to avoid sidecar spam
|
||||
if not storage_source and file_path:
|
||||
storage_source = 'local'
|
||||
else:
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
if tags:
|
||||
# Write sidecar file for this file with its tags
|
||||
try:
|
||||
sidecar_path = write_sidecar(Path(file_path), tags, [], file_hash)
|
||||
log(f"[add_tags] Wrote {len(tags)} tag(s) to sidecar: {sidecar_path}", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to write sidecar for {file_path}: {e}", file=sys.stderr)
|
||||
|
||||
ctx.emit(res)
|
||||
|
||||
if sidecar_count > 0:
|
||||
log(f"[add_tags] Wrote {sidecar_count} sidecar file(s) with embedded tags", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] No tags to write - passed {len(results)} result(s) through unchanged", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# Tags ARE provided - append them to each result and write sidecar files or add to Hydrus
|
||||
sidecar_count = 0
|
||||
total_new_tags = 0
|
||||
total_modified = 0
|
||||
for res in results:
|
||||
# Handle both dict and PipeObject formats
|
||||
file_path = None
|
||||
existing_tags = []
|
||||
file_hash = ""
|
||||
storage_source = None
|
||||
hydrus_hash = None
|
||||
|
||||
if isinstance(res, models.PipeObject):
|
||||
file_path = res.file_path
|
||||
existing_tags = res.extra.get('tags', [])
|
||||
file_hash = res.file_hash or ""
|
||||
storage_source = res.extra.get('storage_source') or res.extra.get('source')
|
||||
hydrus_hash = res.extra.get('hydrus_hash')
|
||||
elif isinstance(res, dict):
|
||||
file_path = res.get('file_path') or res.get('path')
|
||||
existing_tags = res.get('tags', [])
|
||||
if not existing_tags and 'extra' in res:
|
||||
existing_tags = res['extra'].get('tags', [])
|
||||
file_hash = res.get('file_hash', "")
|
||||
storage_source = res.get('storage_source') or res.get('source') or res.get('origin')
|
||||
if not storage_source and 'extra' in res:
|
||||
storage_source = res['extra'].get('storage_source') or res['extra'].get('source')
|
||||
# For Hydrus results from search-file, look for hash, hash_hex, or target (all contain the hash)
|
||||
hydrus_hash = res.get('hydrus_hash') or res.get('hash') or res.get('hash_hex')
|
||||
if not hydrus_hash and 'extra' in res:
|
||||
hydrus_hash = res['extra'].get('hydrus_hash') or res['extra'].get('hash') or res['extra'].get('hash_hex')
|
||||
if not hydrus_hash and file_hash:
|
||||
hydrus_hash = file_hash
|
||||
if not storage_source and hydrus_hash and not file_path:
|
||||
storage_source = 'hydrus'
|
||||
# If we have a file path but no storage source, assume local to avoid sidecar spam
|
||||
if not storage_source and file_path:
|
||||
storage_source = 'local'
|
||||
else:
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
original_tags_lower = {str(t).lower() for t in existing_tags if isinstance(t, str)}
|
||||
original_tags_snapshot = list(existing_tags)
|
||||
original_title = _extract_title_tag(original_tags_snapshot)
|
||||
removed_tags: List[str] = []
|
||||
|
||||
# Apply hash override if provided
|
||||
if hash_override:
|
||||
hydrus_hash = hash_override
|
||||
# If we have a hash override, we treat it as a Hydrus target
|
||||
storage_source = "hydrus"
|
||||
|
||||
if not file_path and not hydrus_hash:
|
||||
log(f"[add_tags] Warning: Result has neither file_path nor hash available, skipping", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
# Handle -duplicate logic (copy existing tags to new namespaces)
|
||||
if duplicate_arg:
|
||||
# Parse duplicate format: source:target1,target2 or source,target1,target2
|
||||
parts = duplicate_arg.split(':')
|
||||
source_ns = ""
|
||||
targets = []
|
||||
|
||||
if len(parts) > 1:
|
||||
# Explicit format: source:target1,target2
|
||||
source_ns = parts[0]
|
||||
targets = parts[1].split(',')
|
||||
else:
|
||||
# Inferred format: source,target1,target2
|
||||
parts = duplicate_arg.split(',')
|
||||
original_tags_lower = {str(t).lower() for t in existing_tags if isinstance(t, str)}
|
||||
original_tags_snapshot = list(existing_tags)
|
||||
original_title = _extract_title_tag(original_tags_snapshot)
|
||||
removed_tags: List[str] = []
|
||||
|
||||
# Apply hash override if provided
|
||||
if hash_override:
|
||||
hydrus_hash = hash_override
|
||||
# If we have a hash override, we treat it as a Hydrus target
|
||||
storage_source = "hydrus"
|
||||
|
||||
if not file_path and not hydrus_hash:
|
||||
log(f"[add_tags] Warning: Result has neither file_path nor hash available, skipping", file=sys.stderr)
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
# Handle -duplicate logic (copy existing tags to new namespaces)
|
||||
if duplicate_arg:
|
||||
# Parse duplicate format: source:target1,target2 or source,target1,target2
|
||||
parts = duplicate_arg.split(':')
|
||||
source_ns = ""
|
||||
targets = []
|
||||
|
||||
if len(parts) > 1:
|
||||
# Explicit format: source:target1,target2
|
||||
source_ns = parts[0]
|
||||
targets = parts[1:]
|
||||
|
||||
if source_ns and targets:
|
||||
# Find tags in source namespace
|
||||
source_tags = [t for t in existing_tags if t.startswith(source_ns + ':')]
|
||||
for t in source_tags:
|
||||
value = t.split(':', 1)[1]
|
||||
for target_ns in targets:
|
||||
new_tag = f"{target_ns}:{value}"
|
||||
if new_tag not in existing_tags and new_tag not in tags_to_add:
|
||||
tags_to_add.append(new_tag)
|
||||
|
||||
# Merge new tags with existing tags, handling namespace overwrites
|
||||
# When adding a tag like "namespace:value", remove any existing "namespace:*" tags
|
||||
for new_tag in tags_to_add:
|
||||
# Check if this is a namespaced tag (format: "namespace:value")
|
||||
if ':' in new_tag:
|
||||
namespace = new_tag.split(':', 1)[0]
|
||||
# Track removals for Hydrus: delete old tags in same namespace (except identical)
|
||||
to_remove = [t for t in existing_tags if t.startswith(namespace + ':') and t.lower() != new_tag.lower()]
|
||||
removed_tags.extend(to_remove)
|
||||
# Remove any existing tags with the same namespace
|
||||
existing_tags = [t for t in existing_tags if not (t.startswith(namespace + ':'))]
|
||||
|
||||
# Add the new tag if not already present
|
||||
if new_tag not in existing_tags:
|
||||
existing_tags.append(new_tag)
|
||||
|
||||
# Ensure only one tag per namespace (e.g., single title:) with latest preferred
|
||||
existing_tags = collapse_namespace_tags(existing_tags, "title", prefer="last")
|
||||
|
||||
# Compute new tags relative to original
|
||||
new_tags_added = [t for t in existing_tags if isinstance(t, str) and t.lower() not in original_tags_lower]
|
||||
total_new_tags += len(new_tags_added)
|
||||
|
||||
# Update the result's tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.extra['tags'] = existing_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = existing_tags
|
||||
|
||||
# If a title: tag was added, update the in-memory title and columns so downstream display reflects it immediately
|
||||
title_value = _extract_title_tag(existing_tags)
|
||||
_apply_title_to_result(res, title_value)
|
||||
|
||||
final_tags = existing_tags
|
||||
|
||||
# Determine where to add tags: Hydrus, local DB, or sidecar
|
||||
if storage_source and storage_source.lower() == 'hydrus':
|
||||
# Add tags to Hydrus using the API
|
||||
target_hash = hydrus_hash or file_hash
|
||||
if target_hash:
|
||||
try:
|
||||
tags_to_send = [t for t in existing_tags if isinstance(t, str) and t.lower() not in original_tags_lower]
|
||||
hydrus_client = hydrus_wrapper.get_client(config)
|
||||
service_name = hydrus_wrapper.get_tag_service_name(config)
|
||||
if tags_to_send:
|
||||
log(f"[add_tags] Adding {len(tags_to_send)} new tag(s) to Hydrus file: {target_hash}", file=sys.stderr)
|
||||
hydrus_client.add_tags(target_hash, tags_to_send, service_name)
|
||||
else:
|
||||
log(f"[add_tags] No new tags to add for Hydrus file: {target_hash}", file=sys.stderr)
|
||||
# Delete old namespace tags we replaced (e.g., previous title:)
|
||||
if removed_tags:
|
||||
unique_removed = sorted(set(removed_tags))
|
||||
hydrus_client.delete_tags(target_hash, unique_removed, service_name)
|
||||
if tags_to_send:
|
||||
log(f"[add_tags] ✓ Tags added to Hydrus", file=sys.stderr)
|
||||
elif removed_tags:
|
||||
log(f"[add_tags] ✓ Removed {len(unique_removed)} tag(s) from Hydrus", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
if tags_to_send or removed_tags:
|
||||
total_modified += 1
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to add tags to Hydrus: {e}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] Warning: No hash available for Hydrus file, skipping", file=sys.stderr)
|
||||
elif storage_source and storage_source.lower() == 'local':
|
||||
# For local storage, save directly to DB (no sidecar needed)
|
||||
if file_path:
|
||||
library_root = get_local_storage_path(config)
|
||||
if library_root:
|
||||
try:
|
||||
path_obj = Path(file_path)
|
||||
with LocalLibraryDB(library_root) as db:
|
||||
db.save_tags(path_obj, existing_tags)
|
||||
# Reload tags to reflect DB state (preserves auto-title logic)
|
||||
refreshed_tags = db.get_tags(path_obj) or existing_tags
|
||||
# Recompute title from refreshed tags for accurate display
|
||||
refreshed_title = _extract_title_tag(refreshed_tags)
|
||||
if refreshed_title:
|
||||
_apply_title_to_result(res, refreshed_title)
|
||||
res_tags = refreshed_tags or existing_tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.extra['tags'] = res_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = res_tags
|
||||
log(f"[add_tags] Added {len(new_tags_added)} new tag(s); {len(res_tags)} total tag(s) stored locally", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
if new_tags_added or removed_tags:
|
||||
total_modified += 1
|
||||
final_tags = res_tags
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to save tags to local DB: {e}", file=sys.stderr)
|
||||
targets = parts[1].split(',')
|
||||
else:
|
||||
log(f"[add_tags] Warning: No library root configured for local storage, skipping", file=sys.stderr)
|
||||
# Inferred format: source,target1,target2
|
||||
parts = duplicate_arg.split(',')
|
||||
if len(parts) > 1:
|
||||
source_ns = parts[0]
|
||||
targets = parts[1:]
|
||||
|
||||
if source_ns and targets:
|
||||
# Find tags in source namespace
|
||||
source_tags = [t for t in existing_tags if t.startswith(source_ns + ':')]
|
||||
for t in source_tags:
|
||||
value = t.split(':', 1)[1]
|
||||
for target_ns in targets:
|
||||
new_tag = f"{target_ns}:{value}"
|
||||
if new_tag not in existing_tags and new_tag not in tags_to_add:
|
||||
tags_to_add.append(new_tag)
|
||||
|
||||
# Merge new tags with existing tags, handling namespace overwrites
|
||||
# When adding a tag like "namespace:value", remove any existing "namespace:*" tags
|
||||
for new_tag in tags_to_add:
|
||||
# Check if this is a namespaced tag (format: "namespace:value")
|
||||
if ':' in new_tag:
|
||||
namespace = new_tag.split(':', 1)[0]
|
||||
# Track removals for Hydrus: delete old tags in same namespace (except identical)
|
||||
to_remove = [t for t in existing_tags if t.startswith(namespace + ':') and t.lower() != new_tag.lower()]
|
||||
removed_tags.extend(to_remove)
|
||||
# Remove any existing tags with the same namespace
|
||||
existing_tags = [t for t in existing_tags if not (t.startswith(namespace + ':'))]
|
||||
|
||||
# Add the new tag if not already present
|
||||
if new_tag not in existing_tags:
|
||||
existing_tags.append(new_tag)
|
||||
|
||||
# Ensure only one tag per namespace (e.g., single title:) with latest preferred
|
||||
existing_tags = collapse_namespace_tags(existing_tags, "title", prefer="last")
|
||||
|
||||
# Compute new tags relative to original
|
||||
new_tags_added = [t for t in existing_tags if isinstance(t, str) and t.lower() not in original_tags_lower]
|
||||
total_new_tags += len(new_tags_added)
|
||||
|
||||
# Update the result's tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.extra['tags'] = existing_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = existing_tags
|
||||
|
||||
# If a title: tag was added, update the in-memory title and columns so downstream display reflects it immediately
|
||||
title_value = _extract_title_tag(existing_tags)
|
||||
_apply_title_to_result(res, title_value)
|
||||
|
||||
final_tags = existing_tags
|
||||
|
||||
# Determine where to add tags: Hydrus, local DB, or sidecar
|
||||
if storage_source and storage_source.lower() == 'hydrus':
|
||||
# Add tags to Hydrus using the API
|
||||
target_hash = hydrus_hash or file_hash
|
||||
if target_hash:
|
||||
try:
|
||||
tags_to_send = [t for t in existing_tags if isinstance(t, str) and t.lower() not in original_tags_lower]
|
||||
hydrus_client = hydrus_wrapper.get_client(config)
|
||||
service_name = hydrus_wrapper.get_tag_service_name(config)
|
||||
if tags_to_send:
|
||||
log(f"[add_tags] Adding {len(tags_to_send)} new tag(s) to Hydrus file: {target_hash}", file=sys.stderr)
|
||||
hydrus_client.add_tags(target_hash, tags_to_send, service_name)
|
||||
else:
|
||||
log(f"[add_tags] No new tags to add for Hydrus file: {target_hash}", file=sys.stderr)
|
||||
# Delete old namespace tags we replaced (e.g., previous title:)
|
||||
if removed_tags:
|
||||
unique_removed = sorted(set(removed_tags))
|
||||
hydrus_client.delete_tags(target_hash, unique_removed, service_name)
|
||||
if tags_to_send:
|
||||
log(f"[add_tags] ✓ Tags added to Hydrus", file=sys.stderr)
|
||||
elif removed_tags:
|
||||
log(f"[add_tags] ✓ Removed {len(unique_removed)} tag(s) from Hydrus", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
if tags_to_send or removed_tags:
|
||||
total_modified += 1
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to add tags to Hydrus: {e}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] Warning: No hash available for Hydrus file, skipping", file=sys.stderr)
|
||||
elif storage_source and storage_source.lower() == 'local':
|
||||
# For local storage, save directly to DB (no sidecar needed)
|
||||
if file_path:
|
||||
library_root = get_local_storage_path(config)
|
||||
if library_root:
|
||||
try:
|
||||
path_obj = Path(file_path)
|
||||
with FolderDB(library_root) as db:
|
||||
db.save_tags(path_obj, existing_tags)
|
||||
# Reload tags to reflect DB state (preserves auto-title logic)
|
||||
file_hash = db.get_file_hash(path_obj)
|
||||
refreshed_tags = db.get_tags(file_hash) if file_hash else existing_tags
|
||||
# Recompute title from refreshed tags for accurate display
|
||||
refreshed_title = _extract_title_tag(refreshed_tags)
|
||||
if refreshed_title:
|
||||
_apply_title_to_result(res, refreshed_title)
|
||||
res_tags = refreshed_tags or existing_tags
|
||||
if isinstance(res, models.PipeObject):
|
||||
res.extra['tags'] = res_tags
|
||||
elif isinstance(res, dict):
|
||||
res['tags'] = res_tags
|
||||
log(f"[add_tags] Added {len(new_tags_added)} new tag(s); {len(res_tags)} total tag(s) stored locally", file=sys.stderr)
|
||||
sidecar_count += 1
|
||||
if new_tags_added or removed_tags:
|
||||
total_modified += 1
|
||||
final_tags = res_tags
|
||||
except Exception as e:
|
||||
log(f"[add_tags] Warning: Failed to save tags to local DB: {e}", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] Warning: No library root configured for local storage, skipping", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] Warning: No file path for local storage, skipping", file=sys.stderr)
|
||||
else:
|
||||
log(f"[add_tags] Warning: No file path for local storage, skipping", file=sys.stderr)
|
||||
else:
|
||||
# For other storage types or unknown sources, avoid writing sidecars to reduce clutter
|
||||
# (local/hydrus are handled above).
|
||||
# For other storage types or unknown sources, avoid writing sidecars to reduce clutter
|
||||
# (local/hydrus are handled above).
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
# If title changed, refresh the cached result table so the display reflects the new name
|
||||
final_title = _extract_title_tag(final_tags)
|
||||
if final_title and (not original_title or final_title.lower() != original_title.lower()):
|
||||
_refresh_result_table_title(final_title, hydrus_hash or file_hash, file_hash, file_path)
|
||||
|
||||
# If tags changed, refresh tag view via get-tag (prefer current subject; fall back to hash refresh)
|
||||
if new_tags_added or removed_tags:
|
||||
_refresh_tags_view(res, hydrus_hash, file_hash, file_path, config)
|
||||
|
||||
# Emit the modified result
|
||||
ctx.emit(res)
|
||||
continue
|
||||
|
||||
# If title changed, refresh the cached result table so the display reflects the new name
|
||||
final_title = _extract_title_tag(final_tags)
|
||||
if final_title and (not original_title or final_title.lower() != original_title.lower()):
|
||||
_refresh_result_table_title(final_title, hydrus_hash or file_hash, file_hash, file_path)
|
||||
log(f"[add_tags] Added {total_new_tags} new tag(s) across {len(results)} item(s); modified {total_modified} item(s)", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# If tags changed, refresh tag view via get-tag (prefer current subject; fall back to hash refresh)
|
||||
if new_tags_added or removed_tags:
|
||||
_refresh_tags_view(res, hydrus_hash, file_hash, file_path, config)
|
||||
|
||||
# Emit the modified result
|
||||
ctx.emit(res)
|
||||
|
||||
log(f"[add_tags] Added {total_new_tags} new tag(s) across {len(results)} item(s); modified {total_modified} item(s)", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="add-tags",
|
||||
summary="Add tags to a Hydrus file or write them to a local .tags sidecar.",
|
||||
usage="add-tags [-hash <sha256>] [-duplicate <format>] [-list <list>[,<list>...]] [--all] <tag>[,<tag>...]",
|
||||
args=[
|
||||
CmdletArg("-hash", type="string", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("-duplicate", type="string", description="Copy existing tag values to new namespaces. Formats: title:album,artist (explicit) or title,album,artist (inferred)"),
|
||||
CmdletArg("-list", type="string", description="Load predefined tag lists from adjective.json. Comma-separated list names (e.g., -list philosophy,occult)."),
|
||||
CmdletArg("--all", type="flag", description="Include temporary files in tagging (by default, only tags non-temporary files)."),
|
||||
CmdletArg("tags", type="string", required=False, description="One or more tags to add. Comma- or space-separated. Can also use {list_name} syntax. If omitted, uses tags from pipeline payload.", variadic=True),
|
||||
],
|
||||
details=[
|
||||
"- By default, only tags non-temporary files (from pipelines). Use --all to tag everything.",
|
||||
"- Without -hash and when the selection is a local file, tags are written to <file>.tags.",
|
||||
"- With a Hydrus hash, tags are sent to the 'my tags' service.",
|
||||
"- Multiple tags can be comma-separated or space-separated.",
|
||||
"- Use -list to include predefined tag lists from adjective.json: -list philosophy,occult",
|
||||
"- Tags can also reference lists with curly braces: add-tag {philosophy} \"other:tag\"",
|
||||
"- Use -duplicate to copy EXISTING tag values to new namespaces:",
|
||||
" Explicit format: -duplicate title:album,artist (copies title: to album: and artist:)",
|
||||
" Inferred format: -duplicate title,album,artist (first is source, rest are targets)",
|
||||
"- The source namespace must already exist in the file being tagged.",
|
||||
"- Target namespaces that already have a value are skipped (not overwritten).",
|
||||
],
|
||||
)
|
||||
CMDLET = Add_Tag()
|
||||
@@ -1,170 +1,85 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, get_field, normalize_hash
|
||||
from helper.logger import log
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.logger import debug
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="add-url",
|
||||
summary="Associate a URL with a file (Hydrus or Local).",
|
||||
usage="add-url [-hash <sha256>] <url>",
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("url", required=True, description="The URL to associate with the file."),
|
||||
],
|
||||
details=[
|
||||
"- Adds the URL to the file's known URL list.",
|
||||
],
|
||||
)
|
||||
from helper.store import FileStorage
|
||||
|
||||
|
||||
@register(["add-url", "ass-url", "associate-url", "add_url"]) # aliases
|
||||
def add(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
class Add_Url(Cmdlet):
|
||||
"""Add URL associations to files via hash+store."""
|
||||
|
||||
NAME = "add-url"
|
||||
SUMMARY = "Associate a URL with a file"
|
||||
USAGE = "@1 | add-url <url>"
|
||||
ARGS = [
|
||||
SharedArgs.HASH,
|
||||
SharedArgs.STORE,
|
||||
CmdletArg("url", required=True, description="URL to associate"),
|
||||
]
|
||||
DETAIL = [
|
||||
"- Associates URL with file identified by hash+store",
|
||||
"- Multiple url can be comma-separated",
|
||||
]
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Add URL to file via hash+store backend."""
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Extract hash and store from result or args
|
||||
file_hash = parsed.get("hash") or get_field(result, "hash")
|
||||
store_name = parsed.get("store") or get_field(result, "store")
|
||||
url_arg = parsed.get("url")
|
||||
|
||||
if not file_hash:
|
||||
log("Error: No file hash provided")
|
||||
return 1
|
||||
|
||||
if not store_name:
|
||||
log("Error: No store name provided")
|
||||
return 1
|
||||
|
||||
if not url_arg:
|
||||
log("Error: No URL provided")
|
||||
return 1
|
||||
|
||||
# Normalize hash
|
||||
file_hash = normalize_hash(file_hash)
|
||||
if not file_hash:
|
||||
log("Error: Invalid hash format")
|
||||
return 1
|
||||
|
||||
# Parse url (comma-separated)
|
||||
url = [u.strip() for u in str(url_arg).split(',') if u.strip()]
|
||||
if not url:
|
||||
log("Error: No valid url provided")
|
||||
return 1
|
||||
|
||||
# Get backend and add url
|
||||
try:
|
||||
storage = FileStorage(config)
|
||||
backend = storage[store_name]
|
||||
|
||||
for url in url:
|
||||
backend.add_url(file_hash, url)
|
||||
ctx.emit(f"Added URL: {url}")
|
||||
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
from ._shared import parse_cmdlet_args
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
override_hash = parsed.get("hash")
|
||||
url_arg = parsed.get("url")
|
||||
|
||||
if not url_arg:
|
||||
log("Requires a URL argument")
|
||||
return 1
|
||||
|
||||
url_arg = str(url_arg).strip()
|
||||
if not url_arg:
|
||||
log("Requires a non-empty URL")
|
||||
return 1
|
||||
|
||||
# Split by comma to handle multiple URLs
|
||||
urls_to_add = [u.strip() for u in url_arg.split(',') if u.strip()]
|
||||
|
||||
# Handle @N selection which creates a list - extract the first item
|
||||
if isinstance(result, list) and len(result) > 0:
|
||||
result = result[0]
|
||||
|
||||
# Helper to get field from both dict and object
|
||||
def get_field(obj: Any, field: str, default: Any = None) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
return obj.get(field, default)
|
||||
else:
|
||||
return getattr(obj, field, default)
|
||||
|
||||
success = False
|
||||
|
||||
# 1. Try Local Library
|
||||
file_path = get_field(result, "file_path") or get_field(result, "path")
|
||||
if file_path and not override_hash:
|
||||
try:
|
||||
path_obj = Path(file_path)
|
||||
if path_obj.exists():
|
||||
storage_path = get_local_storage_path(config)
|
||||
if storage_path:
|
||||
with LocalLibraryDB(storage_path) as db:
|
||||
metadata = db.get_metadata(path_obj) or {}
|
||||
known_urls = metadata.get("known_urls") or []
|
||||
|
||||
local_changed = False
|
||||
for url in urls_to_add:
|
||||
if url not in known_urls:
|
||||
known_urls.append(url)
|
||||
local_changed = True
|
||||
ctx.emit(f"Associated URL with local file {path_obj.name}: {url}")
|
||||
else:
|
||||
ctx.emit(f"URL already exists for local file {path_obj.name}: {url}")
|
||||
|
||||
if local_changed:
|
||||
metadata["known_urls"] = known_urls
|
||||
# Ensure we have a hash if possible, but don't fail if not
|
||||
if not metadata.get("hash"):
|
||||
try:
|
||||
from helper.utils import sha256_file
|
||||
metadata["hash"] = sha256_file(path_obj)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
db.save_metadata(path_obj, metadata)
|
||||
|
||||
success = True
|
||||
except Exception as e:
|
||||
log(f"Error updating local library: {e}", file=sys.stderr)
|
||||
|
||||
# 2. Try Hydrus
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
|
||||
if hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
if client:
|
||||
for url in urls_to_add:
|
||||
client.associate_url(hash_hex, url)
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
ctx.emit(f"Associated URL with Hydrus file {preview}: {url}")
|
||||
success = True
|
||||
|
||||
except KeyError:
|
||||
log(f"Error: Storage backend '{store_name}' not configured")
|
||||
return 1
|
||||
except Exception as exc:
|
||||
# Only log error if we didn't succeed locally either
|
||||
if not success:
|
||||
log(f"Hydrus add-url failed: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
log(f"Error adding URL: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if success:
|
||||
# If we just mutated the currently displayed item, refresh URLs via get-url
|
||||
try:
|
||||
from cmdlets import get_url as get_url_cmd # type: ignore
|
||||
except Exception:
|
||||
get_url_cmd = None
|
||||
if get_url_cmd:
|
||||
try:
|
||||
subject = ctx.get_last_result_subject()
|
||||
if subject is not None:
|
||||
def norm(val: Any) -> str:
|
||||
return str(val).lower()
|
||||
target_hash = norm(hash_hex) if hash_hex else None
|
||||
target_path = norm(file_path) if 'file_path' in locals() else None
|
||||
subj_hashes = []
|
||||
subj_paths = []
|
||||
if isinstance(subject, dict):
|
||||
subj_hashes = [norm(v) for v in [subject.get("hydrus_hash"), subject.get("hash"), subject.get("hash_hex"), subject.get("file_hash")] if v]
|
||||
subj_paths = [norm(v) for v in [subject.get("file_path"), subject.get("path"), subject.get("target")] if v]
|
||||
else:
|
||||
subj_hashes = [norm(getattr(subject, f, None)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if getattr(subject, f, None)]
|
||||
subj_paths = [norm(getattr(subject, f, None)) for f in ("file_path", "path", "target") if getattr(subject, f, None)]
|
||||
is_match = False
|
||||
if target_hash and target_hash in subj_hashes:
|
||||
is_match = True
|
||||
if target_path and target_path in subj_paths:
|
||||
is_match = True
|
||||
if is_match:
|
||||
refresh_args: list[str] = []
|
||||
if hash_hex:
|
||||
refresh_args.extend(["-hash", hash_hex])
|
||||
get_url_cmd._run(subject, refresh_args, config)
|
||||
except Exception:
|
||||
debug("URL refresh skipped (error)")
|
||||
return 0
|
||||
|
||||
if not hash_hex and not file_path:
|
||||
log("Selected result does not include a file path or Hydrus hash", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
return 1
|
||||
|
||||
# Register cmdlet
|
||||
register(["add-url", "add_url"])(Add_Url)
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -8,19 +8,19 @@ from helper.logger import log
|
||||
|
||||
from . import register
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, should_show_help
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="check-file-status",
|
||||
summary="Check if a file is active, deleted, or corrupted in Hydrus.",
|
||||
usage="check-file-status [-hash <sha256>]",
|
||||
args=[
|
||||
CmdletArg("-hash", description="File hash (SHA256) to check. If not provided, uses selected result."),
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Shows whether file is active in Hydrus or marked as deleted",
|
||||
"- Detects corrupted data (e.g., comma-separated URLs)",
|
||||
"- Detects corrupted data (e.g., comma-separated url)",
|
||||
"- Displays file metadata and service locations",
|
||||
"- Note: Hydrus keeps deleted files for recovery. Use cleanup-corrupted for full removal.",
|
||||
],
|
||||
@@ -30,12 +30,9 @@ CMDLET = Cmdlet(
|
||||
@register(["check-file-status", "check-status", "file-status", "status"])
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Parse arguments
|
||||
override_hash: str | None = None
|
||||
@@ -109,11 +106,11 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
log(f" - {sname} ({stype}) - deleted at {time_deleted}", file=sys.stderr)
|
||||
|
||||
# URL check
|
||||
urls = file_info.get("known_urls", [])
|
||||
log(f"\n🔗 URLs ({len(urls)}):", file=sys.stderr)
|
||||
url = file_info.get("url", [])
|
||||
log(f"\n🔗 url ({len(url)}):", file=sys.stderr)
|
||||
|
||||
corrupted_count = 0
|
||||
for i, url in enumerate(urls, 1):
|
||||
for i, url in enumerate(url, 1):
|
||||
if "," in url:
|
||||
corrupted_count += 1
|
||||
log(f" [{i}] ⚠️ CORRUPTED (comma-separated): {url[:50]}...", file=sys.stderr)
|
||||
|
||||
@@ -9,11 +9,12 @@ from __future__ import annotations
|
||||
from typing import Any, Dict, Sequence
|
||||
from pathlib import Path
|
||||
import sys
|
||||
import json
|
||||
|
||||
from helper.logger import log
|
||||
|
||||
from . import register
|
||||
from ._shared import Cmdlet, CmdletArg, get_pipe_object_path, normalize_result_input, filter_results_by_temp
|
||||
from ._shared import Cmdlet, CmdletArg, get_pipe_object_path, normalize_result_input, filter_results_by_temp, should_show_help
|
||||
import models
|
||||
import pipeline as pipeline_context
|
||||
|
||||
@@ -36,13 +37,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""
|
||||
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
import json
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Normalize input to list
|
||||
results = normalize_result_input(result)
|
||||
@@ -97,8 +94,8 @@ CMDLET = Cmdlet(
|
||||
name="cleanup",
|
||||
summary="Remove temporary artifacts from pipeline (marked with is_temp=True).",
|
||||
usage="cleanup",
|
||||
args=[],
|
||||
details=[
|
||||
arg=[],
|
||||
detail=[
|
||||
"- Accepts pipeline results that may contain temporary files (screenshots, intermediate artifacts)",
|
||||
"- Deletes files marked with is_temp=True from disk",
|
||||
"- Also cleans up associated sidecar files (.tags, .metadata)",
|
||||
|
||||
@@ -1,398 +1,249 @@
|
||||
"""Delete-file cmdlet: Delete files from local storage and/or Hydrus."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence
|
||||
import json
|
||||
import sys
|
||||
|
||||
from helper.logger import debug, log
|
||||
import sqlite3
|
||||
from pathlib import Path
|
||||
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper.logger import debug, log
|
||||
from helper.store import Folder
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, looks_like_hash, get_origin, get_field, should_show_help
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, looks_like_hash
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryDB
|
||||
import pipeline as ctx
|
||||
|
||||
|
||||
def _refresh_last_search(config: Dict[str, Any]) -> None:
|
||||
"""Re-run the last search-file to refresh the table after deletes."""
|
||||
try:
|
||||
source_cmd = ctx.get_last_result_table_source_command() if hasattr(ctx, "get_last_result_table_source_command") else None
|
||||
if source_cmd not in {"search-file", "search_file", "search"}:
|
||||
return
|
||||
class Delete_File(Cmdlet):
|
||||
"""Class-based delete-file cmdlet with self-registration."""
|
||||
|
||||
args = ctx.get_last_result_table_source_args() if hasattr(ctx, "get_last_result_table_source_args") else []
|
||||
try:
|
||||
from cmdlets import search_file as search_file_cmd # type: ignore
|
||||
except Exception:
|
||||
return
|
||||
def __init__(self) -> None:
|
||||
super().__init__(
|
||||
name="delete-file",
|
||||
summary="Delete a file locally and/or from Hydrus, including database entries.",
|
||||
usage="delete-file [-hash <sha256>] [-conserve <local|hydrus>] [-lib-root <path>] [reason]",
|
||||
alias=["del-file"],
|
||||
arg=[
|
||||
CmdletArg("hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("conserve", description="Choose which copy to keep: 'local' or 'hydrus'."),
|
||||
CmdletArg("lib-root", description="Path to local library root for database cleanup."),
|
||||
CmdletArg("reason", description="Optional reason for deletion (free text)."),
|
||||
],
|
||||
detail=[
|
||||
"Default removes both the local file and Hydrus file.",
|
||||
"Use -conserve local to keep the local file, or -conserve hydrus to keep it in Hydrus.",
|
||||
"Database entries are automatically cleaned up for local files.",
|
||||
"Any remaining arguments are treated as the Hydrus reason text.",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
# Re-run the prior search to refresh items/table without disturbing history
|
||||
search_file_cmd._run(None, args, config)
|
||||
def _process_single_item(self, item: Any, override_hash: str | None, conserve: str | None,
|
||||
lib_root: str | None, reason: str, config: Dict[str, Any]) -> bool:
|
||||
"""Process deletion for a single item."""
|
||||
# Handle item as either dict or object
|
||||
if isinstance(item, dict):
|
||||
hash_hex_raw = item.get("hash_hex") or item.get("hash")
|
||||
target = item.get("target") or item.get("file_path") or item.get("path")
|
||||
else:
|
||||
hash_hex_raw = get_field(item, "hash_hex") or get_field(item, "hash")
|
||||
target = get_field(item, "target") or get_field(item, "file_path") or get_field(item, "path")
|
||||
|
||||
origin = get_origin(item)
|
||||
|
||||
# Also check the store field explicitly from PipeObject
|
||||
store = None
|
||||
if isinstance(item, dict):
|
||||
store = item.get("store")
|
||||
else:
|
||||
store = get_field(item, "store")
|
||||
|
||||
# For Hydrus files, the target IS the hash
|
||||
if origin and origin.lower() == "hydrus" and not hash_hex_raw:
|
||||
hash_hex_raw = target
|
||||
|
||||
# Set an overlay so action-command pipeline output displays the refreshed table
|
||||
try:
|
||||
new_table = ctx.get_last_result_table()
|
||||
new_items = ctx.get_last_result_items()
|
||||
subject = ctx.get_last_result_subject() if hasattr(ctx, "get_last_result_subject") else None
|
||||
if hasattr(ctx, "set_last_result_table_overlay") and new_table and new_items is not None:
|
||||
ctx.set_last_result_table_overlay(new_table, new_items, subject)
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as exc:
|
||||
debug(f"[delete_file] search refresh failed: {exc}", file=sys.stderr)
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(hash_hex_raw)
|
||||
|
||||
|
||||
|
||||
|
||||
def _cleanup_relationships(db_path: Path, file_hash: str) -> int:
|
||||
"""Remove references to file_hash from other files' relationships."""
|
||||
try:
|
||||
conn = sqlite3.connect(db_path)
|
||||
cursor = conn.cursor()
|
||||
local_deleted = False
|
||||
local_target = isinstance(target, str) and target.strip() and not str(target).lower().startswith(("http://", "https://"))
|
||||
|
||||
# Find all metadata entries that contain this hash in relationships
|
||||
cursor.execute("SELECT file_id, relationships FROM metadata WHERE relationships LIKE ?", (f'%{file_hash}%',))
|
||||
rows = cursor.fetchall()
|
||||
|
||||
rel_update_count = 0
|
||||
for row_fid, rel_json in rows:
|
||||
try:
|
||||
rels = json.loads(rel_json)
|
||||
changed = False
|
||||
if isinstance(rels, dict):
|
||||
for r_type, hashes in rels.items():
|
||||
if isinstance(hashes, list) and file_hash in hashes:
|
||||
hashes.remove(file_hash)
|
||||
changed = True
|
||||
|
||||
if changed:
|
||||
cursor.execute("UPDATE metadata SET relationships = ? WHERE file_id = ?", (json.dumps(rels), row_fid))
|
||||
rel_update_count += 1
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
if rel_update_count > 0:
|
||||
debug(f"Removed relationship references from {rel_update_count} other files", file=sys.stderr)
|
||||
return rel_update_count
|
||||
except Exception as e:
|
||||
debug(f"Error cleaning up relationships: {e}", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
|
||||
def _delete_database_entry(db_path: Path, file_path: str) -> bool:
|
||||
"""Delete file and related entries from local library database.
|
||||
|
||||
Args:
|
||||
db_path: Path to the library.db file
|
||||
file_path: Exact file path string as stored in database
|
||||
|
||||
Returns:
|
||||
True if successful, False otherwise
|
||||
"""
|
||||
try:
|
||||
if not db_path.exists():
|
||||
debug(f"Database not found at {db_path}", file=sys.stderr)
|
||||
return False
|
||||
|
||||
conn = sqlite3.connect(db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
debug(f"Searching database for file_path: {file_path}", file=sys.stderr)
|
||||
|
||||
# Find the file_id using the exact file_path
|
||||
cursor.execute('SELECT id FROM files WHERE file_path = ?', (file_path,))
|
||||
result = cursor.fetchone()
|
||||
|
||||
if not result:
|
||||
debug(f"File path not found in database: {file_path}", file=sys.stderr)
|
||||
conn.close()
|
||||
return False
|
||||
|
||||
file_id = result[0]
|
||||
|
||||
# Get file hash before deletion to clean up relationships
|
||||
cursor.execute('SELECT file_hash FROM files WHERE id = ?', (file_id,))
|
||||
hash_result = cursor.fetchone()
|
||||
file_hash = hash_result[0] if hash_result else None
|
||||
|
||||
debug(f"Found file_id={file_id}, deleting all related records", file=sys.stderr)
|
||||
|
||||
# Delete related records
|
||||
cursor.execute('DELETE FROM metadata WHERE file_id = ?', (file_id,))
|
||||
meta_count = cursor.rowcount
|
||||
|
||||
cursor.execute('DELETE FROM tags WHERE file_id = ?', (file_id,))
|
||||
tags_count = cursor.rowcount
|
||||
|
||||
cursor.execute('DELETE FROM notes WHERE file_id = ?', (file_id,))
|
||||
notes_count = cursor.rowcount
|
||||
|
||||
cursor.execute('DELETE FROM files WHERE id = ?', (file_id,))
|
||||
files_count = cursor.rowcount
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
# Clean up relationships in other files
|
||||
if file_hash:
|
||||
_cleanup_relationships(db_path, file_hash)
|
||||
|
||||
debug(f"Deleted: metadata={meta_count}, tags={tags_count}, notes={notes_count}, files={files_count}", file=sys.stderr)
|
||||
return True
|
||||
|
||||
except Exception as exc:
|
||||
log(f"Database cleanup failed: {exc}", file=sys.stderr)
|
||||
import traceback
|
||||
traceback.print_exc(file=sys.stderr)
|
||||
return False
|
||||
|
||||
|
||||
def _process_single_item(item: Any, override_hash: str | None, conserve: str | None,
|
||||
lib_root: str | None, reason: str, config: Dict[str, Any]) -> bool:
|
||||
"""Process deletion for a single item."""
|
||||
# Handle item as either dict or object
|
||||
if isinstance(item, dict):
|
||||
hash_hex_raw = item.get("hash_hex") or item.get("hash")
|
||||
target = item.get("target")
|
||||
origin = item.get("origin")
|
||||
else:
|
||||
hash_hex_raw = getattr(item, "hash_hex", None) or getattr(item, "hash", None)
|
||||
target = getattr(item, "target", None)
|
||||
origin = getattr(item, "origin", None)
|
||||
|
||||
# For Hydrus files, the target IS the hash
|
||||
if origin and origin.lower() == "hydrus" and not hash_hex_raw:
|
||||
hash_hex_raw = target
|
||||
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(hash_hex_raw)
|
||||
|
||||
local_deleted = False
|
||||
local_target = isinstance(target, str) and target.strip() and not str(target).lower().startswith(("http://", "https://"))
|
||||
|
||||
# Try to resolve local path if target looks like a hash and we have a library root
|
||||
if local_target and looks_like_hash(str(target)) and lib_root:
|
||||
try:
|
||||
db_path = Path(lib_root) / ".downlow_library.db"
|
||||
if db_path.exists():
|
||||
# We can't use LocalLibraryDB context manager easily here without importing it,
|
||||
# but we can use a quick sqlite connection or just use the class if imported.
|
||||
# We imported LocalLibraryDB, so let's use it.
|
||||
with LocalLibraryDB(Path(lib_root)) as db:
|
||||
resolved = db.search_by_hash(str(target))
|
||||
if resolved:
|
||||
target = str(resolved)
|
||||
# Also ensure we have the hash set for Hydrus deletion if needed
|
||||
if not hash_hex:
|
||||
hash_hex = normalize_hash(str(target))
|
||||
except Exception as e:
|
||||
debug(f"Failed to resolve hash to local path: {e}", file=sys.stderr)
|
||||
|
||||
if conserve != "local" and local_target:
|
||||
path = Path(str(target))
|
||||
file_path_str = str(target) # Keep the original string for DB matching
|
||||
try:
|
||||
if path.exists() and path.is_file():
|
||||
path.unlink()
|
||||
local_deleted = True
|
||||
if ctx._PIPE_ACTIVE:
|
||||
ctx.emit(f"Removed local file: {path}")
|
||||
log(f"Deleted: {path.name}", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
log(f"Local delete failed: {exc}", file=sys.stderr)
|
||||
|
||||
# Remove common sidecars regardless of file removal success
|
||||
for sidecar in (path.with_suffix(".tags"), path.with_suffix(".tags.txt"),
|
||||
path.with_suffix(".metadata"), path.with_suffix(".notes")):
|
||||
try:
|
||||
if sidecar.exists() and sidecar.is_file():
|
||||
sidecar.unlink()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Clean up database entry if library root provided - do this regardless of file deletion success
|
||||
if lib_root:
|
||||
lib_root_path = Path(lib_root)
|
||||
db_path = lib_root_path / ".downlow_library.db"
|
||||
if conserve != "local" and local_target:
|
||||
path = Path(str(target))
|
||||
|
||||
# If file_path_str is a hash (because file was already deleted or target was hash),
|
||||
# we need to find the path by hash in the DB first
|
||||
if looks_like_hash(file_path_str):
|
||||
# If lib_root is provided and this is from a folder store, use the Folder class
|
||||
if lib_root:
|
||||
try:
|
||||
with LocalLibraryDB(lib_root_path) as db:
|
||||
resolved = db.search_by_hash(file_path_str)
|
||||
if resolved:
|
||||
file_path_str = str(resolved)
|
||||
folder = Folder(Path(lib_root), name=origin or "local")
|
||||
if folder.delete_file(str(path)):
|
||||
local_deleted = True
|
||||
ctx.emit(f"Removed file: {path.name}")
|
||||
log(f"Deleted: {path.name}", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
debug(f"Folder.delete_file failed: {exc}", file=sys.stderr)
|
||||
# Fallback to manual deletion
|
||||
try:
|
||||
if path.exists() and path.is_file():
|
||||
path.unlink()
|
||||
local_deleted = True
|
||||
ctx.emit(f"Removed local file: {path}")
|
||||
log(f"Deleted: {path.name}", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
log(f"Local delete failed: {exc}", file=sys.stderr)
|
||||
else:
|
||||
# No lib_root, just delete the file
|
||||
try:
|
||||
if path.exists() and path.is_file():
|
||||
path.unlink()
|
||||
local_deleted = True
|
||||
ctx.emit(f"Removed local file: {path}")
|
||||
log(f"Deleted: {path.name}", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
log(f"Local delete failed: {exc}", file=sys.stderr)
|
||||
|
||||
# Remove common sidecars regardless of file removal success
|
||||
for sidecar in (path.with_suffix(".tags"), path.with_suffix(".tags.txt"),
|
||||
path.with_suffix(".metadata"), path.with_suffix(".notes")):
|
||||
try:
|
||||
if sidecar.exists() and sidecar.is_file():
|
||||
sidecar.unlink()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
db_success = _delete_database_entry(db_path, file_path_str)
|
||||
|
||||
if not db_success:
|
||||
# If deletion failed (e.g. not found), but we have a hash, try to clean up relationships anyway
|
||||
effective_hash = None
|
||||
if looks_like_hash(file_path_str):
|
||||
effective_hash = file_path_str
|
||||
elif hash_hex:
|
||||
effective_hash = hash_hex
|
||||
|
||||
if effective_hash:
|
||||
debug(f"Entry not found, but attempting to clean up relationships for hash: {effective_hash}", file=sys.stderr)
|
||||
if _cleanup_relationships(db_path, effective_hash) > 0:
|
||||
db_success = True
|
||||
|
||||
if db_success:
|
||||
if ctx._PIPE_ACTIVE:
|
||||
ctx.emit(f"Removed database entry: {path.name}")
|
||||
debug(f"Database entry cleaned up", file=sys.stderr)
|
||||
local_deleted = True
|
||||
else:
|
||||
debug(f"Database entry not found or cleanup failed for {file_path_str}", file=sys.stderr)
|
||||
else:
|
||||
debug(f"No lib_root provided, skipping database cleanup", file=sys.stderr)
|
||||
|
||||
hydrus_deleted = False
|
||||
# Only attempt Hydrus deletion if origin is explicitly Hydrus or if we failed to delete locally
|
||||
# and we suspect it might be in Hydrus.
|
||||
# If origin is local, we should default to NOT deleting from Hydrus unless requested?
|
||||
# Or maybe we should check if it exists in Hydrus first?
|
||||
# The user complaint is "its still trying to delete hydrus, this is a local file".
|
||||
|
||||
should_try_hydrus = True
|
||||
if origin and origin.lower() == "local":
|
||||
should_try_hydrus = False
|
||||
|
||||
# If conserve is set to hydrus, definitely don't delete
|
||||
if conserve == "hydrus":
|
||||
hydrus_deleted = False
|
||||
# Only attempt Hydrus deletion if store is explicitly Hydrus-related
|
||||
# Check both origin and store fields to determine if this is a Hydrus file
|
||||
|
||||
should_try_hydrus = False
|
||||
|
||||
if should_try_hydrus and hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
except Exception as exc:
|
||||
if not local_deleted:
|
||||
log(f"Hydrus client unavailable: {exc}", file=sys.stderr)
|
||||
return False
|
||||
else:
|
||||
if client is None:
|
||||
# Check if store indicates this is a Hydrus backend
|
||||
if store and ("hydrus" in store.lower() or store.lower() == "home" or store.lower() == "work"):
|
||||
should_try_hydrus = True
|
||||
# Fallback to origin check if store not available
|
||||
elif origin and origin.lower() == "hydrus":
|
||||
should_try_hydrus = True
|
||||
|
||||
# If conserve is set to hydrus, definitely don't delete
|
||||
if conserve == "hydrus":
|
||||
should_try_hydrus = False
|
||||
|
||||
if should_try_hydrus and hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
except Exception as exc:
|
||||
if not local_deleted:
|
||||
# If we deleted locally, we don't care if Hydrus is unavailable
|
||||
pass
|
||||
else:
|
||||
log("Hydrus client unavailable", file=sys.stderr)
|
||||
log(f"Hydrus client unavailable: {exc}", file=sys.stderr)
|
||||
return False
|
||||
else:
|
||||
payload: Dict[str, Any] = {"hashes": [hash_hex]}
|
||||
if reason:
|
||||
payload["reason"] = reason
|
||||
try:
|
||||
client._post("/add_files/delete_files", data=payload) # type: ignore[attr-defined]
|
||||
hydrus_deleted = True
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
debug(f"Deleted from Hydrus: {preview}…", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
# If it's not in Hydrus (e.g. 404 or similar), that's fine
|
||||
# log(f"Hydrus delete failed: {exc}", file=sys.stderr)
|
||||
if client is None:
|
||||
if not local_deleted:
|
||||
log("Hydrus client unavailable", file=sys.stderr)
|
||||
return False
|
||||
else:
|
||||
payload: Dict[str, Any] = {"hashes": [hash_hex]}
|
||||
if reason:
|
||||
payload["reason"] = reason
|
||||
try:
|
||||
client._post("/add_files/delete_files", data=payload) # type: ignore[attr-defined]
|
||||
hydrus_deleted = True
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
debug(f"Deleted from Hydrus: {preview}…", file=sys.stderr)
|
||||
except Exception as exc:
|
||||
# If it's not in Hydrus (e.g. 404 or similar), that's fine
|
||||
if not local_deleted:
|
||||
return False
|
||||
|
||||
if hydrus_deleted and hash_hex:
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
if ctx._PIPE_ACTIVE:
|
||||
if hydrus_deleted and hash_hex:
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
if reason:
|
||||
ctx.emit(f"Deleted {preview} (reason: {reason}).")
|
||||
else:
|
||||
ctx.emit(f"Deleted {preview}.")
|
||||
|
||||
if hydrus_deleted or local_deleted:
|
||||
return True
|
||||
if hydrus_deleted or local_deleted:
|
||||
return True
|
||||
|
||||
log("Selected result has neither Hydrus hash nor local file target")
|
||||
return False
|
||||
log("Selected result has neither Hydrus hash nor local file target")
|
||||
return False
|
||||
|
||||
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Execute delete-file command."""
|
||||
if should_show_help(args):
|
||||
log(f"Cmdlet: {self.name}\nSummary: {self.summary}\nUsage: {self.usage}")
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
override_hash: str | None = None
|
||||
conserve: str | None = None
|
||||
lib_root: str | None = None
|
||||
reason_tokens: list[str] = []
|
||||
i = 0
|
||||
while i < len(args):
|
||||
token = args[i]
|
||||
low = str(token).lower()
|
||||
if low in {"-hash", "--hash", "hash"} and i + 1 < len(args):
|
||||
override_hash = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
if low in {"-conserve", "--conserve"} and i + 1 < len(args):
|
||||
value = str(args[i + 1]).strip().lower()
|
||||
if value in {"local", "hydrus"}:
|
||||
conserve = value
|
||||
# Parse arguments
|
||||
override_hash: str | None = None
|
||||
conserve: str | None = None
|
||||
lib_root: str | None = None
|
||||
reason_tokens: list[str] = []
|
||||
i = 0
|
||||
|
||||
while i < len(args):
|
||||
token = args[i]
|
||||
low = str(token).lower()
|
||||
if low in {"-hash", "--hash", "hash"} and i + 1 < len(args):
|
||||
override_hash = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
if low in {"-lib-root", "--lib-root", "lib-root"} and i + 1 < len(args):
|
||||
lib_root = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
reason_tokens.append(token)
|
||||
i += 1
|
||||
if low in {"-conserve", "--conserve"} and i + 1 < len(args):
|
||||
value = str(args[i + 1]).strip().lower()
|
||||
if value in {"local", "hydrus"}:
|
||||
conserve = value
|
||||
i += 2
|
||||
continue
|
||||
if low in {"-lib-root", "--lib-root", "lib-root"} and i + 1 < len(args):
|
||||
lib_root = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
reason_tokens.append(token)
|
||||
i += 1
|
||||
|
||||
if not lib_root:
|
||||
# Try to get from config
|
||||
p = get_local_storage_path(config)
|
||||
if p:
|
||||
lib_root = str(p)
|
||||
# If no lib_root provided, try to get the first folder store from config
|
||||
if not lib_root:
|
||||
try:
|
||||
storage_config = config.get("storage", {})
|
||||
folder_config = storage_config.get("folder", {})
|
||||
if folder_config:
|
||||
# Get first folder store path
|
||||
for store_name, store_config in folder_config.items():
|
||||
if isinstance(store_config, dict):
|
||||
path = store_config.get("path")
|
||||
if path:
|
||||
lib_root = path
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
reason = " ".join(token for token in reason_tokens if str(token).strip()).strip()
|
||||
reason = " ".join(token for token in reason_tokens if str(token).strip()).strip()
|
||||
|
||||
items = []
|
||||
if isinstance(result, list):
|
||||
items = result
|
||||
elif result:
|
||||
items = [result]
|
||||
|
||||
if not items:
|
||||
log("No items to delete", file=sys.stderr)
|
||||
return 1
|
||||
items = []
|
||||
if isinstance(result, list):
|
||||
items = result
|
||||
elif result:
|
||||
items = [result]
|
||||
|
||||
if not items:
|
||||
log("No items to delete", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
success_count = 0
|
||||
for item in items:
|
||||
if _process_single_item(item, override_hash, conserve, lib_root, reason, config):
|
||||
success_count += 1
|
||||
success_count = 0
|
||||
for item in items:
|
||||
if self._process_single_item(item, override_hash, conserve, lib_root, reason, config):
|
||||
success_count += 1
|
||||
|
||||
if success_count > 0:
|
||||
_refresh_last_search(config)
|
||||
if success_count > 0:
|
||||
# Clear cached tables/items so deleted entries are not redisplayed
|
||||
try:
|
||||
ctx.set_last_result_table_overlay(None, None, None)
|
||||
ctx.set_last_result_table(None, [])
|
||||
ctx.set_last_result_items_only([])
|
||||
ctx.set_current_stage_table(None)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return 0 if success_count > 0 else 1
|
||||
return 0 if success_count > 0 else 1
|
||||
|
||||
|
||||
# Instantiate and register the cmdlet
|
||||
Delete_File()
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="delete-file",
|
||||
summary="Delete a file locally and/or from Hydrus, including database entries.",
|
||||
usage="delete-file [-hash <sha256>] [-conserve <local|hydrus>] [-lib-root <path>] [reason]",
|
||||
aliases=["del-file"],
|
||||
args=[
|
||||
CmdletArg("hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("conserve", description="Choose which copy to keep: 'local' or 'hydrus'."),
|
||||
CmdletArg("lib-root", description="Path to local library root for database cleanup."),
|
||||
CmdletArg("reason", description="Optional reason for deletion (free text)."),
|
||||
],
|
||||
details=[
|
||||
"Default removes both the local file and Hydrus file.",
|
||||
"Use -conserve local to keep the local file, or -conserve hydrus to keep it in Hydrus.",
|
||||
"Database entries are automatically cleaned up for local files.",
|
||||
"Any remaining arguments are treated as the Hydrus reason text.",
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
@@ -5,18 +5,18 @@ import json
|
||||
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, get_hash_for_operation, fetch_hydrus_metadata, should_show_help, get_field
|
||||
from helper.logger import log
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="delete-note",
|
||||
summary="Delete a named note from a Hydrus file.",
|
||||
usage="i | del-note [-hash <sha256>] <name>",
|
||||
aliases=["del-note"],
|
||||
args=[
|
||||
alias=["del-note"],
|
||||
arg=[
|
||||
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Removes the note with the given name from the Hydrus file.",
|
||||
],
|
||||
)
|
||||
@@ -24,12 +24,9 @@ CMDLET = Cmdlet(
|
||||
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
if not args:
|
||||
log("Requires the note name/key to delete")
|
||||
return 1
|
||||
@@ -57,7 +54,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if isinstance(result, list) and len(result) > 0:
|
||||
result = result[0]
|
||||
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(getattr(result, "hash_hex", None))
|
||||
hash_hex = get_hash_for_operation(override_hash, result)
|
||||
if not hash_hex:
|
||||
log("Selected result does not include a Hydrus hash")
|
||||
return 1
|
||||
@@ -93,7 +90,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if isinstance(subject, dict):
|
||||
subj_hashes = [norm(v) for v in [subject.get("hydrus_hash"), subject.get("hash"), subject.get("hash_hex"), subject.get("file_hash")] if v]
|
||||
else:
|
||||
subj_hashes = [norm(getattr(subject, f, None)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if getattr(subject, f, None)]
|
||||
subj_hashes = [norm(get_field(subject, f)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if get_field(subject, f)]
|
||||
if target_hash and target_hash in subj_hashes:
|
||||
get_note_cmd.get_notes(subject, ["-hash", hash_hex], config)
|
||||
return 0
|
||||
|
||||
@@ -10,8 +10,8 @@ import sys
|
||||
from helper.logger import log
|
||||
|
||||
import pipeline as ctx
|
||||
from ._shared import Cmdlet, CmdletArg, parse_cmdlet_args, normalize_result_input
|
||||
from helper.local_library import LocalLibrarySearchOptimizer
|
||||
from ._shared import Cmdlet, CmdletArg, parse_cmdlet_args, normalize_result_input, get_field
|
||||
from helper.folder_store import LocalLibrarySearchOptimizer
|
||||
from config import get_local_storage_path
|
||||
|
||||
|
||||
@@ -35,12 +35,14 @@ def _refresh_relationship_view_if_current(target_hash: Optional[str], target_pat
|
||||
|
||||
subj_hashes: list[str] = []
|
||||
subj_paths: list[str] = []
|
||||
if isinstance(subject, dict):
|
||||
subj_hashes = [norm(v) for v in [subject.get("hydrus_hash"), subject.get("hash"), subject.get("hash_hex"), subject.get("file_hash")] if v]
|
||||
subj_paths = [norm(v) for v in [subject.get("file_path"), subject.get("path"), subject.get("target")] if v]
|
||||
else:
|
||||
subj_hashes = [norm(getattr(subject, f, None)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if getattr(subject, f, None)]
|
||||
subj_paths = [norm(getattr(subject, f, None)) for f in ("file_path", "path", "target") if getattr(subject, f, None)]
|
||||
for field in ("hydrus_hash", "hash", "hash_hex", "file_hash"):
|
||||
val = get_field(subject, field)
|
||||
if val:
|
||||
subj_hashes.append(norm(val))
|
||||
for field in ("file_path", "path", "target"):
|
||||
val = get_field(subject, field)
|
||||
if val:
|
||||
subj_paths.append(norm(val))
|
||||
|
||||
is_match = False
|
||||
if target_hashes and any(h in subj_hashes for h in target_hashes):
|
||||
@@ -93,21 +95,12 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
for single_result in results:
|
||||
try:
|
||||
# Get file path from result
|
||||
file_path_from_result = None
|
||||
|
||||
if isinstance(single_result, dict):
|
||||
file_path_from_result = (
|
||||
single_result.get("file_path") or
|
||||
single_result.get("path") or
|
||||
single_result.get("target")
|
||||
)
|
||||
else:
|
||||
file_path_from_result = (
|
||||
getattr(single_result, "file_path", None) or
|
||||
getattr(single_result, "path", None) or
|
||||
getattr(single_result, "target", None) or
|
||||
str(single_result)
|
||||
)
|
||||
file_path_from_result = (
|
||||
get_field(single_result, "file_path")
|
||||
or get_field(single_result, "path")
|
||||
or get_field(single_result, "target")
|
||||
or (str(single_result) if not isinstance(single_result, dict) else None)
|
||||
)
|
||||
|
||||
if not file_path_from_result:
|
||||
log("Could not extract file path from result", file=sys.stderr)
|
||||
@@ -199,12 +192,12 @@ CMDLET = Cmdlet(
|
||||
name="delete-relationship",
|
||||
summary="Remove relationships from files.",
|
||||
usage="@1 | delete-relationship --all OR delete-relationship -path <file> --all OR @1-3 | delete-relationship -type alt",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("path", type="string", description="Specify the local file path (if not piping a result)."),
|
||||
CmdletArg("all", type="flag", description="Delete all relationships for the file(s)."),
|
||||
CmdletArg("type", type="string", description="Delete specific relationship type ('alt', 'king', 'related'). Default: delete all types."),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Delete all relationships: pipe files | delete-relationship --all",
|
||||
"- Delete specific type: pipe files | delete-relationship -type alt",
|
||||
"- Delete all from file: delete-relationship -path <file> --all",
|
||||
|
||||
@@ -9,7 +9,7 @@ from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, parse_tag_arguments
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, parse_tag_arguments, fetch_hydrus_metadata, should_show_help, get_field
|
||||
from helper.logger import debug, log
|
||||
|
||||
|
||||
@@ -37,8 +37,8 @@ def _refresh_tag_view_if_current(hash_hex: str | None, file_path: str | None, co
|
||||
subj_hashes = [norm(v) for v in [subject.get("hydrus_hash"), subject.get("hash"), subject.get("hash_hex"), subject.get("file_hash")] if v]
|
||||
subj_paths = [norm(v) for v in [subject.get("file_path"), subject.get("path"), subject.get("target")] if v]
|
||||
else:
|
||||
subj_hashes = [norm(getattr(subject, f, None)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if getattr(subject, f, None)]
|
||||
subj_paths = [norm(getattr(subject, f, None)) for f in ("file_path", "path", "target") if getattr(subject, f, None)]
|
||||
subj_hashes = [norm(get_field(subject, f)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if get_field(subject, f)]
|
||||
subj_paths = [norm(get_field(subject, f)) for f in ("file_path", "path", "target") if get_field(subject, f)]
|
||||
|
||||
is_match = False
|
||||
if target_hash and target_hash in subj_hashes:
|
||||
@@ -60,12 +60,12 @@ CMDLET = Cmdlet(
|
||||
name="delete-tags",
|
||||
summary="Remove tags from a Hydrus file.",
|
||||
usage="del-tags [-hash <sha256>] <tag>[,<tag>...]",
|
||||
aliases=["del-tag", "del-tags", "delete-tag"],
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
alias=["del-tag", "del-tags", "delete-tag"],
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
CmdletArg("<tag>[,<tag>...]", required=True, description="One or more tags to remove. Comma- or space-separated."),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Requires a Hydrus file (hash present) or explicit -hash override.",
|
||||
"- Multiple tags can be comma-separated or space-separated.",
|
||||
],
|
||||
@@ -74,12 +74,9 @@ CMDLET = Cmdlet(
|
||||
@register(["del-tag", "del-tags", "delete-tag", "delete-tags"]) # Still needed for backward compatibility
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Check if we have a piped TagItem with no args (i.e., from @1 | delete-tag)
|
||||
has_piped_tag = (result and hasattr(result, '__class__') and
|
||||
@@ -139,15 +136,15 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if idx - 1 < len(ctx._LAST_RESULT_ITEMS):
|
||||
item = ctx._LAST_RESULT_ITEMS[idx - 1]
|
||||
if hasattr(item, '__class__') and item.__class__.__name__ == 'TagItem':
|
||||
tag_name = getattr(item, 'tag_name', None)
|
||||
tag_name = get_field(item, 'tag_name')
|
||||
if tag_name:
|
||||
log(f"[delete_tag] Extracted tag from @{idx}: {tag_name}")
|
||||
tags_from_at_syntax.append(tag_name)
|
||||
# Also get hash from first item for consistency
|
||||
if not hash_from_at_syntax:
|
||||
hash_from_at_syntax = getattr(item, 'hash_hex', None)
|
||||
hash_from_at_syntax = get_field(item, 'hash_hex')
|
||||
if not file_path_from_at_syntax:
|
||||
file_path_from_at_syntax = getattr(item, 'file_path', None)
|
||||
file_path_from_at_syntax = get_field(item, 'file_path')
|
||||
|
||||
if not tags_from_at_syntax:
|
||||
log(f"No tags found at indices: {indices}")
|
||||
@@ -219,13 +216,13 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
for item in items_to_process:
|
||||
tags_to_delete = []
|
||||
item_hash = normalize_hash(override_hash) if override_hash else normalize_hash(getattr(item, "hash_hex", None))
|
||||
item_path = getattr(item, "path", None) or getattr(item, "file_path", None) or getattr(item, "target", None)
|
||||
# If result is a dict (e.g. from search-file), try getting path from keys
|
||||
if not item_path and isinstance(item, dict):
|
||||
item_path = item.get("path") or item.get("file_path") or item.get("target")
|
||||
|
||||
item_source = getattr(item, "source", None)
|
||||
item_hash = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(item, "hash_hex"))
|
||||
item_path = (
|
||||
get_field(item, "path")
|
||||
or get_field(item, "file_path")
|
||||
or get_field(item, "target")
|
||||
)
|
||||
item_source = get_field(item, "source")
|
||||
|
||||
if hasattr(item, '__class__') and item.__class__.__name__ == 'TagItem':
|
||||
# It's a TagItem
|
||||
@@ -238,7 +235,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Let's assume if args are present, we use args. If not, we use the tag name.
|
||||
tags_to_delete = tags_arg
|
||||
else:
|
||||
tag_name = getattr(item, 'tag_name', None)
|
||||
tag_name = get_field(item, 'tag_name')
|
||||
if tag_name:
|
||||
tags_to_delete = [tag_name]
|
||||
else:
|
||||
@@ -270,34 +267,31 @@ def _process_deletion(tags: list[str], hash_hex: str | None, file_path: str | No
|
||||
# Prefer local DB when we have a path and not explicitly hydrus
|
||||
if file_path and (source == "local" or (source != "hydrus" and not hash_hex)):
|
||||
try:
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
from config import get_local_storage_path
|
||||
path_obj = Path(file_path)
|
||||
local_root = get_local_storage_path(config) or path_obj.parent
|
||||
with LocalLibraryDB(local_root) as db:
|
||||
existing = db.get_tags(path_obj) or []
|
||||
with FolderDB(local_root) as db:
|
||||
file_hash = db.get_file_hash(path_obj)
|
||||
existing = db.get_tags(file_hash) if file_hash else []
|
||||
except Exception:
|
||||
existing = []
|
||||
elif hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
payload = client.fetch_file_metadata(
|
||||
hashes=[hash_hex],
|
||||
include_service_keys_to_tags=True,
|
||||
include_file_urls=False,
|
||||
)
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
meta = items[0] if isinstance(items, list) and items else None
|
||||
if isinstance(meta, dict):
|
||||
tags_payload = meta.get("tags")
|
||||
if isinstance(tags_payload, dict):
|
||||
seen: set[str] = set()
|
||||
for svc_data in tags_payload.values():
|
||||
if not isinstance(svc_data, dict):
|
||||
continue
|
||||
display = svc_data.get("display_tags")
|
||||
if isinstance(display, list):
|
||||
for t in display:
|
||||
meta, _ = fetch_hydrus_metadata(
|
||||
config, hash_hex,
|
||||
include_service_keys_to_tags=True,
|
||||
include_file_url=False,
|
||||
)
|
||||
if isinstance(meta, dict):
|
||||
tags_payload = meta.get("tags")
|
||||
if isinstance(tags_payload, dict):
|
||||
seen: set[str] = set()
|
||||
for svc_data in tags_payload.values():
|
||||
if not isinstance(svc_data, dict):
|
||||
continue
|
||||
display = svc_data.get("display_tags")
|
||||
if isinstance(display, list):
|
||||
for t in display:
|
||||
if isinstance(t, (str, bytes)):
|
||||
val = str(t).strip()
|
||||
if val and val not in seen:
|
||||
@@ -313,8 +307,6 @@ def _process_deletion(tags: list[str], hash_hex: str | None, file_path: str | No
|
||||
if val and val not in seen:
|
||||
seen.add(val)
|
||||
existing.append(val)
|
||||
except Exception:
|
||||
existing = []
|
||||
return existing
|
||||
|
||||
# Safety: only block if this deletion would remove the final title tag
|
||||
@@ -335,7 +327,7 @@ def _process_deletion(tags: list[str], hash_hex: str | None, file_path: str | No
|
||||
# Handle local file tag deletion
|
||||
if file_path and (source == "local" or (not hash_hex and source != "hydrus")):
|
||||
try:
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
from pathlib import Path
|
||||
|
||||
path_obj = Path(file_path)
|
||||
@@ -351,7 +343,7 @@ def _process_deletion(tags: list[str], hash_hex: str | None, file_path: str | No
|
||||
# Fallback: assume file is in a library root or use its parent
|
||||
local_root = path_obj.parent
|
||||
|
||||
with LocalLibraryDB(local_root) as db:
|
||||
with FolderDB(local_root) as db:
|
||||
db.remove_tags(path_obj, tags)
|
||||
debug(f"Removed {len(tags)} tag(s) from {path_obj.name} (local)")
|
||||
_refresh_tag_view_if_current(hash_hex, file_path, config)
|
||||
|
||||
@@ -1,194 +1,82 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from . import register
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from helper.logger import debug, log
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryDB
|
||||
import pipeline as ctx
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="delete-url",
|
||||
summary="Remove a URL association from a file (Hydrus or Local).",
|
||||
usage="delete-url [-hash <sha256>] <url>",
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
CmdletArg("url", required=True, description="The URL to remove from the file."),
|
||||
],
|
||||
details=[
|
||||
"- Removes the URL from the file's known URL list.",
|
||||
],
|
||||
)
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, get_field, normalize_hash
|
||||
from helper.logger import log
|
||||
from helper.store import FileStorage
|
||||
|
||||
|
||||
def _parse_hash_and_rest(args: Sequence[str]) -> tuple[str | None, list[str]]:
|
||||
override_hash: str | None = None
|
||||
rest: list[str] = []
|
||||
i = 0
|
||||
while i < len(args):
|
||||
a = args[i]
|
||||
low = str(a).lower()
|
||||
if low in {"-hash", "--hash", "hash"} and i + 1 < len(args):
|
||||
override_hash = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
rest.append(a)
|
||||
i += 1
|
||||
return override_hash, rest
|
||||
|
||||
|
||||
@register(["del-url", "delete-url", "delete_url"]) # aliases
|
||||
def delete(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
class Delete_Url(Cmdlet):
|
||||
"""Delete URL associations from files via hash+store."""
|
||||
|
||||
override_hash, rest = _parse_hash_and_rest(args)
|
||||
NAME = "delete-url"
|
||||
SUMMARY = "Remove a URL association from a file"
|
||||
USAGE = "@1 | delete-url <url>"
|
||||
ARGS = [
|
||||
SharedArgs.HASH,
|
||||
SharedArgs.STORE,
|
||||
CmdletArg("url", required=True, description="URL to remove"),
|
||||
]
|
||||
DETAIL = [
|
||||
"- Removes URL association from file identified by hash+store",
|
||||
"- Multiple url can be comma-separated",
|
||||
]
|
||||
|
||||
url_arg = None
|
||||
if rest:
|
||||
url_arg = str(rest[0] or '').strip()
|
||||
|
||||
# Normalize result to a list
|
||||
items = result if isinstance(result, list) else [result]
|
||||
if not items:
|
||||
log("No input provided.")
|
||||
return 1
|
||||
|
||||
success_count = 0
|
||||
|
||||
for item in items:
|
||||
target_url = url_arg
|
||||
target_file = item
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Delete URL from file via hash+store backend."""
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Check for rich URL object from get-url
|
||||
if isinstance(item, dict) and "url" in item and "source_file" in item:
|
||||
if not target_url:
|
||||
target_url = item["url"]
|
||||
target_file = item["source_file"]
|
||||
# Extract hash and store from result or args
|
||||
file_hash = parsed.get("hash") or get_field(result, "hash")
|
||||
store_name = parsed.get("store") or get_field(result, "store")
|
||||
url_arg = parsed.get("url")
|
||||
|
||||
if not target_url:
|
||||
continue
|
||||
|
||||
if _delete_single(target_file, target_url, override_hash, config):
|
||||
success_count += 1
|
||||
|
||||
if success_count == 0:
|
||||
if not file_hash:
|
||||
log("Error: No file hash provided")
|
||||
return 1
|
||||
|
||||
if not store_name:
|
||||
log("Error: No store name provided")
|
||||
return 1
|
||||
|
||||
if not url_arg:
|
||||
log("Requires a URL argument or valid selection.")
|
||||
else:
|
||||
log("Failed to delete URL(s).")
|
||||
return 1
|
||||
log("Error: No URL provided")
|
||||
return 1
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
def _delete_single(result: Any, url: str, override_hash: str | None, config: Dict[str, Any]) -> bool:
|
||||
# Helper to get field from both dict and object
|
||||
def get_field(obj: Any, field: str, default: Any = None) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
return obj.get(field, default)
|
||||
else:
|
||||
return getattr(obj, field, default)
|
||||
|
||||
success = False
|
||||
|
||||
# 1. Try Local Library
|
||||
file_path = get_field(result, "file_path") or get_field(result, "path")
|
||||
if file_path and not override_hash:
|
||||
# Normalize hash
|
||||
file_hash = normalize_hash(file_hash)
|
||||
if not file_hash:
|
||||
log("Error: Invalid hash format")
|
||||
return 1
|
||||
|
||||
# Parse url (comma-separated)
|
||||
url = [u.strip() for u in str(url_arg).split(',') if u.strip()]
|
||||
if not url:
|
||||
log("Error: No valid url provided")
|
||||
return 1
|
||||
|
||||
# Get backend and delete url
|
||||
try:
|
||||
path_obj = Path(file_path)
|
||||
if path_obj.exists():
|
||||
storage_path = get_local_storage_path(config)
|
||||
if storage_path:
|
||||
with LocalLibraryDB(storage_path) as db:
|
||||
metadata = db.get_metadata(path_obj) or {}
|
||||
known_urls = metadata.get("known_urls") or []
|
||||
|
||||
# Handle comma-separated URLs if passed as arg
|
||||
# But first check if the exact url string exists (e.g. if it contains commas itself)
|
||||
urls_to_process = []
|
||||
if url in known_urls:
|
||||
urls_to_process = [url]
|
||||
else:
|
||||
urls_to_process = [u.strip() for u in url.split(',') if u.strip()]
|
||||
|
||||
local_changed = False
|
||||
for u in urls_to_process:
|
||||
if u in known_urls:
|
||||
known_urls.remove(u)
|
||||
local_changed = True
|
||||
ctx.emit(f"Deleted URL from local file {path_obj.name}: {u}")
|
||||
|
||||
if local_changed:
|
||||
metadata["known_urls"] = known_urls
|
||||
db.save_metadata(path_obj, metadata)
|
||||
success = True
|
||||
except Exception as e:
|
||||
log(f"Error updating local library: {e}", file=sys.stderr)
|
||||
|
||||
# 2. Try Hydrus
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
|
||||
if hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
if client:
|
||||
urls_to_delete = [u.strip() for u in url.split(',') if u.strip()]
|
||||
for u in urls_to_delete:
|
||||
client.delete_url(hash_hex, u)
|
||||
preview = hash_hex[:12] + ('…' if len(hash_hex) > 12 else '')
|
||||
ctx.emit(f"Deleted URL from Hydrus file {preview}: {u}")
|
||||
success = True
|
||||
storage = FileStorage(config)
|
||||
backend = storage[store_name]
|
||||
|
||||
for url in url:
|
||||
backend.delete_url(file_hash, url)
|
||||
ctx.emit(f"Deleted URL: {url}")
|
||||
|
||||
return 0
|
||||
|
||||
except KeyError:
|
||||
log(f"Error: Storage backend '{store_name}' not configured")
|
||||
return 1
|
||||
except Exception as exc:
|
||||
log(f"Hydrus del-url failed: {exc}", file=sys.stderr)
|
||||
log(f"Error deleting URL: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if success:
|
||||
try:
|
||||
from cmdlets import get_url as get_url_cmd # type: ignore
|
||||
except Exception:
|
||||
get_url_cmd = None
|
||||
if get_url_cmd:
|
||||
try:
|
||||
subject = ctx.get_last_result_subject()
|
||||
if subject is not None:
|
||||
def norm(val: Any) -> str:
|
||||
return str(val).lower()
|
||||
|
||||
target_hash = norm(hash_hex) if hash_hex else None
|
||||
target_path = norm(file_path) if file_path else None
|
||||
|
||||
subj_hashes = []
|
||||
subj_paths = []
|
||||
if isinstance(subject, dict):
|
||||
subj_hashes = [norm(v) for v in [subject.get("hydrus_hash"), subject.get("hash"), subject.get("hash_hex"), subject.get("file_hash")] if v]
|
||||
subj_paths = [norm(v) for v in [subject.get("file_path"), subject.get("path"), subject.get("target")] if v]
|
||||
else:
|
||||
subj_hashes = [norm(getattr(subject, f, None)) for f in ("hydrus_hash", "hash", "hash_hex", "file_hash") if getattr(subject, f, None)]
|
||||
subj_paths = [norm(getattr(subject, f, None)) for f in ("file_path", "path", "target") if getattr(subject, f, None)]
|
||||
|
||||
is_match = False
|
||||
if target_hash and target_hash in subj_hashes:
|
||||
is_match = True
|
||||
if target_path and target_path in subj_paths:
|
||||
is_match = True
|
||||
|
||||
if is_match:
|
||||
refresh_args: list[str] = []
|
||||
if hash_hex:
|
||||
refresh_args.extend(["-hash", hash_hex])
|
||||
get_url_cmd._run(subject, refresh_args, config)
|
||||
except Exception:
|
||||
debug("URL refresh skipped (error)")
|
||||
|
||||
return success
|
||||
# Register cmdlet
|
||||
register(["delete-url", "del-url", "delete_url"])(Delete_Url)
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
199
cmdlets/download_file.py
Normal file
199
cmdlets/download_file.py
Normal file
@@ -0,0 +1,199 @@
|
||||
"""Download files directly via HTTP (non-yt-dlp url).
|
||||
|
||||
Focused cmdlet for direct file downloads from:
|
||||
- PDFs, images, documents
|
||||
- url not supported by yt-dlp
|
||||
- LibGen sources
|
||||
- Direct file links
|
||||
|
||||
No streaming site logic - pure HTTP download with retries.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Sequence
|
||||
|
||||
from helper.download import DownloadError, _download_direct_file
|
||||
from helper.logger import log, debug
|
||||
from models import DownloadOptions
|
||||
import pipeline as pipeline_context
|
||||
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, register_url_with_local_library, coerce_to_pipe_object
|
||||
|
||||
|
||||
class Download_File(Cmdlet):
|
||||
"""Class-based download-file cmdlet - direct HTTP downloads."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialize download-file cmdlet."""
|
||||
super().__init__(
|
||||
name="download-file",
|
||||
summary="Download files directly via HTTP (PDFs, images, documents)",
|
||||
usage="download-file <url> [options] or search-file | download-file [options]",
|
||||
alias=["dl-file", "download-http"],
|
||||
arg=[
|
||||
CmdletArg(name="url", type="string", required=False, description="URL to download (direct file links)", variadic=True),
|
||||
CmdletArg(name="-url", type="string", description="URL to download (alias for positional argument)", variadic=True),
|
||||
CmdletArg(name="output", type="string", alias="o", description="Output filename (auto-detected if not specified)"),
|
||||
SharedArgs.URL
|
||||
],
|
||||
detail=["Download files directly via HTTP without yt-dlp processing.", "For streaming sites, use download-media."],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Main execution method."""
|
||||
stage_ctx = pipeline_context.get_stage_context()
|
||||
in_pipeline = stage_ctx is not None and getattr(stage_ctx, "total_stages", 1) > 1
|
||||
if in_pipeline and isinstance(config, dict):
|
||||
config["_quiet_background_output"] = True
|
||||
return self._run_impl(result, args, config)
|
||||
|
||||
def _run_impl(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Main download implementation for direct HTTP files."""
|
||||
try:
|
||||
debug("Starting download-file")
|
||||
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Extract options
|
||||
raw_url = parsed.get("url", [])
|
||||
if isinstance(raw_url, str):
|
||||
raw_url = [raw_url]
|
||||
|
||||
if not raw_url:
|
||||
log("No url to download", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get output directory
|
||||
final_output_dir = self._resolve_output_dir(parsed, config)
|
||||
if not final_output_dir:
|
||||
return 1
|
||||
|
||||
debug(f"Output directory: {final_output_dir}")
|
||||
|
||||
# Download each URL
|
||||
downloaded_count = 0
|
||||
quiet_mode = bool(config.get("_quiet_background_output")) if isinstance(config, dict) else False
|
||||
custom_output = parsed.get("output")
|
||||
|
||||
for url in raw_url:
|
||||
try:
|
||||
debug(f"Processing: {url}")
|
||||
|
||||
# Direct HTTP download
|
||||
result_obj = _download_direct_file(url, final_output_dir, quiet=quiet_mode)
|
||||
debug(f"Download completed, building pipe object...")
|
||||
pipe_obj_dict = self._build_pipe_object(result_obj, url, final_output_dir)
|
||||
debug(f"Emitting result to pipeline...")
|
||||
pipeline_context.emit(pipe_obj_dict)
|
||||
|
||||
# Automatically register url with local library
|
||||
if pipe_obj_dict.get("url"):
|
||||
pipe_obj = coerce_to_pipe_object(pipe_obj_dict)
|
||||
register_url_with_local_library(pipe_obj, config)
|
||||
|
||||
downloaded_count += 1
|
||||
debug("✓ Downloaded and emitted")
|
||||
|
||||
except DownloadError as e:
|
||||
log(f"Download failed for {url}: {e}", file=sys.stderr)
|
||||
except Exception as e:
|
||||
log(f"Error processing {url}: {e}", file=sys.stderr)
|
||||
|
||||
if downloaded_count > 0:
|
||||
debug(f"✓ Successfully processed {downloaded_count} file(s)")
|
||||
return 0
|
||||
|
||||
log("No downloads completed", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
except Exception as e:
|
||||
log(f"Error in download-file: {e}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
def _resolve_output_dir(self, parsed: Dict[str, Any], config: Dict[str, Any]) -> Optional[Path]:
|
||||
"""Resolve the output directory from storage location or config."""
|
||||
storage_location = parsed.get("storage")
|
||||
|
||||
# Priority 1: --storage flag
|
||||
if storage_location:
|
||||
try:
|
||||
return SharedArgs.resolve_storage(storage_location)
|
||||
except Exception as e:
|
||||
log(f"Invalid storage location: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
# Priority 2: Config outfile
|
||||
if config and config.get("outfile"):
|
||||
try:
|
||||
return Path(config["outfile"]).expanduser()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Priority 3: Default (home/Downloads)
|
||||
final_output_dir = Path.home() / "Downloads"
|
||||
debug(f"Using default directory: {final_output_dir}")
|
||||
|
||||
# Ensure directory exists
|
||||
try:
|
||||
final_output_dir.mkdir(parents=True, exist_ok=True)
|
||||
except Exception as e:
|
||||
log(f"Cannot create output directory {final_output_dir}: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
return final_output_dir
|
||||
|
||||
def _build_pipe_object(self, download_result: Any, url: str, output_dir: Path) -> Dict[str, Any]:
|
||||
"""Create a PipeObject-compatible dict from a download result."""
|
||||
# Try to get file path from result
|
||||
file_path = None
|
||||
if hasattr(download_result, 'path'):
|
||||
file_path = download_result.path
|
||||
elif isinstance(download_result, dict) and 'path' in download_result:
|
||||
file_path = download_result['path']
|
||||
|
||||
if not file_path:
|
||||
# Fallback: assume result is the path itself
|
||||
file_path = str(download_result)
|
||||
|
||||
media_path = Path(file_path)
|
||||
hash_value = self._compute_file_hash(media_path)
|
||||
title = media_path.stem
|
||||
|
||||
# Build tags with title for searchability
|
||||
tags = [f"title:{title}"]
|
||||
|
||||
# Prefer canonical fields while keeping legacy keys for compatibility
|
||||
return {
|
||||
"path": str(media_path),
|
||||
"hash": hash_value,
|
||||
"file_hash": hash_value,
|
||||
"title": title,
|
||||
"file_title": title,
|
||||
"action": "cmdlet:download-file",
|
||||
"download_mode": "file",
|
||||
"url": url or (download_result.get('url') if isinstance(download_result, dict) else None),
|
||||
"url": [url] if url else [],
|
||||
"store": "local",
|
||||
"storage_source": "downloads",
|
||||
"media_kind": "file",
|
||||
"tags": tags,
|
||||
}
|
||||
|
||||
def _compute_file_hash(self, filepath: Path) -> str:
|
||||
"""Compute SHA256 hash of a file."""
|
||||
import hashlib
|
||||
sha256_hash = hashlib.sha256()
|
||||
with open(filepath, "rb") as f:
|
||||
for byte_block in iter(lambda: f.read(4096), b""):
|
||||
sha256_hash.update(byte_block)
|
||||
return sha256_hash.hexdigest()
|
||||
|
||||
|
||||
# Module-level singleton registration
|
||||
CMDLET = Download_File()
|
||||
1445
cmdlets/download_media.py
Normal file
1445
cmdlets/download_media.py
Normal file
File diff suppressed because it is too large
Load Diff
127
cmdlets/download_torrent.py
Normal file
127
cmdlets/download_torrent.py
Normal file
@@ -0,0 +1,127 @@
|
||||
"""Download torrent/magnet links via AllDebrid in a dedicated cmdlet.
|
||||
|
||||
Features:
|
||||
- Accepts magnet links and .torrent files/url
|
||||
- Uses AllDebrid API for background downloads
|
||||
- Progress tracking and worker management
|
||||
- Self-registering class-based cmdlet
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import uuid
|
||||
import threading
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, Optional, Sequence
|
||||
|
||||
from helper.logger import log
|
||||
from ._shared import Cmdlet, CmdletArg, parse_cmdlet_args
|
||||
|
||||
class Download_Torrent(Cmdlet):
|
||||
"""Class-based download-torrent cmdlet with self-registration."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
super().__init__(
|
||||
name="download-torrent",
|
||||
summary="Download torrent/magnet links via AllDebrid",
|
||||
usage="download-torrent <magnet|.torrent> [options]",
|
||||
alias=["torrent", "magnet"],
|
||||
arg=[
|
||||
CmdletArg(name="magnet", type="string", required=False, description="Magnet link or .torrent file/URL", variadic=True),
|
||||
CmdletArg(name="output", type="string", description="Output directory for downloaded files"),
|
||||
CmdletArg(name="wait", type="float", description="Wait time (seconds) for magnet processing timeout"),
|
||||
CmdletArg(name="background", type="flag", alias="bg", description="Start download in background"),
|
||||
],
|
||||
detail=["Download torrents/magnets via AllDebrid API."],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
magnet_args = parsed.get("magnet", [])
|
||||
output_dir = Path(parsed.get("output") or Path.home() / "Downloads")
|
||||
wait_timeout = int(float(parsed.get("wait", 600)))
|
||||
background_mode = parsed.get("background", False)
|
||||
api_key = config.get("alldebrid_api_key")
|
||||
if not api_key:
|
||||
log("AllDebrid API key not configured", file=sys.stderr)
|
||||
return 1
|
||||
for magnet_url in magnet_args:
|
||||
if background_mode:
|
||||
self._start_background_worker(magnet_url, output_dir, config, api_key, wait_timeout)
|
||||
log(f"⧗ Torrent download queued in background: {magnet_url}")
|
||||
else:
|
||||
self._download_torrent_worker(str(uuid.uuid4()), magnet_url, output_dir, config, api_key, wait_timeout)
|
||||
return 0
|
||||
|
||||
@staticmethod
|
||||
def _download_torrent_worker(
|
||||
worker_id: str,
|
||||
magnet_url: str,
|
||||
output_dir: Path,
|
||||
config: Dict[str, Any],
|
||||
api_key: str,
|
||||
wait_timeout: int = 600,
|
||||
worker_manager: Optional[Any] = None,
|
||||
) -> None:
|
||||
try:
|
||||
from helper.alldebrid import AllDebridClient
|
||||
client = AllDebridClient(api_key)
|
||||
log(f"[Worker {worker_id}] Submitting magnet to AllDebrid...")
|
||||
magnet_info = client.magnet_add(magnet_url)
|
||||
magnet_id = int(magnet_info.get('id', 0))
|
||||
if magnet_id <= 0:
|
||||
log(f"[Worker {worker_id}] Magnet add failed", file=sys.stderr)
|
||||
return
|
||||
log(f"[Worker {worker_id}] ✓ Magnet added (ID: {magnet_id})")
|
||||
# Poll for ready status (simplified)
|
||||
import time
|
||||
elapsed = 0
|
||||
while elapsed < wait_timeout:
|
||||
status = client.magnet_status(magnet_id)
|
||||
if status.get('ready'):
|
||||
break
|
||||
time.sleep(5)
|
||||
elapsed += 5
|
||||
if elapsed >= wait_timeout:
|
||||
log(f"[Worker {worker_id}] Timeout waiting for magnet", file=sys.stderr)
|
||||
return
|
||||
files_result = client.magnet_links([magnet_id])
|
||||
magnet_files = files_result.get(str(magnet_id), {})
|
||||
files_array = magnet_files.get('files', [])
|
||||
if not files_array:
|
||||
log(f"[Worker {worker_id}] No files found", file=sys.stderr)
|
||||
return
|
||||
for file_info in files_array:
|
||||
file_url = file_info.get('link')
|
||||
file_name = file_info.get('name')
|
||||
if file_url:
|
||||
Download_Torrent._download_file(file_url, output_dir / file_name)
|
||||
log(f"[Worker {worker_id}] ✓ Downloaded {file_name}")
|
||||
except Exception as e:
|
||||
log(f"[Worker {worker_id}] Torrent download failed: {e}", file=sys.stderr)
|
||||
|
||||
@staticmethod
|
||||
def _download_file(url: str, dest: Path) -> None:
|
||||
try:
|
||||
import requests
|
||||
resp = requests.get(url, stream=True)
|
||||
with open(dest, 'wb') as f:
|
||||
for chunk in resp.iter_content(chunk_size=8192):
|
||||
if chunk:
|
||||
f.write(chunk)
|
||||
except Exception as e:
|
||||
log(f"File download failed: {e}", file=sys.stderr)
|
||||
|
||||
def _start_background_worker(self, magnet_url, output_dir, config, api_key, wait_timeout):
|
||||
worker_id = f"torrent_{uuid.uuid4().hex[:6]}"
|
||||
thread = threading.Thread(
|
||||
target=self._download_torrent_worker,
|
||||
args=(worker_id, magnet_url, output_dir, config, api_key, wait_timeout),
|
||||
daemon=False,
|
||||
name=f"TorrentWorker_{worker_id}",
|
||||
)
|
||||
thread.start()
|
||||
|
||||
CMDLET = Download_Torrent()
|
||||
1864
cmdlets/get_file.py
1864
cmdlets/get_file.py
File diff suppressed because it is too large
Load Diff
1708
cmdlets/get_file.py.backup
Normal file
1708
cmdlets/get_file.py.backup
Normal file
File diff suppressed because it is too large
Load Diff
@@ -6,337 +6,224 @@ import sys
|
||||
|
||||
from helper.logger import log
|
||||
from pathlib import Path
|
||||
import mimetypes
|
||||
import os
|
||||
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from config import get_local_storage_path
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, get_field
|
||||
import pipeline as ctx
|
||||
from result_table import ResultTable
|
||||
|
||||
|
||||
def _extract_imported_ts(meta: Dict[str, Any]) -> Optional[int]:
|
||||
"""Extract an imported timestamp from Hydrus metadata if available."""
|
||||
if not isinstance(meta, dict):
|
||||
class Get_Metadata(Cmdlet):
|
||||
"""Class-based get-metadata cmdlet with self-registration."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialize get-metadata cmdlet."""
|
||||
super().__init__(
|
||||
name="get-metadata",
|
||||
summary="Print metadata for files by hash and storage backend.",
|
||||
usage="get-metadata [-hash <sha256>] [-store <backend>]",
|
||||
alias=["meta"],
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
SharedArgs.STORE,
|
||||
],
|
||||
detail=[
|
||||
"- Retrieves metadata from storage backend using file hash as identifier.",
|
||||
"- Shows hash, MIME type, size, duration/pages, known url, and import timestamp.",
|
||||
"- Hash and store are taken from piped result or can be overridden with -hash/-store flags.",
|
||||
"- All metadata is retrieved from the storage backend's database (single source of truth).",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
@staticmethod
|
||||
def _extract_imported_ts(meta: Dict[str, Any]) -> Optional[int]:
|
||||
"""Extract an imported timestamp from metadata if available."""
|
||||
if not isinstance(meta, dict):
|
||||
return None
|
||||
|
||||
# Prefer explicit time_imported if present
|
||||
explicit = meta.get("time_imported")
|
||||
if isinstance(explicit, (int, float)):
|
||||
return int(explicit)
|
||||
|
||||
# Try parsing string timestamps
|
||||
if isinstance(explicit, str):
|
||||
try:
|
||||
import datetime as _dt
|
||||
return int(_dt.datetime.fromisoformat(explicit).timestamp())
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return None
|
||||
|
||||
# Prefer explicit time_imported if present
|
||||
explicit = meta.get("time_imported")
|
||||
if isinstance(explicit, (int, float)):
|
||||
return int(explicit)
|
||||
|
||||
file_services = meta.get("file_services")
|
||||
if isinstance(file_services, dict):
|
||||
current = file_services.get("current")
|
||||
if isinstance(current, dict):
|
||||
numeric = [int(v) for v in current.values() if isinstance(v, (int, float))]
|
||||
if numeric:
|
||||
return min(numeric)
|
||||
return None
|
||||
|
||||
|
||||
def _format_imported(ts: Optional[int]) -> str:
|
||||
if not ts:
|
||||
return ""
|
||||
try:
|
||||
import datetime as _dt
|
||||
return _dt.datetime.utcfromtimestamp(ts).strftime("%Y-%m-%d %H:%M:%S")
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
|
||||
def _build_table_row(title: str, origin: str, path: str, mime: str, size_bytes: Optional[int], dur_seconds: Optional[int], imported_ts: Optional[int], urls: list[str], hash_value: Optional[str], pages: Optional[int] = None) -> Dict[str, Any]:
|
||||
size_mb = None
|
||||
if isinstance(size_bytes, int):
|
||||
@staticmethod
|
||||
def _format_imported(ts: Optional[int]) -> str:
|
||||
"""Format timestamp as readable string."""
|
||||
if not ts:
|
||||
return ""
|
||||
try:
|
||||
size_mb = int(size_bytes / (1024 * 1024))
|
||||
import datetime as _dt
|
||||
return _dt.datetime.utcfromtimestamp(ts).strftime("%Y-%m-%d %H:%M:%S")
|
||||
except Exception:
|
||||
size_mb = None
|
||||
return ""
|
||||
|
||||
dur_int = int(dur_seconds) if isinstance(dur_seconds, (int, float)) else None
|
||||
pages_int = int(pages) if isinstance(pages, (int, float)) else None
|
||||
imported_label = _format_imported(imported_ts)
|
||||
@staticmethod
|
||||
def _build_table_row(title: str, origin: str, path: str, mime: str, size_bytes: Optional[int],
|
||||
dur_seconds: Optional[int], imported_ts: Optional[int], url: list[str],
|
||||
hash_value: Optional[str], pages: Optional[int] = None) -> Dict[str, Any]:
|
||||
"""Build a table row dict with metadata fields."""
|
||||
size_mb = None
|
||||
if isinstance(size_bytes, int):
|
||||
try:
|
||||
size_mb = int(size_bytes / (1024 * 1024))
|
||||
except Exception:
|
||||
size_mb = None
|
||||
|
||||
duration_label = "Duration(s)"
|
||||
duration_value = str(dur_int) if dur_int is not None else ""
|
||||
if mime and mime.lower().startswith("application/pdf"):
|
||||
duration_label = "Pages"
|
||||
duration_value = str(pages_int) if pages_int is not None else ""
|
||||
dur_int = int(dur_seconds) if isinstance(dur_seconds, (int, float)) else None
|
||||
pages_int = int(pages) if isinstance(pages, (int, float)) else None
|
||||
imported_label = Get_Metadata._format_imported(imported_ts)
|
||||
|
||||
columns = [
|
||||
("Title", title or ""),
|
||||
("Hash", hash_value or ""),
|
||||
("MIME", mime or ""),
|
||||
("Size(MB)", str(size_mb) if size_mb is not None else ""),
|
||||
(duration_label, duration_value),
|
||||
("Imported", imported_label),
|
||||
("Store", origin or ""),
|
||||
]
|
||||
duration_label = "Duration(s)"
|
||||
duration_value = str(dur_int) if dur_int is not None else ""
|
||||
if mime and mime.lower().startswith("application/pdf"):
|
||||
duration_label = "Pages"
|
||||
duration_value = str(pages_int) if pages_int is not None else ""
|
||||
|
||||
return {
|
||||
"title": title or path,
|
||||
"path": path,
|
||||
"origin": origin,
|
||||
"mime": mime,
|
||||
"size_bytes": size_bytes,
|
||||
"duration_seconds": dur_int,
|
||||
"pages": pages_int,
|
||||
"imported_ts": imported_ts,
|
||||
"imported": imported_label,
|
||||
"hash": hash_value,
|
||||
"known_urls": urls,
|
||||
"columns": columns,
|
||||
}
|
||||
columns = [
|
||||
("Title", title or ""),
|
||||
("Hash", hash_value or ""),
|
||||
("MIME", mime or ""),
|
||||
("Size(MB)", str(size_mb) if size_mb is not None else ""),
|
||||
(duration_label, duration_value),
|
||||
("Imported", imported_label),
|
||||
("Store", origin or ""),
|
||||
]
|
||||
|
||||
return {
|
||||
"title": title or path,
|
||||
"path": path,
|
||||
"origin": origin,
|
||||
"mime": mime,
|
||||
"size_bytes": size_bytes,
|
||||
"duration_seconds": dur_int,
|
||||
"pages": pages_int,
|
||||
"imported_ts": imported_ts,
|
||||
"imported": imported_label,
|
||||
"hash": hash_value,
|
||||
"url": url,
|
||||
"columns": columns,
|
||||
}
|
||||
|
||||
def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in _args):
|
||||
log(json.dumps(CMDLET.to_dict(), ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Helper to get field from both dict and object
|
||||
def get_field(obj: Any, field: str, default: Any = None) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
return obj.get(field, default)
|
||||
@staticmethod
|
||||
def _add_table_body_row(table: ResultTable, row: Dict[str, Any]) -> None:
|
||||
"""Add a single row to the ResultTable using the prepared columns."""
|
||||
columns = row.get("columns") if isinstance(row, dict) else None
|
||||
lookup: Dict[str, Any] = {}
|
||||
if isinstance(columns, list):
|
||||
for col in columns:
|
||||
if isinstance(col, tuple) and len(col) == 2:
|
||||
label, value = col
|
||||
lookup[str(label)] = value
|
||||
|
||||
row_obj = table.add_row()
|
||||
row_obj.add_column("Hash", lookup.get("Hash", ""))
|
||||
row_obj.add_column("MIME", lookup.get("MIME", ""))
|
||||
row_obj.add_column("Size(MB)", lookup.get("Size(MB)", ""))
|
||||
if "Duration(s)" in lookup:
|
||||
row_obj.add_column("Duration(s)", lookup.get("Duration(s)", ""))
|
||||
elif "Pages" in lookup:
|
||||
row_obj.add_column("Pages", lookup.get("Pages", ""))
|
||||
else:
|
||||
return getattr(obj, field, default)
|
||||
|
||||
# Parse -hash override
|
||||
override_hash: str | None = None
|
||||
args_list = list(_args)
|
||||
i = 0
|
||||
while i < len(args_list):
|
||||
a = args_list[i]
|
||||
low = str(a).lower()
|
||||
if low in {"-hash", "--hash", "hash"} and i + 1 < len(args_list):
|
||||
override_hash = str(args_list[i + 1]).strip()
|
||||
break
|
||||
i += 1
|
||||
|
||||
# Try to determine if this is a local file or Hydrus file
|
||||
local_path = get_field(result, "target", None) or get_field(result, "path", None)
|
||||
is_local = False
|
||||
if local_path and isinstance(local_path, str) and not local_path.startswith(("http://", "https://")):
|
||||
is_local = True
|
||||
|
||||
# LOCAL FILE PATH
|
||||
if is_local and local_path:
|
||||
row_obj.add_column("Duration(s)", "")
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Main execution entry point."""
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Get hash and store from parsed args or result
|
||||
file_hash = parsed.get("hash") or get_field(result, "hash") or get_field(result, "file_hash") or get_field(result, "hash_hex")
|
||||
storage_source = parsed.get("store") or get_field(result, "store") or get_field(result, "storage") or get_field(result, "origin")
|
||||
|
||||
if not file_hash:
|
||||
log("No hash available - use -hash to specify", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if not storage_source:
|
||||
log("No storage backend specified - use -store to specify", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Use storage backend to get metadata
|
||||
try:
|
||||
file_path = Path(str(local_path))
|
||||
if file_path.exists() and file_path.is_file():
|
||||
# Get the hash from result or compute it
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
|
||||
# If no hash, compute SHA256 of the file
|
||||
if not hash_hex:
|
||||
try:
|
||||
import hashlib
|
||||
with open(file_path, 'rb') as f:
|
||||
hash_hex = hashlib.sha256(f.read()).hexdigest()
|
||||
except Exception:
|
||||
hash_hex = None
|
||||
|
||||
# Get MIME type
|
||||
mime_type, _ = mimetypes.guess_type(str(file_path))
|
||||
if not mime_type:
|
||||
mime_type = "unknown"
|
||||
|
||||
# Pull metadata from local DB if available (for imported timestamp, duration, etc.)
|
||||
db_metadata = None
|
||||
library_root = get_local_storage_path(config)
|
||||
if library_root:
|
||||
try:
|
||||
with LocalLibraryDB(library_root) as db:
|
||||
db_metadata = db.get_metadata(file_path) or None
|
||||
except Exception:
|
||||
db_metadata = None
|
||||
|
||||
# Get file size (prefer DB size if present)
|
||||
file_size = None
|
||||
if isinstance(db_metadata, dict) and isinstance(db_metadata.get("size"), int):
|
||||
file_size = db_metadata.get("size")
|
||||
else:
|
||||
try:
|
||||
file_size = file_path.stat().st_size
|
||||
except Exception:
|
||||
file_size = None
|
||||
|
||||
# Duration/pages
|
||||
duration_seconds = None
|
||||
pages = None
|
||||
if isinstance(db_metadata, dict):
|
||||
if isinstance(db_metadata.get("duration"), (int, float)):
|
||||
duration_seconds = float(db_metadata.get("duration"))
|
||||
if isinstance(db_metadata.get("pages"), (int, float)):
|
||||
pages = int(db_metadata.get("pages"))
|
||||
|
||||
if duration_seconds is None and mime_type and mime_type.startswith("video"):
|
||||
try:
|
||||
import subprocess
|
||||
result_proc = subprocess.run(
|
||||
["ffprobe", "-v", "error", "-select_streams", "v:0", "-show_entries", "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", str(file_path)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=5
|
||||
)
|
||||
if result_proc.returncode == 0 and result_proc.stdout.strip():
|
||||
duration_seconds = float(result_proc.stdout.strip())
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Known URLs from sidecar or result
|
||||
urls = []
|
||||
sidecar_path = Path(str(file_path) + '.tags')
|
||||
if sidecar_path.exists():
|
||||
try:
|
||||
with open(sidecar_path, 'r', encoding='utf-8') as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line.startswith('known_url:'):
|
||||
url_value = line.replace('known_url:', '', 1).strip()
|
||||
if url_value:
|
||||
urls.append(url_value)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not urls:
|
||||
urls_from_result = get_field(result, "known_urls", None) or get_field(result, "urls", None)
|
||||
if isinstance(urls_from_result, list):
|
||||
urls.extend([str(u).strip() for u in urls_from_result if u])
|
||||
|
||||
imported_ts = None
|
||||
if isinstance(db_metadata, dict):
|
||||
ts = db_metadata.get("time_imported") or db_metadata.get("time_added")
|
||||
if isinstance(ts, (int, float)):
|
||||
imported_ts = int(ts)
|
||||
elif isinstance(ts, str):
|
||||
try:
|
||||
import datetime as _dt
|
||||
imported_ts = int(_dt.datetime.fromisoformat(ts).timestamp())
|
||||
except Exception:
|
||||
imported_ts = None
|
||||
|
||||
row = _build_table_row(
|
||||
title=file_path.name,
|
||||
origin="local",
|
||||
path=str(file_path),
|
||||
mime=mime_type or "",
|
||||
size_bytes=int(file_size) if isinstance(file_size, int) else None,
|
||||
dur_seconds=duration_seconds,
|
||||
imported_ts=imported_ts,
|
||||
urls=urls,
|
||||
hash_value=hash_hex,
|
||||
pages=pages,
|
||||
)
|
||||
|
||||
table_title = file_path.name
|
||||
table = ResultTable(table_title)
|
||||
table.set_source_command("get-metadata", list(_args))
|
||||
table.add_result(row)
|
||||
ctx.set_last_result_table_overlay(table, [row], row)
|
||||
ctx.emit(row)
|
||||
return 0
|
||||
except Exception:
|
||||
# Fall through to Hydrus if local file handling fails
|
||||
pass
|
||||
|
||||
# HYDRUS PATH
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
if not hash_hex:
|
||||
log("Selected result does not include a Hydrus hash or local path", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus client unavailable: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if client is None:
|
||||
log("Hydrus client unavailable", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
try:
|
||||
payload = client.fetch_file_metadata(
|
||||
hashes=[hash_hex],
|
||||
include_service_keys_to_tags=False,
|
||||
include_file_urls=True,
|
||||
include_duration=True,
|
||||
include_size=True,
|
||||
include_mime=True,
|
||||
)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus metadata fetch failed: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
if not isinstance(items, list) or not items:
|
||||
log("No metadata found.")
|
||||
return 0
|
||||
|
||||
meta = items[0] if isinstance(items[0], dict) else None
|
||||
if not isinstance(meta, dict):
|
||||
log("No metadata found.")
|
||||
return 0
|
||||
|
||||
mime = meta.get("mime")
|
||||
size = meta.get("size") or meta.get("file_size")
|
||||
duration_value = meta.get("duration")
|
||||
inner = meta.get("metadata") if isinstance(meta.get("metadata"), dict) else None
|
||||
if duration_value is None and isinstance(inner, dict):
|
||||
duration_value = inner.get("duration")
|
||||
|
||||
imported_ts = _extract_imported_ts(meta)
|
||||
|
||||
try:
|
||||
from .search_file import _hydrus_duration_seconds as _dur_secs
|
||||
except Exception:
|
||||
_dur_secs = lambda x: x
|
||||
|
||||
dur_seconds = _dur_secs(duration_value)
|
||||
urls = meta.get("known_urls") or meta.get("urls")
|
||||
urls = [str(u).strip() for u in urls] if isinstance(urls, list) else []
|
||||
|
||||
row = _build_table_row(
|
||||
title=hash_hex,
|
||||
origin="hydrus",
|
||||
path=f"hydrus://file/{hash_hex}",
|
||||
mime=mime or "",
|
||||
size_bytes=int(size) if isinstance(size, int) else None,
|
||||
dur_seconds=int(dur_seconds) if isinstance(dur_seconds, (int, float)) else None,
|
||||
imported_ts=imported_ts,
|
||||
urls=urls,
|
||||
hash_value=hash_hex,
|
||||
pages=None,
|
||||
)
|
||||
|
||||
table = ResultTable(hash_hex or "Metadata")
|
||||
table.set_source_command("get-metadata", list(_args))
|
||||
table.add_result(row)
|
||||
ctx.set_last_result_table_overlay(table, [row], row)
|
||||
ctx.emit(row)
|
||||
|
||||
return 0
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config)
|
||||
backend = storage[storage_source]
|
||||
|
||||
# Get metadata from backend
|
||||
metadata = backend.get_metadata(file_hash)
|
||||
|
||||
if not metadata:
|
||||
log(f"No metadata found for hash {file_hash[:8]}... in {storage_source}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Extract title from tags if available
|
||||
title = get_field(result, "title") or file_hash[:16]
|
||||
if not get_field(result, "title"):
|
||||
# Try to get title from tags
|
||||
try:
|
||||
tags, _ = backend.get_tag(file_hash)
|
||||
for tag in tags:
|
||||
if tag.lower().startswith("title:"):
|
||||
title = tag.split(":", 1)[1]
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Extract metadata fields
|
||||
mime_type = metadata.get("mime") or metadata.get("ext", "")
|
||||
file_size = metadata.get("size")
|
||||
duration_seconds = metadata.get("duration")
|
||||
pages = metadata.get("pages")
|
||||
url = metadata.get("url") or []
|
||||
imported_ts = self._extract_imported_ts(metadata)
|
||||
|
||||
# Normalize url
|
||||
if isinstance(url, str):
|
||||
try:
|
||||
url = json.loads(url)
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
url = []
|
||||
if not isinstance(url, list):
|
||||
url = []
|
||||
|
||||
# Build display row
|
||||
row = self._build_table_row(
|
||||
title=title,
|
||||
origin=storage_source,
|
||||
path=metadata.get("file_path", ""),
|
||||
mime=mime_type,
|
||||
size_bytes=file_size,
|
||||
dur_seconds=duration_seconds,
|
||||
imported_ts=imported_ts,
|
||||
url=url,
|
||||
hash_value=file_hash,
|
||||
pages=pages,
|
||||
)
|
||||
|
||||
table_title = title
|
||||
table = ResultTable(table_title).init_command("get-metadata", list(args))
|
||||
self._add_table_body_row(table, row)
|
||||
ctx.set_last_result_table_overlay(table, [row], row)
|
||||
ctx.emit(row)
|
||||
return 0
|
||||
|
||||
except KeyError:
|
||||
log(f"Storage backend '{storage_source}' not found", file=sys.stderr)
|
||||
return 1
|
||||
except Exception as exc:
|
||||
log(f"Failed to get metadata: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="get-metadata",
|
||||
summary="Print metadata for local or Hydrus files (hash, mime, duration, size, URLs).",
|
||||
usage="get-metadata [-hash <sha256>]",
|
||||
aliases=["meta"],
|
||||
args=[
|
||||
CmdletArg("hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
],
|
||||
details=[
|
||||
"- For local files: Shows path, hash (computed if needed), MIME type, size, duration, and known URLs from sidecar.",
|
||||
"- For Hydrus files: Shows path (hydrus://), hash, MIME, duration, size, and known URLs.",
|
||||
"- Automatically detects local vs Hydrus files.",
|
||||
"- Local file hashes are computed via SHA256 if not already available.",
|
||||
],
|
||||
)
|
||||
CMDLET = Get_Metadata()
|
||||
|
||||
@@ -7,17 +7,17 @@ from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, get_hash_for_operation, fetch_hydrus_metadata, get_field, should_show_help
|
||||
from helper.logger import log
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="get-note",
|
||||
summary="List notes on a Hydrus file.",
|
||||
usage="get-note [-hash <sha256>]",
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Prints notes by service and note name.",
|
||||
],
|
||||
)
|
||||
@@ -25,45 +25,24 @@ CMDLET = Cmdlet(
|
||||
|
||||
@register(["get-note", "get-notes", "get_note"]) # aliases
|
||||
def get_notes(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Helper to get field from both dict and object
|
||||
def get_field(obj: Any, field: str, default: Any = None) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
return obj.get(field, default)
|
||||
else:
|
||||
return getattr(obj, field, default)
|
||||
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
from ._shared import parse_cmdlet_args
|
||||
from ._shared import parse_cmdlet_args, get_hash_for_operation, fetch_hydrus_metadata
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
override_hash = parsed.get("hash")
|
||||
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
hash_hex = get_hash_for_operation(override_hash, result)
|
||||
if not hash_hex:
|
||||
log("Selected result does not include a Hydrus hash")
|
||||
return 1
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus client unavailable: {exc}")
|
||||
return 1
|
||||
|
||||
if client is None:
|
||||
log("Hydrus client unavailable")
|
||||
return 1
|
||||
try:
|
||||
payload = client.fetch_file_metadata(hashes=[hash_hex], include_service_keys_to_tags=False, include_notes=True)
|
||||
except Exception as exc:
|
||||
log(f"Hydrus metadata fetch failed: {exc}")
|
||||
return 1
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
meta = items[0] if (isinstance(items, list) and items and isinstance(items[0], dict)) else None
|
||||
meta, error_code = fetch_hydrus_metadata(config, hash_hex, include_service_keys_to_tags=False, include_notes=True)
|
||||
if error_code != 0:
|
||||
return error_code
|
||||
|
||||
notes = {}
|
||||
if isinstance(meta, dict):
|
||||
# Hydrus returns service_keys_to_tags; for notes we expect 'service_names_to_notes' in modern API
|
||||
|
||||
@@ -7,12 +7,11 @@ from pathlib import Path
|
||||
|
||||
from helper.logger import log
|
||||
|
||||
from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash, fmt_bytes
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, normalize_hash, fmt_bytes, get_hash_for_operation, fetch_hydrus_metadata, should_show_help
|
||||
from helper.folder_store import FolderDB
|
||||
from config import get_local_storage_path
|
||||
from result_table import ResultTable
|
||||
|
||||
@@ -20,23 +19,22 @@ CMDLET = Cmdlet(
|
||||
name="get-relationship",
|
||||
summary="Print relationships for the selected file (Hydrus or Local).",
|
||||
usage="get-relationship [-hash <sha256>]",
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
alias=[
|
||||
"get-rel",
|
||||
],
|
||||
details=[
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
],
|
||||
detail=[
|
||||
"- Lists relationship data as returned by Hydrus or Local DB.",
|
||||
],
|
||||
)
|
||||
|
||||
@register(["get-rel", "get-relationship", "get-relationships", "get-file-relationships"]) # aliases
|
||||
def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in _args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(_args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Parse -hash override
|
||||
override_hash: str | None = None
|
||||
@@ -91,8 +89,9 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
storage_path = get_local_storage_path(config)
|
||||
print(f"[DEBUG] Storage path: {storage_path}", file=sys.stderr)
|
||||
if storage_path:
|
||||
with LocalLibraryDB(storage_path) as db:
|
||||
metadata = db.get_metadata(path_obj)
|
||||
with FolderDB(storage_path) as db:
|
||||
file_hash = db.get_file_hash(path_obj)
|
||||
metadata = db.get_metadata(file_hash) if file_hash else None
|
||||
print(f"[DEBUG] Metadata found: {metadata is not None}", file=sys.stderr)
|
||||
if metadata and metadata.get("relationships"):
|
||||
local_db_checked = True
|
||||
@@ -106,14 +105,14 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# h is now a file hash (not a path)
|
||||
print(f"[DEBUG] Processing relationship hash: h={h}", file=sys.stderr)
|
||||
# Resolve hash to file path
|
||||
resolved_path = db.search_by_hash(h)
|
||||
resolved_path = db.search_hash(h)
|
||||
title = h[:16] + "..."
|
||||
path = None
|
||||
if resolved_path and resolved_path.exists():
|
||||
path = str(resolved_path)
|
||||
# Try to get title from tags
|
||||
try:
|
||||
tags = db.get_tags(resolved_path)
|
||||
tags = db.get_tags(h)
|
||||
found_title = False
|
||||
for t in tags:
|
||||
if t.lower().startswith('title:'):
|
||||
@@ -154,11 +153,13 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if not existing_parent:
|
||||
parent_title = parent_path_obj.stem
|
||||
try:
|
||||
parent_tags = db.get_tags(parent_path_obj)
|
||||
for t in parent_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
parent_title = t[6:].strip()
|
||||
break
|
||||
parent_hash = db.get_file_hash(parent_path_obj)
|
||||
if parent_hash:
|
||||
parent_tags = db.get_tags(parent_hash)
|
||||
for t in parent_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
parent_title = t[6:].strip()
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -176,7 +177,8 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
existing_parent['type'] = "king"
|
||||
|
||||
# 1. Check forward relationships from parent (siblings)
|
||||
parent_metadata = db.get_metadata(parent_path_obj)
|
||||
parent_hash = db.get_file_hash(parent_path_obj)
|
||||
parent_metadata = db.get_metadata(parent_hash) if parent_hash else None
|
||||
print(f"[DEBUG] 📖 Parent metadata: {parent_metadata is not None}", file=sys.stderr)
|
||||
if parent_metadata:
|
||||
print(f"[DEBUG] Parent metadata keys: {parent_metadata.keys()}", file=sys.stderr)
|
||||
@@ -189,7 +191,7 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if child_hashes:
|
||||
for child_h in child_hashes:
|
||||
# child_h is now a HASH, not a path - resolve it
|
||||
child_path_obj = db.search_by_hash(child_h)
|
||||
child_path_obj = db.search_hash(child_h)
|
||||
print(f"[DEBUG] Resolved hash {child_h[:16]}... to: {child_path_obj}", file=sys.stderr)
|
||||
|
||||
if not child_path_obj:
|
||||
@@ -205,11 +207,13 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Now child_path_obj is a Path, so we can get tags
|
||||
child_title = child_path_obj.stem
|
||||
try:
|
||||
child_tags = db.get_tags(child_path_obj)
|
||||
for t in child_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
child_title = t[6:].strip()
|
||||
break
|
||||
child_hash = db.get_file_hash(child_path_obj)
|
||||
if child_hash:
|
||||
child_tags = db.get_tags(child_hash)
|
||||
for t in child_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
child_title = t[6:].strip()
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -241,11 +245,13 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
child_path_obj = Path(child_path)
|
||||
child_title = child_path_obj.stem
|
||||
try:
|
||||
child_tags = db.get_tags(child_path_obj)
|
||||
for t in child_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
child_title = t[6:].strip()
|
||||
break
|
||||
child_hash = db.get_file_hash(child_path_obj)
|
||||
if child_hash:
|
||||
child_tags = db.get_tags(child_hash)
|
||||
for t in child_tags:
|
||||
if t.lower().startswith('title:'):
|
||||
child_title = t[6:].strip()
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -304,11 +310,7 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# But if the file is also in Hydrus, we might want those too.
|
||||
# Let's try Hydrus if we have a hash.
|
||||
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(getattr(result, "hash_hex", None))
|
||||
if not hash_hex:
|
||||
# Try to get hash from dict
|
||||
if isinstance(result, dict):
|
||||
hash_hex = normalize_hash(result.get("hash") or result.get("file_hash"))
|
||||
hash_hex = get_hash_for_operation(override_hash, result)
|
||||
|
||||
if hash_hex and not local_db_checked:
|
||||
try:
|
||||
@@ -362,7 +364,7 @@ def _run(result: Any, _args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
return 0
|
||||
|
||||
# Display results
|
||||
table = ResultTable(f"Relationships: {source_title}")
|
||||
table = ResultTable(f"Relationships: {source_title}").init_command("get-relationship", [])
|
||||
|
||||
# Sort by type then title
|
||||
# Custom sort order: King first, then Derivative, then others
|
||||
|
||||
@@ -20,8 +20,8 @@ from typing import Any, Dict, List, Optional, Sequence, Tuple
|
||||
|
||||
import pipeline as ctx
|
||||
from helper import hydrus
|
||||
from helper.local_library import read_sidecar, write_sidecar, find_sidecar, LocalLibraryDB
|
||||
from ._shared import normalize_hash, looks_like_hash, Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args
|
||||
from helper.folder_store import read_sidecar, write_sidecar, find_sidecar, FolderDB
|
||||
from ._shared import normalize_hash, looks_like_hash, Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, get_field
|
||||
from config import get_local_storage_path
|
||||
|
||||
|
||||
@@ -71,33 +71,6 @@ class TagItem:
|
||||
}
|
||||
|
||||
|
||||
def _extract_my_tags_from_hydrus_meta(meta: Dict[str, Any], service_key: Optional[str], service_name: str) -> List[str]:
|
||||
"""Extract current tags from Hydrus metadata dict.
|
||||
|
||||
Prefers display_tags (includes siblings/parents, excludes deleted).
|
||||
Falls back to storage_tags status '0' (current).
|
||||
"""
|
||||
tags_payload = meta.get("tags")
|
||||
if not isinstance(tags_payload, dict):
|
||||
return []
|
||||
svc_data = None
|
||||
if service_key:
|
||||
svc_data = tags_payload.get(service_key)
|
||||
if not isinstance(svc_data, dict):
|
||||
return []
|
||||
# Prefer display_tags (Hydrus computes siblings/parents)
|
||||
display = svc_data.get("display_tags")
|
||||
if isinstance(display, list) and display:
|
||||
return [str(t) for t in display if isinstance(t, (str, bytes)) and str(t).strip()]
|
||||
# Fallback to storage_tags status '0' (current)
|
||||
storage = svc_data.get("storage_tags")
|
||||
if isinstance(storage, dict):
|
||||
current_list = storage.get("0") or storage.get(0)
|
||||
if isinstance(current_list, list):
|
||||
return [str(t) for t in current_list if isinstance(t, (str, bytes)) and str(t).strip()]
|
||||
return []
|
||||
|
||||
|
||||
def _emit_tags_as_table(
|
||||
tags_list: List[str],
|
||||
hash_hex: Optional[str],
|
||||
@@ -316,12 +289,12 @@ def _read_sidecar_fallback(p: Path) -> tuple[Optional[str], List[str], List[str]
|
||||
|
||||
Format:
|
||||
- Lines with "hash:" prefix: file hash
|
||||
- Lines with "known_url:" or "url:" prefix: URLs
|
||||
- Lines with "url:" or "url:" prefix: url
|
||||
- Lines with "relationship:" prefix: ignored (internal relationships)
|
||||
- Lines with "key:", "namespace:value" format: treated as namespace tags
|
||||
- Plain lines without colons: freeform tags
|
||||
|
||||
Excluded namespaces (treated as metadata, not tags): hash, known_url, url, relationship
|
||||
Excluded namespaces (treated as metadata, not tags): hash, url, url, relationship
|
||||
"""
|
||||
try:
|
||||
raw = p.read_text(encoding="utf-8", errors="ignore")
|
||||
@@ -332,7 +305,7 @@ def _read_sidecar_fallback(p: Path) -> tuple[Optional[str], List[str], List[str]
|
||||
h: Optional[str] = None
|
||||
|
||||
# Namespaces to exclude from tags
|
||||
excluded_namespaces = {"hash", "known_url", "url", "relationship"}
|
||||
excluded_namespaces = {"hash", "url", "url", "relationship"}
|
||||
|
||||
for line in raw.splitlines():
|
||||
s = line.strip()
|
||||
@@ -344,7 +317,7 @@ def _read_sidecar_fallback(p: Path) -> tuple[Optional[str], List[str], List[str]
|
||||
if low.startswith("hash:"):
|
||||
h = s.split(":", 1)[1].strip() if ":" in s else h
|
||||
# Check if this is a URL line
|
||||
elif low.startswith("known_url:") or low.startswith("url:"):
|
||||
elif low.startswith("url:") or low.startswith("url:"):
|
||||
val = s.split(":", 1)[1].strip() if ":" in s else ""
|
||||
if val:
|
||||
u.append(val)
|
||||
@@ -361,12 +334,12 @@ def _read_sidecar_fallback(p: Path) -> tuple[Optional[str], List[str], List[str]
|
||||
return h, t, u
|
||||
|
||||
|
||||
def _write_sidecar(p: Path, media: Path, tag_list: List[str], known_urls: List[str], hash_in_sidecar: Optional[str]) -> Path:
|
||||
def _write_sidecar(p: Path, media: Path, tag_list: List[str], url: List[str], hash_in_sidecar: Optional[str]) -> Path:
|
||||
"""Write tags to sidecar file and handle title-based renaming.
|
||||
|
||||
Returns the new media path if renamed, otherwise returns the original media path.
|
||||
"""
|
||||
success = write_sidecar(media, tag_list, known_urls, hash_in_sidecar)
|
||||
success = write_sidecar(media, tag_list, url, hash_in_sidecar)
|
||||
if success:
|
||||
_apply_result_updates_from_tags(None, tag_list)
|
||||
# Check if we should rename the file based on title tag
|
||||
@@ -381,8 +354,8 @@ def _write_sidecar(p: Path, media: Path, tag_list: List[str], known_urls: List[s
|
||||
if hash_in_sidecar:
|
||||
lines.append(f"hash:{hash_in_sidecar}")
|
||||
lines.extend(ordered)
|
||||
for u in known_urls:
|
||||
lines.append(f"known_url:{u}")
|
||||
for u in url:
|
||||
lines.append(f"url:{u}")
|
||||
try:
|
||||
p.write_text("\n".join(lines) + "\n", encoding="utf-8")
|
||||
# Check if we should rename the file based on title tag
|
||||
@@ -414,16 +387,16 @@ def _emit_tag_payload(source: str, tags_list: List[str], *, hash_value: Optional
|
||||
label = None
|
||||
if store_label:
|
||||
label = store_label
|
||||
elif ctx._PIPE_ACTIVE:
|
||||
elif ctx.get_stage_context() is not None:
|
||||
label = "tags"
|
||||
if label:
|
||||
ctx.store_value(label, payload)
|
||||
if ctx._PIPE_ACTIVE and label.lower() != "tags":
|
||||
if ctx.get_stage_context() is not None and label.lower() != "tags":
|
||||
ctx.store_value("tags", payload)
|
||||
|
||||
# Emit individual TagItem objects so they can be selected by bare index
|
||||
# When in pipeline, emit individual TagItem objects
|
||||
if ctx._PIPE_ACTIVE:
|
||||
if ctx.get_stage_context() is not None:
|
||||
for idx, tag_name in enumerate(tags_list, start=1):
|
||||
tag_item = TagItem(
|
||||
tag_name=tag_name,
|
||||
@@ -1113,7 +1086,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# Try local sidecar if no tags present on result
|
||||
if not identifier_tags:
|
||||
file_path = get_field(result, "target", None) or get_field(result, "path", None) or get_field(result, "file_path", None) or get_field(result, "filename", None)
|
||||
file_path = get_field(result, "target", None) or get_field(result, "path", None) or get_field(result, "filename", None)
|
||||
if isinstance(file_path, str) and file_path and not file_path.lower().startswith(("http://", "https://")):
|
||||
try:
|
||||
media_path = Path(str(file_path))
|
||||
@@ -1226,103 +1199,35 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
emit_mode = emit_requested or bool(store_key)
|
||||
store_label = (store_key.strip() if store_key and store_key.strip() else None)
|
||||
|
||||
# Check Hydrus availability
|
||||
hydrus_available, _ = hydrus.is_available(config)
|
||||
# Get hash and store from result
|
||||
file_hash = hash_hex
|
||||
storage_source = get_field(result, "store") or get_field(result, "storage") or get_field(result, "origin")
|
||||
|
||||
# Try to find path in result object
|
||||
local_path = get_field(result, "target", None) or get_field(result, "path", None) or get_field(result, "file_path", None)
|
||||
if not file_hash:
|
||||
log("No hash available in result", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Determine if local file
|
||||
is_local_file = False
|
||||
media: Optional[Path] = None
|
||||
if local_path and isinstance(local_path, str) and not local_path.startswith(("http://", "https://")):
|
||||
is_local_file = True
|
||||
try:
|
||||
media = Path(str(local_path))
|
||||
except Exception:
|
||||
media = None
|
||||
if not storage_source:
|
||||
log("No storage backend specified in result", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Try Hydrus first (always prioritize if available and has hash)
|
||||
use_hydrus = False
|
||||
hydrus_meta = None # Cache the metadata from first fetch
|
||||
client = None
|
||||
if hash_hex and hydrus_available:
|
||||
try:
|
||||
client = hydrus.get_client(config)
|
||||
payload = client.fetch_file_metadata(hashes=[str(hash_hex)], include_service_keys_to_tags=True, include_file_urls=False)
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
if isinstance(items, list) and items:
|
||||
meta = items[0] if isinstance(items[0], dict) else None
|
||||
# Only accept file if it has a valid file_id (not None)
|
||||
if isinstance(meta, dict) and meta.get("file_id") is not None:
|
||||
use_hydrus = True
|
||||
hydrus_meta = meta # Cache for tag extraction
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Get tags - try Hydrus first, fallback to sidecar
|
||||
current = []
|
||||
service_name = ""
|
||||
service_key = None
|
||||
source = "unknown"
|
||||
|
||||
if use_hydrus and hash_hex and hydrus_meta:
|
||||
try:
|
||||
# Use cached metadata from above, don't fetch again
|
||||
service_name = hydrus.get_tag_service_name(config)
|
||||
if client is None:
|
||||
client = hydrus.get_client(config)
|
||||
service_key = hydrus.get_tag_service_key(client, service_name)
|
||||
current = _extract_my_tags_from_hydrus_meta(hydrus_meta, service_key, service_name)
|
||||
source = "hydrus"
|
||||
except Exception as exc:
|
||||
log(f"Warning: Failed to extract tags from Hydrus: {exc}", file=sys.stderr)
|
||||
|
||||
# Fallback to local sidecar or local DB if no tags
|
||||
if not current and is_local_file and media and media.exists():
|
||||
try:
|
||||
# First try local library DB
|
||||
library_root = get_local_storage_path(config)
|
||||
if library_root:
|
||||
try:
|
||||
with LocalLibraryDB(library_root) as db:
|
||||
db_tags = db.get_tags(media)
|
||||
if db_tags:
|
||||
current = db_tags
|
||||
source = "local_db"
|
||||
except Exception as exc:
|
||||
log(f"[get_tag] DB lookup failed, trying sidecar: {exc}", file=sys.stderr)
|
||||
|
||||
# Fall back to sidecar if DB didn't have tags
|
||||
if not current:
|
||||
sidecar_path = find_sidecar(media)
|
||||
if sidecar_path and sidecar_path.exists():
|
||||
try:
|
||||
_, current, _ = read_sidecar(sidecar_path)
|
||||
except Exception:
|
||||
_, current, _ = _read_sidecar_fallback(sidecar_path)
|
||||
if current:
|
||||
source = "sidecar"
|
||||
except Exception as exc:
|
||||
log(f"Warning: Failed to load tags from local storage: {exc}", file=sys.stderr)
|
||||
|
||||
# Fallback to tags in the result object if Hydrus/local lookup returned nothing
|
||||
if not current:
|
||||
# Check if result has 'tags' attribute (PipeObject)
|
||||
if hasattr(result, 'tags') and getattr(result, 'tags', None):
|
||||
current = getattr(result, 'tags')
|
||||
source = "pipeline_result"
|
||||
# Check if result is a dict with 'tags' key
|
||||
elif isinstance(result, dict) and 'tags' in result:
|
||||
tags_val = result['tags']
|
||||
if isinstance(tags_val, list):
|
||||
current = tags_val
|
||||
source = "pipeline_result"
|
||||
source = "pipeline_result"
|
||||
|
||||
# Error if no tags found
|
||||
if not current:
|
||||
log("No tags found", file=sys.stderr)
|
||||
# Get tags using storage backend
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config)
|
||||
backend = storage[storage_source]
|
||||
current, source = backend.get_tag(file_hash, config=config)
|
||||
|
||||
if not current:
|
||||
log("No tags found", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
service_name = ""
|
||||
except KeyError:
|
||||
log(f"Storage backend '{storage_source}' not found", file=sys.stderr)
|
||||
return 1
|
||||
except Exception as exc:
|
||||
log(f"Failed to get tags: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Always output to ResultTable (pipeline mode only)
|
||||
@@ -1383,33 +1288,106 @@ except Exception:
|
||||
_SCRAPE_CHOICES = ["itunes", "openlibrary", "googlebooks", "google", "musicbrainz"]
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="get-tag",
|
||||
summary="Get tags from Hydrus or local sidecar metadata",
|
||||
usage="get-tag [-hash <sha256>] [--store <key>] [--emit] [-scrape <url|provider>]",
|
||||
aliases=["tags"],
|
||||
args=[
|
||||
SharedArgs.HASH,
|
||||
CmdletArg(
|
||||
name="-store",
|
||||
type="string",
|
||||
description="Store result to this key for pipeline",
|
||||
alias="store"
|
||||
),
|
||||
CmdletArg(
|
||||
name="-emit",
|
||||
type="flag",
|
||||
description="Emit result without interactive prompt (quiet mode)",
|
||||
alias="emit-only"
|
||||
),
|
||||
CmdletArg(
|
||||
name="-scrape",
|
||||
type="string",
|
||||
description="Scrape metadata from URL or provider name (returns tags as JSON or table)",
|
||||
required=False,
|
||||
choices=_SCRAPE_CHOICES,
|
||||
)
|
||||
]
|
||||
)
|
||||
class Get_Tag(Cmdlet):
|
||||
"""Class-based get-tag cmdlet with self-registration."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialize get-tag cmdlet."""
|
||||
super().__init__(
|
||||
name="get-tag",
|
||||
summary="Get tags from Hydrus or local sidecar metadata",
|
||||
usage="get-tag [-hash <sha256>] [--store <key>] [--emit] [-scrape <url|provider>]",
|
||||
alias=["tags"],
|
||||
arg=[
|
||||
SharedArgs.HASH,
|
||||
CmdletArg(
|
||||
name="-store",
|
||||
type="string",
|
||||
description="Store result to this key for pipeline",
|
||||
alias="store"
|
||||
),
|
||||
CmdletArg(
|
||||
name="-emit",
|
||||
type="flag",
|
||||
description="Emit result without interactive prompt (quiet mode)",
|
||||
alias="emit-only"
|
||||
),
|
||||
CmdletArg(
|
||||
name="-scrape",
|
||||
type="string",
|
||||
description="Scrape metadata from URL or provider name (returns tags as JSON or table)",
|
||||
required=False,
|
||||
choices=_SCRAPE_CHOICES,
|
||||
)
|
||||
],
|
||||
detail=[
|
||||
"- Retrieves tags for a file from:",
|
||||
" Hydrus: Using file hash if available",
|
||||
" Local: From sidecar files or local library database",
|
||||
"- Options:",
|
||||
" -hash: Override hash to look up in Hydrus",
|
||||
" -store: Store result to key for downstream pipeline",
|
||||
" -emit: Quiet mode (no interactive selection)",
|
||||
" -scrape: Scrape metadata from URL or metadata provider",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Execute get-tag cmdlet."""
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
# Get hash and store from parsed args or result
|
||||
hash_override = parsed.get("hash")
|
||||
file_hash = hash_override or get_field(result, "hash") or get_field(result, "file_hash") or get_field(result, "hash_hex")
|
||||
storage_source = parsed.get("store") or get_field(result, "store") or get_field(result, "storage") or get_field(result, "origin")
|
||||
|
||||
if not file_hash:
|
||||
log("No hash available in result", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if not storage_source:
|
||||
log("No storage backend specified in result", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Get tags using storage backend
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
storage_obj = FileStorage(config)
|
||||
backend = storage_obj[storage_source]
|
||||
current, source = backend.get_tag(file_hash, config=config)
|
||||
|
||||
if not current:
|
||||
log("No tags found", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Build table and emit
|
||||
item_title = get_field(result, "title") or file_hash[:16]
|
||||
_emit_tags_as_table(
|
||||
tags_list=current,
|
||||
hash_hex=file_hash,
|
||||
source=source,
|
||||
service_name="",
|
||||
config=config,
|
||||
item_title=item_title,
|
||||
file_path=None,
|
||||
subject=result,
|
||||
)
|
||||
return 0
|
||||
|
||||
except KeyError:
|
||||
log(f"Storage backend '{storage_source}' not found", file=sys.stderr)
|
||||
return 1
|
||||
except Exception as exc:
|
||||
log(f"Failed to get tags: {exc}", file=sys.stderr)
|
||||
import traceback
|
||||
traceback.print_exc(file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
# Create and register the cmdlet
|
||||
CMDLET = Get_Tag()
|
||||
|
||||
|
||||
|
||||
1415
cmdlets/get_tag.py.orig
Normal file
1415
cmdlets/get_tag.py.orig
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,139 +1,80 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from . import register
|
||||
import models
|
||||
import pipeline as ctx
|
||||
from helper import hydrus as hydrus_wrapper
|
||||
from ._shared import Cmdlet, CmdletArg, normalize_hash
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, parse_cmdlet_args, get_field, normalize_hash
|
||||
from helper.logger import log
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryDB
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="get-url",
|
||||
summary="List URLs associated with a file (Hydrus or Local).",
|
||||
usage="get-url [-hash <sha256>]",
|
||||
args=[
|
||||
CmdletArg("-hash", description="Override the Hydrus file hash (SHA256) to target instead of the selected result."),
|
||||
],
|
||||
details=[
|
||||
"- Prints the known URLs for the selected file.",
|
||||
],
|
||||
)
|
||||
from helper.store import FileStorage
|
||||
|
||||
|
||||
def _parse_hash_and_rest(args: Sequence[str]) -> tuple[str | None, list[str]]:
|
||||
override_hash: str | None = None
|
||||
rest: list[str] = []
|
||||
i = 0
|
||||
while i < len(args):
|
||||
a = args[i]
|
||||
low = str(a).lower()
|
||||
if low in {"-hash", "--hash", "hash"} and i + 1 < len(args):
|
||||
override_hash = str(args[i + 1]).strip()
|
||||
i += 2
|
||||
continue
|
||||
rest.append(a)
|
||||
i += 1
|
||||
return override_hash, rest
|
||||
|
||||
|
||||
@register(["get-url", "get-urls", "get_url"]) # aliases
|
||||
def get_urls(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Helper to get field from both dict and object
|
||||
def get_field(obj: Any, field: str, default: Any = None) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
return obj.get(field, default)
|
||||
else:
|
||||
return getattr(obj, field, default)
|
||||
class Get_Url(Cmdlet):
|
||||
"""Get url associated with files via hash+store."""
|
||||
|
||||
# Help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
NAME = "get-url"
|
||||
SUMMARY = "List url associated with a file"
|
||||
USAGE = "@1 | get-url"
|
||||
ARGS = [
|
||||
SharedArgs.HASH,
|
||||
SharedArgs.STORE,
|
||||
]
|
||||
DETAIL = [
|
||||
"- Lists all url associated with file identified by hash+store",
|
||||
]
|
||||
|
||||
override_hash, _ = _parse_hash_and_rest(args)
|
||||
|
||||
# Handle @N selection which creates a list - extract the first item
|
||||
if isinstance(result, list) and len(result) > 0:
|
||||
result = result[0]
|
||||
|
||||
found_urls = []
|
||||
|
||||
# 1. Try Local Library
|
||||
file_path = get_field(result, "file_path") or get_field(result, "path")
|
||||
if file_path and not override_hash:
|
||||
try:
|
||||
path_obj = Path(file_path)
|
||||
if path_obj.exists():
|
||||
storage_path = get_local_storage_path(config)
|
||||
if storage_path:
|
||||
with LocalLibraryDB(storage_path) as db:
|
||||
metadata = db.get_metadata(path_obj)
|
||||
if metadata and metadata.get("known_urls"):
|
||||
found_urls.extend(metadata["known_urls"])
|
||||
except Exception as e:
|
||||
log(f"Error checking local library: {e}", file=sys.stderr)
|
||||
|
||||
# 2. Try Hydrus
|
||||
hash_hex = normalize_hash(override_hash) if override_hash else normalize_hash(get_field(result, "hash_hex", None))
|
||||
|
||||
# If we haven't found URLs yet, or if we want to merge them (maybe?), let's check Hydrus if we have a hash
|
||||
# But usually if it's local, we might not want to check Hydrus unless requested.
|
||||
# However, the user said "they can just work together".
|
||||
|
||||
if hash_hex:
|
||||
try:
|
||||
client = hydrus_wrapper.get_client(config)
|
||||
if client:
|
||||
payload = client.fetch_file_metadata(hashes=[hash_hex], include_file_urls=True)
|
||||
items = payload.get("metadata") if isinstance(payload, dict) else None
|
||||
meta = items[0] if (isinstance(items, list) and items and isinstance(items[0], dict)) else None
|
||||
hydrus_urls = (meta.get("known_urls") if isinstance(meta, dict) else None) or []
|
||||
for u in hydrus_urls:
|
||||
if u not in found_urls:
|
||||
found_urls.append(u)
|
||||
except Exception as exc:
|
||||
# Only log error if we didn't find local URLs either, or if it's a specific error
|
||||
if not found_urls:
|
||||
log(f"Hydrus lookup failed: {exc}", file=sys.stderr)
|
||||
|
||||
if found_urls:
|
||||
for u in found_urls:
|
||||
text = str(u).strip()
|
||||
if text:
|
||||
# Emit a rich object that looks like a string but carries context
|
||||
# We use a dict with 'title' which ResultTable uses for display
|
||||
# and 'url' which is the actual data
|
||||
# We also include the source file info so downstream cmdlets can use it
|
||||
|
||||
# Create a result object that mimics the structure expected by delete-url
|
||||
# delete-url expects a file object usually, but here we are emitting URLs.
|
||||
# If we emit a dict with 'url' and 'source_file', delete-url can use it.
|
||||
|
||||
rich_result = {
|
||||
"title": text, # Display as just the URL
|
||||
"url": text,
|
||||
"source_file": result, # Pass the original file context
|
||||
"file_path": get_field(result, "file_path") or get_field(result, "path"),
|
||||
"hash_hex": hash_hex
|
||||
}
|
||||
ctx.emit(rich_result)
|
||||
return 0
|
||||
|
||||
if not hash_hex and not file_path:
|
||||
log("Selected result does not include a file path or Hydrus hash", file=sys.stderr)
|
||||
return 1
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Get url for file via hash+store backend."""
|
||||
parsed = parse_cmdlet_args(args, self)
|
||||
|
||||
ctx.emit("No URLs found.")
|
||||
return 0
|
||||
# Extract hash and store from result or args
|
||||
file_hash = parsed.get("hash") or get_field(result, "hash")
|
||||
store_name = parsed.get("store") or get_field(result, "store")
|
||||
|
||||
if not file_hash:
|
||||
log("Error: No file hash provided")
|
||||
return 1
|
||||
|
||||
if not store_name:
|
||||
log("Error: No store name provided")
|
||||
return 1
|
||||
|
||||
# Normalize hash
|
||||
file_hash = normalize_hash(file_hash)
|
||||
if not file_hash:
|
||||
log("Error: Invalid hash format")
|
||||
return 1
|
||||
|
||||
# Get backend and retrieve url
|
||||
try:
|
||||
storage = FileStorage(config)
|
||||
backend = storage[store_name]
|
||||
|
||||
url = backend.get_url(file_hash)
|
||||
|
||||
if url:
|
||||
for url in url:
|
||||
# Emit rich object for pipeline compatibility
|
||||
ctx.emit({
|
||||
"url": url,
|
||||
"hash": file_hash,
|
||||
"store": store_name,
|
||||
})
|
||||
return 0
|
||||
else:
|
||||
ctx.emit("No url found")
|
||||
return 0
|
||||
|
||||
except KeyError:
|
||||
log(f"Error: Storage backend '{store_name}' not configured")
|
||||
return 1
|
||||
except Exception as exc:
|
||||
log(f"Error retrieving url: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
# Register cmdlet
|
||||
register(["get-url", "get_url"])(Get_Url)
|
||||
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ CMDLET = Cmdlet(
|
||||
name=".config",
|
||||
summary="Manage configuration settings",
|
||||
usage=".config [key] [value]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg(
|
||||
name="key",
|
||||
description="Configuration key to update (dot-separated)",
|
||||
|
||||
@@ -42,16 +42,14 @@ from ._shared import (
|
||||
normalize_result_input,
|
||||
get_pipe_object_path,
|
||||
get_pipe_object_hash,
|
||||
should_show_help,
|
||||
get_field,
|
||||
)
|
||||
import models
|
||||
import pipeline as ctx
|
||||
|
||||
|
||||
def _get_item_value(item: Any, key: str, default: Any = None) -> Any:
|
||||
"""Helper to read either dict keys or attributes."""
|
||||
if isinstance(item, dict):
|
||||
return item.get(key, default)
|
||||
return getattr(item, key, default)
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -60,12 +58,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Merge multiple files into one."""
|
||||
|
||||
# Parse help
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# Parse arguments
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
@@ -102,7 +97,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
source_files: List[Path] = []
|
||||
source_tags_files: List[Path] = []
|
||||
source_hashes: List[str] = []
|
||||
source_urls: List[str] = []
|
||||
source_url: List[str] = []
|
||||
source_tags: List[str] = [] # NEW: collect tags from source files
|
||||
source_relationships: List[str] = [] # NEW: collect relationships from source files
|
||||
|
||||
@@ -146,7 +141,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if tags_file.exists():
|
||||
source_tags_files.append(tags_file)
|
||||
|
||||
# Try to read hash, tags, urls, and relationships from .tags sidecar file
|
||||
# Try to read hash, tags, url, and relationships from .tags sidecar file
|
||||
try:
|
||||
tags_content = tags_file.read_text(encoding='utf-8')
|
||||
for line in tags_content.split('\n'):
|
||||
@@ -157,18 +152,18 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
hash_value = line[5:].strip()
|
||||
if hash_value:
|
||||
source_hashes.append(hash_value)
|
||||
elif line.startswith('known_url:') or line.startswith('url:'):
|
||||
# Extract URLs from tags file
|
||||
elif line.startswith('url:') or line.startswith('url:'):
|
||||
# Extract url from tags file
|
||||
url_value = line.split(':', 1)[1].strip() if ':' in line else ''
|
||||
if url_value and url_value not in source_urls:
|
||||
source_urls.append(url_value)
|
||||
if url_value and url_value not in source_url:
|
||||
source_url.append(url_value)
|
||||
elif line.startswith('relationship:'):
|
||||
# Extract relationships from tags file
|
||||
rel_value = line.split(':', 1)[1].strip() if ':' in line else ''
|
||||
if rel_value and rel_value not in source_relationships:
|
||||
source_relationships.append(rel_value)
|
||||
else:
|
||||
# Collect actual tags (not metadata like hash: or known_url:)
|
||||
# Collect actual tags (not metadata like hash: or url:)
|
||||
source_tags.append(line)
|
||||
except Exception:
|
||||
pass
|
||||
@@ -178,14 +173,14 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if hash_value and hash_value not in source_hashes:
|
||||
source_hashes.append(str(hash_value))
|
||||
|
||||
# Extract known URLs if available
|
||||
known_urls = _get_item_value(item, 'known_urls', [])
|
||||
if isinstance(known_urls, str):
|
||||
source_urls.append(known_urls)
|
||||
elif isinstance(known_urls, list):
|
||||
source_urls.extend(known_urls)
|
||||
# Extract known url if available
|
||||
url = get_field(item, 'url', [])
|
||||
if isinstance(url, str):
|
||||
source_url.append(url)
|
||||
elif isinstance(url, list):
|
||||
source_url.extend(url)
|
||||
else:
|
||||
title = _get_item_value(item, 'title', 'unknown') or _get_item_value(item, 'id', 'unknown')
|
||||
title = get_field(item, 'title', 'unknown') or get_field(item, 'id', 'unknown')
|
||||
log(f"Warning: Could not locate file for item: {title}", file=sys.stderr)
|
||||
|
||||
if len(source_files) < 2:
|
||||
@@ -279,8 +274,8 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
if HAS_METADATA_API and write_tags_to_file:
|
||||
# Use unified API for file writing
|
||||
source_hashes_list = source_hashes if source_hashes else None
|
||||
source_urls_list = source_urls if source_urls else None
|
||||
write_tags_to_file(tags_path, merged_tags, source_hashes_list, source_urls_list)
|
||||
source_url_list = source_url if source_url else None
|
||||
write_tags_to_file(tags_path, merged_tags, source_hashes_list, source_url_list)
|
||||
else:
|
||||
# Fallback: manual file writing
|
||||
tags_lines = []
|
||||
@@ -292,10 +287,10 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Add regular tags
|
||||
tags_lines.extend(merged_tags)
|
||||
|
||||
# Add known URLs
|
||||
if source_urls:
|
||||
for url in source_urls:
|
||||
tags_lines.append(f"known_url:{url}")
|
||||
# Add known url
|
||||
if source_url:
|
||||
for url in source_url:
|
||||
tags_lines.append(f"url:{url}")
|
||||
|
||||
# Add relationships (if available)
|
||||
if source_relationships:
|
||||
@@ -309,7 +304,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# Also create .metadata file using centralized function
|
||||
try:
|
||||
write_metadata(output_path, source_hashes[0] if source_hashes else None, source_urls, source_relationships)
|
||||
write_metadata(output_path, source_hashes[0] if source_hashes else None, source_url, source_relationships)
|
||||
log(f"Created metadata: {output_path.name}.metadata", file=sys.stderr)
|
||||
except Exception as e:
|
||||
log(f"Warning: Could not create metadata file: {e}", file=sys.stderr)
|
||||
@@ -325,12 +320,12 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
except ImportError:
|
||||
# Fallback: create a simple object with the required attributes
|
||||
class SimpleItem:
|
||||
def __init__(self, target, title, media_kind, tags=None, known_urls=None):
|
||||
def __init__(self, target, title, media_kind, tags=None, url=None):
|
||||
self.target = target
|
||||
self.title = title
|
||||
self.media_kind = media_kind
|
||||
self.tags = tags or []
|
||||
self.known_urls = known_urls or []
|
||||
self.url = url or []
|
||||
self.origin = "local" # Ensure origin is set for add-file
|
||||
PipelineItem = SimpleItem
|
||||
|
||||
@@ -339,7 +334,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
title=output_path.stem,
|
||||
media_kind=file_kind,
|
||||
tags=merged_tags, # Include merged tags
|
||||
known_urls=source_urls # Include known URLs
|
||||
url=source_url # Include known url
|
||||
)
|
||||
# Clear previous results to ensure only the merged file is passed down
|
||||
ctx.clear_last_result()
|
||||
@@ -904,12 +899,12 @@ CMDLET = Cmdlet(
|
||||
name="merge-file",
|
||||
summary="Merge multiple files into a single output file. Supports audio, video, PDF, and text merging with optional cleanup.",
|
||||
usage="merge-file [-delete] [-output <path>] [-format <auto|mp3|aac|opus|mp4|mkv|pdf|txt>]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("-delete", type="flag", description="Delete source files after successful merge."),
|
||||
CmdletArg("-output", description="Override output file path."),
|
||||
CmdletArg("-format", description="Output format (auto/mp3/aac/opus/mp4/mkv/pdf/txt). Default: auto-detect from first file."),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Pipe multiple files: search-file query | [1,2,3] | merge-file",
|
||||
"- Audio files merge with minimal quality loss using specified codec.",
|
||||
"- Video files merge into MP4 or MKV containers.",
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Screen-shot cmdlet for capturing screenshots of URLs in a pipeline.
|
||||
"""Screen-shot cmdlet for capturing screenshots of url in a pipeline.
|
||||
|
||||
This cmdlet processes files through the pipeline and creates screenshots using
|
||||
Playwright, marking them as temporary artifacts for cleanup.
|
||||
@@ -23,7 +23,7 @@ from helper.http_client import HTTPClient
|
||||
from helper.utils import ensure_directory, unique_path, unique_preserve_order
|
||||
|
||||
from . import register
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, create_pipe_object_result, normalize_result_input
|
||||
from ._shared import Cmdlet, CmdletArg, SharedArgs, create_pipe_object_result, normalize_result_input, should_show_help, get_field
|
||||
import models
|
||||
import pipeline as pipeline_context
|
||||
|
||||
@@ -113,8 +113,8 @@ class ScreenshotError(RuntimeError):
|
||||
class ScreenshotOptions:
|
||||
"""Options controlling screenshot capture and post-processing."""
|
||||
|
||||
url: str
|
||||
output_dir: Path
|
||||
url: Sequence[str] = ()
|
||||
output_path: Optional[Path] = None
|
||||
full_page: bool = True
|
||||
headless: bool = True
|
||||
@@ -124,7 +124,7 @@ class ScreenshotOptions:
|
||||
tags: Sequence[str] = ()
|
||||
archive: bool = False
|
||||
archive_timeout: float = ARCHIVE_TIMEOUT
|
||||
known_urls: Sequence[str] = ()
|
||||
url: Sequence[str] = ()
|
||||
output_format: Optional[str] = None
|
||||
prefer_platform_target: bool = False
|
||||
target_selectors: Optional[Sequence[str]] = None
|
||||
@@ -136,10 +136,9 @@ class ScreenshotResult:
|
||||
"""Details about the captured screenshot."""
|
||||
|
||||
path: Path
|
||||
url: str
|
||||
tags_applied: List[str]
|
||||
archive_urls: List[str]
|
||||
known_urls: List[str]
|
||||
archive_url: List[str]
|
||||
url: List[str]
|
||||
warnings: List[str] = field(default_factory=list)
|
||||
|
||||
|
||||
@@ -471,24 +470,24 @@ def _capture_screenshot(options: ScreenshotOptions) -> ScreenshotResult:
|
||||
warnings: List[str] = []
|
||||
_capture(options, destination, warnings)
|
||||
|
||||
known_urls = unique_preserve_order([options.url, *options.known_urls])
|
||||
archive_urls: List[str] = []
|
||||
# Build URL list from provided options.url (sequence) and deduplicate
|
||||
url = unique_preserve_order(list(options.url))
|
||||
archive_url: List[str] = []
|
||||
if options.archive:
|
||||
debug(f"[_capture_screenshot] Archiving enabled for {options.url}")
|
||||
archives, archive_warnings = _archive_url(options.url, options.archive_timeout)
|
||||
archive_urls.extend(archives)
|
||||
archive_url.extend(archives)
|
||||
warnings.extend(archive_warnings)
|
||||
if archives:
|
||||
known_urls = unique_preserve_order([*known_urls, *archives])
|
||||
url = unique_preserve_order([*url, *archives])
|
||||
|
||||
applied_tags = unique_preserve_order(list(tag for tag in options.tags if tag.strip()))
|
||||
|
||||
return ScreenshotResult(
|
||||
path=destination,
|
||||
url=options.url,
|
||||
tags_applied=applied_tags,
|
||||
archive_urls=archive_urls,
|
||||
known_urls=known_urls,
|
||||
archive_url=archive_url,
|
||||
url=url,
|
||||
warnings=warnings,
|
||||
)
|
||||
|
||||
@@ -498,10 +497,10 @@ def _capture_screenshot(options: ScreenshotOptions) -> ScreenshotResult:
|
||||
# ============================================================================
|
||||
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Take screenshots of URLs in the pipeline.
|
||||
"""Take screenshots of url in the pipeline.
|
||||
|
||||
Accepts:
|
||||
- Single result object (dict or PipeObject) with 'file_path' field
|
||||
- Single result object (dict or PipeObject) with 'path' field
|
||||
- List of result objects to screenshot each
|
||||
- Direct URL as string
|
||||
|
||||
@@ -518,12 +517,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
debug(f"[_run] screen-shot invoked with args: {args}")
|
||||
|
||||
# Help check
|
||||
try:
|
||||
if any(str(a).lower() in {"-?", "/?", "--help", "-h", "help", "--cmdlet"} for a in args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
except Exception:
|
||||
pass
|
||||
if should_show_help(args):
|
||||
log(json.dumps(CMDLET, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
# ========================================================================
|
||||
# ARGUMENT PARSING
|
||||
@@ -539,36 +535,36 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# Positional URL argument (if provided)
|
||||
url_arg = parsed.get("url")
|
||||
positional_urls = [str(url_arg)] if url_arg else []
|
||||
positional_url = [str(url_arg)] if url_arg else []
|
||||
|
||||
# ========================================================================
|
||||
# INPUT PROCESSING - Extract URLs from pipeline or command arguments
|
||||
# INPUT PROCESSING - Extract url from pipeline or command arguments
|
||||
# ========================================================================
|
||||
|
||||
piped_results = normalize_result_input(result)
|
||||
urls_to_process = []
|
||||
url_to_process = []
|
||||
|
||||
# Extract URLs from piped results
|
||||
# Extract url from piped results
|
||||
if piped_results:
|
||||
for item in piped_results:
|
||||
url = None
|
||||
if isinstance(item, dict):
|
||||
url = item.get('file_path') or item.get('path') or item.get('url') or item.get('target')
|
||||
else:
|
||||
url = getattr(item, 'file_path', None) or getattr(item, 'path', None) or getattr(item, 'url', None) or getattr(item, 'target', None)
|
||||
|
||||
url = (
|
||||
get_field(item, 'path')
|
||||
or get_field(item, 'url')
|
||||
or get_field(item, 'target')
|
||||
)
|
||||
|
||||
if url:
|
||||
urls_to_process.append(str(url))
|
||||
url_to_process.append(str(url))
|
||||
|
||||
# Use positional arguments if no pipeline input
|
||||
if not urls_to_process and positional_urls:
|
||||
urls_to_process = positional_urls
|
||||
if not url_to_process and positional_url:
|
||||
url_to_process = positional_url
|
||||
|
||||
if not urls_to_process:
|
||||
log(f"No URLs to process for screen-shot cmdlet", file=sys.stderr)
|
||||
if not url_to_process:
|
||||
log(f"No url to process for screen-shot cmdlet", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
debug(f"[_run] URLs to process: {urls_to_process}")
|
||||
debug(f"[_run] url to process: {url_to_process}")
|
||||
|
||||
# ========================================================================
|
||||
# OUTPUT DIRECTORY RESOLUTION - Priority chain
|
||||
@@ -619,10 +615,10 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
all_emitted = []
|
||||
exit_code = 0
|
||||
# ========================================================================
|
||||
# PROCESS URLs AND CAPTURE SCREENSHOTS
|
||||
# PROCESS url AND CAPTURE SCREENSHOTS
|
||||
# ========================================================================
|
||||
|
||||
for url in urls_to_process:
|
||||
for url in url_to_process:
|
||||
# Validate URL format
|
||||
if not url.lower().startswith(("http://", "https://", "file://")):
|
||||
log(f"[screen_shot] Skipping non-URL input: {url}", file=sys.stderr)
|
||||
@@ -631,7 +627,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
try:
|
||||
# Create screenshot with provided options
|
||||
options = ScreenshotOptions(
|
||||
url=url,
|
||||
url=[url],
|
||||
output_dir=screenshot_dir,
|
||||
output_format=format_name,
|
||||
archive=archive_enabled,
|
||||
@@ -645,8 +641,8 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# Log results and warnings
|
||||
log(f"Screenshot captured to {screenshot_result.path}", flush=True)
|
||||
if screenshot_result.archive_urls:
|
||||
log(f"Archives: {', '.join(screenshot_result.archive_urls)}", flush=True)
|
||||
if screenshot_result.archive_url:
|
||||
log(f"Archives: {', '.join(screenshot_result.archive_url)}", flush=True)
|
||||
for warning in screenshot_result.warnings:
|
||||
log(f"Warning: {warning}", flush=True)
|
||||
|
||||
@@ -670,8 +666,8 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
parent_hash=hashlib.sha256(url.encode()).hexdigest(),
|
||||
extra={
|
||||
'source_url': url,
|
||||
'archive_urls': screenshot_result.archive_urls,
|
||||
'known_urls': screenshot_result.known_urls,
|
||||
'archive_url': screenshot_result.archive_url,
|
||||
'url': screenshot_result.url,
|
||||
'target': str(screenshot_result.path), # Explicit target for add-file
|
||||
}
|
||||
)
|
||||
@@ -701,16 +697,16 @@ CMDLET = Cmdlet(
|
||||
name="screen-shot",
|
||||
summary="Capture a screenshot of a URL or file and mark as temporary artifact",
|
||||
usage="screen-shot <url> [options] or download-data <url> | screen-shot [options]",
|
||||
aliases=["screenshot", "ss"],
|
||||
args=[
|
||||
alias=["screenshot", "ss"],
|
||||
arg=[
|
||||
CmdletArg(name="url", type="string", required=False, description="URL to screenshot (or from pipeline)"),
|
||||
CmdletArg(name="format", type="string", description="Output format: png, jpeg, or pdf"),
|
||||
CmdletArg(name="selector", type="string", description="CSS selector for element capture"),
|
||||
SharedArgs.ARCHIVE, # Use shared archive argument
|
||||
SharedArgs.STORAGE, # Use shared storage argument
|
||||
SharedArgs.STORE, # Use shared storage argument
|
||||
],
|
||||
details=[
|
||||
"Take screenshots of URLs with optional archiving and element targeting.",
|
||||
detail=[
|
||||
"Take screenshots of url with optional archiving and element targeting.",
|
||||
"Screenshots are marked as temporary artifacts for cleanup by the cleanup cmdlet.",
|
||||
"",
|
||||
"Arguments:",
|
||||
|
||||
@@ -1,531 +0,0 @@
|
||||
"""Search-file cmdlet: Search for files by query, tags, size, type, duration, etc."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence, List, Optional, Tuple, Callable
|
||||
from fnmatch import fnmatchcase
|
||||
from pathlib import Path
|
||||
from dataclasses import dataclass, field
|
||||
from collections import OrderedDict
|
||||
import re
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
from helper.logger import log, debug
|
||||
import shutil
|
||||
import subprocess
|
||||
|
||||
from helper.file_storage import FileStorage
|
||||
from helper.search_provider import get_provider, list_providers, SearchResult
|
||||
from metadata import import_pending_sidecars
|
||||
|
||||
from . import register
|
||||
from ._shared import Cmdlet, CmdletArg
|
||||
import models
|
||||
import pipeline as ctx
|
||||
|
||||
# Optional dependencies
|
||||
try:
|
||||
import mutagen # type: ignore
|
||||
except ImportError: # pragma: no cover
|
||||
mutagen = None # type: ignore
|
||||
|
||||
try:
|
||||
from config import get_hydrus_url, resolve_output_dir
|
||||
except Exception: # pragma: no cover
|
||||
get_hydrus_url = None # type: ignore
|
||||
resolve_output_dir = None # type: ignore
|
||||
|
||||
try:
|
||||
from helper.hydrus import HydrusClient, HydrusRequestError
|
||||
except ImportError: # pragma: no cover
|
||||
HydrusClient = None # type: ignore
|
||||
HydrusRequestError = RuntimeError # type: ignore
|
||||
|
||||
try:
|
||||
from helper.utils import sha256_file
|
||||
except ImportError: # pragma: no cover
|
||||
sha256_file = None # type: ignore
|
||||
|
||||
try:
|
||||
from helper.utils_constant import mime_maps
|
||||
except ImportError: # pragma: no cover
|
||||
mime_maps = {} # type: ignore
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Data Classes (from helper/search.py)
|
||||
# ============================================================================
|
||||
|
||||
@dataclass(slots=True)
|
||||
class SearchRecord:
|
||||
path: str
|
||||
size_bytes: int | None = None
|
||||
duration_seconds: str | None = None
|
||||
tags: str | None = None
|
||||
hash_hex: str | None = None
|
||||
|
||||
def as_dict(self) -> dict[str, str]:
|
||||
payload: dict[str, str] = {"path": self.path}
|
||||
if self.size_bytes is not None:
|
||||
payload["size"] = str(self.size_bytes)
|
||||
if self.duration_seconds:
|
||||
payload["duration"] = self.duration_seconds
|
||||
if self.tags:
|
||||
payload["tags"] = self.tags
|
||||
if self.hash_hex:
|
||||
payload["hash"] = self.hash_hex
|
||||
return payload
|
||||
|
||||
|
||||
@dataclass
|
||||
class ResultItem:
|
||||
origin: str
|
||||
title: str
|
||||
detail: str
|
||||
annotations: List[str]
|
||||
target: str
|
||||
media_kind: str = "other"
|
||||
hash_hex: Optional[str] = None
|
||||
columns: List[tuple[str, str]] = field(default_factory=list)
|
||||
tag_summary: Optional[str] = None
|
||||
duration_seconds: Optional[float] = None
|
||||
size_bytes: Optional[int] = None
|
||||
full_metadata: Optional[Dict[str, Any]] = None
|
||||
tags: Optional[set[str]] = field(default_factory=set)
|
||||
relationships: Optional[List[str]] = field(default_factory=list)
|
||||
known_urls: Optional[List[str]] = field(default_factory=list)
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
payload: Dict[str, Any] = {
|
||||
"title": self.title,
|
||||
}
|
||||
|
||||
# Always include these core fields for downstream cmdlets (get-file, download-data, etc)
|
||||
payload["origin"] = self.origin
|
||||
payload["target"] = self.target
|
||||
payload["media_kind"] = self.media_kind
|
||||
|
||||
# Always include full_metadata if present (needed by download-data, etc)
|
||||
# This is NOT for display, but for downstream processing
|
||||
if self.full_metadata:
|
||||
payload["full_metadata"] = self.full_metadata
|
||||
|
||||
# Include columns if defined (result renderer will use these for display)
|
||||
if self.columns:
|
||||
payload["columns"] = list(self.columns)
|
||||
else:
|
||||
# If no columns, include the detail for backwards compatibility
|
||||
payload["detail"] = self.detail
|
||||
payload["annotations"] = list(self.annotations)
|
||||
|
||||
# Include optional fields
|
||||
if self.hash_hex:
|
||||
payload["hash"] = self.hash_hex
|
||||
if self.tag_summary:
|
||||
payload["tags"] = self.tag_summary
|
||||
if self.tags:
|
||||
payload["tags_set"] = list(self.tags)
|
||||
if self.relationships:
|
||||
payload["relationships"] = self.relationships
|
||||
if self.known_urls:
|
||||
payload["known_urls"] = self.known_urls
|
||||
return payload
|
||||
|
||||
|
||||
STORAGE_ORIGINS = {"local", "hydrus", "debrid"}
|
||||
|
||||
|
||||
def _normalize_extension(ext_value: Any) -> str:
|
||||
"""Sanitize extension strings to alphanumerics and cap at 5 chars."""
|
||||
ext = str(ext_value or "").strip().lstrip(".")
|
||||
|
||||
# Stop at common separators to avoid dragging status text into the extension
|
||||
for sep in (" ", "|", "(", "[", "{", ",", ";"):
|
||||
if sep in ext:
|
||||
ext = ext.split(sep, 1)[0]
|
||||
break
|
||||
|
||||
# If there are multiple dots, take the last token as the extension
|
||||
if "." in ext:
|
||||
ext = ext.split(".")[-1]
|
||||
|
||||
# Keep only alphanumeric characters and enforce max length
|
||||
ext = "".join(ch for ch in ext if ch.isalnum())
|
||||
return ext[:5]
|
||||
|
||||
|
||||
def _ensure_storage_columns(payload: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Attach Title/Store columns for storage-origin results to keep CLI display compact."""
|
||||
origin_value = str(payload.get("origin") or payload.get("source") or "").lower()
|
||||
if origin_value not in STORAGE_ORIGINS:
|
||||
return payload
|
||||
|
||||
title = payload.get("title") or payload.get("name") or payload.get("target") or payload.get("path") or "Result"
|
||||
store_label = payload.get("origin") or payload.get("source") or origin_value
|
||||
|
||||
# Handle extension
|
||||
extension = _normalize_extension(payload.get("ext", ""))
|
||||
if not extension and title:
|
||||
path_obj = Path(str(title))
|
||||
if path_obj.suffix:
|
||||
extension = _normalize_extension(path_obj.suffix.lstrip('.'))
|
||||
title = path_obj.stem
|
||||
|
||||
# Handle size as integer MB (header will include units)
|
||||
size_val = payload.get("size") or payload.get("size_bytes")
|
||||
size_str = ""
|
||||
if size_val is not None:
|
||||
try:
|
||||
size_bytes = int(size_val)
|
||||
size_mb = int(size_bytes / (1024 * 1024))
|
||||
size_str = str(size_mb)
|
||||
except (ValueError, TypeError):
|
||||
size_str = str(size_val)
|
||||
|
||||
normalized = dict(payload)
|
||||
normalized["columns"] = [
|
||||
("Title", str(title)),
|
||||
("Ext", str(extension)),
|
||||
("Store", str(store_label)),
|
||||
("Size(Mb)", str(size_str)),
|
||||
]
|
||||
return normalized
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name="search-file",
|
||||
summary="Unified search cmdlet for storage (Hydrus, Local) and providers (Debrid, LibGen, OpenLibrary, Soulseek).",
|
||||
usage="search-file [query] [-tag TAG] [-size >100MB|<50MB] [-type audio|video|image] [-duration >10:00] [-storage BACKEND] [-provider PROVIDER]",
|
||||
args=[
|
||||
CmdletArg("query", description="Search query string"),
|
||||
CmdletArg("tag", description="Filter by tag (can be used multiple times)"),
|
||||
CmdletArg("size", description="Filter by size: >100MB, <50MB, =10MB"),
|
||||
CmdletArg("type", description="Filter by type: audio, video, image, document"),
|
||||
CmdletArg("duration", description="Filter by duration: >10:00, <1:30:00"),
|
||||
CmdletArg("limit", type="integer", description="Limit results (default: 45)"),
|
||||
CmdletArg("storage", description="Search storage backend: hydrus, local (default: all searchable storages)"),
|
||||
CmdletArg("provider", description="Search provider: libgen, openlibrary, soulseek, debrid, local (overrides -storage)"),
|
||||
],
|
||||
details=[
|
||||
"Search across storage (Hydrus, Local) and providers (Debrid, LibGen, OpenLibrary, Soulseek)",
|
||||
"Use -provider to search a specific source, or -storage to search file backends",
|
||||
"Filter results by: tag, size, type, duration",
|
||||
"Results can be piped to other commands",
|
||||
"Examples:",
|
||||
"search-file foo # Search all file backends",
|
||||
"search-file -provider libgen 'python programming' # Search LibGen books",
|
||||
"search-file -provider debrid 'movie' # Search AllDebrid magnets",
|
||||
"search-file 'music' -provider soulseek # Search Soulseek P2P",
|
||||
"search-file -provider openlibrary 'tolkien' # Search OpenLibrary",
|
||||
"search-file song -storage hydrus -type audio # Search only Hydrus audio",
|
||||
"search-file movie -tag action -provider debrid # Debrid with filters",
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
@register(["search-file", "search"])
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Search across multiple providers: Hydrus, Local, Debrid, LibGen, etc."""
|
||||
args_list = [str(arg) for arg in (args or [])]
|
||||
|
||||
# Parse arguments
|
||||
query = ""
|
||||
tag_filters: List[str] = []
|
||||
size_filter: Optional[Tuple[str, int]] = None
|
||||
duration_filter: Optional[Tuple[str, float]] = None
|
||||
type_filter: Optional[str] = None
|
||||
storage_backend: Optional[str] = None
|
||||
provider_name: Optional[str] = None
|
||||
limit = 45
|
||||
searched_backends: List[str] = []
|
||||
|
||||
# Simple argument parsing
|
||||
i = 0
|
||||
while i < len(args_list):
|
||||
arg = args_list[i]
|
||||
low = arg.lower()
|
||||
|
||||
if low in {"-provider", "--provider"} and i + 1 < len(args_list):
|
||||
provider_name = args_list[i + 1].lower()
|
||||
i += 2
|
||||
elif low in {"-storage", "--storage"} and i + 1 < len(args_list):
|
||||
storage_backend = args_list[i + 1].lower()
|
||||
i += 2
|
||||
elif low in {"-tag", "--tag"} and i + 1 < len(args_list):
|
||||
tag_filters.append(args_list[i + 1])
|
||||
i += 2
|
||||
elif low in {"-limit", "--limit"} and i + 1 < len(args_list):
|
||||
try:
|
||||
limit = int(args_list[i + 1])
|
||||
except ValueError:
|
||||
limit = 100
|
||||
i += 2
|
||||
elif low in {"-type", "--type"} and i + 1 < len(args_list):
|
||||
type_filter = args_list[i + 1].lower()
|
||||
i += 2
|
||||
elif not arg.startswith("-"):
|
||||
if query:
|
||||
query += " " + arg
|
||||
else:
|
||||
query = arg
|
||||
i += 1
|
||||
else:
|
||||
i += 1
|
||||
|
||||
# Extract store: filter tokens (works with commas or whitespace) and clean query for backends
|
||||
store_filter: Optional[str] = None
|
||||
if query:
|
||||
match = re.search(r"\bstore:([^\s,]+)", query, flags=re.IGNORECASE)
|
||||
if match:
|
||||
store_filter = match.group(1).strip().lower() or None
|
||||
# Remove any store: tokens so downstream backends see only the actual query
|
||||
query = re.sub(r"\s*[,]?\s*store:[^\s,]+", " ", query, flags=re.IGNORECASE)
|
||||
query = re.sub(r"\s{2,}", " ", query)
|
||||
query = query.strip().strip(',')
|
||||
|
||||
# Debrid is provider-only now
|
||||
if storage_backend and storage_backend.lower() == "debrid":
|
||||
log("Use -provider debrid instead of -storage debrid (debrid is provider-only)", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# If store: was provided without explicit -storage/-provider, prefer that backend
|
||||
if store_filter and not provider_name and not storage_backend:
|
||||
if store_filter in {"hydrus", "local", "debrid"}:
|
||||
storage_backend = store_filter
|
||||
|
||||
# Handle piped input (e.g. from @N selection) if query is empty
|
||||
if not query and result:
|
||||
# If result is a list, take the first item
|
||||
actual_result = result[0] if isinstance(result, list) and result else result
|
||||
|
||||
# Helper to get field
|
||||
def get_field(obj: Any, field: str) -> Any:
|
||||
return getattr(obj, field, None) or (obj.get(field) if isinstance(obj, dict) else None)
|
||||
|
||||
origin = get_field(actual_result, 'origin')
|
||||
target = get_field(actual_result, 'target')
|
||||
|
||||
# Special handling for Bandcamp artist/album drill-down
|
||||
if origin == 'bandcamp' and target:
|
||||
query = target
|
||||
if not provider_name:
|
||||
provider_name = 'bandcamp'
|
||||
|
||||
# Generic URL handling
|
||||
elif target and str(target).startswith(('http://', 'https://')):
|
||||
query = target
|
||||
# Try to infer provider from URL if not set
|
||||
if not provider_name:
|
||||
if 'bandcamp.com' in target:
|
||||
provider_name = 'bandcamp'
|
||||
elif 'youtube.com' in target or 'youtu.be' in target:
|
||||
provider_name = 'youtube'
|
||||
|
||||
if not query:
|
||||
log("Provide a search query", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Initialize worker for this search command
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from config import get_local_storage_path
|
||||
import uuid
|
||||
worker_id = str(uuid.uuid4())
|
||||
library_root = get_local_storage_path(config or {})
|
||||
if not library_root:
|
||||
log("No library root configured", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
db = None
|
||||
try:
|
||||
db = LocalLibraryDB(library_root)
|
||||
db.insert_worker(
|
||||
worker_id,
|
||||
"search",
|
||||
title=f"Search: {query}",
|
||||
description=f"Query: {query}",
|
||||
pipe=ctx.get_current_command_text()
|
||||
)
|
||||
|
||||
results_list = []
|
||||
import result_table
|
||||
import importlib
|
||||
importlib.reload(result_table)
|
||||
from result_table import ResultTable
|
||||
|
||||
# Create ResultTable for display
|
||||
table_title = f"Search: {query}"
|
||||
if provider_name:
|
||||
table_title += f" [{provider_name}]"
|
||||
elif storage_backend:
|
||||
table_title += f" [{storage_backend}]"
|
||||
|
||||
table = ResultTable(table_title)
|
||||
table.set_source_command("search-file", args_list)
|
||||
|
||||
# Try to search using provider (libgen, soulseek, debrid, openlibrary)
|
||||
if provider_name:
|
||||
debug(f"[search_file] Attempting provider search with: {provider_name}")
|
||||
provider = get_provider(provider_name, config)
|
||||
if not provider:
|
||||
log(f"Provider '{provider_name}' not available", file=sys.stderr)
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
return 1
|
||||
|
||||
debug(f"[search_file] Provider loaded, calling search with query: {query}")
|
||||
search_result = provider.search(query, limit=limit)
|
||||
debug(f"[search_file] Provider search returned {len(search_result)} results")
|
||||
|
||||
for item in search_result:
|
||||
# Add to table
|
||||
table.add_result(item)
|
||||
|
||||
# Emit to pipeline
|
||||
item_dict = item.to_dict()
|
||||
results_list.append(item_dict)
|
||||
ctx.emit(item_dict)
|
||||
|
||||
# Set the result table in context for TUI/CLI display
|
||||
ctx.set_last_result_table(table, results_list)
|
||||
|
||||
debug(f"[search_file] Emitted {len(results_list)} results")
|
||||
|
||||
# Write results to worker stdout
|
||||
db.append_worker_stdout(worker_id, json.dumps(results_list, indent=2))
|
||||
db.update_worker_status(worker_id, 'completed')
|
||||
return 0
|
||||
|
||||
# Otherwise search using storage backends (Hydrus, Local)
|
||||
from helper.file_storage import FileStorage
|
||||
storage = FileStorage(config=config or {})
|
||||
|
||||
backend_to_search = storage_backend or None
|
||||
if backend_to_search:
|
||||
# Check if requested backend is available
|
||||
if backend_to_search == "hydrus":
|
||||
from helper.hydrus import is_hydrus_available
|
||||
if not is_hydrus_available(config or {}):
|
||||
log(f"Backend 'hydrus' is not available (Hydrus service not running)", file=sys.stderr)
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
return 1
|
||||
searched_backends.append(backend_to_search)
|
||||
if not storage.supports_search(backend_to_search):
|
||||
log(f"Backend '{backend_to_search}' does not support searching", file=sys.stderr)
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
return 1
|
||||
results = storage[backend_to_search].search(query, limit=limit)
|
||||
else:
|
||||
# Search all searchable backends, but skip hydrus if unavailable
|
||||
from helper.hydrus import is_hydrus_available
|
||||
hydrus_available = is_hydrus_available(config or {})
|
||||
|
||||
all_results = []
|
||||
for backend_name in storage.list_searchable_backends():
|
||||
# Skip hydrus if not available
|
||||
if backend_name == "hydrus" and not hydrus_available:
|
||||
continue
|
||||
searched_backends.append(backend_name)
|
||||
try:
|
||||
backend_results = storage[backend_name].search(query, limit=limit - len(all_results))
|
||||
if backend_results:
|
||||
all_results.extend(backend_results)
|
||||
if len(all_results) >= limit:
|
||||
break
|
||||
except Exception as exc:
|
||||
log(f"Backend {backend_name} search failed: {exc}", file=sys.stderr)
|
||||
results = all_results[:limit]
|
||||
|
||||
# Also query Debrid provider by default (provider-only, but keep legacy coverage when no explicit provider given)
|
||||
if not provider_name and not storage_backend:
|
||||
try:
|
||||
debrid_provider = get_provider("debrid", config)
|
||||
if debrid_provider and debrid_provider.validate():
|
||||
remaining = max(0, limit - len(results)) if isinstance(results, list) else limit
|
||||
if remaining > 0:
|
||||
debrid_results = debrid_provider.search(query, limit=remaining)
|
||||
if debrid_results:
|
||||
if "debrid" not in searched_backends:
|
||||
searched_backends.append("debrid")
|
||||
if results is None:
|
||||
results = []
|
||||
results.extend(debrid_results)
|
||||
except Exception as exc:
|
||||
log(f"Debrid provider search failed: {exc}", file=sys.stderr)
|
||||
|
||||
def _format_storage_label(name: str) -> str:
|
||||
clean = str(name or "").strip()
|
||||
if not clean:
|
||||
return "Unknown"
|
||||
return clean.replace("_", " ").title()
|
||||
|
||||
storage_counts: OrderedDict[str, int] = OrderedDict((name, 0) for name in searched_backends)
|
||||
for item in results or []:
|
||||
origin = getattr(item, 'origin', None)
|
||||
if origin is None and isinstance(item, dict):
|
||||
origin = item.get('origin') or item.get('source')
|
||||
if not origin:
|
||||
continue
|
||||
key = str(origin).lower()
|
||||
if key not in storage_counts:
|
||||
storage_counts[key] = 0
|
||||
storage_counts[key] += 1
|
||||
|
||||
if storage_counts or query:
|
||||
display_counts = OrderedDict((_format_storage_label(name), count) for name, count in storage_counts.items())
|
||||
summary_line = table.set_storage_summary(display_counts, query, inline=True)
|
||||
if summary_line:
|
||||
table.title = summary_line
|
||||
|
||||
# Emit results and collect for workers table
|
||||
if results:
|
||||
for item in results:
|
||||
def _as_dict(obj: Any) -> Dict[str, Any]:
|
||||
if isinstance(obj, dict):
|
||||
return dict(obj)
|
||||
if hasattr(obj, "to_dict") and callable(getattr(obj, "to_dict")):
|
||||
return obj.to_dict() # type: ignore[arg-type]
|
||||
return {"title": str(obj)}
|
||||
|
||||
item_dict = _as_dict(item)
|
||||
if store_filter:
|
||||
origin_val = str(item_dict.get("origin") or item_dict.get("source") or "").lower()
|
||||
if store_filter != origin_val:
|
||||
continue
|
||||
normalized = _ensure_storage_columns(item_dict)
|
||||
# Add to table using normalized columns to avoid extra fields (e.g., Tags/Name)
|
||||
table.add_result(normalized)
|
||||
|
||||
results_list.append(normalized)
|
||||
ctx.emit(normalized)
|
||||
|
||||
# Set the result table in context for TUI/CLI display
|
||||
ctx.set_last_result_table(table, results_list)
|
||||
|
||||
# Write results to worker stdout
|
||||
db.append_worker_stdout(worker_id, json.dumps(results_list, indent=2))
|
||||
else:
|
||||
log("No results found", file=sys.stderr)
|
||||
db.append_worker_stdout(worker_id, json.dumps([], indent=2))
|
||||
|
||||
db.update_worker_status(worker_id, 'completed')
|
||||
return 0
|
||||
|
||||
except Exception as exc:
|
||||
log(f"Search failed: {exc}", file=sys.stderr)
|
||||
import traceback
|
||||
traceback.print_exc(file=sys.stderr)
|
||||
if db:
|
||||
try:
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
except Exception:
|
||||
pass
|
||||
return 1
|
||||
|
||||
finally:
|
||||
# Always close the database connection
|
||||
if db:
|
||||
try:
|
||||
db.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
117
cmdlets/search_provider.py
Normal file
117
cmdlets/search_provider.py
Normal file
@@ -0,0 +1,117 @@
|
||||
"""search-provider cmdlet: Search external providers (bandcamp, libgen, soulseek, youtube)."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List, Sequence
|
||||
import sys
|
||||
|
||||
from helper.logger import log, debug
|
||||
from helper.provider import get_search_provider, list_search_providers
|
||||
|
||||
from ._shared import Cmdlet, CmdletArg, should_show_help
|
||||
import pipeline as ctx
|
||||
|
||||
|
||||
class Search_Provider(Cmdlet):
|
||||
"""Search external content providers."""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__(
|
||||
name="search-provider",
|
||||
summary="Search external providers (bandcamp, libgen, soulseek, youtube)",
|
||||
usage="search-provider <provider> <query> [-limit N]",
|
||||
arg=[
|
||||
CmdletArg("provider", type="string", required=True, description="Provider name: bandcamp, libgen, soulseek, youtube"),
|
||||
CmdletArg("query", type="string", required=True, description="Search query (supports provider-specific syntax)"),
|
||||
CmdletArg("limit", type="int", description="Maximum results to return (default: 50)"),
|
||||
],
|
||||
detail=[
|
||||
"Search external content providers:",
|
||||
"- bandcamp: Search for music albums/tracks",
|
||||
" Example: search-provider bandcamp \"artist:altrusian grace\"",
|
||||
"- libgen: Search Library Genesis for books",
|
||||
" Example: search-provider libgen \"python programming\"",
|
||||
"- soulseek: Search P2P network for music",
|
||||
" Example: search-provider soulseek \"pink floyd\"",
|
||||
"- youtube: Search YouTube for videos",
|
||||
" Example: search-provider youtube \"tutorial\"",
|
||||
"",
|
||||
"Query syntax:",
|
||||
"- bandcamp: Use 'artist:Name' to search by artist",
|
||||
"- libgen: Supports isbn:, author:, title: prefixes",
|
||||
"- soulseek: Plain text search",
|
||||
"- youtube: Plain text search",
|
||||
"",
|
||||
"Results can be piped to other cmdlets:",
|
||||
" search-provider bandcamp \"artist:grace\" | @1 | download-data",
|
||||
],
|
||||
exec=self.run
|
||||
)
|
||||
self.register()
|
||||
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Execute search-provider cmdlet."""
|
||||
if should_show_help(args):
|
||||
ctx.emit(self.__dict__)
|
||||
return 0
|
||||
|
||||
# Parse arguments
|
||||
if len(args) < 2:
|
||||
log("Error: search-provider requires <provider> and <query> arguments", file=sys.stderr)
|
||||
log(f"Usage: {self.usage}", file=sys.stderr)
|
||||
log("Available providers:", file=sys.stderr)
|
||||
providers = list_search_providers(config)
|
||||
for name, available in sorted(providers.items()):
|
||||
status = "✓" if available else "✗"
|
||||
log(f" {status} {name}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
provider_name = args[0]
|
||||
query = args[1]
|
||||
|
||||
# Parse optional limit
|
||||
limit = 50
|
||||
if len(args) >= 4 and args[2] in ("-limit", "--limit"):
|
||||
try:
|
||||
limit = int(args[3])
|
||||
except ValueError:
|
||||
log(f"Warning: Invalid limit value '{args[3]}', using default 50", file=sys.stderr)
|
||||
|
||||
debug(f"[search-provider] provider={provider_name}, query={query}, limit={limit}")
|
||||
|
||||
# Get provider
|
||||
provider = get_search_provider(provider_name, config)
|
||||
if not provider:
|
||||
log(f"Error: Provider '{provider_name}' is not available", file=sys.stderr)
|
||||
log("Available providers:", file=sys.stderr)
|
||||
providers = list_search_providers(config)
|
||||
for name, available in sorted(providers.items()):
|
||||
if available:
|
||||
log(f" - {name}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Execute search
|
||||
try:
|
||||
debug(f"[search-provider] Calling {provider_name}.search()")
|
||||
results = provider.search(query, limit=limit)
|
||||
debug(f"[search-provider] Got {len(results)} results")
|
||||
|
||||
if not results:
|
||||
log(f"No results found for query: {query}", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# Emit results for pipeline
|
||||
for search_result in results:
|
||||
ctx.emit(search_result.to_dict())
|
||||
|
||||
log(f"Found {len(results)} result(s) from {provider_name}", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
log(f"Error searching {provider_name}: {e}", file=sys.stderr)
|
||||
import traceback
|
||||
debug(traceback.format_exc())
|
||||
return 1
|
||||
|
||||
|
||||
# Register cmdlet instance
|
||||
Search_Provider_Instance = Search_Provider()
|
||||
341
cmdlets/search_store.py
Normal file
341
cmdlets/search_store.py
Normal file
@@ -0,0 +1,341 @@
|
||||
"""Search-store cmdlet: Search for files in storage backends (Folder, Hydrus)."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence, List, Optional, Tuple
|
||||
from pathlib import Path
|
||||
from dataclasses import dataclass, field
|
||||
from collections import OrderedDict
|
||||
import re
|
||||
import json
|
||||
import sys
|
||||
|
||||
from helper.logger import log, debug
|
||||
|
||||
from ._shared import Cmdlet, CmdletArg, get_origin, get_field, should_show_help
|
||||
import pipeline as ctx
|
||||
|
||||
# Optional dependencies
|
||||
try:
|
||||
import mutagen # type: ignore
|
||||
except ImportError: # pragma: no cover
|
||||
mutagen = None # type: ignore
|
||||
|
||||
try:
|
||||
from config import get_hydrus_url, resolve_output_dir
|
||||
except Exception: # pragma: no cover
|
||||
get_hydrus_url = None # type: ignore
|
||||
resolve_output_dir = None # type: ignore
|
||||
|
||||
try:
|
||||
from helper.hydrus import HydrusClient, HydrusRequestError
|
||||
except ImportError: # pragma: no cover
|
||||
HydrusClient = None # type: ignore
|
||||
HydrusRequestError = RuntimeError # type: ignore
|
||||
|
||||
try:
|
||||
from helper.utils import sha256_file
|
||||
except ImportError: # pragma: no cover
|
||||
sha256_file = None # type: ignore
|
||||
|
||||
try:
|
||||
from helper.utils_constant import mime_maps
|
||||
except ImportError: # pragma: no cover
|
||||
mime_maps = {} # type: ignore
|
||||
|
||||
@dataclass(slots=True)
|
||||
class SearchRecord:
|
||||
path: str
|
||||
size_bytes: int | None = None
|
||||
duration_seconds: str | None = None
|
||||
tags: str | None = None
|
||||
hash_hex: str | None = None
|
||||
|
||||
def as_dict(self) -> dict[str, str]:
|
||||
payload: dict[str, str] = {"path": self.path}
|
||||
if self.size_bytes is not None:
|
||||
payload["size"] = str(self.size_bytes)
|
||||
if self.duration_seconds:
|
||||
payload["duration"] = self.duration_seconds
|
||||
if self.tags:
|
||||
payload["tags"] = self.tags
|
||||
if self.hash_hex:
|
||||
payload["hash"] = self.hash_hex
|
||||
return payload
|
||||
|
||||
|
||||
STORAGE_ORIGINS = {"local", "hydrus", "folder"}
|
||||
|
||||
|
||||
class Search_Store(Cmdlet):
|
||||
"""Class-based search-store cmdlet for searching storage backends."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
super().__init__(
|
||||
name="search-store",
|
||||
summary="Search storage backends (Folder, Hydrus) for files.",
|
||||
usage="search-store [query] [-tag TAG] [-size >100MB|<50MB] [-type audio|video|image] [-duration >10:00] [-store BACKEND]",
|
||||
arg=[
|
||||
CmdletArg("query", description="Search query string"),
|
||||
CmdletArg("tag", description="Filter by tag (can be used multiple times)"),
|
||||
CmdletArg("size", description="Filter by size: >100MB, <50MB, =10MB"),
|
||||
CmdletArg("type", description="Filter by type: audio, video, image, document"),
|
||||
CmdletArg("duration", description="Filter by duration: >10:00, <1:30:00"),
|
||||
CmdletArg("limit", type="integer", description="Limit results (default: 100)"),
|
||||
CmdletArg("store", description="Search specific storage backend (e.g., 'home', 'test', or 'default')"),
|
||||
],
|
||||
detail=[
|
||||
"Search across storage backends: Folder stores and Hydrus instances",
|
||||
"Use -store to search a specific backend by name",
|
||||
"Filter results by: tag, size, type, duration",
|
||||
"Results include hash for downstream commands (get-file, add-tag, etc.)",
|
||||
"Examples:",
|
||||
"search-store foo # Search all storage backends",
|
||||
"search-store -store home '*' # Search 'home' Hydrus instance",
|
||||
"search-store -store test 'video' # Search 'test' folder store",
|
||||
"search-store song -type audio # Search for audio files",
|
||||
"search-store movie -tag action # Search with tag filter",
|
||||
],
|
||||
exec=self.run,
|
||||
)
|
||||
self.register()
|
||||
|
||||
# --- Helper methods -------------------------------------------------
|
||||
@staticmethod
|
||||
def _normalize_extension(ext_value: Any) -> str:
|
||||
"""Sanitize extension strings to alphanumerics and cap at 5 chars."""
|
||||
ext = str(ext_value or "").strip().lstrip(".")
|
||||
for sep in (" ", "|", "(", "[", "{", ",", ";"):
|
||||
if sep in ext:
|
||||
ext = ext.split(sep, 1)[0]
|
||||
break
|
||||
if "." in ext:
|
||||
ext = ext.split(".")[-1]
|
||||
ext = "".join(ch for ch in ext if ch.isalnum())
|
||||
return ext[:5]
|
||||
|
||||
def _ensure_storage_columns(self, payload: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Ensure storage results have the necessary fields for result_table display."""
|
||||
store_value = str(get_origin(payload, "") or "").lower()
|
||||
if store_value not in STORAGE_ORIGINS:
|
||||
return payload
|
||||
|
||||
# Ensure we have title field
|
||||
if "title" not in payload:
|
||||
payload["title"] = payload.get("name") or payload.get("target") or payload.get("path") or "Result"
|
||||
|
||||
# Ensure we have ext field
|
||||
if "ext" not in payload:
|
||||
title = str(payload.get("title", ""))
|
||||
path_obj = Path(title)
|
||||
if path_obj.suffix:
|
||||
payload["ext"] = self._normalize_extension(path_obj.suffix.lstrip('.'))
|
||||
else:
|
||||
payload["ext"] = payload.get("ext", "")
|
||||
|
||||
# Ensure size_bytes is present for display (already set by search_file())
|
||||
# result_table will handle formatting it
|
||||
|
||||
# Don't create manual columns - let result_table handle display
|
||||
# This allows the table to respect max_columns and apply consistent formatting
|
||||
return payload
|
||||
|
||||
# --- Execution ------------------------------------------------------
|
||||
def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Search storage backends for files."""
|
||||
if should_show_help(args):
|
||||
log(f"Cmdlet: {self.name}\nSummary: {self.summary}\nUsage: {self.usage}")
|
||||
return 0
|
||||
|
||||
args_list = [str(arg) for arg in (args or [])]
|
||||
|
||||
# Parse arguments
|
||||
query = ""
|
||||
tag_filters: List[str] = []
|
||||
size_filter: Optional[Tuple[str, int]] = None
|
||||
duration_filter: Optional[Tuple[str, float]] = None
|
||||
type_filter: Optional[str] = None
|
||||
storage_backend: Optional[str] = None
|
||||
limit = 100
|
||||
searched_backends: List[str] = []
|
||||
|
||||
i = 0
|
||||
while i < len(args_list):
|
||||
arg = args_list[i]
|
||||
low = arg.lower()
|
||||
if low in {"-store", "--store", "-storage", "--storage"} and i + 1 < len(args_list):
|
||||
storage_backend = args_list[i + 1]
|
||||
i += 2
|
||||
elif low in {"-tag", "--tag"} and i + 1 < len(args_list):
|
||||
tag_filters.append(args_list[i + 1])
|
||||
i += 2
|
||||
elif low in {"-limit", "--limit"} and i + 1 < len(args_list):
|
||||
try:
|
||||
limit = int(args_list[i + 1])
|
||||
except ValueError:
|
||||
limit = 100
|
||||
i += 2
|
||||
elif low in {"-type", "--type"} and i + 1 < len(args_list):
|
||||
type_filter = args_list[i + 1].lower()
|
||||
i += 2
|
||||
elif not arg.startswith("-"):
|
||||
query = f"{query} {arg}".strip() if query else arg
|
||||
i += 1
|
||||
else:
|
||||
i += 1
|
||||
|
||||
store_filter: Optional[str] = None
|
||||
if query:
|
||||
match = re.search(r"\bstore:([^\s,]+)", query, flags=re.IGNORECASE)
|
||||
if match:
|
||||
store_filter = match.group(1).strip() or None
|
||||
query = re.sub(r"\s*[,]?\s*store:[^\s,]+", " ", query, flags=re.IGNORECASE)
|
||||
query = re.sub(r"\s{2,}", " ", query)
|
||||
query = query.strip().strip(',')
|
||||
|
||||
if store_filter and not storage_backend:
|
||||
storage_backend = store_filter
|
||||
|
||||
if not query:
|
||||
log("Provide a search query", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
from helper.folder_store import FolderDB
|
||||
from config import get_local_storage_path
|
||||
import uuid
|
||||
worker_id = str(uuid.uuid4())
|
||||
library_root = get_local_storage_path(config or {})
|
||||
if not library_root:
|
||||
log("No library root configured", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Use context manager to ensure database is always closed
|
||||
with FolderDB(library_root) as db:
|
||||
try:
|
||||
db.insert_worker(
|
||||
worker_id,
|
||||
"search-store",
|
||||
title=f"Search: {query}",
|
||||
description=f"Query: {query}",
|
||||
pipe=ctx.get_current_command_text()
|
||||
)
|
||||
|
||||
results_list = []
|
||||
import result_table
|
||||
import importlib
|
||||
importlib.reload(result_table)
|
||||
from result_table import ResultTable
|
||||
|
||||
table_title = f"Search: {query}"
|
||||
if storage_backend:
|
||||
table_title += f" [{storage_backend}]"
|
||||
|
||||
table = ResultTable(table_title)
|
||||
|
||||
from helper.store import FileStorage
|
||||
storage = FileStorage(config=config or {})
|
||||
|
||||
backend_to_search = storage_backend or None
|
||||
if backend_to_search:
|
||||
searched_backends.append(backend_to_search)
|
||||
target_backend = storage[backend_to_search]
|
||||
if not callable(getattr(target_backend, 'search_file', None)):
|
||||
log(f"Backend '{backend_to_search}' does not support searching", file=sys.stderr)
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
return 1
|
||||
results = target_backend.search_file(query, limit=limit)
|
||||
else:
|
||||
from helper.hydrus import is_hydrus_available
|
||||
hydrus_available = is_hydrus_available(config or {})
|
||||
|
||||
all_results = []
|
||||
for backend_name in storage.list_searchable_backends():
|
||||
if backend_name.startswith("hydrus") and not hydrus_available:
|
||||
continue
|
||||
searched_backends.append(backend_name)
|
||||
try:
|
||||
backend_results = storage[backend_name].search_file(query, limit=limit - len(all_results))
|
||||
if backend_results:
|
||||
all_results.extend(backend_results)
|
||||
if len(all_results) >= limit:
|
||||
break
|
||||
except Exception as exc:
|
||||
log(f"Backend {backend_name} search failed: {exc}", file=sys.stderr)
|
||||
results = all_results[:limit]
|
||||
|
||||
def _format_storage_label(name: str) -> str:
|
||||
clean = str(name or "").strip()
|
||||
if not clean:
|
||||
return "Unknown"
|
||||
return clean.replace("_", " ").title()
|
||||
|
||||
storage_counts: OrderedDict[str, int] = OrderedDict((name, 0) for name in searched_backends)
|
||||
for item in results or []:
|
||||
origin = get_origin(item)
|
||||
if not origin:
|
||||
continue
|
||||
key = str(origin).lower()
|
||||
if key not in storage_counts:
|
||||
storage_counts[key] = 0
|
||||
storage_counts[key] += 1
|
||||
|
||||
if storage_counts or query:
|
||||
display_counts = OrderedDict((_format_storage_label(name), count) for name, count in storage_counts.items())
|
||||
summary_line = table.set_storage_summary(display_counts, query, inline=True)
|
||||
if summary_line:
|
||||
table.title = summary_line
|
||||
|
||||
if results:
|
||||
for item in results:
|
||||
def _as_dict(obj: Any) -> Dict[str, Any]:
|
||||
if isinstance(obj, dict):
|
||||
return dict(obj)
|
||||
if hasattr(obj, "to_dict") and callable(getattr(obj, "to_dict")):
|
||||
return obj.to_dict() # type: ignore[arg-type]
|
||||
return {"title": str(obj)}
|
||||
|
||||
item_dict = _as_dict(item)
|
||||
if store_filter:
|
||||
origin_val = str(get_origin(item_dict) or "").lower()
|
||||
if store_filter != origin_val:
|
||||
continue
|
||||
normalized = self._ensure_storage_columns(item_dict)
|
||||
|
||||
# Make hash/store available for downstream cmdlets without rerunning search
|
||||
hash_val = normalized.get("hash")
|
||||
store_val = normalized.get("store") or get_origin(item_dict)
|
||||
if hash_val and not normalized.get("hash"):
|
||||
normalized["hash"] = hash_val
|
||||
if store_val and not normalized.get("store"):
|
||||
normalized["store"] = store_val
|
||||
|
||||
table.add_result(normalized)
|
||||
|
||||
results_list.append(normalized)
|
||||
ctx.emit(normalized)
|
||||
|
||||
# Debug: Verify table rows match items list
|
||||
debug(f"[search-store] Added {len(table.rows)} rows to table, {len(results_list)} items to results_list")
|
||||
if len(table.rows) != len(results_list):
|
||||
debug(f"[search-store] WARNING: Table/items mismatch! rows={len(table.rows)} items={len(results_list)}", file=sys.stderr)
|
||||
|
||||
ctx.set_last_result_table(table, results_list)
|
||||
db.append_worker_stdout(worker_id, json.dumps(results_list, indent=2))
|
||||
else:
|
||||
log("No results found", file=sys.stderr)
|
||||
db.append_worker_stdout(worker_id, json.dumps([], indent=2))
|
||||
|
||||
db.update_worker_status(worker_id, 'completed')
|
||||
return 0
|
||||
|
||||
except Exception as exc:
|
||||
log(f"Search failed: {exc}", file=sys.stderr)
|
||||
import traceback
|
||||
traceback.print_exc(file=sys.stderr)
|
||||
try:
|
||||
db.update_worker_status(worker_id, 'error')
|
||||
except Exception:
|
||||
pass
|
||||
return 1
|
||||
|
||||
|
||||
CMDLET = Search_Store()
|
||||
@@ -26,12 +26,12 @@ CMDLET = Cmdlet(
|
||||
name="trim-file",
|
||||
summary="Trim a media file using ffmpeg.",
|
||||
usage="trim-file [-path <path>] -range <start-end> [-delete]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("-path", description="Path to the file (optional if piped)."),
|
||||
CmdletArg("-range", required=True, description="Time range to trim (e.g. '3:45-3:55' or '00:03:45-00:03:55')."),
|
||||
CmdletArg("-delete", type="flag", description="Delete the original file after trimming."),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"Creates a new file with 'clip_' prefix in the filename/title.",
|
||||
"Inherits tags from the source file.",
|
||||
"Adds a relationship to the source file (if hash is available).",
|
||||
@@ -133,7 +133,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# If path arg provided, add it to inputs
|
||||
if path_arg:
|
||||
inputs.append({"file_path": path_arg})
|
||||
inputs.append({"path": path_arg})
|
||||
|
||||
if not inputs:
|
||||
log("No input files provided.", file=sys.stderr)
|
||||
@@ -145,9 +145,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Resolve file path
|
||||
file_path = None
|
||||
if isinstance(item, dict):
|
||||
file_path = item.get("file_path") or item.get("path") or item.get("target")
|
||||
elif hasattr(item, "file_path"):
|
||||
file_path = item.file_path
|
||||
file_path = item.get("path") or item.get("target")
|
||||
elif hasattr(item, "path"):
|
||||
file_path = item.path
|
||||
elif isinstance(item, str):
|
||||
file_path = item
|
||||
|
||||
@@ -175,9 +175,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# 1. Get source hash for relationship
|
||||
source_hash = None
|
||||
if isinstance(item, dict):
|
||||
source_hash = item.get("hash") or item.get("file_hash")
|
||||
elif hasattr(item, "file_hash"):
|
||||
source_hash = item.file_hash
|
||||
source_hash = item.get("hash")
|
||||
elif hasattr(item, "hash"):
|
||||
source_hash = item.hash
|
||||
|
||||
if not source_hash:
|
||||
try:
|
||||
@@ -219,18 +219,18 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
# Update original file in local DB if possible
|
||||
try:
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
storage_path = get_local_storage_path(config)
|
||||
if storage_path:
|
||||
with LocalLibraryDB(storage_path) as db:
|
||||
with FolderDB(storage_path) as db:
|
||||
# Get original file metadata
|
||||
# We need to find the original file by hash or path
|
||||
# Try path first
|
||||
orig_meta = db.get_metadata(path_obj)
|
||||
if not orig_meta and source_hash:
|
||||
# Try by hash
|
||||
orig_path_resolved = db.search_by_hash(source_hash)
|
||||
orig_path_resolved = db.search_hash(source_hash)
|
||||
if orig_path_resolved:
|
||||
orig_meta = db.get_metadata(orig_path_resolved)
|
||||
|
||||
@@ -256,7 +256,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
orig_meta["hash"] = source_hash
|
||||
|
||||
# We need the path to save
|
||||
save_path = Path(orig_meta.get("file_path") or path_obj)
|
||||
save_path = Path(orig_meta.get("path") or path_obj)
|
||||
db.save_metadata(save_path, orig_meta)
|
||||
log(f"Updated relationship for original file: {save_path.name}", file=sys.stderr)
|
||||
except Exception as e:
|
||||
@@ -264,7 +264,6 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# 5. Construct result
|
||||
result_dict = {
|
||||
"file_path": str(output_path),
|
||||
"path": str(output_path),
|
||||
"title": new_title,
|
||||
"tags": new_tags,
|
||||
|
||||
@@ -135,10 +135,10 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name=".adjective",
|
||||
aliases=["adj"],
|
||||
alias=["adj"],
|
||||
summary="Manage adjective categories and tags",
|
||||
usage=".adjective [category] [-add tag] [-delete tag]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg(name="category", type="string", description="Category name", required=False),
|
||||
CmdletArg(name="tag", type="string", description="Tag name", required=False),
|
||||
CmdletArg(name="add", type="flag", description="Add tag"),
|
||||
|
||||
183
cmdnats/help.py
Normal file
183
cmdnats/help.py
Normal file
@@ -0,0 +1,183 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Sequence, List, Optional
|
||||
import shlex
|
||||
import sys
|
||||
|
||||
from cmdlets._shared import Cmdlet, CmdletArg, parse_cmdlet_args
|
||||
from helper.logger import log
|
||||
from result_table import ResultTable
|
||||
import pipeline as ctx
|
||||
|
||||
|
||||
def _normalize_choice_list(arg_names: Optional[List[str]]) -> List[str]:
|
||||
return sorted(set(arg_names or []))
|
||||
|
||||
|
||||
def _examples_for_cmd(name: str) -> List[str]:
|
||||
"""Return example invocations for a given command (best-effort)."""
|
||||
lookup = {
|
||||
".adjective": [
|
||||
'.adjective -add "example"',
|
||||
'.adjective -delete "example"',
|
||||
],
|
||||
}
|
||||
|
||||
key = name.replace("_", "-").lower()
|
||||
return lookup.get(key, [])
|
||||
|
||||
|
||||
def _find_cmd_metadata(name: str, metadata: Dict[str, Dict[str, Any]]) -> Optional[Dict[str, Any]]:
|
||||
target = name.replace("_", "-").lower()
|
||||
for cmd_name, meta in metadata.items():
|
||||
if target == cmd_name:
|
||||
return meta
|
||||
aliases = meta.get("aliases", []) or []
|
||||
if target in aliases:
|
||||
return meta
|
||||
return None
|
||||
|
||||
|
||||
def _render_list(metadata: Dict[str, Dict[str, Any]], filter_text: Optional[str], args: Sequence[str]) -> None:
|
||||
table = ResultTable("Help")
|
||||
table.set_source_command(".help", list(args))
|
||||
|
||||
items: List[Dict[str, Any]] = []
|
||||
needle = (filter_text or "").lower().strip()
|
||||
|
||||
for name in sorted(metadata.keys()):
|
||||
meta = metadata[name]
|
||||
summary = meta.get("summary", "") or ""
|
||||
if needle and needle not in name.lower() and needle not in summary.lower():
|
||||
continue
|
||||
|
||||
row = table.add_row()
|
||||
row.add_column("Cmd", name)
|
||||
aliases = ", ".join(meta.get("aliases", []) or [])
|
||||
row.add_column("Aliases", aliases)
|
||||
arg_names = [a.get("name") for a in meta.get("args", []) if a.get("name")]
|
||||
row.add_column("Args", ", ".join(f"-{a}" for a in arg_names))
|
||||
table.set_row_selection_args(len(table.rows) - 1, ["-cmd", name])
|
||||
items.append(meta)
|
||||
|
||||
ctx.set_last_result_table(table, items)
|
||||
ctx.set_current_stage_table(table)
|
||||
print(table)
|
||||
|
||||
|
||||
def _render_detail(meta: Dict[str, Any], args: Sequence[str]) -> None:
|
||||
title = f"Help: {meta.get('name', '') or 'cmd'}"
|
||||
table = ResultTable(title)
|
||||
table.set_source_command(".help", list(args))
|
||||
|
||||
header_lines: List[str] = []
|
||||
summary = meta.get("summary", "")
|
||||
usage = meta.get("usage", "")
|
||||
aliases = meta.get("aliases", []) or []
|
||||
examples = _examples_for_cmd(meta.get("name", ""))
|
||||
first_example_tokens: List[str] = []
|
||||
first_example_cmd: Optional[str] = None
|
||||
if examples:
|
||||
try:
|
||||
split_tokens = shlex.split(examples[0])
|
||||
if split_tokens:
|
||||
first_example_cmd = split_tokens[0]
|
||||
first_example_tokens = split_tokens[1:]
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if summary:
|
||||
header_lines.append(summary)
|
||||
if usage:
|
||||
header_lines.append(f"Usage: {usage}")
|
||||
if aliases:
|
||||
header_lines.append("Aliases: " + ", ".join(aliases))
|
||||
if examples:
|
||||
header_lines.append("Examples: " + " | ".join(examples))
|
||||
if header_lines:
|
||||
table.set_header_lines(header_lines)
|
||||
|
||||
args_meta = meta.get("args", []) or []
|
||||
example_text = " | ".join(examples)
|
||||
# If we have an example, use it as the source command so @N runs that example
|
||||
if first_example_cmd:
|
||||
table.set_source_command(first_example_cmd, [])
|
||||
if not args_meta:
|
||||
row = table.add_row()
|
||||
row.add_column("Arg", "(none)")
|
||||
row.add_column("Type", "")
|
||||
row.add_column("Req", "")
|
||||
row.add_column("Description", "")
|
||||
row.add_column("Example", example_text)
|
||||
if first_example_tokens:
|
||||
table.set_row_selection_args(len(table.rows) - 1, first_example_tokens)
|
||||
else:
|
||||
for arg in args_meta:
|
||||
row = table.add_row()
|
||||
name = arg.get("name") or ""
|
||||
row.add_column("Arg", f"-{name}" if name else "")
|
||||
row.add_column("Type", arg.get("type", ""))
|
||||
row.add_column("Req", "yes" if arg.get("required") else "")
|
||||
desc = arg.get("description", "") or ""
|
||||
choices = arg.get("choices", []) or []
|
||||
if choices:
|
||||
choice_text = f"choices: {', '.join(choices)}"
|
||||
desc = f"{desc} ({choice_text})" if desc else choice_text
|
||||
row.add_column("Description", desc)
|
||||
row.add_column("Example", example_text)
|
||||
if first_example_tokens:
|
||||
table.set_row_selection_args(len(table.rows) - 1, first_example_tokens)
|
||||
|
||||
ctx.set_last_result_table_overlay(table, [meta])
|
||||
ctx.set_current_stage_table(table)
|
||||
print(table)
|
||||
|
||||
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
try:
|
||||
from helper import cmdlet_catalog as _catalog
|
||||
|
||||
CMDLET.arg[0].choices = _normalize_choice_list(_catalog.list_cmdlet_names())
|
||||
metadata = _catalog.list_cmdlet_metadata()
|
||||
except Exception:
|
||||
CMDLET.arg[0].choices = []
|
||||
metadata = {}
|
||||
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
|
||||
filter_text = parsed.get("filter")
|
||||
cmd_arg = parsed.get("cmd")
|
||||
|
||||
if cmd_arg:
|
||||
target_meta = _find_cmd_metadata(str(cmd_arg), metadata)
|
||||
if not target_meta:
|
||||
log(f"Unknown command: {cmd_arg}", file=sys.stderr)
|
||||
return 1
|
||||
_render_detail(target_meta, args)
|
||||
return 0
|
||||
|
||||
_render_list(metadata, filter_text, args)
|
||||
return 0
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name=".help",
|
||||
alias=["help", "?"],
|
||||
summary="Show cmdlets or detailed help",
|
||||
usage=".help [cmd] [-filter text]",
|
||||
arg=[
|
||||
CmdletArg(
|
||||
name="cmd",
|
||||
type="string",
|
||||
description="Cmdlet name to show detailed help",
|
||||
required=False,
|
||||
choices=[],
|
||||
),
|
||||
CmdletArg(
|
||||
name="-filter",
|
||||
type="string",
|
||||
description="Filter cmdlets by substring",
|
||||
required=False,
|
||||
),
|
||||
],
|
||||
)
|
||||
@@ -3,95 +3,22 @@ import sys
|
||||
from cmdlets._shared import Cmdlet, CmdletArg, parse_cmdlet_args
|
||||
from helper.logger import log, debug
|
||||
from result_table import ResultTable
|
||||
from helper.file_storage import MatrixStorageBackend
|
||||
# REFACTOR: Commenting out Matrix import until provider refactor is complete
|
||||
# from helper.store import MatrixStorageBackend
|
||||
from config import save_config, load_config
|
||||
import pipeline as ctx
|
||||
|
||||
def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
|
||||
# Initialize backend
|
||||
backend = MatrixStorageBackend()
|
||||
|
||||
# Get current default room
|
||||
matrix_conf = config.get('storage', {}).get('matrix', {})
|
||||
current_room_id = matrix_conf.get('room_id')
|
||||
|
||||
# Fetch rooms
|
||||
debug("Fetching joined rooms from Matrix...")
|
||||
rooms = backend.list_rooms(config)
|
||||
|
||||
if not rooms:
|
||||
debug("No joined rooms found or Matrix not configured.")
|
||||
return 1
|
||||
|
||||
# Handle selection if provided
|
||||
selection = parsed.get("selection")
|
||||
if selection:
|
||||
new_room_id = None
|
||||
selected_room_name = None
|
||||
|
||||
# Try as index (1-based)
|
||||
try:
|
||||
idx = int(selection) - 1
|
||||
if 0 <= idx < len(rooms):
|
||||
selected_room = rooms[idx]
|
||||
new_room_id = selected_room['id']
|
||||
selected_room_name = selected_room['name']
|
||||
except ValueError:
|
||||
# Try as Room ID
|
||||
for room in rooms:
|
||||
if room['id'] == selection:
|
||||
new_room_id = selection
|
||||
selected_room_name = room['name']
|
||||
break
|
||||
|
||||
if new_room_id:
|
||||
# Update config
|
||||
# Load fresh config from disk to avoid saving runtime objects (like WorkerManager)
|
||||
disk_config = load_config()
|
||||
|
||||
if 'storage' not in disk_config: disk_config['storage'] = {}
|
||||
if 'matrix' not in disk_config['storage']: disk_config['storage']['matrix'] = {}
|
||||
|
||||
disk_config['storage']['matrix']['room_id'] = new_room_id
|
||||
save_config(disk_config)
|
||||
|
||||
debug(f"Default Matrix room set to: {selected_room_name} ({new_room_id})")
|
||||
current_room_id = new_room_id
|
||||
else:
|
||||
debug(f"Invalid selection: {selection}")
|
||||
return 1
|
||||
|
||||
# Display table
|
||||
table = ResultTable("Matrix Rooms")
|
||||
for i, room in enumerate(rooms):
|
||||
is_default = (room['id'] == current_room_id)
|
||||
|
||||
row = table.add_row()
|
||||
row.add_column("Default", "*" if is_default else "")
|
||||
row.add_column("Name", room['name'])
|
||||
row.add_column("ID", room['id'])
|
||||
|
||||
# Set selection args so user can type @N to select
|
||||
# This will run .matrix N
|
||||
table.set_row_selection_args(i, [str(i + 1)])
|
||||
|
||||
table.set_source_command(".matrix")
|
||||
|
||||
# Register results
|
||||
ctx.set_last_result_table_overlay(table, rooms)
|
||||
ctx.set_current_stage_table(table)
|
||||
|
||||
print(table)
|
||||
return 0
|
||||
# REFACTOR: Matrix cmdlet temporarily disabled during storage provider refactor
|
||||
log("⚠️ Matrix cmdlet is temporarily disabled during refactor", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name=".matrix",
|
||||
aliases=["matrix", "rooms"],
|
||||
alias=["matrix", "rooms"],
|
||||
summary="List and select default Matrix room",
|
||||
usage=".matrix [selection]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg(
|
||||
name="selection",
|
||||
type="string",
|
||||
|
||||
448
cmdnats/pipe.py
448
cmdnats/pipe.py
@@ -14,7 +14,7 @@ from helper.mpv_ipc import get_ipc_pipe_path, MPVIPCClient
|
||||
import pipeline as ctx
|
||||
from helper.download import is_url_supported_by_ytdlp
|
||||
|
||||
from helper.local_library import LocalLibrarySearchOptimizer
|
||||
from helper.folder_store import LocalLibrarySearchOptimizer
|
||||
from config import get_local_storage_path, get_hydrus_access_key, get_hydrus_url
|
||||
from hydrus_health_check import get_cookies_file_path
|
||||
|
||||
@@ -35,6 +35,20 @@ def _send_ipc_command(command: Dict[str, Any], silent: bool = False) -> Optional
|
||||
debug(f"IPC Error: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def _is_mpv_running() -> bool:
|
||||
"""Check if MPV is currently running and accessible via IPC."""
|
||||
try:
|
||||
ipc_pipe = get_ipc_pipe_path()
|
||||
client = MPVIPCClient(socket_path=ipc_pipe)
|
||||
if client.connect():
|
||||
client.disconnect()
|
||||
return True
|
||||
return False
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
def _get_playlist(silent: bool = False) -> Optional[List[Dict[str, Any]]]:
|
||||
"""Get the current playlist from MPV. Returns None if MPV is not running."""
|
||||
cmd = {"command": ["get_property", "playlist"], "request_id": 100}
|
||||
@@ -87,8 +101,75 @@ def _extract_target_from_memory_uri(text: str) -> Optional[str]:
|
||||
return None
|
||||
|
||||
|
||||
def _normalize_playlist_target(text: Optional[str]) -> Optional[str]:
|
||||
"""Normalize playlist entry targets for dedupe comparisons."""
|
||||
def _find_hydrus_instance_for_hash(hash_str: str, file_storage: Any) -> Optional[str]:
|
||||
"""Find which Hydrus instance serves a specific file hash.
|
||||
|
||||
Args:
|
||||
hash_str: SHA256 hash (64 hex chars)
|
||||
file_storage: FileStorage instance with Hydrus backends
|
||||
|
||||
Returns:
|
||||
Instance name (e.g., 'home') or None if not found
|
||||
"""
|
||||
# Query each Hydrus backend to see if it has this file
|
||||
for backend_name in file_storage.list_backends():
|
||||
backend = file_storage[backend_name]
|
||||
# Check if this is a Hydrus backend by checking class name
|
||||
backend_class = type(backend).__name__
|
||||
if backend_class != "HydrusNetwork":
|
||||
continue
|
||||
|
||||
try:
|
||||
# Query metadata to see if this instance has the file
|
||||
metadata = backend.get_metadata(hash_str)
|
||||
if metadata:
|
||||
return backend_name
|
||||
except Exception:
|
||||
# This instance doesn't have the file or had an error
|
||||
continue
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def _find_hydrus_instance_by_url(url: str, file_storage: Any) -> Optional[str]:
|
||||
"""Find which Hydrus instance matches a given URL.
|
||||
|
||||
Args:
|
||||
url: Full URL (e.g., http://localhost:45869/get_files/file?hash=...)
|
||||
file_storage: FileStorage instance with Hydrus backends
|
||||
|
||||
Returns:
|
||||
Instance name (e.g., 'home') or None if not found
|
||||
"""
|
||||
from urllib.parse import urlparse
|
||||
|
||||
parsed_target = urlparse(url)
|
||||
target_netloc = parsed_target.netloc.lower()
|
||||
|
||||
# Check each Hydrus backend's URL
|
||||
for backend_name in file_storage.list_backends():
|
||||
backend = file_storage[backend_name]
|
||||
backend_class = type(backend).__name__
|
||||
if backend_class != "HydrusNetwork":
|
||||
continue
|
||||
|
||||
# Get the backend's base URL from its client
|
||||
try:
|
||||
backend_url = backend._client.base_url
|
||||
parsed_backend = urlparse(backend_url)
|
||||
backend_netloc = parsed_backend.netloc.lower()
|
||||
|
||||
# Match by netloc (host:port)
|
||||
if target_netloc == backend_netloc:
|
||||
return backend_name
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def _normalize_playlist_path(text: Optional[str]) -> Optional[str]:
|
||||
"""Normalize playlist entry paths for dedupe comparisons."""
|
||||
if not text:
|
||||
return None
|
||||
real = _extract_target_from_memory_uri(text) or text
|
||||
@@ -118,8 +199,16 @@ def _normalize_playlist_target(text: Optional[str]) -> Optional[str]:
|
||||
return real.lower()
|
||||
|
||||
|
||||
def _infer_store_from_playlist_item(item: Dict[str, Any]) -> str:
|
||||
"""Infer a friendly store label from an MPV playlist entry."""
|
||||
def _infer_store_from_playlist_item(item: Dict[str, Any], file_storage: Optional[Any] = None) -> str:
|
||||
"""Infer a friendly store label from an MPV playlist entry.
|
||||
|
||||
Args:
|
||||
item: MPV playlist item dict
|
||||
file_storage: Optional FileStorage instance for querying specific backend instances
|
||||
|
||||
Returns:
|
||||
Store label (e.g., 'home', 'work', 'local', 'youtube', etc.)
|
||||
"""
|
||||
name = item.get("filename") if isinstance(item, dict) else None
|
||||
target = str(name or "")
|
||||
|
||||
@@ -130,19 +219,33 @@ def _infer_store_from_playlist_item(item: Dict[str, Any]) -> str:
|
||||
|
||||
# Hydrus hashes: bare 64-hex entries
|
||||
if re.fullmatch(r"[0-9a-f]{64}", target.lower()):
|
||||
# If we have file_storage, query each Hydrus instance to find which one has this hash
|
||||
if file_storage:
|
||||
hash_str = target.lower()
|
||||
hydrus_instance = _find_hydrus_instance_for_hash(hash_str, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
return "hydrus"
|
||||
|
||||
lower = target.lower()
|
||||
if lower.startswith("magnet:"):
|
||||
return "magnet"
|
||||
if lower.startswith("hydrus://"):
|
||||
# Extract hash from hydrus:// URL if possible
|
||||
if file_storage:
|
||||
hash_match = re.search(r"[0-9a-f]{64}", target.lower())
|
||||
if hash_match:
|
||||
hash_str = hash_match.group(0)
|
||||
hydrus_instance = _find_hydrus_instance_for_hash(hash_str, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
return "hydrus"
|
||||
|
||||
# Windows / UNC paths
|
||||
if re.match(r"^[a-z]:[\\/]", target, flags=re.IGNORECASE) or target.startswith("\\\\"):
|
||||
return "local"
|
||||
|
||||
# file:// URLs
|
||||
# file:// url
|
||||
if lower.startswith("file://"):
|
||||
return "local"
|
||||
|
||||
@@ -162,9 +265,33 @@ def _infer_store_from_playlist_item(item: Dict[str, Any]) -> str:
|
||||
return "soundcloud"
|
||||
if "bandcamp" in host_stripped:
|
||||
return "bandcamp"
|
||||
if "get_files" in path or host_stripped in {"127.0.0.1", "localhost"}:
|
||||
if "get_files" in path or "file?hash=" in path or host_stripped in {"127.0.0.1", "localhost"}:
|
||||
# Hydrus API URL - try to extract hash and find instance
|
||||
if file_storage:
|
||||
# Try to extract hash from URL parameters
|
||||
hash_match = re.search(r"hash=([0-9a-f]{64})", target.lower())
|
||||
if hash_match:
|
||||
hash_str = hash_match.group(1)
|
||||
hydrus_instance = _find_hydrus_instance_for_hash(hash_str, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
# If no hash in URL, try matching the base URL to configured instances
|
||||
hydrus_instance = _find_hydrus_instance_by_url(target, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
return "hydrus"
|
||||
if re.match(r"^\d+\.\d+\.\d+\.\d+$", host_stripped) and "get_files" in path:
|
||||
# IP-based Hydrus URL
|
||||
if file_storage:
|
||||
hash_match = re.search(r"hash=([0-9a-f]{64})", target.lower())
|
||||
if hash_match:
|
||||
hash_str = hash_match.group(1)
|
||||
hydrus_instance = _find_hydrus_instance_for_hash(hash_str, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
hydrus_instance = _find_hydrus_instance_by_url(target, file_storage)
|
||||
if hydrus_instance:
|
||||
return hydrus_instance
|
||||
return "hydrus"
|
||||
|
||||
parts = host_stripped.split('.')
|
||||
@@ -231,15 +358,15 @@ def _build_ytdl_options(config: Optional[Dict[str, Any]], hydrus_header: Optiona
|
||||
return ",".join(opts) if opts else None
|
||||
|
||||
|
||||
def _is_hydrus_target(target: str, hydrus_url: Optional[str]) -> bool:
|
||||
if not target:
|
||||
def _is_hydrus_path(path: str, hydrus_url: Optional[str]) -> bool:
|
||||
if not path:
|
||||
return False
|
||||
lower = target.lower()
|
||||
lower = path.lower()
|
||||
if "hydrus://" in lower:
|
||||
return True
|
||||
parsed = urlparse(target)
|
||||
parsed = urlparse(path)
|
||||
host = (parsed.netloc or "").lower()
|
||||
path = parsed.path or ""
|
||||
path_part = parsed.path or ""
|
||||
if hydrus_url:
|
||||
try:
|
||||
hydrus_host = urlparse(hydrus_url).netloc.lower()
|
||||
@@ -247,9 +374,9 @@ def _is_hydrus_target(target: str, hydrus_url: Optional[str]) -> bool:
|
||||
return True
|
||||
except Exception:
|
||||
pass
|
||||
if "get_files" in path or "file?hash=" in path:
|
||||
if "get_files" in path_part or "file?hash=" in path_part:
|
||||
return True
|
||||
if re.match(r"^\d+\.\d+\.\d+\.\d+$", host) and "get_files" in path:
|
||||
if re.match(r"^\d+\.\d+\.\d+\.\d+$", host) and "get_files" in path_part:
|
||||
return True
|
||||
return False
|
||||
|
||||
@@ -313,6 +440,113 @@ def _monitor_mpv_logs(duration: float = 3.0) -> None:
|
||||
client.disconnect()
|
||||
except Exception:
|
||||
pass
|
||||
def _get_playable_path(item: Any, file_storage: Optional[Any], config: Optional[Dict[str, Any]]) -> Optional[tuple[str, Optional[str]]]:
|
||||
"""Extract a playable path/URL from an item, handling different store types.
|
||||
|
||||
Args:
|
||||
item: Item to extract path from (dict, PipeObject, or string)
|
||||
file_storage: FileStorage instance for querying backends
|
||||
config: Config dict for Hydrus URL
|
||||
|
||||
Returns:
|
||||
Tuple of (path, title) or None if no valid path found
|
||||
"""
|
||||
path = None
|
||||
title = None
|
||||
store = None
|
||||
file_hash = None
|
||||
|
||||
# Extract fields from item - prefer a disk path ('path'), but accept 'url' as fallback for providers
|
||||
if isinstance(item, dict):
|
||||
# Support both canonical 'path' and legacy 'file_path' keys, and provider 'url' keys
|
||||
path = item.get("path") or item.get("file_path")
|
||||
# Fallbacks for provider-style entries where URL is stored in 'url' or 'source_url' or 'target'
|
||||
if not path:
|
||||
path = item.get("url") or item.get("source_url") or item.get("target")
|
||||
if not path:
|
||||
known = item.get("url") or item.get("url") or []
|
||||
if known and isinstance(known, list):
|
||||
path = known[0]
|
||||
title = item.get("title") or item.get("file_title")
|
||||
store = item.get("store") or item.get("storage") or item.get("storage_source") or item.get("origin")
|
||||
file_hash = item.get("hash") or item.get("file_hash") or item.get("hash_hex")
|
||||
elif hasattr(item, "path") or hasattr(item, "url") or hasattr(item, "source_url") or hasattr(item, "store") or hasattr(item, "hash"):
|
||||
# Handle PipeObject / dataclass objects - prefer path, but fall back to url/source_url attributes
|
||||
path = getattr(item, "path", None) or getattr(item, "file_path", None)
|
||||
if not path:
|
||||
path = getattr(item, "url", None) or getattr(item, "source_url", None) or getattr(item, "target", None)
|
||||
if not path:
|
||||
known = getattr(item, "url", None) or (getattr(item, "extra", None) or {}).get("url")
|
||||
if known and isinstance(known, list):
|
||||
path = known[0]
|
||||
title = getattr(item, "title", None) or getattr(item, "file_title", None)
|
||||
store = getattr(item, "store", None) or getattr(item, "origin", None)
|
||||
file_hash = getattr(item, "hash", None)
|
||||
elif isinstance(item, str):
|
||||
path = item
|
||||
|
||||
# Debug: show incoming values
|
||||
try:
|
||||
debug(f"_get_playable_path: store={store}, path={path}, hash={file_hash}")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not path:
|
||||
return None
|
||||
|
||||
# If we have a store and hash, use store's .pipe() method if available
|
||||
# Skip this for URL-based providers (YouTube, SoundCloud, etc.) which have hash="unknown"
|
||||
# Also skip if path is already a URL (http/https)
|
||||
if store and file_hash and file_hash != "unknown" and file_storage:
|
||||
# Check if this is actually a URL - if so, just return it
|
||||
if path.startswith(("http://", "https://")):
|
||||
return (path, title)
|
||||
|
||||
try:
|
||||
backend = file_storage[store]
|
||||
# Check if backend has a .pipe() method
|
||||
if hasattr(backend, 'pipe') and callable(backend.pipe):
|
||||
pipe_path = backend.pipe(file_hash, config)
|
||||
if pipe_path:
|
||||
path = pipe_path
|
||||
debug(f"Got pipe path from {store} backend: {path}")
|
||||
except KeyError:
|
||||
# Store not found in file_storage - it could be a search provider (youtube, bandcamp, etc.)
|
||||
from helper.provider import get_search_provider
|
||||
try:
|
||||
provider = get_search_provider(store, config or {})
|
||||
if provider and hasattr(provider, 'pipe') and callable(provider.pipe):
|
||||
try:
|
||||
debug(f"Calling provider.pipe for '{store}' with path: {path}")
|
||||
provider_path = provider.pipe(path, config or {})
|
||||
debug(f"provider.pipe returned: {provider_path}")
|
||||
if provider_path:
|
||||
path = provider_path
|
||||
debug(f"Got pipe path from provider '{store}': {path}")
|
||||
except Exception as e:
|
||||
debug(f"Error in provider.pipe for '{store}': {e}", file=sys.stderr)
|
||||
except Exception as e:
|
||||
debug(f"Error calling provider.pipe for '{store}': {e}", file=sys.stderr)
|
||||
except Exception as e:
|
||||
debug(f"Error calling .pipe() on store '{store}': {e}", file=sys.stderr)
|
||||
|
||||
# As a fallback, if a provider exists for this store (e.g., youtube) and
|
||||
# this store is not part of FileStorage backends, call provider.pipe()
|
||||
if store and (not file_storage or store not in (file_storage.list_backends() if file_storage else [])):
|
||||
try:
|
||||
from helper.provider import get_search_provider
|
||||
provider = get_search_provider(store, config or {})
|
||||
if provider and hasattr(provider, 'pipe') and callable(provider.pipe):
|
||||
provider_path = provider.pipe(path, config or {})
|
||||
if provider_path:
|
||||
path = provider_path
|
||||
debug(f"Got pipe path from provider '{store}' (fallback): {path}")
|
||||
except Exception as e:
|
||||
debug(f"Error calling provider.pipe (fallback) for '{store}': {e}", file=sys.stderr)
|
||||
|
||||
return (path, title)
|
||||
|
||||
|
||||
def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[Dict[str, Any]] = None) -> bool:
|
||||
"""Queue items to MPV, starting it if necessary.
|
||||
|
||||
@@ -323,6 +557,12 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
Returns:
|
||||
True if MPV was started, False if items were queued via IPC.
|
||||
"""
|
||||
# Debug: print incoming items
|
||||
try:
|
||||
debug(f"_queue_items: count={len(items)} types={[type(i).__name__ for i in items]}")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Just verify cookies are configured, don't try to set via IPC
|
||||
_ensure_ytdl_cookies()
|
||||
|
||||
@@ -333,6 +573,14 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
hydrus_url = get_hydrus_url(config) if config is not None else None
|
||||
except Exception:
|
||||
hydrus_url = None
|
||||
|
||||
# Initialize FileStorage for path resolution
|
||||
file_storage = None
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
file_storage = FileStorage(config or {})
|
||||
except Exception as e:
|
||||
debug(f"Warning: Could not initialize FileStorage: {e}", file=sys.stderr)
|
||||
|
||||
# Dedupe existing playlist before adding more (unless we're replacing it)
|
||||
existing_targets: set[str] = set()
|
||||
@@ -342,7 +590,7 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
for idx, pl_item in enumerate(playlist):
|
||||
fname = pl_item.get("filename") if isinstance(pl_item, dict) else str(pl_item)
|
||||
alt = pl_item.get("playlist-path") if isinstance(pl_item, dict) else None
|
||||
norm = _normalize_playlist_target(fname) or _normalize_playlist_target(alt)
|
||||
norm = _normalize_playlist_path(fname) or _normalize_playlist_path(alt)
|
||||
if not norm:
|
||||
continue
|
||||
if norm in existing_targets:
|
||||
@@ -360,25 +608,25 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
new_targets: set[str] = set()
|
||||
|
||||
for i, item in enumerate(items):
|
||||
# Extract URL/Path
|
||||
target = None
|
||||
title = None
|
||||
# Debug: show the item being processed
|
||||
try:
|
||||
debug(f"_queue_items: processing idx={i} type={type(item)} repr={repr(item)[:200]}")
|
||||
except Exception:
|
||||
pass
|
||||
# Extract URL/Path using store-aware logic
|
||||
result = _get_playable_path(item, file_storage, config)
|
||||
if not result:
|
||||
debug(f"_queue_items: item idx={i} produced no playable path")
|
||||
continue
|
||||
|
||||
if isinstance(item, dict):
|
||||
target = item.get("target") or item.get("url") or item.get("path") or item.get("filename")
|
||||
title = item.get("title") or item.get("name")
|
||||
elif hasattr(item, "target"):
|
||||
target = item.target
|
||||
title = getattr(item, "title", None)
|
||||
elif isinstance(item, str):
|
||||
target = item
|
||||
target, title = result
|
||||
|
||||
if target:
|
||||
# If we just have a hydrus hash, build a direct file URL for MPV
|
||||
if re.fullmatch(r"[0-9a-f]{64}", str(target).strip().lower()) and hydrus_url:
|
||||
target = f"{hydrus_url.rstrip('/')}/get_files/file?hash={str(target).strip()}"
|
||||
|
||||
norm_key = _normalize_playlist_target(target) or str(target).strip().lower()
|
||||
norm_key = _normalize_playlist_path(target) or str(target).strip().lower()
|
||||
if norm_key in existing_targets or norm_key in new_targets:
|
||||
debug(f"Skipping duplicate playlist entry: {title or target}")
|
||||
continue
|
||||
@@ -386,11 +634,16 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
|
||||
# Check if it's a yt-dlp supported URL
|
||||
is_ytdlp = False
|
||||
if target.startswith("http") and is_url_supported_by_ytdlp(target):
|
||||
is_ytdlp = True
|
||||
# Treat any http(s) target as yt-dlp candidate. If the Python yt-dlp
|
||||
# module is available we also check more deeply, but default to True
|
||||
# so MPV can use its ytdl hooks for remote streaming sites.
|
||||
try:
|
||||
is_ytdlp = target.startswith("http") or is_url_supported_by_ytdlp(target)
|
||||
except Exception:
|
||||
is_ytdlp = target.startswith("http")
|
||||
|
||||
# Use memory:// M3U hack to pass title to MPV
|
||||
# Skip for yt-dlp URLs to ensure proper handling
|
||||
# Skip for yt-dlp url to ensure proper handling
|
||||
if title and not is_ytdlp:
|
||||
# Sanitize title for M3U (remove newlines)
|
||||
safe_title = title.replace('\n', ' ').replace('\r', '')
|
||||
@@ -403,8 +656,8 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
if clear_first and i == 0:
|
||||
mode = "replace"
|
||||
|
||||
# If this is a Hydrus target, set header property and yt-dlp headers before loading
|
||||
if hydrus_header and _is_hydrus_target(target_to_send, hydrus_url):
|
||||
# If this is a Hydrus path, set header property and yt-dlp headers before loading
|
||||
if hydrus_header and _is_hydrus_path(target_to_send, hydrus_url):
|
||||
header_cmd = {"command": ["set_property", "http-header-fields", hydrus_header], "request_id": 199}
|
||||
_send_ipc_command(header_cmd, silent=True)
|
||||
if ytdl_opts:
|
||||
@@ -412,11 +665,18 @@ def _queue_items(items: List[Any], clear_first: bool = False, config: Optional[D
|
||||
_send_ipc_command(ytdl_cmd, silent=True)
|
||||
|
||||
cmd = {"command": ["loadfile", target_to_send, mode], "request_id": 200}
|
||||
resp = _send_ipc_command(cmd)
|
||||
try:
|
||||
debug(f"Sending MPV loadfile: {target_to_send} mode={mode}")
|
||||
resp = _send_ipc_command(cmd)
|
||||
debug(f"MPV loadfile response: {resp}")
|
||||
except Exception as e:
|
||||
debug(f"Exception sending loadfile to MPV: {e}", file=sys.stderr)
|
||||
resp = None
|
||||
|
||||
if resp is None:
|
||||
# MPV not running (or died)
|
||||
# Start MPV with remaining items
|
||||
debug(f"MPV not running/died while queuing, starting MPV with remaining items: {items[i:]}")
|
||||
_start_mpv(items[i:], config=config)
|
||||
return True
|
||||
elif resp.get("error") == "success":
|
||||
@@ -435,6 +695,14 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
parsed = parse_cmdlet_args(args, CMDLET)
|
||||
|
||||
# Initialize FileStorage for detecting Hydrus instance names
|
||||
file_storage = None
|
||||
try:
|
||||
from helper.store import FileStorage
|
||||
file_storage = FileStorage(config)
|
||||
except Exception as e:
|
||||
debug(f"Warning: Could not initialize FileStorage: {e}", file=sys.stderr)
|
||||
|
||||
# Initialize mpv_started flag
|
||||
mpv_started = False
|
||||
|
||||
@@ -485,7 +753,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
|
||||
# Emit the current item to pipeline
|
||||
result_obj = {
|
||||
'file_path': filename,
|
||||
'path': filename,
|
||||
'title': title,
|
||||
'cmdlet_name': '.pipe',
|
||||
'source': 'pipe',
|
||||
@@ -683,10 +951,20 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
items_to_add = result
|
||||
elif isinstance(result, dict):
|
||||
items_to_add = [result]
|
||||
|
||||
if _queue_items(items_to_add, config=config):
|
||||
else:
|
||||
# Handle PipeObject or any other object type
|
||||
items_to_add = [result]
|
||||
|
||||
# Debug: inspect incoming result and attributes
|
||||
try:
|
||||
debug(f"pipe._run: received result type={type(result)} repr={repr(result)[:200]}")
|
||||
debug(f"pipe._run: attrs path={getattr(result, 'path', None)} url={getattr(result, 'url', None)} store={getattr(result, 'store', None)} hash={getattr(result, 'hash', None)}")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if items_to_add and _queue_items(items_to_add, config=config):
|
||||
mpv_started = True
|
||||
|
||||
|
||||
if items_to_add:
|
||||
# If we added items, we might want to play the first one if nothing is playing?
|
||||
# For now, just list the playlist
|
||||
@@ -760,7 +1038,7 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
return 1
|
||||
else:
|
||||
# Play item
|
||||
if hydrus_header and _is_hydrus_target(filename, hydrus_url):
|
||||
if hydrus_header and _is_hydrus_path(filename, hydrus_url):
|
||||
header_cmd = {"command": ["set_property", "http-header-fields", hydrus_header], "request_id": 198}
|
||||
_send_ipc_command(header_cmd, silent=True)
|
||||
cmd = {"command": ["playlist-play-index", idx], "request_id": 102}
|
||||
@@ -799,28 +1077,84 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
except NameError:
|
||||
table_title = "MPV Playlist"
|
||||
|
||||
table = ResultTable(table_title)
|
||||
table = ResultTable(table_title, preserve_order=True)
|
||||
|
||||
# Convert MPV items to PipeObjects with proper hash and store
|
||||
pipe_objects = []
|
||||
for i, item in enumerate(items):
|
||||
is_current = item.get("current", False)
|
||||
title = _extract_title_from_item(item)
|
||||
store = _infer_store_from_playlist_item(item)
|
||||
|
||||
# Truncate if too long
|
||||
if len(title) > 80:
|
||||
title = title[:77] + "..."
|
||||
filename = item.get("filename", "")
|
||||
|
||||
# Extract the real path/URL from memory:// wrapper if present
|
||||
real_path = _extract_target_from_memory_uri(filename) or filename
|
||||
|
||||
# Try to extract hash from the path/URL
|
||||
file_hash = None
|
||||
store_name = None
|
||||
|
||||
# Check if it's a Hydrus URL
|
||||
if "get_files/file" in real_path or "hash=" in real_path:
|
||||
# Extract hash from Hydrus URL
|
||||
hash_match = re.search(r"hash=([0-9a-f]{64})", real_path.lower())
|
||||
if hash_match:
|
||||
file_hash = hash_match.group(1)
|
||||
# Try to find which Hydrus instance has this file
|
||||
if file_storage:
|
||||
store_name = _find_hydrus_instance_for_hash(file_hash, file_storage)
|
||||
if not store_name:
|
||||
store_name = "hydrus"
|
||||
# Check if it's a hash-based local file
|
||||
elif real_path:
|
||||
# Try to extract hash from filename (e.g., C:\path\1e8c46...a1b2.mp4)
|
||||
path_obj = Path(real_path)
|
||||
stem = path_obj.stem # filename without extension
|
||||
if len(stem) == 64 and all(c in '0123456789abcdef' for c in stem.lower()):
|
||||
file_hash = stem.lower()
|
||||
# Find which folder store has this file
|
||||
if file_storage:
|
||||
for backend_name in file_storage.list_backends():
|
||||
backend = file_storage[backend_name]
|
||||
if type(backend).__name__ == "Folder":
|
||||
# Check if this backend has the file
|
||||
try:
|
||||
result_path = backend.get_file(file_hash)
|
||||
if result_path and result_path.exists():
|
||||
store_name = backend_name
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback to inferred store if we couldn't find it
|
||||
if not store_name:
|
||||
store_name = _infer_store_from_playlist_item(item, file_storage=file_storage)
|
||||
|
||||
# Build PipeObject with proper metadata
|
||||
from models import PipeObject
|
||||
pipe_obj = PipeObject(
|
||||
hash=file_hash or "unknown",
|
||||
store=store_name or "unknown",
|
||||
title=title,
|
||||
path=real_path
|
||||
)
|
||||
pipe_objects.append(pipe_obj)
|
||||
|
||||
# Truncate title for display
|
||||
display_title = title
|
||||
if len(display_title) > 80:
|
||||
display_title = display_title[:77] + "..."
|
||||
|
||||
row = table.add_row()
|
||||
row.add_column("Current", "*" if is_current else "")
|
||||
row.add_column("Store", store)
|
||||
row.add_column("Title", title)
|
||||
row.add_column("Store", store_name or "unknown")
|
||||
row.add_column("Title", display_title)
|
||||
|
||||
table.set_row_selection_args(i, [str(i + 1)])
|
||||
|
||||
table.set_source_command(".pipe")
|
||||
|
||||
# Register results with pipeline context so @N selection works
|
||||
ctx.set_last_result_table_overlay(table, items)
|
||||
# Register PipeObjects (not raw MPV items) with pipeline context
|
||||
ctx.set_last_result_table_overlay(table, pipe_objects)
|
||||
ctx.set_current_stage_table(table)
|
||||
|
||||
print(table)
|
||||
@@ -889,16 +1223,30 @@ def _start_mpv(items: List[Any], config: Optional[Dict[str, Any]] = None) -> Non
|
||||
if items:
|
||||
_queue_items(items, config=config)
|
||||
|
||||
# Auto-play the first item
|
||||
import time
|
||||
time.sleep(0.3) # Give MPV a moment to process the queued items
|
||||
|
||||
# Play the first item (index 0) and unpause
|
||||
play_cmd = {"command": ["playlist-play-index", 0], "request_id": 102}
|
||||
play_resp = _send_ipc_command(play_cmd, silent=True)
|
||||
|
||||
if play_resp and play_resp.get("error") == "success":
|
||||
# Ensure playback starts (unpause)
|
||||
unpause_cmd = {"command": ["set_property", "pause", False], "request_id": 103}
|
||||
_send_ipc_command(unpause_cmd, silent=True)
|
||||
debug("Auto-playing first item")
|
||||
|
||||
except Exception as e:
|
||||
debug(f"Error starting MPV: {e}", file=sys.stderr)
|
||||
|
||||
|
||||
CMDLET = Cmdlet(
|
||||
name=".pipe",
|
||||
aliases=["pipe", "playlist", "queue", "ls-pipe"],
|
||||
alias=["pipe", "playlist", "queue", "ls-pipe"],
|
||||
summary="Manage and play items in the MPV playlist via IPC",
|
||||
usage=".pipe [index|url] [-current] [-clear] [-list] [-url URL]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg(
|
||||
name="index",
|
||||
type="string", # Changed to string to allow URL detection
|
||||
|
||||
@@ -21,14 +21,14 @@ CMDLET = Cmdlet(
|
||||
name=".worker",
|
||||
summary="Display workers table in result table format.",
|
||||
usage=".worker [status] [-limit N] [@N]",
|
||||
args=[
|
||||
arg=[
|
||||
CmdletArg("status", description="Filter by status: running, completed, error (default: all)"),
|
||||
CmdletArg("limit", type="integer", description="Limit results (default: 100)"),
|
||||
CmdletArg("@N", description="Select worker by index (1-based) and display full logs"),
|
||||
CmdletArg("-id", description="Show full logs for a specific worker"),
|
||||
CmdletArg("-clear", type="flag", description="Remove completed workers from the database"),
|
||||
],
|
||||
details=[
|
||||
detail=[
|
||||
"- Shows all background worker tasks and their output",
|
||||
"- Can filter by status: running, completed, error",
|
||||
"- Search result stdout is captured from each worker",
|
||||
@@ -74,9 +74,9 @@ def _run(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
return 1
|
||||
|
||||
try:
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
with LocalLibraryDB(library_root) as db:
|
||||
with FolderDB(library_root) as db:
|
||||
if options.clear:
|
||||
count = db.clear_finished_workers()
|
||||
log(f"Cleared {count} finished workers.")
|
||||
|
||||
60
config.py
60
config.py
@@ -25,18 +25,28 @@ def _make_cache_key(config_dir: Optional[Path], filename: str, actual_path: Opti
|
||||
def get_hydrus_instance(config: Dict[str, Any], instance_name: str = "home") -> Optional[Dict[str, Any]]:
|
||||
"""Get a specific Hydrus instance config by name.
|
||||
|
||||
Supports both formats:
|
||||
- New: config["storage"]["hydrus"][instance_name] = {"key": "...", "url": "..."}
|
||||
- Old: config["HydrusNetwork"][instance_name] = {"key": "...", "url": "..."}
|
||||
Supports multiple formats:
|
||||
- Current: config["store"]["hydrusnetwork"][instance_name]
|
||||
- Legacy: config["storage"]["hydrus"][instance_name]
|
||||
- Old: config["HydrusNetwork"][instance_name]
|
||||
|
||||
Args:
|
||||
config: Configuration dict
|
||||
instance_name: Name of the Hydrus instance (default: "home")
|
||||
|
||||
Returns:
|
||||
Dict with "key" and "url" keys, or None if not found
|
||||
Dict with access key and URL, or None if not found
|
||||
"""
|
||||
# Try new format first
|
||||
# Try current format first: config["store"]["hydrusnetwork"]["home"]
|
||||
store = config.get("store", {})
|
||||
if isinstance(store, dict):
|
||||
hydrusnetwork = store.get("hydrusnetwork", {})
|
||||
if isinstance(hydrusnetwork, dict):
|
||||
instance = hydrusnetwork.get(instance_name)
|
||||
if isinstance(instance, dict):
|
||||
return instance
|
||||
|
||||
# Try legacy format: config["storage"]["hydrus"]
|
||||
storage = config.get("storage", {})
|
||||
if isinstance(storage, dict):
|
||||
hydrus_config = storage.get("hydrus", {})
|
||||
@@ -45,7 +55,7 @@ def get_hydrus_instance(config: Dict[str, Any], instance_name: str = "home") ->
|
||||
if isinstance(instance, dict):
|
||||
return instance
|
||||
|
||||
# Fall back to old format
|
||||
# Fall back to old format: config["HydrusNetwork"]
|
||||
hydrus_network = config.get("HydrusNetwork")
|
||||
if not isinstance(hydrus_network, dict):
|
||||
return None
|
||||
@@ -60,9 +70,10 @@ def get_hydrus_instance(config: Dict[str, Any], instance_name: str = "home") ->
|
||||
def get_hydrus_access_key(config: Dict[str, Any], instance_name: str = "home") -> Optional[str]:
|
||||
"""Get Hydrus access key for an instance.
|
||||
|
||||
Supports both old flat format and new nested format:
|
||||
Supports multiple formats:
|
||||
- Current: config["store"]["hydrusnetwork"][name]["Hydrus-Client-API-Access-Key"]
|
||||
- Legacy: config["storage"]["hydrus"][name]["key"]
|
||||
- Old: config["HydrusNetwork_Access_Key"]
|
||||
- New: config["HydrusNetwork"][instance_name]["key"]
|
||||
|
||||
Args:
|
||||
config: Configuration dict
|
||||
@@ -72,7 +83,18 @@ def get_hydrus_access_key(config: Dict[str, Any], instance_name: str = "home") -
|
||||
Access key string, or None if not found
|
||||
"""
|
||||
instance = get_hydrus_instance(config, instance_name)
|
||||
key = instance.get("key") if instance else config.get("HydrusNetwork_Access_Key")
|
||||
if instance:
|
||||
# Try current format key name
|
||||
key = instance.get("Hydrus-Client-API-Access-Key")
|
||||
if key:
|
||||
return str(key).strip()
|
||||
# Try legacy key name
|
||||
key = instance.get("key")
|
||||
if key:
|
||||
return str(key).strip()
|
||||
|
||||
# Fall back to old flat format
|
||||
key = config.get("HydrusNetwork_Access_Key")
|
||||
return str(key).strip() if key else None
|
||||
|
||||
|
||||
@@ -140,8 +162,9 @@ def resolve_output_dir(config: Dict[str, Any]) -> Path:
|
||||
def get_local_storage_path(config: Dict[str, Any]) -> Optional[Path]:
|
||||
"""Get local storage path from config.
|
||||
|
||||
Supports both formats:
|
||||
- New: config["storage"]["local"]["path"]
|
||||
Supports multiple formats:
|
||||
- New: config["store"]["folder"]["default"]["path"]
|
||||
- Old: config["storage"]["local"]["path"]
|
||||
- Old: config["Local"]["path"]
|
||||
|
||||
Args:
|
||||
@@ -150,7 +173,18 @@ def get_local_storage_path(config: Dict[str, Any]) -> Optional[Path]:
|
||||
Returns:
|
||||
Path object if found, None otherwise
|
||||
"""
|
||||
# Try new format first
|
||||
# Try new format first: store.folder.default.path
|
||||
store = config.get("store", {})
|
||||
if isinstance(store, dict):
|
||||
folder_config = store.get("folder", {})
|
||||
if isinstance(folder_config, dict):
|
||||
default_config = folder_config.get("default", {})
|
||||
if isinstance(default_config, dict):
|
||||
path_str = default_config.get("path")
|
||||
if path_str:
|
||||
return Path(str(path_str)).expanduser()
|
||||
|
||||
# Fall back to storage.local.path format
|
||||
storage = config.get("storage", {})
|
||||
if isinstance(storage, dict):
|
||||
local_config = storage.get("local", {})
|
||||
@@ -159,7 +193,7 @@ def get_local_storage_path(config: Dict[str, Any]) -> Optional[Path]:
|
||||
if path_str:
|
||||
return Path(str(path_str)).expanduser()
|
||||
|
||||
# Fall back to old format
|
||||
# Fall back to old Local format
|
||||
local_config = config.get("Local", {})
|
||||
if isinstance(local_config, dict):
|
||||
path_str = local_config.get("path")
|
||||
|
||||
@@ -50,7 +50,6 @@ UrlPolicy = _utils.UrlPolicy
|
||||
DownloadOptions = _download.DownloadOptions
|
||||
DownloadError = _download.DownloadError
|
||||
DownloadMediaResult = _download.DownloadMediaResult
|
||||
download_media = _download.download_media
|
||||
is_url_supported_by_ytdlp = _download.is_url_supported_by_ytdlp
|
||||
probe_url = _download.probe_url
|
||||
# Hydrus utilities
|
||||
|
||||
@@ -35,7 +35,7 @@ class AllDebridClient:
|
||||
"""Client for AllDebrid API."""
|
||||
|
||||
# Try both v4 and v3 APIs
|
||||
BASE_URLS = [
|
||||
BASE_url = [
|
||||
"https://api.alldebrid.com/v4",
|
||||
"https://api.alldebrid.com/v3",
|
||||
]
|
||||
@@ -49,7 +49,7 @@ class AllDebridClient:
|
||||
self.api_key = api_key.strip()
|
||||
if not self.api_key:
|
||||
raise AllDebridError("AllDebrid API key is empty")
|
||||
self.base_url = self.BASE_URLS[0] # Start with v4
|
||||
self.base_url = self.BASE_url[0] # Start with v4
|
||||
|
||||
def _request(self, endpoint: str, params: Optional[Dict[str, str]] = None) -> Dict[str, Any]:
|
||||
"""Make a request to AllDebrid API.
|
||||
@@ -738,7 +738,7 @@ def parse_magnet_or_hash(uri: str) -> Optional[str]:
|
||||
def unlock_link_cmdlet(result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:
|
||||
"""Unlock a restricted link using AllDebrid.
|
||||
|
||||
Converts free hosters and restricted links to direct download URLs.
|
||||
Converts free hosters and restricted links to direct download url.
|
||||
|
||||
Usage:
|
||||
unlock-link <link>
|
||||
|
||||
@@ -378,7 +378,7 @@ def download(
|
||||
session: Authenticated requests.Session
|
||||
n_threads: Number of download threads
|
||||
directory: Directory to save images to
|
||||
links: List of image URLs
|
||||
links: List of image url
|
||||
scale: Image resolution (0=highest, 10=lowest)
|
||||
book_id: Archive.org book ID (for re-borrowing)
|
||||
|
||||
|
||||
195
helper/background_notifier.py
Normal file
195
helper/background_notifier.py
Normal file
@@ -0,0 +1,195 @@
|
||||
"""Lightweight console notifier for background WorkerManager tasks.
|
||||
|
||||
Registers a refresh callback on WorkerManager and prints concise updates when
|
||||
workers start, progress, or finish. Intended for CLI background workflows.
|
||||
|
||||
Filters to show only workers related to the current pipeline session to avoid
|
||||
cluttering the terminal with workers from previous sessions.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Callable, Dict, Optional, Set
|
||||
|
||||
from helper.logger import log, debug
|
||||
|
||||
|
||||
class BackgroundNotifier:
|
||||
"""Simple notifier that prints worker status changes for a session."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
manager: Any,
|
||||
output: Callable[[str], None] = log,
|
||||
session_worker_ids: Optional[Set[str]] = None,
|
||||
only_terminal_updates: bool = False,
|
||||
overlay_mode: bool = False,
|
||||
) -> None:
|
||||
self.manager = manager
|
||||
self.output = output
|
||||
self.session_worker_ids = session_worker_ids if session_worker_ids is not None else set()
|
||||
self.only_terminal_updates = only_terminal_updates
|
||||
self.overlay_mode = overlay_mode
|
||||
self._filter_enabled = session_worker_ids is not None
|
||||
self._last_state: Dict[str, str] = {}
|
||||
|
||||
try:
|
||||
self.manager.add_refresh_callback(self._on_refresh)
|
||||
self.manager.start_auto_refresh()
|
||||
except Exception as exc: # pragma: no cover - best effort
|
||||
debug(f"[notifier] Could not attach refresh callback: {exc}")
|
||||
|
||||
def _render_line(self, worker: Dict[str, Any]) -> Optional[str]:
|
||||
# Use worker_id (the actual worker ID we set) for filtering and display
|
||||
worker_id = str(worker.get("worker_id") or "").strip()
|
||||
if not worker_id:
|
||||
# Fallback to database id if worker_id is not set
|
||||
worker_id = str(worker.get("id") or "").strip()
|
||||
if not worker_id:
|
||||
return None
|
||||
|
||||
status = str(worker.get("status") or "running")
|
||||
progress_val = worker.get("progress") or worker.get("progress_percent")
|
||||
progress = ""
|
||||
if isinstance(progress_val, (int, float)):
|
||||
progress = f" {progress_val:.1f}%"
|
||||
elif progress_val:
|
||||
progress = f" {progress_val}"
|
||||
|
||||
step = str(worker.get("current_step") or worker.get("description") or "").strip()
|
||||
parts = [f"[worker:{worker_id}] {status}{progress}"]
|
||||
if step:
|
||||
parts.append(step)
|
||||
return " - ".join(parts)
|
||||
|
||||
def _on_refresh(self, workers: list[Dict[str, Any]]) -> None:
|
||||
overlay_active_workers = 0
|
||||
|
||||
for worker in workers:
|
||||
# Use worker_id (the actual worker ID we set) for filtering
|
||||
worker_id = str(worker.get("worker_id") or "").strip()
|
||||
if not worker_id:
|
||||
# Fallback to database id if worker_id is not set
|
||||
worker_id = str(worker.get("id") or "").strip()
|
||||
if not worker_id:
|
||||
continue
|
||||
|
||||
# If filtering is enabled, skip workers not in this session
|
||||
if self._filter_enabled and worker_id not in self.session_worker_ids:
|
||||
continue
|
||||
|
||||
status = str(worker.get("status") or "running")
|
||||
|
||||
# Overlay mode: only emit on completion; suppress start/progress spam
|
||||
if self.overlay_mode:
|
||||
if status in ("completed", "finished", "error"):
|
||||
progress_val = worker.get("progress") or worker.get("progress_percent") or ""
|
||||
step = str(worker.get("current_step") or worker.get("description") or "").strip()
|
||||
signature = f"{status}|{progress_val}|{step}"
|
||||
|
||||
if self._last_state.get(worker_id) == signature:
|
||||
continue
|
||||
|
||||
self._last_state[worker_id] = signature
|
||||
line = self._render_line(worker)
|
||||
if line:
|
||||
try:
|
||||
self.output(line)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
self._last_state.pop(worker_id, None)
|
||||
self.session_worker_ids.discard(worker_id)
|
||||
continue
|
||||
|
||||
# For terminal-only mode, emit once when the worker finishes and skip intermediate updates
|
||||
if self.only_terminal_updates:
|
||||
if status in ("completed", "finished", "error"):
|
||||
if self._last_state.get(worker_id) == status:
|
||||
continue
|
||||
self._last_state[worker_id] = status
|
||||
line = self._render_line(worker)
|
||||
if line:
|
||||
try:
|
||||
self.output(line)
|
||||
except Exception:
|
||||
pass
|
||||
# Stop tracking this worker after terminal notification
|
||||
self.session_worker_ids.discard(worker_id)
|
||||
continue
|
||||
|
||||
# Skip finished workers after showing them once (standard verbose mode)
|
||||
if status in ("completed", "finished", "error"):
|
||||
if worker_id in self._last_state:
|
||||
# Already shown, remove from tracking
|
||||
self._last_state.pop(worker_id, None)
|
||||
self.session_worker_ids.discard(worker_id)
|
||||
continue
|
||||
|
||||
progress_val = worker.get("progress") or worker.get("progress_percent") or ""
|
||||
step = str(worker.get("current_step") or worker.get("description") or "").strip()
|
||||
signature = f"{status}|{progress_val}|{step}"
|
||||
|
||||
if self._last_state.get(worker_id) == signature:
|
||||
continue
|
||||
|
||||
self._last_state[worker_id] = signature
|
||||
line = self._render_line(worker)
|
||||
if line:
|
||||
try:
|
||||
self.output(line)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if self.overlay_mode:
|
||||
try:
|
||||
# If nothing active for this session, clear the overlay text
|
||||
if overlay_active_workers == 0:
|
||||
self.output("")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
def ensure_background_notifier(
|
||||
manager: Any,
|
||||
output: Callable[[str], None] = log,
|
||||
session_worker_ids: Optional[Set[str]] = None,
|
||||
only_terminal_updates: bool = False,
|
||||
overlay_mode: bool = False,
|
||||
) -> Optional[BackgroundNotifier]:
|
||||
"""Attach a BackgroundNotifier to a WorkerManager if not already present.
|
||||
|
||||
Args:
|
||||
manager: WorkerManager instance
|
||||
output: Function to call for printing updates
|
||||
session_worker_ids: Set of worker IDs belonging to this pipeline session.
|
||||
If None, show all workers. If a set (even empty), only show workers in that set.
|
||||
"""
|
||||
if manager is None:
|
||||
return None
|
||||
|
||||
existing = getattr(manager, "_background_notifier", None)
|
||||
if isinstance(existing, BackgroundNotifier):
|
||||
# Update session IDs if provided
|
||||
if session_worker_ids is not None:
|
||||
existing._filter_enabled = True
|
||||
existing.session_worker_ids.update(session_worker_ids)
|
||||
# Respect the most restrictive setting for terminal-only updates
|
||||
if only_terminal_updates:
|
||||
existing.only_terminal_updates = True
|
||||
# Enable overlay mode if requested later
|
||||
if overlay_mode:
|
||||
existing.overlay_mode = True
|
||||
return existing
|
||||
|
||||
notifier = BackgroundNotifier(
|
||||
manager,
|
||||
output,
|
||||
session_worker_ids=session_worker_ids,
|
||||
only_terminal_updates=only_terminal_updates,
|
||||
overlay_mode=overlay_mode,
|
||||
)
|
||||
try:
|
||||
manager._background_notifier = notifier # type: ignore[attr-defined]
|
||||
except Exception:
|
||||
pass
|
||||
return notifier
|
||||
223
helper/cmdlet_catalog.py
Normal file
223
helper/cmdlet_catalog.py
Normal file
@@ -0,0 +1,223 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from importlib import import_module
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
try:
|
||||
from cmdlets import REGISTRY
|
||||
except Exception:
|
||||
REGISTRY = {} # type: ignore
|
||||
|
||||
try:
|
||||
from cmdnats import register_native_commands as _register_native_commands
|
||||
except Exception:
|
||||
_register_native_commands = None
|
||||
|
||||
|
||||
def ensure_registry_loaded() -> None:
|
||||
"""Ensure native commands are registered into REGISTRY (idempotent)."""
|
||||
if _register_native_commands and REGISTRY is not None:
|
||||
try:
|
||||
_register_native_commands(REGISTRY)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
def _normalize_mod_name(mod_name: str) -> str:
|
||||
"""Normalize a command/module name for import resolution."""
|
||||
normalized = (mod_name or "").strip()
|
||||
if normalized.startswith('.'):
|
||||
normalized = normalized.lstrip('.')
|
||||
normalized = normalized.replace('-', '_')
|
||||
return normalized
|
||||
|
||||
|
||||
def import_cmd_module(mod_name: str):
|
||||
"""Import a cmdlet/native module from cmdnats or cmdlets packages."""
|
||||
normalized = _normalize_mod_name(mod_name)
|
||||
if not normalized:
|
||||
return None
|
||||
for package in ("cmdnats", "cmdlets", None):
|
||||
try:
|
||||
qualified = f"{package}.{normalized}" if package else normalized
|
||||
return import_module(qualified)
|
||||
except ModuleNotFoundError:
|
||||
continue
|
||||
except Exception:
|
||||
continue
|
||||
return None
|
||||
|
||||
|
||||
def _normalize_arg(arg: Any) -> Dict[str, Any]:
|
||||
"""Convert a CmdletArg/dict into a plain metadata dict."""
|
||||
if isinstance(arg, dict):
|
||||
name = arg.get("name", "")
|
||||
return {
|
||||
"name": str(name).lstrip("-"),
|
||||
"type": arg.get("type", "string"),
|
||||
"required": bool(arg.get("required", False)),
|
||||
"description": arg.get("description", ""),
|
||||
"choices": arg.get("choices", []) or [],
|
||||
"alias": arg.get("alias", ""),
|
||||
"variadic": arg.get("variadic", False),
|
||||
}
|
||||
|
||||
name = getattr(arg, "name", "") or ""
|
||||
return {
|
||||
"name": str(name).lstrip("-"),
|
||||
"type": getattr(arg, "type", "string"),
|
||||
"required": bool(getattr(arg, "required", False)),
|
||||
"description": getattr(arg, "description", ""),
|
||||
"choices": getattr(arg, "choices", []) or [],
|
||||
"alias": getattr(arg, "alias", ""),
|
||||
"variadic": getattr(arg, "variadic", False),
|
||||
}
|
||||
|
||||
|
||||
def get_cmdlet_metadata(cmd_name: str) -> Optional[Dict[str, Any]]:
|
||||
"""Return normalized metadata for a cmdlet, if available (aliases supported)."""
|
||||
ensure_registry_loaded()
|
||||
normalized = cmd_name.replace("-", "_")
|
||||
mod = import_cmd_module(normalized)
|
||||
data = getattr(mod, "CMDLET", None) if mod else None
|
||||
|
||||
# Fallback: resolve via registered function's module (covers aliases)
|
||||
if data is None:
|
||||
try:
|
||||
reg_fn = (REGISTRY or {}).get(cmd_name.replace('_', '-').lower())
|
||||
if reg_fn:
|
||||
owner_mod = getattr(reg_fn, "__module__", "")
|
||||
if owner_mod:
|
||||
owner = import_module(owner_mod)
|
||||
data = getattr(owner, "CMDLET", None)
|
||||
except Exception:
|
||||
data = None
|
||||
|
||||
if not data:
|
||||
return None
|
||||
|
||||
if hasattr(data, "to_dict"):
|
||||
base = data.to_dict()
|
||||
elif isinstance(data, dict):
|
||||
base = data
|
||||
else:
|
||||
base = {}
|
||||
|
||||
name = getattr(data, "name", base.get("name", cmd_name)) or cmd_name
|
||||
aliases = getattr(data, "aliases", base.get("aliases", [])) or []
|
||||
usage = getattr(data, "usage", base.get("usage", ""))
|
||||
summary = getattr(data, "summary", base.get("summary", ""))
|
||||
details = getattr(data, "details", base.get("details", [])) or []
|
||||
args_list = getattr(data, "args", base.get("args", [])) or []
|
||||
args = [_normalize_arg(arg) for arg in args_list]
|
||||
|
||||
return {
|
||||
"name": str(name).replace("_", "-").lower(),
|
||||
"aliases": [str(a).replace("_", "-").lower() for a in aliases if a],
|
||||
"usage": usage,
|
||||
"summary": summary,
|
||||
"details": details,
|
||||
"args": args,
|
||||
"raw": data,
|
||||
}
|
||||
|
||||
|
||||
def list_cmdlet_metadata() -> Dict[str, Dict[str, Any]]:
|
||||
"""Collect metadata for all registered cmdlets keyed by canonical name."""
|
||||
ensure_registry_loaded()
|
||||
entries: Dict[str, Dict[str, Any]] = {}
|
||||
for reg_name in (REGISTRY or {}).keys():
|
||||
meta = get_cmdlet_metadata(reg_name)
|
||||
canonical = str(reg_name).replace("_", "-").lower()
|
||||
|
||||
if meta:
|
||||
canonical = meta.get("name", canonical)
|
||||
aliases = meta.get("aliases", [])
|
||||
base = entries.get(
|
||||
canonical,
|
||||
{
|
||||
"name": canonical,
|
||||
"aliases": [],
|
||||
"usage": "",
|
||||
"summary": "",
|
||||
"details": [],
|
||||
"args": [],
|
||||
"raw": meta.get("raw"),
|
||||
},
|
||||
)
|
||||
merged_aliases = set(base.get("aliases", [])) | set(aliases)
|
||||
if canonical != reg_name:
|
||||
merged_aliases.add(reg_name)
|
||||
base["aliases"] = sorted(a for a in merged_aliases if a and a != canonical)
|
||||
if not base.get("usage") and meta.get("usage"):
|
||||
base["usage"] = meta["usage"]
|
||||
if not base.get("summary") and meta.get("summary"):
|
||||
base["summary"] = meta["summary"]
|
||||
if not base.get("details") and meta.get("details"):
|
||||
base["details"] = meta["details"]
|
||||
if not base.get("args") and meta.get("args"):
|
||||
base["args"] = meta["args"]
|
||||
if not base.get("raw"):
|
||||
base["raw"] = meta.get("raw")
|
||||
entries[canonical] = base
|
||||
else:
|
||||
entries.setdefault(
|
||||
canonical,
|
||||
{"name": canonical, "aliases": [], "usage": "", "summary": "", "details": [], "args": [], "raw": None},
|
||||
)
|
||||
return entries
|
||||
|
||||
|
||||
def list_cmdlet_names(include_aliases: bool = True) -> List[str]:
|
||||
"""Return sorted cmdlet names (optionally including aliases)."""
|
||||
ensure_registry_loaded()
|
||||
entries = list_cmdlet_metadata()
|
||||
names = set()
|
||||
for meta in entries.values():
|
||||
names.add(meta.get("name", ""))
|
||||
if include_aliases:
|
||||
for alias in meta.get("aliases", []):
|
||||
names.add(alias)
|
||||
return sorted(n for n in names if n)
|
||||
|
||||
|
||||
def get_cmdlet_arg_flags(cmd_name: str) -> List[str]:
|
||||
"""Return flag variants for cmdlet arguments (e.g., -name/--name)."""
|
||||
meta = get_cmdlet_metadata(cmd_name)
|
||||
if not meta:
|
||||
return []
|
||||
|
||||
raw = meta.get("raw")
|
||||
if raw and hasattr(raw, "build_flag_registry"):
|
||||
try:
|
||||
registry = raw.build_flag_registry()
|
||||
flags: List[str] = []
|
||||
for flag_set in registry.values():
|
||||
flags.extend(flag_set)
|
||||
return sorted(set(flags))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
flags: List[str] = []
|
||||
for arg in meta.get("args", []):
|
||||
name = arg.get("name")
|
||||
if not name:
|
||||
continue
|
||||
flags.append(f"-{name}")
|
||||
flags.append(f"--{name}")
|
||||
alias = arg.get("alias")
|
||||
if alias:
|
||||
flags.append(f"-{alias}")
|
||||
return flags
|
||||
|
||||
|
||||
def get_cmdlet_arg_choices(cmd_name: str, arg_name: str) -> List[str]:
|
||||
"""Return declared choices for a cmdlet argument."""
|
||||
meta = get_cmdlet_metadata(cmd_name)
|
||||
if not meta:
|
||||
return []
|
||||
target = arg_name.lstrip("-")
|
||||
for arg in meta.get("args", []):
|
||||
if arg.get("name") == target:
|
||||
return list(arg.get("choices", []) or [])
|
||||
return []
|
||||
@@ -28,7 +28,6 @@ from helper.logger import log, debug
|
||||
from .utils import ensure_directory, sha256_file
|
||||
from .http_client import HTTPClient
|
||||
from models import DownloadError, DownloadOptions, DownloadMediaResult, DebugLogger, ProgressBar
|
||||
from hydrus_health_check import get_cookies_file_path
|
||||
|
||||
try:
|
||||
import yt_dlp # type: ignore
|
||||
@@ -145,7 +144,7 @@ def list_formats(url: str, no_playlist: bool = False, playlist_items: Optional[s
|
||||
return None
|
||||
|
||||
|
||||
def _download_with_sections_via_cli(url: str, ytdl_options: Dict[str, Any], sections: List[str]) -> tuple[Optional[str], Dict[str, Any]]:
|
||||
def _download_with_sections_via_cli(url: str, ytdl_options: Dict[str, Any], sections: List[str], quiet: bool = False) -> tuple[Optional[str], Dict[str, Any]]:
|
||||
"""Download each section separately so merge-file can combine them.
|
||||
|
||||
yt-dlp with multiple --download-sections args merges them into one file.
|
||||
@@ -204,11 +203,14 @@ def _download_with_sections_via_cli(url: str, ytdl_options: Dict[str, Any], sect
|
||||
info_dict = json.loads(meta_result.stdout.strip())
|
||||
first_section_info = info_dict
|
||||
title_from_first = info_dict.get('title')
|
||||
debug(f"Extracted title from metadata: {title_from_first}")
|
||||
if not quiet:
|
||||
debug(f"Extracted title from metadata: {title_from_first}")
|
||||
except json.JSONDecodeError:
|
||||
debug("Could not parse JSON metadata")
|
||||
if not quiet:
|
||||
debug("Could not parse JSON metadata")
|
||||
except Exception as e:
|
||||
debug(f"Error extracting metadata: {e}")
|
||||
if not quiet:
|
||||
debug(f"Error extracting metadata: {e}")
|
||||
|
||||
# Build yt-dlp command for downloading this section
|
||||
cmd = ["yt-dlp"]
|
||||
@@ -240,8 +242,9 @@ def _download_with_sections_via_cli(url: str, ytdl_options: Dict[str, Any], sect
|
||||
# Add the URL
|
||||
cmd.append(url)
|
||||
|
||||
debug(f"Running yt-dlp for section {section_idx}/{len(sections_list)}: {section}")
|
||||
debug(f"Command: {' '.join(cmd)}")
|
||||
if not quiet:
|
||||
debug(f"Running yt-dlp for section {section_idx}/{len(sections_list)}: {section}")
|
||||
debug(f"Command: {' '.join(cmd)}")
|
||||
|
||||
# Run the subprocess - don't capture output so progress is shown
|
||||
try:
|
||||
@@ -273,13 +276,15 @@ def _build_ytdlp_options(opts: DownloadOptions) -> Dict[str, Any]:
|
||||
"fragment_retries": 10,
|
||||
"http_chunk_size": 10_485_760,
|
||||
"restrictfilenames": True,
|
||||
"progress_hooks": [_progress_callback],
|
||||
"progress_hooks": [] if opts.quiet else [_progress_callback],
|
||||
}
|
||||
|
||||
if opts.cookies_path and opts.cookies_path.is_file():
|
||||
base_options["cookiefile"] = str(opts.cookies_path)
|
||||
else:
|
||||
# Check global cookies file
|
||||
# Check global cookies file lazily to avoid import cycles
|
||||
from hydrus_health_check import get_cookies_file_path # local import
|
||||
|
||||
global_cookies = get_cookies_file_path()
|
||||
if global_cookies:
|
||||
base_options["cookiefile"] = global_cookies
|
||||
@@ -287,7 +292,7 @@ def _build_ytdlp_options(opts: DownloadOptions) -> Dict[str, Any]:
|
||||
# Fallback to browser cookies
|
||||
base_options["cookiesfrombrowser"] = ("chrome",)
|
||||
|
||||
# Add no-playlist option if specified (for single video from playlist URLs)
|
||||
# Add no-playlist option if specified (for single video from playlist url)
|
||||
if opts.no_playlist:
|
||||
base_options["noplaylist"] = True
|
||||
|
||||
@@ -336,7 +341,8 @@ def _build_ytdlp_options(opts: DownloadOptions) -> Dict[str, Any]:
|
||||
if opts.playlist_items:
|
||||
base_options["playlist_items"] = opts.playlist_items
|
||||
|
||||
debug(f"yt-dlp: mode={opts.mode}, format={base_options.get('format')}")
|
||||
if not opts.quiet:
|
||||
debug(f"yt-dlp: mode={opts.mode}, format={base_options.get('format')}")
|
||||
return base_options
|
||||
|
||||
|
||||
@@ -411,8 +417,8 @@ def _extract_sha256(info: Dict[str, Any]) -> Optional[str]:
|
||||
def _get_libgen_download_url(libgen_url: str) -> Optional[str]:
|
||||
"""Extract the actual download link from LibGen redirect URL.
|
||||
|
||||
LibGen URLs like https://libgen.gl/file.php?id=123456 redirect to
|
||||
actual mirror URLs. This follows the redirect chain to get the real file.
|
||||
LibGen url like https://libgen.gl/file.php?id=123456 redirect to
|
||||
actual mirror url. This follows the redirect chain to get the real file.
|
||||
|
||||
Args:
|
||||
libgen_url: LibGen file.php URL
|
||||
@@ -491,6 +497,7 @@ def _download_direct_file(
|
||||
url: str,
|
||||
output_dir: Path,
|
||||
debug_logger: Optional[DebugLogger] = None,
|
||||
quiet: bool = False,
|
||||
) -> DownloadMediaResult:
|
||||
"""Download a direct file (PDF, image, document, etc.) without yt-dlp."""
|
||||
ensure_directory(output_dir)
|
||||
@@ -535,9 +542,11 @@ def _download_direct_file(
|
||||
extracted_name = match.group(1) or match.group(2)
|
||||
if extracted_name:
|
||||
filename = unquote(extracted_name)
|
||||
debug(f"Filename from Content-Disposition: {filename}")
|
||||
if not quiet:
|
||||
debug(f"Filename from Content-Disposition: {filename}")
|
||||
except Exception as e:
|
||||
log(f"Could not get filename from headers: {e}", file=sys.stderr)
|
||||
if not quiet:
|
||||
log(f"Could not get filename from headers: {e}", file=sys.stderr)
|
||||
|
||||
# Fallback if we still don't have a good filename
|
||||
if not filename or "." not in filename:
|
||||
@@ -546,7 +555,8 @@ def _download_direct_file(
|
||||
file_path = output_dir / filename
|
||||
progress_bar = ProgressBar()
|
||||
|
||||
debug(f"Direct download: {filename}")
|
||||
if not quiet:
|
||||
debug(f"Direct download: {filename}")
|
||||
|
||||
try:
|
||||
start_time = time.time()
|
||||
@@ -577,7 +587,8 @@ def _download_direct_file(
|
||||
speed_str=speed_str,
|
||||
eta_str=eta_str,
|
||||
)
|
||||
debug(progress_line)
|
||||
if not quiet:
|
||||
debug(progress_line)
|
||||
last_progress_time[0] = now
|
||||
|
||||
with HTTPClient(timeout=30.0) as client:
|
||||
@@ -585,7 +596,8 @@ def _download_direct_file(
|
||||
|
||||
elapsed = time.time() - start_time
|
||||
avg_speed_str = progress_bar.format_bytes(downloaded_bytes[0] / elapsed if elapsed > 0 else 0) + "/s"
|
||||
debug(f"✓ Downloaded in {elapsed:.1f}s at {avg_speed_str}")
|
||||
if not quiet:
|
||||
debug(f"✓ Downloaded in {elapsed:.1f}s at {avg_speed_str}")
|
||||
|
||||
# For direct file downloads, create minimal info dict without filename as title
|
||||
# This prevents creating duplicate title: tags when filename gets auto-generated
|
||||
@@ -658,375 +670,98 @@ def _download_direct_file(
|
||||
raise DownloadError(f"Error downloading file: {exc}") from exc
|
||||
|
||||
|
||||
def probe_url(url: str, no_playlist: bool = False) -> Optional[Dict[str, Any]]:
|
||||
def probe_url(url: str, no_playlist: bool = False, timeout_seconds: int = 15) -> Optional[Dict[str, Any]]:
|
||||
"""Probe URL to extract metadata WITHOUT downloading.
|
||||
|
||||
Args:
|
||||
url: URL to probe
|
||||
no_playlist: If True, ignore playlists and probe only the single video
|
||||
timeout_seconds: Max seconds to wait for probe (default 15s)
|
||||
|
||||
Returns:
|
||||
Dict with keys: extractor, title, entries (if playlist), duration, etc.
|
||||
Returns None if not supported by yt-dlp.
|
||||
Returns None if not supported by yt-dlp or on timeout.
|
||||
"""
|
||||
if not is_url_supported_by_ytdlp(url):
|
||||
return None
|
||||
|
||||
_ensure_yt_dlp_ready()
|
||||
# Wrap probe in timeout to prevent hanging on large playlists
|
||||
import threading
|
||||
from typing import cast
|
||||
|
||||
assert yt_dlp is not None
|
||||
try:
|
||||
# Extract info without downloading
|
||||
# Use extract_flat='in_playlist' to get full metadata for playlist items
|
||||
ydl_opts = {
|
||||
"quiet": True, # Suppress all output
|
||||
"no_warnings": True,
|
||||
"socket_timeout": 10,
|
||||
"retries": 3,
|
||||
"skip_download": True, # Don't actually download
|
||||
"extract_flat": "in_playlist", # Get playlist with metadata for each entry
|
||||
"noprogress": True, # No progress bars
|
||||
}
|
||||
|
||||
# Add cookies if available
|
||||
global_cookies = get_cookies_file_path()
|
||||
if global_cookies:
|
||||
ydl_opts["cookiefile"] = global_cookies
|
||||
|
||||
# Add no_playlist option if specified
|
||||
if no_playlist:
|
||||
ydl_opts["noplaylist"] = True
|
||||
|
||||
with yt_dlp.YoutubeDL(ydl_opts) as ydl: # type: ignore[arg-type]
|
||||
info = ydl.extract_info(url, download=False)
|
||||
|
||||
if not isinstance(info, dict):
|
||||
return None
|
||||
|
||||
# Extract relevant fields
|
||||
return {
|
||||
"extractor": info.get("extractor", ""),
|
||||
"title": info.get("title", ""),
|
||||
"entries": info.get("entries", []), # Will be populated if playlist
|
||||
"duration": info.get("duration"),
|
||||
"uploader": info.get("uploader"),
|
||||
"description": info.get("description"),
|
||||
"url": url,
|
||||
}
|
||||
except Exception as exc:
|
||||
log(f"Probe failed for {url}: {exc}")
|
||||
return None
|
||||
|
||||
|
||||
def download_media(
|
||||
opts: DownloadOptions,
|
||||
*,
|
||||
debug_logger: Optional[DebugLogger] = None,
|
||||
) -> DownloadMediaResult:
|
||||
"""Download media from URL using yt-dlp or direct HTTP download.
|
||||
result_container: List[Optional[Any]] = [None, None] # [result, error]
|
||||
|
||||
Args:
|
||||
opts: DownloadOptions with url, mode, output_dir, etc.
|
||||
debug_logger: Optional debug logger for troubleshooting
|
||||
|
||||
Returns:
|
||||
DownloadMediaResult with path, info, tags, hash
|
||||
|
||||
Raises:
|
||||
DownloadError: If download fails
|
||||
"""
|
||||
# Handle LibGen URLs specially
|
||||
# file.php redirects to mirrors, get.php is direct from modern API
|
||||
if 'libgen' in opts.url.lower():
|
||||
if '/get.php' in opts.url.lower():
|
||||
# Modern API get.php links are direct downloads from mirrors (not file redirects)
|
||||
log(f"Detected LibGen get.php URL, downloading directly...")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("libgen-direct", {"url": opts.url})
|
||||
return _download_direct_file(opts.url, opts.output_dir, debug_logger)
|
||||
elif '/file.php' in opts.url.lower():
|
||||
# Old-style file.php redirects to mirrors, we need to resolve
|
||||
log(f"Detected LibGen file.php URL, resolving to actual mirror...")
|
||||
actual_url = _get_libgen_download_url(opts.url)
|
||||
if actual_url and actual_url != opts.url:
|
||||
log(f"Resolved LibGen URL to mirror: {actual_url}")
|
||||
opts.url = actual_url
|
||||
# After resolution, this will typically be an onion link or direct file
|
||||
# Skip yt-dlp for this (it won't support onion/mirrors), go direct
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("libgen-resolved", {"original": opts.url, "resolved": actual_url})
|
||||
return _download_direct_file(opts.url, opts.output_dir, debug_logger)
|
||||
else:
|
||||
log(f"Could not resolve LibGen URL, trying direct download anyway", file=sys.stderr)
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("libgen-resolve-failed", {"url": opts.url})
|
||||
return _download_direct_file(opts.url, opts.output_dir, debug_logger)
|
||||
|
||||
# Handle GoFile shares with a dedicated resolver before yt-dlp/direct fallbacks
|
||||
try:
|
||||
netloc = urlparse(opts.url).netloc.lower()
|
||||
except Exception:
|
||||
netloc = ""
|
||||
if "gofile.io" in netloc:
|
||||
msg = "GoFile links are currently unsupported"
|
||||
debug(msg)
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("gofile-unsupported", {"url": opts.url})
|
||||
raise DownloadError(msg)
|
||||
|
||||
# Determine if yt-dlp should be used
|
||||
ytdlp_supported = is_url_supported_by_ytdlp(opts.url)
|
||||
if ytdlp_supported:
|
||||
probe_result = probe_url(opts.url, no_playlist=opts.no_playlist)
|
||||
if probe_result is None:
|
||||
log(f"URL supported by yt-dlp but no media detected, falling back to direct download: {opts.url}")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("ytdlp-skip-no-media", {"url": opts.url})
|
||||
return _download_direct_file(opts.url, opts.output_dir, debug_logger)
|
||||
else:
|
||||
log(f"URL not supported by yt-dlp, trying direct download: {opts.url}")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("direct-file-attempt", {"url": opts.url})
|
||||
return _download_direct_file(opts.url, opts.output_dir, debug_logger)
|
||||
|
||||
_ensure_yt_dlp_ready()
|
||||
|
||||
ytdl_options = _build_ytdlp_options(opts)
|
||||
debug(f"Starting yt-dlp download: {opts.url}")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("ytdlp-start", {"url": opts.url})
|
||||
|
||||
assert yt_dlp is not None
|
||||
try:
|
||||
# Debug: show what options we're using
|
||||
if ytdl_options.get("download_sections"):
|
||||
debug(f"[yt-dlp] download_sections: {ytdl_options['download_sections']}")
|
||||
debug(f"[yt-dlp] force_keyframes_at_cuts: {ytdl_options.get('force_keyframes_at_cuts', False)}")
|
||||
|
||||
# Use subprocess when download_sections are present (Python API doesn't support them properly)
|
||||
session_id = None
|
||||
first_section_info = {}
|
||||
if ytdl_options.get("download_sections"):
|
||||
session_id, first_section_info = _download_with_sections_via_cli(opts.url, ytdl_options, ytdl_options.get("download_sections", []))
|
||||
info = None
|
||||
else:
|
||||
with yt_dlp.YoutubeDL(ytdl_options) as ydl: # type: ignore[arg-type]
|
||||
info = ydl.extract_info(opts.url, download=True)
|
||||
except Exception as exc:
|
||||
log(f"yt-dlp failed: {exc}", file=sys.stderr)
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"exception",
|
||||
{
|
||||
"phase": "yt-dlp",
|
||||
"error": str(exc),
|
||||
"traceback": traceback.format_exc(),
|
||||
},
|
||||
)
|
||||
raise DownloadError("yt-dlp download failed") from exc
|
||||
|
||||
# If we used subprocess, we need to find the file manually
|
||||
if info is None:
|
||||
# Find files created/modified during this download (after we started)
|
||||
# Look for files matching the expected output template pattern
|
||||
def _do_probe() -> None:
|
||||
try:
|
||||
import glob
|
||||
import time
|
||||
import re
|
||||
_ensure_yt_dlp_ready()
|
||||
|
||||
# Get the expected filename pattern from outtmpl
|
||||
# For sections: "C:\path\{session_id}.section_1_of_3.ext", etc.
|
||||
# For non-sections: "C:\path\title.ext"
|
||||
|
||||
# Wait a moment to ensure files are fully written
|
||||
time.sleep(0.5)
|
||||
|
||||
# List all files in output_dir, sorted by modification time
|
||||
files = sorted(opts.output_dir.iterdir(), key=lambda p: p.stat().st_mtime, reverse=True)
|
||||
if not files:
|
||||
raise FileNotFoundError(f"No files found in {opts.output_dir}")
|
||||
|
||||
# If we downloaded sections, look for files with the session_id pattern
|
||||
if opts.clip_sections and session_id:
|
||||
# Pattern: "{session_id}_1.ext", "{session_id}_2.ext", etc.
|
||||
section_pattern = re.compile(rf'^{re.escape(session_id)}_(\d+)\.')
|
||||
matching_files = [f for f in files if section_pattern.search(f.name)]
|
||||
|
||||
if matching_files:
|
||||
# Sort by section number to ensure correct order
|
||||
def extract_section_num(path: Path) -> int:
|
||||
match = section_pattern.search(path.name)
|
||||
return int(match.group(1)) if match else 999
|
||||
|
||||
matching_files.sort(key=extract_section_num)
|
||||
debug(f"Found {len(matching_files)} section file(s) matching pattern")
|
||||
|
||||
# Now rename section files to use hash-based names
|
||||
# This ensures unique filenames for each section content
|
||||
renamed_files = []
|
||||
|
||||
for idx, section_file in enumerate(matching_files, 1):
|
||||
try:
|
||||
# Calculate hash for the file
|
||||
file_hash = sha256_file(section_file)
|
||||
ext = section_file.suffix
|
||||
new_name = f"{file_hash}{ext}"
|
||||
new_path = opts.output_dir / new_name
|
||||
|
||||
if new_path.exists() and new_path != section_file:
|
||||
# If file with same hash exists, use it and delete the temp one
|
||||
debug(f"File with hash {file_hash} already exists, using existing file.")
|
||||
try:
|
||||
section_file.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
renamed_files.append(new_path)
|
||||
else:
|
||||
section_file.rename(new_path)
|
||||
debug(f"Renamed section file: {section_file.name} → {new_name}")
|
||||
renamed_files.append(new_path)
|
||||
except Exception as e:
|
||||
debug(f"Failed to process section file {section_file.name}: {e}")
|
||||
renamed_files.append(section_file)
|
||||
|
||||
media_path = renamed_files[0]
|
||||
media_paths = renamed_files
|
||||
debug(f"✓ Downloaded {len(media_paths)} section file(s) (session: {session_id})")
|
||||
else:
|
||||
# Fallback to most recent file if pattern not found
|
||||
media_path = files[0]
|
||||
media_paths = None
|
||||
debug(f"✓ Downloaded section file (pattern not found): {media_path.name}")
|
||||
else:
|
||||
# No sections, just take the most recent file
|
||||
media_path = files[0]
|
||||
media_paths = None
|
||||
|
||||
debug(f"✓ Downloaded: {media_path.name}")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record("ytdlp-file-found", {"path": str(media_path)})
|
||||
except Exception as exc:
|
||||
log(f"Error finding downloaded file: {exc}", file=sys.stderr)
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"exception",
|
||||
{"phase": "find-file", "error": str(exc)},
|
||||
)
|
||||
raise DownloadError(str(exc)) from exc
|
||||
|
||||
# Create result with minimal data extracted from filename
|
||||
file_hash = sha256_file(media_path)
|
||||
|
||||
# For section downloads, create tags with the title and build proper info dict
|
||||
tags = []
|
||||
title = ''
|
||||
if first_section_info:
|
||||
title = first_section_info.get('title', '')
|
||||
if title:
|
||||
tags.append(f'title:{title}')
|
||||
debug(f"Added title tag for section download: {title}")
|
||||
|
||||
# Build info dict - always use extracted title if available, not hash
|
||||
if first_section_info:
|
||||
info_dict = first_section_info
|
||||
else:
|
||||
info_dict = {
|
||||
"id": media_path.stem,
|
||||
"title": title or media_path.stem,
|
||||
"ext": media_path.suffix.lstrip(".")
|
||||
assert yt_dlp is not None
|
||||
# Extract info without downloading
|
||||
# Use extract_flat='in_playlist' to get full metadata for playlist items
|
||||
ydl_opts = {
|
||||
"quiet": True, # Suppress all output
|
||||
"no_warnings": True,
|
||||
"socket_timeout": 10,
|
||||
"retries": 2, # Reduce retries for faster timeout
|
||||
"skip_download": True, # Don't actually download
|
||||
"extract_flat": "in_playlist", # Get playlist with metadata for each entry
|
||||
"noprogress": True, # No progress bars
|
||||
}
|
||||
|
||||
return DownloadMediaResult(
|
||||
path=media_path,
|
||||
info=info_dict,
|
||||
tags=tags,
|
||||
source_url=opts.url,
|
||||
hash_value=file_hash,
|
||||
paths=media_paths, # Include all section files if present
|
||||
)
|
||||
|
||||
# Add cookies if available (lazy import to avoid circular dependency)
|
||||
from hydrus_health_check import get_cookies_file_path # local import
|
||||
|
||||
if not isinstance(info, dict):
|
||||
log(f"Unexpected yt-dlp response: {type(info)}", file=sys.stderr)
|
||||
raise DownloadError("Unexpected yt-dlp response type")
|
||||
|
||||
info_dict: Dict[str, Any] = info
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"ytdlp-info",
|
||||
{
|
||||
"keys": sorted(info_dict.keys()),
|
||||
"is_playlist": bool(info_dict.get("entries")),
|
||||
},
|
||||
)
|
||||
|
||||
try:
|
||||
entry, media_path = _resolve_entry_and_path(info_dict, opts.output_dir)
|
||||
except FileNotFoundError as exc:
|
||||
log(f"Error: {exc}", file=sys.stderr)
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"exception",
|
||||
{"phase": "resolve-path", "error": str(exc)},
|
||||
)
|
||||
raise DownloadError(str(exc)) from exc
|
||||
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"resolved-media",
|
||||
{"path": str(media_path), "entry_keys": sorted(entry.keys())},
|
||||
)
|
||||
|
||||
# Extract hash from metadata or compute
|
||||
hash_value = _extract_sha256(entry) or _extract_sha256(info_dict)
|
||||
if not hash_value:
|
||||
try:
|
||||
hash_value = sha256_file(media_path)
|
||||
except OSError as exc:
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"hash-error",
|
||||
{"path": str(media_path), "error": str(exc)},
|
||||
)
|
||||
|
||||
# Extract tags using metadata.py
|
||||
tags = []
|
||||
if extract_ytdlp_tags:
|
||||
try:
|
||||
tags = extract_ytdlp_tags(entry)
|
||||
except Exception as e:
|
||||
log(f"Error extracting tags: {e}", file=sys.stderr)
|
||||
|
||||
source_url = (
|
||||
entry.get("webpage_url")
|
||||
or entry.get("original_url")
|
||||
or entry.get("url")
|
||||
)
|
||||
|
||||
debug(f"✓ Downloaded: {media_path.name} ({len(tags)} tags)")
|
||||
if debug_logger is not None:
|
||||
debug_logger.write_record(
|
||||
"downloaded",
|
||||
{
|
||||
"path": str(media_path),
|
||||
"tag_count": len(tags),
|
||||
"source_url": source_url,
|
||||
"sha256": hash_value,
|
||||
},
|
||||
)
|
||||
|
||||
return DownloadMediaResult(
|
||||
path=media_path,
|
||||
info=entry,
|
||||
tags=tags,
|
||||
source_url=source_url,
|
||||
hash_value=hash_value,
|
||||
)
|
||||
global_cookies = get_cookies_file_path()
|
||||
if global_cookies:
|
||||
ydl_opts["cookiefile"] = global_cookies
|
||||
|
||||
# Add no_playlist option if specified
|
||||
if no_playlist:
|
||||
ydl_opts["noplaylist"] = True
|
||||
|
||||
with yt_dlp.YoutubeDL(ydl_opts) as ydl: # type: ignore[arg-type]
|
||||
info = ydl.extract_info(url, download=False)
|
||||
|
||||
if not isinstance(info, dict):
|
||||
result_container[0] = None
|
||||
return
|
||||
|
||||
# Extract relevant fields
|
||||
result_container[0] = {
|
||||
"extractor": info.get("extractor", ""),
|
||||
"title": info.get("title", ""),
|
||||
"entries": info.get("entries", []), # Will be populated if playlist
|
||||
"duration": info.get("duration"),
|
||||
"uploader": info.get("uploader"),
|
||||
"description": info.get("description"),
|
||||
"url": url,
|
||||
}
|
||||
except Exception as exc:
|
||||
log(f"Probe error for {url}: {exc}")
|
||||
result_container[1] = exc
|
||||
|
||||
thread = threading.Thread(target=_do_probe, daemon=False)
|
||||
thread.start()
|
||||
thread.join(timeout=timeout_seconds)
|
||||
|
||||
if thread.is_alive():
|
||||
# Probe timed out - return None to fall back to direct download
|
||||
debug(f"Probe timeout for {url} (>={timeout_seconds}s), proceeding with download")
|
||||
return None
|
||||
|
||||
if result_container[1] is not None:
|
||||
# Probe error - return None to proceed anyway
|
||||
return None
|
||||
|
||||
return cast(Optional[Dict[str, Any]], result_container[0])
|
||||
|
||||
|
||||
__all__ = [
|
||||
"download_media",
|
||||
"is_url_supported_by_ytdlp",
|
||||
"list_formats",
|
||||
"probe_url",
|
||||
"DownloadError",
|
||||
"DownloadOptions",
|
||||
"DownloadMediaResult",
|
||||
]
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -73,7 +73,7 @@ class HydrusRequestSpec:
|
||||
class HydrusClient:
|
||||
"""Thin wrapper around the Hydrus Client API."""
|
||||
|
||||
base_url: str
|
||||
url: str
|
||||
access_key: str = ""
|
||||
timeout: float = 60.0
|
||||
|
||||
@@ -84,10 +84,10 @@ class HydrusClient:
|
||||
_session_key: str = field(init=False, default="", repr=False) # Cached session key
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if not self.base_url:
|
||||
if not self.url:
|
||||
raise ValueError("Hydrus base URL is required")
|
||||
self.base_url = self.base_url.rstrip("/")
|
||||
parsed = urlsplit(self.base_url)
|
||||
self.url = self.url.rstrip("/")
|
||||
parsed = urlsplit(self.url)
|
||||
if parsed.scheme not in {"http", "https"}:
|
||||
raise ValueError("Hydrus base URL must use http or https")
|
||||
self.scheme = parsed.scheme
|
||||
@@ -374,24 +374,24 @@ class HydrusClient:
|
||||
hashes = self._ensure_hashes(file_hashes)
|
||||
if len(hashes) == 1:
|
||||
body = {"hash": hashes[0], "url_to_add": url}
|
||||
return self._post("/add_urls/associate_url", data=body)
|
||||
return self._post("/add_url/associate_url", data=body)
|
||||
|
||||
results: dict[str, Any] = {}
|
||||
for file_hash in hashes:
|
||||
body = {"hash": file_hash, "url_to_add": url}
|
||||
results[file_hash] = self._post("/add_urls/associate_url", data=body)
|
||||
results[file_hash] = self._post("/add_url/associate_url", data=body)
|
||||
return {"batched": results}
|
||||
|
||||
def delete_url(self, file_hashes: Union[str, Iterable[str]], url: str) -> dict[str, Any]:
|
||||
hashes = self._ensure_hashes(file_hashes)
|
||||
if len(hashes) == 1:
|
||||
body = {"hash": hashes[0], "url_to_delete": url}
|
||||
return self._post("/add_urls/associate_url", data=body)
|
||||
return self._post("/add_url/associate_url", data=body)
|
||||
|
||||
results: dict[str, Any] = {}
|
||||
for file_hash in hashes:
|
||||
body = {"hash": file_hash, "url_to_delete": url}
|
||||
results[file_hash] = self._post("/add_urls/associate_url", data=body)
|
||||
results[file_hash] = self._post("/add_url/associate_url", data=body)
|
||||
return {"batched": results}
|
||||
|
||||
def set_notes(self, file_hashes: Union[str, Iterable[str]], notes: dict[str, str], service_name: str) -> dict[str, Any]:
|
||||
@@ -517,7 +517,7 @@ class HydrusClient:
|
||||
file_ids: Sequence[int] | None = None,
|
||||
hashes: Sequence[str] | None = None,
|
||||
include_service_keys_to_tags: bool = True,
|
||||
include_file_urls: bool = False,
|
||||
include_file_url: bool = False,
|
||||
include_duration: bool = True,
|
||||
include_size: bool = True,
|
||||
include_mime: bool = False,
|
||||
@@ -535,7 +535,7 @@ class HydrusClient:
|
||||
include_service_keys_to_tags,
|
||||
lambda v: "true" if v else None,
|
||||
),
|
||||
("include_file_urls", include_file_urls, lambda v: "true" if v else None),
|
||||
("include_file_url", include_file_url, lambda v: "true" if v else None),
|
||||
("include_duration", include_duration, lambda v: "true" if v else None),
|
||||
("include_size", include_size, lambda v: "true" if v else None),
|
||||
("include_mime", include_mime, lambda v: "true" if v else None),
|
||||
@@ -559,13 +559,13 @@ class HydrusClient:
|
||||
def file_url(self, file_hash: str) -> str:
|
||||
hash_param = quote(file_hash)
|
||||
# Don't append access_key parameter for file downloads - use header instead
|
||||
url = f"{self.base_url}/get_files/file?hash={hash_param}"
|
||||
url = f"{self.url}/get_files/file?hash={hash_param}"
|
||||
return url
|
||||
|
||||
def thumbnail_url(self, file_hash: str) -> str:
|
||||
hash_param = quote(file_hash)
|
||||
# Don't append access_key parameter for file downloads - use header instead
|
||||
url = f"{self.base_url}/get_files/thumbnail?hash={hash_param}"
|
||||
url = f"{self.url}/get_files/thumbnail?hash={hash_param}"
|
||||
return url
|
||||
|
||||
|
||||
@@ -612,7 +612,7 @@ def hydrus_request(args, parser) -> int:
|
||||
|
||||
parsed = urlsplit(options.url)
|
||||
if parsed.scheme not in ('http', 'https'):
|
||||
parser.error('Only http and https URLs are supported')
|
||||
parser.error('Only http and https url are supported')
|
||||
if not parsed.hostname:
|
||||
parser.error('Invalid Hydrus URL')
|
||||
|
||||
@@ -1064,7 +1064,7 @@ def hydrus_export(args, _parser) -> int:
|
||||
file_hash = getattr(args, 'file_hash', None) or _extract_hash(args.file_url)
|
||||
if hydrus_url and file_hash:
|
||||
try:
|
||||
client = HydrusClient(base_url=hydrus_url, access_key=args.access_key, timeout=args.timeout)
|
||||
client = HydrusClient(url=hydrus_url, access_key=args.access_key, timeout=args.timeout)
|
||||
meta_response = client.fetch_file_metadata(hashes=[file_hash], include_mime=True)
|
||||
entries = meta_response.get('metadata') if isinstance(meta_response, dict) else None
|
||||
if isinstance(entries, list) and entries:
|
||||
@@ -1301,8 +1301,7 @@ def is_available(config: dict[str, Any], use_cache: bool = True) -> tuple[bool,
|
||||
|
||||
Performs a lightweight probe to verify:
|
||||
- Hydrus URL is configured
|
||||
- Hydrus client library is available
|
||||
- Can connect to Hydrus and retrieve services
|
||||
- Can connect to Hydrus URL/port
|
||||
|
||||
Results are cached per session unless use_cache=False.
|
||||
|
||||
@@ -1330,50 +1329,43 @@ def is_available(config: dict[str, Any], use_cache: bool = True) -> tuple[bool,
|
||||
return False, reason
|
||||
|
||||
access_key = get_hydrus_access_key(config, "home") or ""
|
||||
if not access_key:
|
||||
reason = "Hydrus access key not configured"
|
||||
_HYDRUS_AVAILABLE = False
|
||||
_HYDRUS_UNAVAILABLE_REASON = reason
|
||||
return False, reason
|
||||
|
||||
timeout_raw = config.get("HydrusNetwork_Request_Timeout")
|
||||
try:
|
||||
timeout = float(timeout_raw) if timeout_raw is not None else 10.0
|
||||
timeout = float(timeout_raw) if timeout_raw is not None else 5.0
|
||||
except (TypeError, ValueError):
|
||||
timeout = 10.0
|
||||
timeout = 5.0
|
||||
|
||||
try:
|
||||
# Use HTTPClient directly to avoid session key logic and reduce retries
|
||||
# This prevents log spam when Hydrus is offline (avoiding 3 retries x 2 requests)
|
||||
from helper.http_client import HTTPClient
|
||||
# Simple TCP connection test to URL/port
|
||||
import socket
|
||||
from urllib.parse import urlparse
|
||||
|
||||
probe_url = f"{url.rstrip('/')}/get_services"
|
||||
parsed = urlparse(url)
|
||||
hostname = parsed.hostname or 'localhost'
|
||||
port = parsed.port or (443 if parsed.scheme == 'https' else 80)
|
||||
|
||||
headers = {}
|
||||
if access_key:
|
||||
headers["Hydrus-Client-API-Access-Key"] = access_key
|
||||
|
||||
# Suppress HTTPClient logging during probe to avoid "Request failed" logs on startup
|
||||
http_logger = logging.getLogger("helper.http_client")
|
||||
original_level = http_logger.level
|
||||
http_logger.setLevel(logging.CRITICAL)
|
||||
|
||||
# Try to connect to the host/port
|
||||
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
sock.settimeout(timeout)
|
||||
try:
|
||||
# Use retries=1 (single attempt, no retry) to fail fast
|
||||
with HTTPClient(timeout=timeout, retries=1, headers=headers, verify_ssl=False) as http:
|
||||
try:
|
||||
response = http.get(probe_url)
|
||||
if response.status_code == 200:
|
||||
_HYDRUS_AVAILABLE = True
|
||||
_HYDRUS_UNAVAILABLE_REASON = None
|
||||
return True, None
|
||||
else:
|
||||
# Even if we get a 4xx/5xx, the service is "reachable" but maybe auth failed
|
||||
# But for "availability" we usually mean "usable".
|
||||
# If auth fails (403), we can't use it, so return False.
|
||||
reason = f"HTTP {response.status_code}: {response.reason_phrase}"
|
||||
_HYDRUS_AVAILABLE = False
|
||||
_HYDRUS_UNAVAILABLE_REASON = reason
|
||||
return False, reason
|
||||
except Exception as e:
|
||||
# This catches connection errors from HTTPClient
|
||||
raise e
|
||||
result = sock.connect_ex((hostname, port))
|
||||
if result == 0:
|
||||
_HYDRUS_AVAILABLE = True
|
||||
_HYDRUS_UNAVAILABLE_REASON = None
|
||||
return True, None
|
||||
else:
|
||||
reason = f"Cannot connect to {hostname}:{port}"
|
||||
_HYDRUS_AVAILABLE = False
|
||||
_HYDRUS_UNAVAILABLE_REASON = reason
|
||||
return False, reason
|
||||
finally:
|
||||
http_logger.setLevel(original_level)
|
||||
sock.close()
|
||||
|
||||
except Exception as exc:
|
||||
reason = str(exc)
|
||||
|
||||
@@ -2,15 +2,29 @@
|
||||
|
||||
import sys
|
||||
import inspect
|
||||
import threading
|
||||
from pathlib import Path
|
||||
|
||||
_DEBUG_ENABLED = False
|
||||
_thread_local = threading.local()
|
||||
|
||||
def set_thread_stream(stream):
|
||||
"""Set a custom output stream for the current thread."""
|
||||
_thread_local.stream = stream
|
||||
|
||||
def get_thread_stream():
|
||||
"""Get the custom output stream for the current thread, if any."""
|
||||
return getattr(_thread_local, 'stream', None)
|
||||
|
||||
def set_debug(enabled: bool) -> None:
|
||||
"""Enable or disable debug logging."""
|
||||
global _DEBUG_ENABLED
|
||||
_DEBUG_ENABLED = enabled
|
||||
|
||||
def is_debug_enabled() -> bool:
|
||||
"""Check if debug logging is enabled."""
|
||||
return _DEBUG_ENABLED
|
||||
|
||||
def debug(*args, **kwargs) -> None:
|
||||
"""Print debug message if debug logging is enabled.
|
||||
|
||||
@@ -18,9 +32,22 @@ def debug(*args, **kwargs) -> None:
|
||||
"""
|
||||
if not _DEBUG_ENABLED:
|
||||
return
|
||||
|
||||
# Check if stderr has been redirected to /dev/null (quiet mode)
|
||||
# If so, skip output to avoid queuing in background worker's capture
|
||||
try:
|
||||
stderr_name = getattr(sys.stderr, 'name', '')
|
||||
if 'nul' in str(stderr_name).lower() or '/dev/null' in str(stderr_name):
|
||||
return
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Check for thread-local stream first
|
||||
stream = get_thread_stream()
|
||||
if stream:
|
||||
kwargs['file'] = stream
|
||||
# Set default to stderr for debug messages
|
||||
if 'file' not in kwargs:
|
||||
elif 'file' not in kwargs:
|
||||
kwargs['file'] = sys.stderr
|
||||
|
||||
# Prepend DEBUG label
|
||||
@@ -59,8 +86,12 @@ def log(*args, **kwargs) -> None:
|
||||
# Get function name
|
||||
func_name = caller_frame.f_code.co_name
|
||||
|
||||
# Check for thread-local stream first
|
||||
stream = get_thread_stream()
|
||||
if stream:
|
||||
kwargs['file'] = stream
|
||||
# Set default to stdout if not specified
|
||||
if 'file' not in kwargs:
|
||||
elif 'file' not in kwargs:
|
||||
kwargs['file'] = sys.stdout
|
||||
|
||||
if add_prefix:
|
||||
|
||||
@@ -96,7 +96,7 @@ class MPVfile:
|
||||
relationship_metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
tags: List[str] = field(default_factory=list)
|
||||
original_tags: Dict[str, str] = field(default_factory=dict)
|
||||
known_urls: List[str] = field(default_factory=list)
|
||||
url: List[str] = field(default_factory=list)
|
||||
title: Optional[str] = None
|
||||
source_url: Optional[str] = None
|
||||
clip_time: Optional[str] = None
|
||||
@@ -128,7 +128,7 @@ class MPVfile:
|
||||
"relationship_metadata": self.relationship_metadata,
|
||||
"tags": self.tags,
|
||||
"original_tags": self.original_tags,
|
||||
"known_urls": self.known_urls,
|
||||
"url": self.url,
|
||||
"title": self.title,
|
||||
"source_url": self.source_url,
|
||||
"clip_time": self.clip_time,
|
||||
@@ -293,10 +293,10 @@ class MPVFileBuilder:
|
||||
if s.tags:
|
||||
s.original_tags = {tag: tag for tag in s.tags}
|
||||
|
||||
# known URLs + last_url
|
||||
s.known_urls = _normalise_string_list(p.get("known_urls"))
|
||||
if self.last_url and self.last_url not in s.known_urls:
|
||||
s.known_urls.append(self.last_url)
|
||||
# known url + last_url
|
||||
s.url = _normalise_string_list(p.get("url"))
|
||||
if self.last_url and self.last_url not in s.url:
|
||||
s.url.append(self.last_url)
|
||||
|
||||
# source URL (explicit or fallback to last_url)
|
||||
explicit_source = p.get("source_url")
|
||||
@@ -500,8 +500,8 @@ class MPVFileBuilder:
|
||||
self._apply_hydrus_result(result)
|
||||
self.state.type = "hydrus"
|
||||
matched_url = result.get("matched_url") or result.get("url")
|
||||
if matched_url and matched_url not in self.state.known_urls:
|
||||
self.state.known_urls.append(matched_url)
|
||||
if matched_url and matched_url not in self.state.url:
|
||||
self.state.url.append(matched_url)
|
||||
# Enrich relationships once we know the hash
|
||||
if self.include_relationships and self.state.hash and self.hydrus_settings.base_url:
|
||||
self._enrich_relationships_from_api(self.state.hash)
|
||||
@@ -527,7 +527,7 @@ class MPVFileBuilder:
|
||||
metadata_payload["type"] = "other"
|
||||
self.state.metadata = metadata_payload
|
||||
# Do NOT overwrite MPVfile.type with metadata.type
|
||||
self._merge_known_urls(metadata_payload.get("known_urls") or metadata_payload.get("known_urls_set"))
|
||||
self._merge_url(metadata_payload.get("url") or metadata_payload.get("url_set"))
|
||||
source_url = metadata_payload.get("original_url") or metadata_payload.get("source_url")
|
||||
if source_url and not self.state.source_url:
|
||||
self.state.source_url = self._normalise_url(source_url)
|
||||
@@ -722,7 +722,7 @@ class MPVFileBuilder:
|
||||
include_service_keys_to_tags=True,
|
||||
include_duration=True,
|
||||
include_size=True,
|
||||
include_file_urls=False,
|
||||
include_file_url=False,
|
||||
include_mime=False,
|
||||
)
|
||||
except HydrusRequestError as hre: # pragma: no cover
|
||||
@@ -801,11 +801,11 @@ class MPVFileBuilder:
|
||||
if tag not in self.state.original_tags:
|
||||
self.state.original_tags[tag] = tag
|
||||
|
||||
def _merge_known_urls(self, urls: Optional[Iterable[Any]]) -> None:
|
||||
if not urls:
|
||||
def _merge_url(self, url: Optional[Iterable[Any]]) -> None:
|
||||
if not url:
|
||||
return
|
||||
combined = list(self.state.known_urls or []) + _normalise_string_list(urls)
|
||||
self.state.known_urls = unique_preserve_order(combined)
|
||||
combined = list(self.state.url or []) + _normalise_string_list(url)
|
||||
self.state.url = unique_preserve_order(combined)
|
||||
|
||||
def _load_sidecar_tags(self, local_path: str) -> None:
|
||||
try:
|
||||
@@ -821,7 +821,7 @@ class MPVFileBuilder:
|
||||
if hash_value and not self.state.hash and _looks_like_hash(hash_value):
|
||||
self.state.hash = hash_value.lower()
|
||||
self._merge_tags(tags)
|
||||
self._merge_known_urls(known)
|
||||
self._merge_url(known)
|
||||
break
|
||||
|
||||
def _read_sidecar(self, sidecar_path: Path) -> tuple[Optional[str], List[str], List[str]]:
|
||||
@@ -831,7 +831,7 @@ class MPVFileBuilder:
|
||||
return None, [], []
|
||||
hash_value: Optional[str] = None
|
||||
tags: List[str] = []
|
||||
known_urls: List[str] = []
|
||||
url: List[str] = []
|
||||
for line in raw.splitlines():
|
||||
trimmed = line.strip()
|
||||
if not trimmed:
|
||||
@@ -841,13 +841,13 @@ class MPVFileBuilder:
|
||||
candidate = trimmed.split(":", 1)[1].strip() if ":" in trimmed else ""
|
||||
if candidate:
|
||||
hash_value = candidate
|
||||
elif lowered.startswith("known_url:") or lowered.startswith("url:"):
|
||||
elif lowered.startswith("url:") or lowered.startswith("url:"):
|
||||
candidate = trimmed.split(":", 1)[1].strip() if ":" in trimmed else ""
|
||||
if candidate:
|
||||
known_urls.append(candidate)
|
||||
url.append(candidate)
|
||||
else:
|
||||
tags.append(trimmed)
|
||||
return hash_value, tags, known_urls
|
||||
return hash_value, tags, url
|
||||
|
||||
def _compute_local_hash(self, local_path: str) -> None:
|
||||
try:
|
||||
@@ -864,8 +864,8 @@ class MPVFileBuilder:
|
||||
def _finalise(self) -> None:
|
||||
if self.state.tags:
|
||||
self.state.tags = unique_preserve_order(self.state.tags)
|
||||
if self.state.known_urls:
|
||||
self.state.known_urls = unique_preserve_order(self.state.known_urls)
|
||||
if self.state.url:
|
||||
self.state.url = unique_preserve_order(self.state.url)
|
||||
# Ensure metadata.type is always present for Lua, but do NOT overwrite MPVfile.type
|
||||
if not self.state.title:
|
||||
if self.state.metadata.get("title"):
|
||||
|
||||
@@ -85,7 +85,7 @@ def _normalize_target(text: Optional[str]) -> Optional[str]:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Normalize paths/urls for comparison
|
||||
# Normalize paths/url for comparison
|
||||
return lower.replace('\\', '\\')
|
||||
|
||||
|
||||
|
||||
818
helper/provider.py
Normal file
818
helper/provider.py
Normal file
@@ -0,0 +1,818 @@
|
||||
"""Provider interfaces for search and file upload functionality.
|
||||
|
||||
This module defines two distinct provider types:
|
||||
1. SearchProvider: For searching content (books, music, videos, games)
|
||||
2. FileProvider: For uploading files to hosting services
|
||||
|
||||
No legacy code or backwards compatibility - clean, single source of truth.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
import sys
|
||||
import os
|
||||
import json
|
||||
import re
|
||||
import time
|
||||
import asyncio
|
||||
import subprocess
|
||||
import shutil
|
||||
import mimetypes
|
||||
import traceback
|
||||
import requests
|
||||
|
||||
from helper.logger import log, debug
|
||||
|
||||
# Optional dependencies
|
||||
try:
|
||||
from playwright.sync_api import sync_playwright
|
||||
PLAYWRIGHT_AVAILABLE = True
|
||||
except ImportError:
|
||||
PLAYWRIGHT_AVAILABLE = False
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# SEARCH PROVIDERS
|
||||
# ============================================================================
|
||||
|
||||
@dataclass
|
||||
class SearchResult:
|
||||
"""Unified search result format across all search providers."""
|
||||
|
||||
origin: str # Provider name: "libgen", "soulseek", "debrid", "bandcamp", etc.
|
||||
title: str # Display title/filename
|
||||
path: str # Download target (URL, path, magnet, identifier)
|
||||
|
||||
detail: str = "" # Additional description
|
||||
annotations: List[str] = field(default_factory=list) # Tags: ["120MB", "flac", "ready"]
|
||||
media_kind: str = "other" # Type: "book", "audio", "video", "game", "magnet"
|
||||
size_bytes: Optional[int] = None
|
||||
tags: set[str] = field(default_factory=set) # Searchable tags
|
||||
columns: List[Tuple[str, str]] = field(default_factory=list) # Display columns
|
||||
full_metadata: Dict[str, Any] = field(default_factory=dict) # Extra metadata
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dictionary for pipeline processing."""
|
||||
return {
|
||||
"origin": self.origin,
|
||||
"title": self.title,
|
||||
"path": self.path,
|
||||
"detail": self.detail,
|
||||
"annotations": self.annotations,
|
||||
"media_kind": self.media_kind,
|
||||
"size_bytes": self.size_bytes,
|
||||
"tags": list(self.tags),
|
||||
"columns": list(self.columns),
|
||||
"full_metadata": self.full_metadata,
|
||||
}
|
||||
|
||||
|
||||
class SearchProvider(ABC):
|
||||
"""Base class for search providers."""
|
||||
|
||||
def __init__(self, config: Dict[str, Any] = None):
|
||||
self.config = config or {}
|
||||
self.name = self.__class__.__name__.lower()
|
||||
|
||||
@abstractmethod
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 50,
|
||||
filters: Optional[Dict[str, Any]] = None,
|
||||
**kwargs
|
||||
) -> List[SearchResult]:
|
||||
"""Search for items matching the query.
|
||||
|
||||
Args:
|
||||
query: Search query string
|
||||
limit: Maximum results to return
|
||||
filters: Optional filtering criteria
|
||||
**kwargs: Provider-specific arguments
|
||||
|
||||
Returns:
|
||||
List of SearchResult objects
|
||||
"""
|
||||
pass
|
||||
|
||||
def validate(self) -> bool:
|
||||
"""Check if provider is available and properly configured."""
|
||||
return True
|
||||
|
||||
|
||||
class Libgen(SearchProvider):
|
||||
"""Search provider for Library Genesis books."""
|
||||
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 50,
|
||||
filters: Optional[Dict[str, Any]] = None,
|
||||
**kwargs
|
||||
) -> List[SearchResult]:
|
||||
filters = filters or {}
|
||||
|
||||
try:
|
||||
from helper.unified_book_downloader import UnifiedBookDownloader
|
||||
from helper.query_parser import parse_query, get_field, get_free_text
|
||||
|
||||
parsed = parse_query(query)
|
||||
isbn = get_field(parsed, 'isbn')
|
||||
author = get_field(parsed, 'author')
|
||||
title = get_field(parsed, 'title')
|
||||
free_text = get_free_text(parsed)
|
||||
|
||||
search_query = isbn or title or author or free_text or query
|
||||
|
||||
downloader = UnifiedBookDownloader(config=self.config)
|
||||
books = downloader.search_libgen(search_query, limit=limit)
|
||||
|
||||
results = []
|
||||
for idx, book in enumerate(books, 1):
|
||||
title = book.get("title", "Unknown")
|
||||
author = book.get("author", "Unknown")
|
||||
year = book.get("year", "Unknown")
|
||||
pages = book.get("pages") or book.get("pages_str") or ""
|
||||
extension = book.get("extension", "") or book.get("ext", "")
|
||||
filesize = book.get("filesize_str", "Unknown")
|
||||
isbn = book.get("isbn", "")
|
||||
mirror_url = book.get("mirror_url", "")
|
||||
|
||||
columns = [
|
||||
("Title", title),
|
||||
("Author", author),
|
||||
("Pages", str(pages)),
|
||||
("Ext", str(extension)),
|
||||
]
|
||||
|
||||
detail = f"By: {author}"
|
||||
if year and year != "Unknown":
|
||||
detail += f" ({year})"
|
||||
|
||||
annotations = [f"{filesize}"]
|
||||
if isbn:
|
||||
annotations.append(f"ISBN: {isbn}")
|
||||
|
||||
results.append(SearchResult(
|
||||
origin="libgen",
|
||||
title=title,
|
||||
path=mirror_url or f"libgen:{book.get('id', '')}",
|
||||
detail=detail,
|
||||
annotations=annotations,
|
||||
media_kind="book",
|
||||
columns=columns,
|
||||
full_metadata={
|
||||
"number": idx,
|
||||
"author": author,
|
||||
"year": year,
|
||||
"isbn": isbn,
|
||||
"filesize": filesize,
|
||||
"pages": pages,
|
||||
"extension": extension,
|
||||
"book_id": book.get("book_id", ""),
|
||||
"md5": book.get("md5", ""),
|
||||
},
|
||||
))
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
log(f"[libgen] Search error: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
def validate(self) -> bool:
|
||||
try:
|
||||
from helper.unified_book_downloader import UnifiedBookDownloader
|
||||
return True
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
class Soulseek(SearchProvider):
|
||||
"""Search provider for Soulseek P2P network."""
|
||||
|
||||
MUSIC_EXTENSIONS = {
|
||||
'.flac', '.mp3', '.m4a', '.aac', '.ogg', '.opus',
|
||||
'.wav', '.alac', '.wma', '.ape', '.aiff', '.dsf',
|
||||
'.dff', '.wv', '.tta', '.tak', '.ac3', '.dts'
|
||||
}
|
||||
|
||||
USERNAME = "asjhkjljhkjfdsd334"
|
||||
PASSWORD = "khhhg"
|
||||
DOWNLOAD_DIR = "./downloads"
|
||||
MAX_WAIT_TRANSFER = 1200
|
||||
|
||||
async def perform_search(
|
||||
self,
|
||||
query: str,
|
||||
timeout: float = 9.0,
|
||||
limit: int = 50
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Perform async Soulseek search."""
|
||||
import os
|
||||
from aioslsk.client import SoulSeekClient
|
||||
from aioslsk.settings import Settings, CredentialsSettings
|
||||
|
||||
os.makedirs(self.DOWNLOAD_DIR, exist_ok=True)
|
||||
|
||||
settings = Settings(credentials=CredentialsSettings(username=self.USERNAME, password=self.PASSWORD))
|
||||
client = SoulSeekClient(settings)
|
||||
|
||||
try:
|
||||
await client.start()
|
||||
await client.login()
|
||||
except Exception as e:
|
||||
log(f"[soulseek] Login failed: {type(e).__name__}: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
try:
|
||||
search_request = await client.searches.search(query)
|
||||
await self._collect_results(client, search_request, timeout=timeout)
|
||||
return self._flatten_results(search_request)[:limit]
|
||||
except Exception as e:
|
||||
log(f"[soulseek] Search error: {type(e).__name__}: {e}", file=sys.stderr)
|
||||
return []
|
||||
finally:
|
||||
try:
|
||||
await client.stop()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def _flatten_results(self, search_request) -> List[dict]:
|
||||
flat = []
|
||||
for result in search_request.results:
|
||||
username = getattr(result, "username", "?")
|
||||
|
||||
for file_data in getattr(result, "shared_items", []):
|
||||
flat.append({
|
||||
"file": file_data,
|
||||
"username": username,
|
||||
"filename": getattr(file_data, "filename", "?"),
|
||||
"size": getattr(file_data, "filesize", 0),
|
||||
})
|
||||
|
||||
for file_data in getattr(result, "locked_results", []):
|
||||
flat.append({
|
||||
"file": file_data,
|
||||
"username": username,
|
||||
"filename": getattr(file_data, "filename", "?"),
|
||||
"size": getattr(file_data, "filesize", 0),
|
||||
})
|
||||
|
||||
return flat
|
||||
|
||||
async def _collect_results(self, client, search_request, timeout: float = 75.0) -> None:
|
||||
end = time.time() + timeout
|
||||
last_count = 0
|
||||
while time.time() < end:
|
||||
current_count = len(search_request.results)
|
||||
if current_count > last_count:
|
||||
debug(f"[soulseek] Got {current_count} result(s)...")
|
||||
last_count = current_count
|
||||
await asyncio.sleep(0.5)
|
||||
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 50,
|
||||
filters: Optional[Dict[str, Any]] = None,
|
||||
**kwargs
|
||||
) -> List[SearchResult]:
|
||||
filters = filters or {}
|
||||
|
||||
try:
|
||||
flat_results = asyncio.run(self.perform_search(query, timeout=9.0, limit=limit))
|
||||
|
||||
if not flat_results:
|
||||
return []
|
||||
|
||||
# Filter to music files only
|
||||
music_results = []
|
||||
for item in flat_results:
|
||||
filename = item['filename']
|
||||
ext = '.' + filename.rsplit('.', 1)[-1].lower() if '.' in filename else ''
|
||||
if ext in self.MUSIC_EXTENSIONS:
|
||||
music_results.append(item)
|
||||
|
||||
if not music_results:
|
||||
return []
|
||||
|
||||
# Extract metadata
|
||||
enriched_results = []
|
||||
for item in music_results:
|
||||
filename = item['filename']
|
||||
ext = '.' + filename.rsplit('.', 1)[-1].lower() if '.' in filename else ''
|
||||
|
||||
# Get display filename
|
||||
display_name = filename.split('\\')[-1] if '\\' in filename else filename.split('/')[-1] if '/' in filename else filename
|
||||
|
||||
# Extract path hierarchy
|
||||
path_parts = filename.replace('\\', '/').split('/')
|
||||
artist = path_parts[-3] if len(path_parts) >= 3 else ''
|
||||
album = path_parts[-2] if len(path_parts) >= 3 else path_parts[-2] if len(path_parts) == 2 else ''
|
||||
|
||||
# Extract track number and title
|
||||
base_name = display_name.rsplit('.', 1)[0] if '.' in display_name else display_name
|
||||
track_num = ''
|
||||
title = base_name
|
||||
filename_artist = ''
|
||||
|
||||
match = re.match(r'^(\d{1,3})\s*[\.\-]?\s+(.+)$', base_name)
|
||||
if match:
|
||||
track_num = match.group(1)
|
||||
rest = match.group(2)
|
||||
if ' - ' in rest:
|
||||
filename_artist, title = rest.split(' - ', 1)
|
||||
else:
|
||||
title = rest
|
||||
|
||||
if filename_artist:
|
||||
artist = filename_artist
|
||||
|
||||
enriched_results.append({
|
||||
**item,
|
||||
'artist': artist,
|
||||
'album': album,
|
||||
'title': title,
|
||||
'track_num': track_num,
|
||||
'ext': ext
|
||||
})
|
||||
|
||||
# Apply filters
|
||||
if filters:
|
||||
artist_filter = filters.get('artist', '').lower() if filters.get('artist') else ''
|
||||
album_filter = filters.get('album', '').lower() if filters.get('album') else ''
|
||||
track_filter = filters.get('track', '').lower() if filters.get('track') else ''
|
||||
|
||||
if artist_filter or album_filter or track_filter:
|
||||
filtered = []
|
||||
for item in enriched_results:
|
||||
if artist_filter and artist_filter not in item['artist'].lower():
|
||||
continue
|
||||
if album_filter and album_filter not in item['album'].lower():
|
||||
continue
|
||||
if track_filter and track_filter not in item['title'].lower():
|
||||
continue
|
||||
filtered.append(item)
|
||||
enriched_results = filtered
|
||||
|
||||
# Sort: .flac first, then by size
|
||||
enriched_results.sort(key=lambda item: (item['ext'].lower() != '.flac', -item['size']))
|
||||
|
||||
# Convert to SearchResult
|
||||
results = []
|
||||
for idx, item in enumerate(enriched_results, 1):
|
||||
artist_display = item['artist'] if item['artist'] else "(no artist)"
|
||||
album_display = item['album'] if item['album'] else "(no album)"
|
||||
size_mb = int(item['size'] / 1024 / 1024)
|
||||
|
||||
columns = [
|
||||
("Track", item['track_num'] or "?"),
|
||||
("Title", item['title'][:40]),
|
||||
("Artist", artist_display[:32]),
|
||||
("Album", album_display[:32]),
|
||||
("Size", f"{size_mb} MB"),
|
||||
]
|
||||
|
||||
results.append(SearchResult(
|
||||
origin="soulseek",
|
||||
title=item['title'],
|
||||
path=item['filename'],
|
||||
detail=f"{artist_display} - {album_display}",
|
||||
annotations=[f"{size_mb} MB", item['ext'].lstrip('.').upper()],
|
||||
media_kind="audio",
|
||||
size_bytes=item['size'],
|
||||
columns=columns,
|
||||
full_metadata={
|
||||
"username": item['username'],
|
||||
"filename": item['filename'],
|
||||
"artist": item['artist'],
|
||||
"album": item['album'],
|
||||
"track_num": item['track_num'],
|
||||
"ext": item['ext'],
|
||||
},
|
||||
))
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
log(f"[soulseek] Search error: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
def validate(self) -> bool:
|
||||
try:
|
||||
from aioslsk.client import SoulSeekClient
|
||||
return True
|
||||
except ImportError:
|
||||
return False
|
||||
|
||||
|
||||
class Bandcamp(SearchProvider):
|
||||
"""Search provider for Bandcamp."""
|
||||
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 50,
|
||||
filters: Optional[Dict[str, Any]] = None,
|
||||
**kwargs
|
||||
) -> List[SearchResult]:
|
||||
if not PLAYWRIGHT_AVAILABLE:
|
||||
log("[bandcamp] Playwright not available. Install with: pip install playwright", file=sys.stderr)
|
||||
return []
|
||||
|
||||
results = []
|
||||
try:
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
page = browser.new_page()
|
||||
|
||||
# Parse query for artist: prefix
|
||||
if query.strip().lower().startswith("artist:"):
|
||||
artist_name = query[7:].strip().strip('"')
|
||||
search_url = f"https://bandcamp.com/search?q={artist_name}&item_type=b"
|
||||
else:
|
||||
search_url = f"https://bandcamp.com/search?q={query}&item_type=a"
|
||||
|
||||
results = self._scrape_url(page, search_url, limit)
|
||||
|
||||
browser.close()
|
||||
except Exception as e:
|
||||
log(f"[bandcamp] Search error: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
return results
|
||||
|
||||
def _scrape_url(self, page, url: str, limit: int) -> List[SearchResult]:
|
||||
debug(f"[bandcamp] Scraping: {url}")
|
||||
|
||||
page.goto(url)
|
||||
page.wait_for_load_state("domcontentloaded")
|
||||
|
||||
results = []
|
||||
|
||||
# Check for search results
|
||||
search_results = page.query_selector_all(".searchresult")
|
||||
if search_results:
|
||||
for item in search_results[:limit]:
|
||||
try:
|
||||
heading = item.query_selector(".heading")
|
||||
if not heading:
|
||||
continue
|
||||
|
||||
link = heading.query_selector("a")
|
||||
if not link:
|
||||
continue
|
||||
|
||||
title = link.inner_text().strip()
|
||||
target_url = link.get_attribute("href")
|
||||
|
||||
subhead = item.query_selector(".subhead")
|
||||
artist = subhead.inner_text().strip() if subhead else "Unknown"
|
||||
|
||||
itemtype = item.query_selector(".itemtype")
|
||||
media_type = itemtype.inner_text().strip() if itemtype else "album"
|
||||
|
||||
results.append(SearchResult(
|
||||
origin="bandcamp",
|
||||
title=title,
|
||||
path=target_url,
|
||||
detail=f"By: {artist}",
|
||||
annotations=[media_type],
|
||||
media_kind="audio",
|
||||
columns=[
|
||||
("Name", title),
|
||||
("Artist", artist),
|
||||
("Type", media_type),
|
||||
],
|
||||
full_metadata={
|
||||
"artist": artist,
|
||||
"type": media_type,
|
||||
},
|
||||
))
|
||||
except Exception as e:
|
||||
debug(f"[bandcamp] Error parsing result: {e}")
|
||||
continue
|
||||
|
||||
return results
|
||||
|
||||
def validate(self) -> bool:
|
||||
return PLAYWRIGHT_AVAILABLE
|
||||
|
||||
|
||||
class YouTube(SearchProvider):
|
||||
"""Search provider for YouTube using yt-dlp."""
|
||||
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 10,
|
||||
filters: Optional[Dict[str, Any]] = None,
|
||||
**kwargs
|
||||
) -> List[SearchResult]:
|
||||
ytdlp_path = shutil.which("yt-dlp")
|
||||
if not ytdlp_path:
|
||||
log("[youtube] yt-dlp not found in PATH", file=sys.stderr)
|
||||
return []
|
||||
|
||||
search_query = f"ytsearch{limit}:{query}"
|
||||
|
||||
cmd = [
|
||||
ytdlp_path,
|
||||
"--dump-json",
|
||||
"--flat-playlist",
|
||||
"--no-warnings",
|
||||
search_query
|
||||
]
|
||||
|
||||
try:
|
||||
process = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
encoding="utf-8",
|
||||
errors="replace"
|
||||
)
|
||||
|
||||
if process.returncode != 0:
|
||||
log(f"[youtube] yt-dlp failed: {process.stderr}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
results = []
|
||||
for line in process.stdout.splitlines():
|
||||
if not line.strip():
|
||||
continue
|
||||
try:
|
||||
video_data = json.loads(line)
|
||||
title = video_data.get("title", "Unknown")
|
||||
video_id = video_data.get("id", "")
|
||||
url = video_data.get("url") or f"https://youtube.com/watch?v={video_id}"
|
||||
uploader = video_data.get("uploader", "Unknown")
|
||||
duration = video_data.get("duration", 0)
|
||||
view_count = video_data.get("view_count", 0)
|
||||
|
||||
duration_str = f"{int(duration//60)}:{int(duration%60):02d}" if duration else ""
|
||||
views_str = f"{view_count:,}" if view_count else ""
|
||||
|
||||
results.append(SearchResult(
|
||||
origin="youtube",
|
||||
title=title,
|
||||
path=url,
|
||||
detail=f"By: {uploader}",
|
||||
annotations=[duration_str, f"{views_str} views"],
|
||||
media_kind="video",
|
||||
columns=[
|
||||
("Title", title),
|
||||
("Uploader", uploader),
|
||||
("Duration", duration_str),
|
||||
("Views", views_str),
|
||||
],
|
||||
full_metadata={
|
||||
"video_id": video_id,
|
||||
"uploader": uploader,
|
||||
"duration": duration,
|
||||
"view_count": view_count,
|
||||
},
|
||||
))
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
log(f"[youtube] Error: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
def validate(self) -> bool:
|
||||
return shutil.which("yt-dlp") is not None
|
||||
|
||||
def pipe(self, path: str, config: Optional[Dict[str, Any]] = None) -> Optional[str]:
|
||||
"""Return the playable URL for MPV (just the path for YouTube)."""
|
||||
return path
|
||||
|
||||
|
||||
# Search provider registry
|
||||
_SEARCH_PROVIDERS = {
|
||||
"libgen": Libgen,
|
||||
"soulseek": Soulseek,
|
||||
"bandcamp": Bandcamp,
|
||||
"youtube": YouTube,
|
||||
}
|
||||
|
||||
|
||||
def get_search_provider(name: str, config: Optional[Dict[str, Any]] = None) -> Optional[SearchProvider]:
|
||||
"""Get a search provider by name."""
|
||||
provider_class = _SEARCH_PROVIDERS.get(name.lower())
|
||||
|
||||
if provider_class is None:
|
||||
log(f"[provider] Unknown search provider: {name}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
try:
|
||||
provider = provider_class(config)
|
||||
if not provider.validate():
|
||||
log(f"[provider] Provider '{name}' is not available", file=sys.stderr)
|
||||
return None
|
||||
return provider
|
||||
except Exception as e:
|
||||
log(f"[provider] Error initializing '{name}': {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def list_search_providers(config: Optional[Dict[str, Any]] = None) -> Dict[str, bool]:
|
||||
"""List all search providers and their availability."""
|
||||
availability = {}
|
||||
for name, provider_class in _SEARCH_PROVIDERS.items():
|
||||
try:
|
||||
provider = provider_class(config)
|
||||
availability[name] = provider.validate()
|
||||
except Exception:
|
||||
availability[name] = False
|
||||
return availability
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# FILE PROVIDERS
|
||||
# ============================================================================
|
||||
|
||||
class FileProvider(ABC):
|
||||
"""Base class for file upload providers."""
|
||||
|
||||
def __init__(self, config: Optional[Dict[str, Any]] = None):
|
||||
self.config = config or {}
|
||||
self.name = self.__class__.__name__.lower()
|
||||
|
||||
@abstractmethod
|
||||
def upload(self, file_path: str, **kwargs: Any) -> str:
|
||||
"""Upload a file and return the URL."""
|
||||
pass
|
||||
|
||||
def validate(self) -> bool:
|
||||
"""Check if provider is available/configured."""
|
||||
return True
|
||||
|
||||
|
||||
class ZeroXZero(FileProvider):
|
||||
"""File provider for 0x0.st."""
|
||||
|
||||
def upload(self, file_path: str, **kwargs: Any) -> str:
|
||||
from helper.http_client import HTTPClient
|
||||
|
||||
if not os.path.exists(file_path):
|
||||
raise FileNotFoundError(f"File not found: {file_path}")
|
||||
|
||||
try:
|
||||
headers = {"User-Agent": "Medeia-Macina/1.0"}
|
||||
with HTTPClient(headers=headers) as client:
|
||||
with open(file_path, 'rb') as f:
|
||||
response = client.post(
|
||||
"https://0x0.st",
|
||||
files={"file": f}
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.text.strip()
|
||||
else:
|
||||
raise Exception(f"Upload failed: {response.status_code} - {response.text}")
|
||||
|
||||
except Exception as e:
|
||||
log(f"[0x0] Upload error: {e}", file=sys.stderr)
|
||||
raise
|
||||
|
||||
def validate(self) -> bool:
|
||||
return True
|
||||
|
||||
|
||||
class Matrix(FileProvider):
|
||||
"""File provider for Matrix (Element) chat rooms."""
|
||||
|
||||
def validate(self) -> bool:
|
||||
if not self.config:
|
||||
return False
|
||||
matrix_conf = self.config.get('storage', {}).get('matrix', {})
|
||||
return bool(
|
||||
matrix_conf.get('homeserver') and
|
||||
matrix_conf.get('room_id') and
|
||||
(matrix_conf.get('access_token') or matrix_conf.get('password'))
|
||||
)
|
||||
|
||||
def upload(self, file_path: str, **kwargs: Any) -> str:
|
||||
from pathlib import Path
|
||||
|
||||
path = Path(file_path)
|
||||
if not path.exists():
|
||||
raise FileNotFoundError(f"File not found: {file_path}")
|
||||
|
||||
matrix_conf = self.config.get('storage', {}).get('matrix', {})
|
||||
homeserver = matrix_conf.get('homeserver')
|
||||
access_token = matrix_conf.get('access_token')
|
||||
room_id = matrix_conf.get('room_id')
|
||||
|
||||
if not homeserver.startswith('http'):
|
||||
homeserver = f"https://{homeserver}"
|
||||
|
||||
# Upload media
|
||||
upload_url = f"{homeserver}/_matrix/media/v3/upload"
|
||||
headers = {
|
||||
"Authorization": f"Bearer {access_token}",
|
||||
"Content-Type": "application/octet-stream"
|
||||
}
|
||||
|
||||
mime_type, _ = mimetypes.guess_type(path)
|
||||
if mime_type:
|
||||
headers["Content-Type"] = mime_type
|
||||
|
||||
filename = path.name
|
||||
|
||||
with open(path, 'rb') as f:
|
||||
resp = requests.post(upload_url, headers=headers, data=f, params={"filename": filename})
|
||||
|
||||
if resp.status_code != 200:
|
||||
raise Exception(f"Matrix upload failed: {resp.text}")
|
||||
|
||||
content_uri = resp.json().get('content_uri')
|
||||
if not content_uri:
|
||||
raise Exception("No content_uri returned")
|
||||
|
||||
# Send message
|
||||
send_url = f"{homeserver}/_matrix/client/v3/rooms/{room_id}/send/m.room.message"
|
||||
|
||||
# Determine message type
|
||||
msgtype = "m.file"
|
||||
ext = path.suffix.lower()
|
||||
|
||||
AUDIO_EXTS = {'.mp3', '.flac', '.wav', '.m4a', '.aac', '.ogg', '.opus', '.wma', '.mka', '.alac'}
|
||||
VIDEO_EXTS = {'.mp4', '.mkv', '.webm', '.mov', '.avi', '.flv', '.mpg', '.mpeg', '.ts', '.m4v', '.wmv'}
|
||||
IMAGE_EXTS = {'.jpg', '.jpeg', '.png', '.gif', '.webp', '.bmp', '.tiff'}
|
||||
|
||||
if ext in AUDIO_EXTS:
|
||||
msgtype = "m.audio"
|
||||
elif ext in VIDEO_EXTS:
|
||||
msgtype = "m.video"
|
||||
elif ext in IMAGE_EXTS:
|
||||
msgtype = "m.image"
|
||||
|
||||
info = {
|
||||
"mimetype": mime_type,
|
||||
"size": path.stat().st_size
|
||||
}
|
||||
|
||||
payload = {
|
||||
"msgtype": msgtype,
|
||||
"body": filename,
|
||||
"url": content_uri,
|
||||
"info": info
|
||||
}
|
||||
|
||||
resp = requests.post(send_url, headers=headers, json=payload)
|
||||
if resp.status_code != 200:
|
||||
raise Exception(f"Matrix send message failed: {resp.text}")
|
||||
|
||||
event_id = resp.json().get('event_id')
|
||||
return f"https://matrix.to/#/{room_id}/{event_id}"
|
||||
|
||||
|
||||
# File provider registry
|
||||
_FILE_PROVIDERS = {
|
||||
"0x0": ZeroXZero,
|
||||
"matrix": Matrix,
|
||||
}
|
||||
|
||||
|
||||
def get_file_provider(name: str, config: Optional[Dict[str, Any]] = None) -> Optional[FileProvider]:
|
||||
"""Get a file provider by name."""
|
||||
provider_class = _FILE_PROVIDERS.get(name.lower())
|
||||
|
||||
if provider_class is None:
|
||||
log(f"[provider] Unknown file provider: {name}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
try:
|
||||
provider = provider_class(config)
|
||||
if not provider.validate():
|
||||
log(f"[provider] File provider '{name}' is not available", file=sys.stderr)
|
||||
return None
|
||||
return provider
|
||||
except Exception as e:
|
||||
log(f"[provider] Error initializing file provider '{name}': {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def list_file_providers(config: Optional[Dict[str, Any]] = None) -> Dict[str, bool]:
|
||||
"""List all file providers and their availability."""
|
||||
availability = {}
|
||||
for name, provider_class in _FILE_PROVIDERS.items():
|
||||
try:
|
||||
provider = provider_class(config)
|
||||
availability[name] = provider.validate()
|
||||
except Exception:
|
||||
availability[name] = False
|
||||
return availability
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -159,8 +159,8 @@ def create_app():
|
||||
status["storage_path"] = str(STORAGE_PATH)
|
||||
status["storage_exists"] = STORAGE_PATH.exists()
|
||||
try:
|
||||
from helper.local_library import LocalLibraryDB
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
from helper.folder_store import FolderDB
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
status["database_accessible"] = True
|
||||
except Exception as e:
|
||||
status["database_accessible"] = False
|
||||
@@ -177,7 +177,7 @@ def create_app():
|
||||
@require_storage()
|
||||
def search_files():
|
||||
"""Search for files by name or tag."""
|
||||
from helper.local_library import LocalLibrarySearchOptimizer
|
||||
from helper.folder_store import LocalLibrarySearchOptimizer
|
||||
|
||||
query = request.args.get('q', '')
|
||||
limit = request.args.get('limit', 100, type=int)
|
||||
@@ -205,11 +205,11 @@ def create_app():
|
||||
@require_storage()
|
||||
def get_file_metadata(file_hash: str):
|
||||
"""Get metadata for a specific file by hash."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
|
||||
if not file_path or not file_path.exists():
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
@@ -233,13 +233,13 @@ def create_app():
|
||||
@require_storage()
|
||||
def index_file():
|
||||
"""Index a new file in the storage."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
from helper.utils import sha256_file
|
||||
|
||||
data = request.get_json() or {}
|
||||
file_path_str = data.get('path')
|
||||
tags = data.get('tags', [])
|
||||
urls = data.get('urls', [])
|
||||
url = data.get('url', [])
|
||||
|
||||
if not file_path_str:
|
||||
return jsonify({"error": "File path required"}), 400
|
||||
@@ -250,14 +250,14 @@ def create_app():
|
||||
if not file_path.exists():
|
||||
return jsonify({"error": "File does not exist"}), 404
|
||||
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
db.get_or_create_file_entry(file_path)
|
||||
|
||||
if tags:
|
||||
db.add_tags(file_path, tags)
|
||||
|
||||
if urls:
|
||||
db.add_known_urls(file_path, urls)
|
||||
if url:
|
||||
db.add_url(file_path, url)
|
||||
|
||||
file_hash = sha256_file(file_path)
|
||||
|
||||
@@ -265,7 +265,7 @@ def create_app():
|
||||
"hash": file_hash,
|
||||
"path": str(file_path),
|
||||
"tags_added": len(tags),
|
||||
"urls_added": len(urls)
|
||||
"url_added": len(url)
|
||||
}), 201
|
||||
except Exception as e:
|
||||
logger.error(f"Index error: {e}", exc_info=True)
|
||||
@@ -280,11 +280,11 @@ def create_app():
|
||||
@require_storage()
|
||||
def get_tags(file_hash: str):
|
||||
"""Get tags for a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
@@ -299,7 +299,7 @@ def create_app():
|
||||
@require_storage()
|
||||
def add_tags(file_hash: str):
|
||||
"""Add tags to a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
data = request.get_json() or {}
|
||||
tags = data.get('tags', [])
|
||||
@@ -309,8 +309,8 @@ def create_app():
|
||||
return jsonify({"error": "Tags required"}), 400
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
@@ -328,13 +328,13 @@ def create_app():
|
||||
@require_storage()
|
||||
def remove_tags(file_hash: str):
|
||||
"""Remove tags from a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
tags_str = request.args.get('tags', '')
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
@@ -358,11 +358,11 @@ def create_app():
|
||||
@require_storage()
|
||||
def get_relationships(file_hash: str):
|
||||
"""Get relationships for a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
@@ -378,7 +378,7 @@ def create_app():
|
||||
@require_storage()
|
||||
def set_relationship():
|
||||
"""Set a relationship between two files."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
data = request.get_json() or {}
|
||||
from_hash = data.get('from_hash')
|
||||
@@ -389,9 +389,9 @@ def create_app():
|
||||
return jsonify({"error": "from_hash and to_hash required"}), 400
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
from_path = db.search_by_hash(from_hash)
|
||||
to_path = db.search_by_hash(to_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
from_path = db.search_hash(from_hash)
|
||||
to_path = db.search_hash(to_hash)
|
||||
|
||||
if not from_path or not to_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
@@ -406,49 +406,49 @@ def create_app():
|
||||
# URL OPERATIONS
|
||||
# ========================================================================
|
||||
|
||||
@app.route('/urls/<file_hash>', methods=['GET'])
|
||||
@app.route('/url/<file_hash>', methods=['GET'])
|
||||
@require_auth()
|
||||
@require_storage()
|
||||
def get_urls(file_hash: str):
|
||||
"""Get known URLs for a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
def get_url(file_hash: str):
|
||||
"""Get known url for a file."""
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
metadata = db.get_metadata(file_path)
|
||||
urls = metadata.get('known_urls', []) if metadata else []
|
||||
return jsonify({"hash": file_hash, "urls": urls}), 200
|
||||
url = metadata.get('url', []) if metadata else []
|
||||
return jsonify({"hash": file_hash, "url": url}), 200
|
||||
except Exception as e:
|
||||
logger.error(f"Get URLs error: {e}", exc_info=True)
|
||||
logger.error(f"Get url error: {e}", exc_info=True)
|
||||
return jsonify({"error": f"Failed: {str(e)}"}), 500
|
||||
|
||||
@app.route('/urls/<file_hash>', methods=['POST'])
|
||||
@app.route('/url/<file_hash>', methods=['POST'])
|
||||
@require_auth()
|
||||
@require_storage()
|
||||
def add_urls(file_hash: str):
|
||||
"""Add URLs to a file."""
|
||||
from helper.local_library import LocalLibraryDB
|
||||
def add_url(file_hash: str):
|
||||
"""Add url to a file."""
|
||||
from helper.folder_store import FolderDB
|
||||
|
||||
data = request.get_json() or {}
|
||||
urls = data.get('urls', [])
|
||||
url = data.get('url', [])
|
||||
|
||||
if not urls:
|
||||
return jsonify({"error": "URLs required"}), 400
|
||||
if not url:
|
||||
return jsonify({"error": "url required"}), 400
|
||||
|
||||
try:
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_by_hash(file_hash)
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
file_path = db.search_hash(file_hash)
|
||||
if not file_path:
|
||||
return jsonify({"error": "File not found"}), 404
|
||||
|
||||
db.add_known_urls(file_path, urls)
|
||||
return jsonify({"hash": file_hash, "urls_added": len(urls)}), 200
|
||||
db.add_url(file_path, url)
|
||||
return jsonify({"hash": file_hash, "url_added": len(url)}), 200
|
||||
except Exception as e:
|
||||
logger.error(f"Add URLs error: {e}", exc_info=True)
|
||||
logger.error(f"Add url error: {e}", exc_info=True)
|
||||
return jsonify({"error": f"Failed: {str(e)}"}), 500
|
||||
|
||||
return app
|
||||
@@ -509,8 +509,8 @@ def main():
|
||||
print(f"\n{'='*70}\n")
|
||||
|
||||
try:
|
||||
from helper.local_library import LocalLibraryDB
|
||||
with LocalLibraryDB(STORAGE_PATH) as db:
|
||||
from helper.folder_store import FolderDB
|
||||
with FolderDB(STORAGE_PATH) as db:
|
||||
logger.info("Database initialized successfully")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize database: {e}")
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
2268
helper/store.py
Normal file
2268
helper/store.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -555,7 +555,7 @@ class UnifiedBookDownloader:
|
||||
This follows the exact process from archive_client.py:
|
||||
1. Login with credentials
|
||||
2. Call loan() to create 14-day borrow
|
||||
3. Get book info (extract page URLs)
|
||||
3. Get book info (extract page url)
|
||||
4. Download all pages as images
|
||||
5. Merge images into searchable PDF
|
||||
|
||||
@@ -576,10 +576,10 @@ class UnifiedBookDownloader:
|
||||
# If we get here, borrowing succeeded
|
||||
logger.info(f"[UnifiedBookDownloader] Successfully borrowed book: {book_id}")
|
||||
|
||||
# Now get the book info (page URLs and metadata)
|
||||
# Now get the book info (page url and metadata)
|
||||
logger.info(f"[UnifiedBookDownloader] Extracting book page information...")
|
||||
# Try both URL formats: with /borrow and without
|
||||
book_urls = [
|
||||
book_url = [
|
||||
f"https://archive.org/borrow/{book_id}", # Try borrow page first (for borrowed books)
|
||||
f"https://archive.org/details/{book_id}" # Fallback to details page
|
||||
]
|
||||
@@ -589,7 +589,7 @@ class UnifiedBookDownloader:
|
||||
metadata = None
|
||||
last_error = None
|
||||
|
||||
for book_url in book_urls:
|
||||
for book_url in book_url:
|
||||
try:
|
||||
logger.debug(f"[UnifiedBookDownloader] Trying to get book info from: {book_url}")
|
||||
response = session.get(book_url, timeout=10)
|
||||
@@ -611,7 +611,7 @@ class UnifiedBookDownloader:
|
||||
continue
|
||||
|
||||
if links is None:
|
||||
logger.error(f"[UnifiedBookDownloader] Failed to get book info from all URLs: {last_error}")
|
||||
logger.error(f"[UnifiedBookDownloader] Failed to get book info from all url: {last_error}")
|
||||
# Borrow extraction failed - return False
|
||||
return False, "Could not extract borrowed book pages"
|
||||
|
||||
|
||||
@@ -308,7 +308,7 @@ def format_metadata_value(key: str, value) -> str:
|
||||
# ============================================================================
|
||||
# Link Utilities - Consolidated from link_utils.py
|
||||
# ============================================================================
|
||||
"""Link utilities - Extract and process URLs from various sources."""
|
||||
"""Link utilities - Extract and process url from various sources."""
|
||||
|
||||
|
||||
def extract_link_from_args(args: Iterable[str]) -> Any | None:
|
||||
|
||||
@@ -77,3 +77,26 @@ mime_maps = {
|
||||
"csv": { "ext": ".csv", "mimes": ["text/csv"] }
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def get_type_from_ext(ext: str) -> str:
|
||||
"""Determine the type (e.g., 'image', 'video', 'audio') from file extension.
|
||||
|
||||
Args:
|
||||
ext: File extension (with or without leading dot, e.g., 'jpg' or '.jpg')
|
||||
|
||||
Returns:
|
||||
Type string (e.g., 'image', 'video', 'audio') or 'other' if unknown
|
||||
"""
|
||||
if not ext:
|
||||
return 'other'
|
||||
|
||||
# Normalize: remove leading dot and convert to lowercase
|
||||
ext_clean = ext.lstrip('.').lower()
|
||||
|
||||
# Search through mime_maps to find matching type
|
||||
for type_name, extensions_dict in mime_maps.items():
|
||||
if ext_clean in extensions_dict:
|
||||
return type_name
|
||||
|
||||
return 'other'
|
||||
|
||||
@@ -11,7 +11,7 @@ from datetime import datetime
|
||||
from threading import Thread, Lock
|
||||
import time
|
||||
|
||||
from .local_library import LocalLibraryDB
|
||||
from .folder_store import FolderDB
|
||||
from helper.logger import log
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -140,7 +140,7 @@ class Worker:
|
||||
class WorkerLoggingHandler(logging.StreamHandler):
|
||||
"""Custom logging handler that captures logs for a worker."""
|
||||
|
||||
def __init__(self, worker_id: str, db: LocalLibraryDB,
|
||||
def __init__(self, worker_id: str, db: FolderDB,
|
||||
manager: Optional['WorkerManager'] = None,
|
||||
buffer_size: int = 50):
|
||||
"""Initialize the handler.
|
||||
@@ -235,7 +235,7 @@ class WorkerManager:
|
||||
auto_refresh_interval: Seconds between auto-refresh checks (0 = disabled)
|
||||
"""
|
||||
self.library_root = Path(library_root)
|
||||
self.db = LocalLibraryDB(library_root)
|
||||
self.db = FolderDB(library_root)
|
||||
self.auto_refresh_interval = auto_refresh_interval
|
||||
self.refresh_callbacks: List[Callable] = []
|
||||
self.refresh_thread: Optional[Thread] = None
|
||||
@@ -244,6 +244,22 @@ class WorkerManager:
|
||||
self.worker_handlers: Dict[str, WorkerLoggingHandler] = {} # Track active handlers
|
||||
self._worker_last_step: Dict[str, str] = {}
|
||||
|
||||
def close(self) -> None:
|
||||
"""Close the database connection."""
|
||||
if self.db:
|
||||
try:
|
||||
self.db.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def __enter__(self):
|
||||
"""Context manager entry."""
|
||||
return self
|
||||
|
||||
def __exit__(self, exc_type, exc_val, exc_tb):
|
||||
"""Context manager exit - close database."""
|
||||
self.close()
|
||||
|
||||
def add_refresh_callback(self, callback: Callable[[List[Dict[str, Any]]], None]) -> None:
|
||||
"""Register a callback to be called on worker updates.
|
||||
|
||||
|
||||
@@ -12,26 +12,14 @@ from typing import Tuple, Optional, Dict, Any
|
||||
from pathlib import Path
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Global state for Hydrus availability
|
||||
_HYDRUS_AVAILABLE: Optional[bool] = None
|
||||
_HYDRUS_UNAVAILABLE_REASON: Optional[str] = None
|
||||
_HYDRUS_CHECK_COMPLETE = False
|
||||
|
||||
# Global state for Debrid availability
|
||||
_DEBRID_AVAILABLE: Optional[bool] = None
|
||||
_DEBRID_UNAVAILABLE_REASON: Optional[str] = None
|
||||
_DEBRID_CHECK_COMPLETE = False
|
||||
|
||||
# Global state for MPV availability
|
||||
_MPV_AVAILABLE: Optional[bool] = None
|
||||
_MPV_UNAVAILABLE_REASON: Optional[str] = None
|
||||
_MPV_CHECK_COMPLETE = False
|
||||
|
||||
# Global state for Matrix availability
|
||||
_MATRIX_AVAILABLE: Optional[bool] = None
|
||||
_MATRIX_UNAVAILABLE_REASON: Optional[str] = None
|
||||
_MATRIX_CHECK_COMPLETE = False
|
||||
# Global state for all service availability checks - consolidated from 12 separate globals
|
||||
_SERVICE_STATE = {
|
||||
"hydrus": {"available": None, "reason": None, "complete": False},
|
||||
"hydrusnetwork_stores": {}, # Track individual Hydrus instances
|
||||
"debrid": {"available": None, "reason": None, "complete": False},
|
||||
"mpv": {"available": None, "reason": None, "complete": False},
|
||||
"matrix": {"available": None, "reason": None, "complete": False},
|
||||
}
|
||||
|
||||
# Global state for Cookies availability
|
||||
_COOKIES_FILE_PATH: Optional[str] = None
|
||||
@@ -68,130 +56,73 @@ def check_hydrus_availability(config: Dict[str, Any]) -> Tuple[bool, Optional[st
|
||||
return False, error_msg
|
||||
|
||||
|
||||
def initialize_hydrus_health_check(config: Dict[str, Any]) -> None:
|
||||
"""Initialize Hydrus health check at startup.
|
||||
|
||||
This should be called once at application startup to determine if Hydrus
|
||||
features should be enabled or disabled.
|
||||
|
||||
Args:
|
||||
config: Application configuration dictionary
|
||||
"""
|
||||
global _HYDRUS_AVAILABLE, _HYDRUS_UNAVAILABLE_REASON, _HYDRUS_CHECK_COMPLETE
|
||||
|
||||
def initialize_hydrus_health_check(config: Dict[str, Any], emit_debug: bool = True) -> Tuple[bool, Optional[str]]:
|
||||
"""Initialize Hydrus health check at startup."""
|
||||
global _SERVICE_STATE
|
||||
logger.info("[Startup] Starting Hydrus health check...")
|
||||
is_available, reason = check_hydrus_availability(config)
|
||||
_SERVICE_STATE["hydrus"]["available"] = is_available
|
||||
_SERVICE_STATE["hydrus"]["reason"] = reason
|
||||
_SERVICE_STATE["hydrus"]["complete"] = True
|
||||
|
||||
# Track individual Hydrus instances
|
||||
try:
|
||||
is_available, reason = check_hydrus_availability(config)
|
||||
_HYDRUS_AVAILABLE = is_available
|
||||
_HYDRUS_UNAVAILABLE_REASON = reason
|
||||
_HYDRUS_CHECK_COMPLETE = True
|
||||
|
||||
if is_available:
|
||||
debug("Hydrus: ENABLED - All Hydrus features available", file=sys.stderr)
|
||||
else:
|
||||
debug(f"Hydrus: DISABLED - {reason or 'Connection failed'}", file=sys.stderr)
|
||||
|
||||
store_config = config.get("store", {})
|
||||
hydrusnetwork = store_config.get("hydrusnetwork", {})
|
||||
for instance_name, instance_config in hydrusnetwork.items():
|
||||
if isinstance(instance_config, dict):
|
||||
url = instance_config.get("url")
|
||||
access_key = instance_config.get("Hydrus-Client-API-Access-Key")
|
||||
if url and access_key:
|
||||
_SERVICE_STATE["hydrusnetwork_stores"][instance_name] = {
|
||||
"ok": is_available,
|
||||
"url": url,
|
||||
"detail": reason if not is_available else "Connected"
|
||||
}
|
||||
else:
|
||||
_SERVICE_STATE["hydrusnetwork_stores"][instance_name] = {
|
||||
"ok": False,
|
||||
"url": url or "Not configured",
|
||||
"detail": "Missing credentials"
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to initialize Hydrus health check: {e}", exc_info=True)
|
||||
_HYDRUS_AVAILABLE = False
|
||||
_HYDRUS_UNAVAILABLE_REASON = str(e)
|
||||
_HYDRUS_CHECK_COMPLETE = True
|
||||
debug(f"Hydrus: DISABLED - Error during health check: {e}", file=sys.stderr)
|
||||
logger.debug(f"Could not enumerate Hydrus instances: {e}")
|
||||
|
||||
if emit_debug:
|
||||
status = 'ENABLED' if is_available else f'DISABLED - {reason or "Connection failed"}'
|
||||
debug(f"Hydrus: {status}", file=sys.stderr)
|
||||
return is_available, reason
|
||||
|
||||
|
||||
def check_debrid_availability(config: Dict[str, Any]) -> Tuple[bool, Optional[str]]:
|
||||
"""Check if Debrid API is available.
|
||||
|
||||
Args:
|
||||
config: Application configuration dictionary
|
||||
|
||||
Returns:
|
||||
Tuple of (is_available: bool, reason: Optional[str])
|
||||
- (True, None) if Debrid API is available
|
||||
- (False, reason) if Debrid API is unavailable with reason
|
||||
"""
|
||||
"""Check if Debrid API is available."""
|
||||
try:
|
||||
from helper.http_client import HTTPClient
|
||||
|
||||
logger.info("[Debrid Health Check] Pinging Debrid API at https://api.alldebrid.com/v4/ping...")
|
||||
|
||||
try:
|
||||
# Use the public ping endpoint to check API availability
|
||||
# This endpoint doesn't require authentication
|
||||
with HTTPClient(timeout=10.0, verify_ssl=True) as client:
|
||||
response = client.get('https://api.alldebrid.com/v4/ping')
|
||||
logger.debug(f"[Debrid Health Check] Response status: {response.status_code}")
|
||||
|
||||
# Read response text first (handles gzip decompression)
|
||||
try:
|
||||
response_text = response.text
|
||||
logger.debug(f"[Debrid Health Check] Response text: {response_text}")
|
||||
except Exception as e:
|
||||
logger.error(f"[Debrid Health Check] ❌ Failed to read response text: {e}")
|
||||
return False, f"Failed to read response: {e}"
|
||||
|
||||
# Parse JSON
|
||||
try:
|
||||
result = response.json()
|
||||
logger.debug(f"[Debrid Health Check] Response JSON: {result}")
|
||||
except Exception as e:
|
||||
logger.error(f"[Debrid Health Check] ❌ Failed to parse JSON: {e}")
|
||||
logger.error(f"[Debrid Health Check] Response was: {response_text}")
|
||||
return False, f"Failed to parse response: {e}"
|
||||
|
||||
# Validate response format
|
||||
if result.get('status') == 'success' and result.get('data', {}).get('ping') == 'pong':
|
||||
logger.info("[Debrid Health Check] ✅ Debrid API is AVAILABLE")
|
||||
return True, None
|
||||
else:
|
||||
logger.warning(f"[Debrid Health Check] ❌ Debrid API returned unexpected response: {result}")
|
||||
return False, "Invalid API response"
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
logger.warning(f"[Debrid Health Check] ❌ Debrid API error: {error_msg}")
|
||||
import traceback
|
||||
logger.debug(f"[Debrid Health Check] Traceback: {traceback.format_exc()}")
|
||||
return False, error_msg
|
||||
|
||||
logger.info("[Debrid Health Check] Pinging Debrid API...")
|
||||
with HTTPClient(timeout=10.0, verify_ssl=True) as client:
|
||||
response = client.get('https://api.alldebrid.com/v4/ping')
|
||||
result = response.json()
|
||||
if result.get('status') == 'success' and result.get('data', {}).get('ping') == 'pong':
|
||||
logger.info("[Debrid Health Check] Debrid API is AVAILABLE")
|
||||
return True, None
|
||||
return False, "Invalid API response"
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
logger.error(f"[Debrid Health Check] ❌ Error checking Debrid availability: {error_msg}")
|
||||
return False, error_msg
|
||||
logger.warning(f"[Debrid Health Check] Debrid API error: {e}")
|
||||
return False, str(e)
|
||||
|
||||
|
||||
def initialize_debrid_health_check(config: Dict[str, Any]) -> None:
|
||||
"""Initialize Debrid health check at startup.
|
||||
|
||||
This should be called once at application startup to determine if Debrid
|
||||
features should be enabled or disabled.
|
||||
|
||||
Args:
|
||||
config: Application configuration dictionary
|
||||
"""
|
||||
global _DEBRID_AVAILABLE, _DEBRID_UNAVAILABLE_REASON, _DEBRID_CHECK_COMPLETE
|
||||
|
||||
def initialize_debrid_health_check(config: Dict[str, Any], emit_debug: bool = True) -> Tuple[bool, Optional[str]]:
|
||||
"""Initialize Debrid health check at startup."""
|
||||
global _SERVICE_STATE
|
||||
logger.info("[Startup] Starting Debrid health check...")
|
||||
|
||||
try:
|
||||
is_available, reason = check_debrid_availability(config)
|
||||
_DEBRID_AVAILABLE = is_available
|
||||
_DEBRID_UNAVAILABLE_REASON = reason
|
||||
_DEBRID_CHECK_COMPLETE = True
|
||||
|
||||
if is_available:
|
||||
debug("✅ Debrid: ENABLED - All Debrid features available", file=sys.stderr)
|
||||
logger.info("[Startup] Debrid health check PASSED")
|
||||
else:
|
||||
debug(f"⚠️ Debrid: DISABLED - {reason or 'Connection failed'}", file=sys.stderr)
|
||||
logger.warning(f"[Startup] Debrid health check FAILED: {reason}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to initialize Debrid health check: {e}", exc_info=True)
|
||||
_DEBRID_AVAILABLE = False
|
||||
_DEBRID_UNAVAILABLE_REASON = str(e)
|
||||
_DEBRID_CHECK_COMPLETE = True
|
||||
debug(f"⚠️ Debrid: DISABLED - Error during health check: {e}", file=sys.stderr)
|
||||
is_available, reason = check_debrid_availability(config)
|
||||
_SERVICE_STATE["debrid"]["available"] = is_available
|
||||
_SERVICE_STATE["debrid"]["reason"] = reason
|
||||
_SERVICE_STATE["debrid"]["complete"] = True
|
||||
if emit_debug:
|
||||
status = 'ENABLED' if is_available else f'DISABLED - {reason or "Connection failed"}'
|
||||
debug(f"Debrid: {status}", file=sys.stderr)
|
||||
return is_available, reason
|
||||
|
||||
|
||||
def check_mpv_availability() -> Tuple[bool, Optional[str]]:
|
||||
@@ -200,10 +131,10 @@ def check_mpv_availability() -> Tuple[bool, Optional[str]]:
|
||||
Returns:
|
||||
Tuple of (is_available: bool, reason: Optional[str])
|
||||
"""
|
||||
global _MPV_AVAILABLE, _MPV_UNAVAILABLE_REASON, _MPV_CHECK_COMPLETE
|
||||
global _SERVICE_STATE
|
||||
|
||||
if _MPV_CHECK_COMPLETE and _MPV_AVAILABLE is not None:
|
||||
return _MPV_AVAILABLE, _MPV_UNAVAILABLE_REASON
|
||||
if _SERVICE_STATE["mpv"]["complete"] and _SERVICE_STATE["mpv"]["available"] is not None:
|
||||
return _SERVICE_STATE["mpv"]["available"], _SERVICE_STATE["mpv"]["reason"]
|
||||
|
||||
import shutil
|
||||
import subprocess
|
||||
@@ -212,11 +143,8 @@ def check_mpv_availability() -> Tuple[bool, Optional[str]]:
|
||||
|
||||
mpv_path = shutil.which("mpv")
|
||||
if not mpv_path:
|
||||
_MPV_AVAILABLE = False
|
||||
_MPV_UNAVAILABLE_REASON = "Executable 'mpv' not found in PATH"
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: {_MPV_UNAVAILABLE_REASON}")
|
||||
return False, _MPV_UNAVAILABLE_REASON
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: Executable 'mpv' not found in PATH")
|
||||
return False, "Executable 'mpv' not found in PATH"
|
||||
|
||||
# Try to get version to confirm it works
|
||||
try:
|
||||
@@ -228,55 +156,35 @@ def check_mpv_availability() -> Tuple[bool, Optional[str]]:
|
||||
)
|
||||
if result.returncode == 0:
|
||||
version_line = result.stdout.split('\n')[0]
|
||||
_MPV_AVAILABLE = True
|
||||
_MPV_UNAVAILABLE_REASON = None
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
logger.info(f"[MPV Health Check] ✅ MPV is AVAILABLE ({version_line})")
|
||||
logger.info(f"[MPV Health Check] MPV is AVAILABLE ({version_line})")
|
||||
return True, None
|
||||
else:
|
||||
_MPV_AVAILABLE = False
|
||||
_MPV_UNAVAILABLE_REASON = f"MPV returned non-zero exit code: {result.returncode}"
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: {_MPV_UNAVAILABLE_REASON}")
|
||||
return False, _MPV_UNAVAILABLE_REASON
|
||||
reason = f"MPV returned non-zero exit code: {result.returncode}"
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: {reason}")
|
||||
return False, reason
|
||||
except Exception as e:
|
||||
_MPV_AVAILABLE = False
|
||||
_MPV_UNAVAILABLE_REASON = f"Error running MPV: {e}"
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: {_MPV_UNAVAILABLE_REASON}")
|
||||
return False, _MPV_UNAVAILABLE_REASON
|
||||
reason = f"Error running MPV: {e}"
|
||||
logger.warning(f"[MPV Health Check] ❌ MPV is UNAVAILABLE: {reason}")
|
||||
return False, reason
|
||||
|
||||
|
||||
def initialize_mpv_health_check() -> None:
|
||||
"""Initialize MPV health check at startup.
|
||||
|
||||
This should be called once at application startup to determine if MPV
|
||||
features should be enabled or disabled.
|
||||
"""
|
||||
global _MPV_AVAILABLE, _MPV_UNAVAILABLE_REASON, _MPV_CHECK_COMPLETE
|
||||
def initialize_mpv_health_check(emit_debug: bool = True) -> Tuple[bool, Optional[str]]:
|
||||
"""Initialize MPV health check at startup and return (is_available, reason)."""
|
||||
global _SERVICE_STATE
|
||||
|
||||
logger.info("[Startup] Starting MPV health check...")
|
||||
is_available, reason = check_mpv_availability()
|
||||
_SERVICE_STATE["mpv"]["available"] = is_available
|
||||
_SERVICE_STATE["mpv"]["reason"] = reason
|
||||
_SERVICE_STATE["mpv"]["complete"] = True
|
||||
|
||||
try:
|
||||
is_available, reason = check_mpv_availability()
|
||||
_MPV_AVAILABLE = is_available
|
||||
_MPV_UNAVAILABLE_REASON = reason
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
|
||||
if emit_debug:
|
||||
if is_available:
|
||||
debug("✅ MPV: ENABLED - All MPV features available", file=sys.stderr)
|
||||
logger.info("[Startup] MPV health check PASSED")
|
||||
else:
|
||||
debug(f"⚠️ MPV: DISABLED - {reason or 'Connection failed'}", file=sys.stderr)
|
||||
debug("→ Hydrus features still available", file=sys.stderr)
|
||||
logger.warning(f"[Startup] MPV health check FAILED: {reason}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to initialize MPV health check: {e}", exc_info=True)
|
||||
_MPV_AVAILABLE = False
|
||||
_MPV_UNAVAILABLE_REASON = str(e)
|
||||
_MPV_CHECK_COMPLETE = True
|
||||
debug(f"⚠️ MPV: DISABLED - Error during health check: {e}", file=sys.stderr)
|
||||
debug("MPV: ENABLED - All MPV features available", file=sys.stderr)
|
||||
elif reason != "Not configured":
|
||||
debug(f"MPV: DISABLED - {reason or 'Connection failed'}", file=sys.stderr)
|
||||
|
||||
return is_available, reason
|
||||
|
||||
|
||||
def check_matrix_availability(config: Dict[str, Any]) -> Tuple[bool, Optional[str]]:
|
||||
@@ -324,264 +232,262 @@ def check_matrix_availability(config: Dict[str, Any]) -> Tuple[bool, Optional[st
|
||||
return False, str(e)
|
||||
|
||||
|
||||
def initialize_matrix_health_check(config: Dict[str, Any]) -> None:
|
||||
"""Initialize Matrix health check at startup."""
|
||||
global _MATRIX_AVAILABLE, _MATRIX_UNAVAILABLE_REASON, _MATRIX_CHECK_COMPLETE
|
||||
|
||||
def initialize_matrix_health_check(config: Dict[str, Any], emit_debug: bool = True) -> Tuple[bool, Optional[str]]:
|
||||
"""Initialize Matrix health check at startup and return (is_available, reason)."""
|
||||
global _SERVICE_STATE
|
||||
|
||||
logger.info("[Startup] Starting Matrix health check...")
|
||||
is_available, reason = check_matrix_availability(config)
|
||||
_SERVICE_STATE["matrix"]["available"] = is_available
|
||||
_SERVICE_STATE["matrix"]["reason"] = reason
|
||||
_SERVICE_STATE["matrix"]["complete"] = True
|
||||
|
||||
try:
|
||||
is_available, reason = check_matrix_availability(config)
|
||||
_MATRIX_AVAILABLE = is_available
|
||||
_MATRIX_UNAVAILABLE_REASON = reason
|
||||
_MATRIX_CHECK_COMPLETE = True
|
||||
|
||||
if emit_debug:
|
||||
if is_available:
|
||||
debug("Matrix: ENABLED - Homeserver reachable", file=sys.stderr)
|
||||
else:
|
||||
if reason != "Not configured":
|
||||
debug(f"Matrix: DISABLED - {reason}", file=sys.stderr)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to initialize Matrix health check: {e}", exc_info=True)
|
||||
_MATRIX_AVAILABLE = False
|
||||
_MATRIX_UNAVAILABLE_REASON = str(e)
|
||||
_MATRIX_CHECK_COMPLETE = True
|
||||
|
||||
|
||||
def is_hydrus_available() -> bool:
|
||||
"""Check if Hydrus is available (from cached health check).
|
||||
elif reason != "Not configured":
|
||||
debug(f"Matrix: DISABLED - {reason}", file=sys.stderr)
|
||||
|
||||
Returns:
|
||||
True if Hydrus API is available, False otherwise
|
||||
"""
|
||||
return _HYDRUS_AVAILABLE is True
|
||||
return is_available, reason
|
||||
|
||||
|
||||
# Unified getter functions for service availability - all use _SERVICE_STATE
|
||||
def is_hydrus_available() -> bool:
|
||||
"""Check if Hydrus is available (from cached health check)."""
|
||||
return _SERVICE_STATE["hydrus"]["available"] is True
|
||||
|
||||
|
||||
def get_hydrus_unavailable_reason() -> Optional[str]:
|
||||
"""Get the reason why Hydrus is unavailable.
|
||||
|
||||
Returns:
|
||||
String explaining why Hydrus is unavailable, or None if available
|
||||
"""
|
||||
return _HYDRUS_UNAVAILABLE_REASON if not is_hydrus_available() else None
|
||||
"""Get the reason why Hydrus is unavailable."""
|
||||
return _SERVICE_STATE["hydrus"]["reason"] if not is_hydrus_available() else None
|
||||
|
||||
|
||||
def is_hydrus_check_complete() -> bool:
|
||||
"""Check if the Hydrus health check has been completed.
|
||||
|
||||
Returns:
|
||||
True if health check has run, False if still pending
|
||||
"""
|
||||
return _HYDRUS_CHECK_COMPLETE
|
||||
"""Check if the Hydrus health check has been completed."""
|
||||
return _SERVICE_STATE["hydrus"]["complete"]
|
||||
|
||||
|
||||
def disable_hydrus_features() -> None:
|
||||
"""Manually disable all Hydrus features (for testing/fallback).
|
||||
|
||||
This can be called if Hydrus connectivity is lost after startup.
|
||||
"""
|
||||
global _HYDRUS_AVAILABLE, _HYDRUS_UNAVAILABLE_REASON
|
||||
_HYDRUS_AVAILABLE = False
|
||||
_HYDRUS_UNAVAILABLE_REASON = "Manually disabled or lost connection"
|
||||
"""Manually disable all Hydrus features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["hydrus"]["available"] = False
|
||||
_SERVICE_STATE["hydrus"]["reason"] = "Manually disabled or lost connection"
|
||||
logger.warning("[Hydrus] Features manually disabled")
|
||||
|
||||
|
||||
def enable_hydrus_features() -> None:
|
||||
"""Manually enable Hydrus features (for testing/fallback).
|
||||
|
||||
This can be called if Hydrus connectivity is restored after startup.
|
||||
"""
|
||||
global _HYDRUS_AVAILABLE, _HYDRUS_UNAVAILABLE_REASON
|
||||
_HYDRUS_AVAILABLE = True
|
||||
_HYDRUS_UNAVAILABLE_REASON = None
|
||||
"""Manually enable Hydrus features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["hydrus"]["available"] = True
|
||||
_SERVICE_STATE["hydrus"]["reason"] = None
|
||||
logger.info("[Hydrus] Features manually enabled")
|
||||
|
||||
|
||||
def is_debrid_available() -> bool:
|
||||
"""Check if Debrid is available (from cached health check).
|
||||
|
||||
Returns:
|
||||
True if Debrid API is available, False otherwise
|
||||
"""
|
||||
return _DEBRID_AVAILABLE is True
|
||||
"""Check if Debrid is available (from cached health check)."""
|
||||
return _SERVICE_STATE["debrid"]["available"] is True
|
||||
|
||||
|
||||
def get_debrid_unavailable_reason() -> Optional[str]:
|
||||
"""Get the reason why Debrid is unavailable.
|
||||
|
||||
Returns:
|
||||
String explaining why Debrid is unavailable, or None if available
|
||||
"""
|
||||
return _DEBRID_UNAVAILABLE_REASON if not is_debrid_available() else None
|
||||
"""Get the reason why Debrid is unavailable."""
|
||||
return _SERVICE_STATE["debrid"]["reason"] if not is_debrid_available() else None
|
||||
|
||||
|
||||
def is_debrid_check_complete() -> bool:
|
||||
"""Check if the Debrid health check has been completed.
|
||||
|
||||
Returns:
|
||||
True if health check has run, False if still pending
|
||||
"""
|
||||
return _DEBRID_CHECK_COMPLETE
|
||||
"""Check if the Debrid health check has been completed."""
|
||||
return _SERVICE_STATE["debrid"]["complete"]
|
||||
|
||||
|
||||
def disable_debrid_features() -> None:
|
||||
"""Manually disable all Debrid features (for testing/fallback).
|
||||
|
||||
This can be called if Debrid connectivity is lost after startup.
|
||||
"""
|
||||
global _DEBRID_AVAILABLE, _DEBRID_UNAVAILABLE_REASON
|
||||
_DEBRID_AVAILABLE = False
|
||||
_DEBRID_UNAVAILABLE_REASON = "Manually disabled or lost connection"
|
||||
"""Manually disable all Debrid features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["debrid"]["available"] = False
|
||||
_SERVICE_STATE["debrid"]["reason"] = "Manually disabled or lost connection"
|
||||
logger.warning("[Debrid] Features manually disabled")
|
||||
|
||||
|
||||
def enable_debrid_features() -> None:
|
||||
"""Manually enable Debrid features (for testing/fallback).
|
||||
|
||||
This can be called if Debrid connectivity is restored after startup.
|
||||
"""
|
||||
global _DEBRID_AVAILABLE, _DEBRID_UNAVAILABLE_REASON
|
||||
_DEBRID_AVAILABLE = True
|
||||
_DEBRID_UNAVAILABLE_REASON = None
|
||||
"""Manually enable Debrid features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["debrid"]["available"] = True
|
||||
_SERVICE_STATE["debrid"]["reason"] = None
|
||||
logger.info("[Debrid] Features manually enabled")
|
||||
|
||||
|
||||
def is_mpv_available() -> bool:
|
||||
"""Check if MPV is available (from cached health check).
|
||||
|
||||
Returns:
|
||||
True if MPV is available, False otherwise
|
||||
"""
|
||||
return _MPV_AVAILABLE is True
|
||||
|
||||
"""Check if MPV is available (from cached health check)."""
|
||||
return _SERVICE_STATE["mpv"]["available"] is True
|
||||
|
||||
def get_mpv_unavailable_reason() -> Optional[str]:
|
||||
"""Get the reason why MPV is unavailable.
|
||||
|
||||
Returns:
|
||||
String explaining why MPV is unavailable, or None if available
|
||||
"""
|
||||
return _MPV_UNAVAILABLE_REASON if not is_mpv_available() else None
|
||||
"""Get the reason why MPV is unavailable."""
|
||||
return _SERVICE_STATE["mpv"]["reason"] if not is_mpv_available() else None
|
||||
|
||||
|
||||
def is_mpv_check_complete() -> bool:
|
||||
"""Check if the MPV health check has been completed.
|
||||
|
||||
Returns:
|
||||
True if health check has run, False if still pending
|
||||
"""
|
||||
return _MPV_CHECK_COMPLETE
|
||||
"""Check if the MPV health check has been completed."""
|
||||
return _SERVICE_STATE["mpv"]["complete"]
|
||||
|
||||
|
||||
def disable_mpv_features() -> None:
|
||||
"""Manually disable all MPV features (for testing/fallback).
|
||||
|
||||
This can be called if MPV connectivity is lost after startup.
|
||||
"""
|
||||
global _MPV_AVAILABLE, _MPV_UNAVAILABLE_REASON
|
||||
_MPV_AVAILABLE = False
|
||||
_MPV_UNAVAILABLE_REASON = "Manually disabled or lost connection"
|
||||
"""Manually disable all MPV features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["mpv"]["available"] = False
|
||||
_SERVICE_STATE["mpv"]["reason"] = "Manually disabled or lost connection"
|
||||
logger.warning("[MPV] Features manually disabled")
|
||||
|
||||
|
||||
def enable_mpv_features() -> None:
|
||||
"""Manually enable MPV features (for testing/fallback).
|
||||
|
||||
This can be called if MPV connectivity is restored after startup.
|
||||
"""
|
||||
global _MPV_AVAILABLE, _MPV_UNAVAILABLE_REASON
|
||||
_MPV_AVAILABLE = True
|
||||
_MPV_UNAVAILABLE_REASON = None
|
||||
"""Manually enable MPV features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["mpv"]["available"] = True
|
||||
_SERVICE_STATE["mpv"]["reason"] = None
|
||||
logger.info("[MPV] Features manually enabled")
|
||||
|
||||
|
||||
def is_matrix_available() -> bool:
|
||||
"""Check if Matrix is available (from cached health check).
|
||||
|
||||
Returns:
|
||||
True if Matrix is available, False otherwise
|
||||
"""
|
||||
return _MATRIX_AVAILABLE is True
|
||||
"""Check if Matrix is available (from cached health check)."""
|
||||
return _SERVICE_STATE["matrix"]["available"] is True
|
||||
|
||||
|
||||
def get_matrix_unavailable_reason() -> Optional[str]:
|
||||
"""Get the reason why Matrix is unavailable.
|
||||
|
||||
Returns:
|
||||
String explaining why Matrix is unavailable, or None if available
|
||||
"""
|
||||
return _MATRIX_UNAVAILABLE_REASON if not is_matrix_available() else None
|
||||
"""Get the reason why Matrix is unavailable."""
|
||||
return _SERVICE_STATE["matrix"]["reason"] if not is_matrix_available() else None
|
||||
|
||||
|
||||
def is_matrix_check_complete() -> bool:
|
||||
"""Check if the Matrix health check has been completed.
|
||||
|
||||
Returns:
|
||||
True if health check has run, False if still pending
|
||||
"""
|
||||
return _MATRIX_CHECK_COMPLETE
|
||||
"""Check if the Matrix health check has been completed."""
|
||||
return _SERVICE_STATE["matrix"]["complete"]
|
||||
|
||||
|
||||
def disable_matrix_features() -> None:
|
||||
"""Manually disable all Matrix features (for testing/fallback).
|
||||
|
||||
This can be called if Matrix connectivity is lost after startup.
|
||||
"""
|
||||
global _MATRIX_AVAILABLE, _MATRIX_UNAVAILABLE_REASON
|
||||
_MATRIX_AVAILABLE = False
|
||||
_MATRIX_UNAVAILABLE_REASON = "Manually disabled or lost connection"
|
||||
"""Manually disable all Matrix features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["matrix"]["available"] = False
|
||||
_SERVICE_STATE["matrix"]["reason"] = "Manually disabled or lost connection"
|
||||
logger.warning("[Matrix] Features manually disabled")
|
||||
|
||||
|
||||
def enable_matrix_features() -> None:
|
||||
"""Manually enable Matrix features (for testing/fallback).
|
||||
|
||||
This can be called if Matrix connectivity is restored after startup.
|
||||
"""
|
||||
global _MATRIX_AVAILABLE, _MATRIX_UNAVAILABLE_REASON
|
||||
_MATRIX_AVAILABLE = True
|
||||
_MATRIX_UNAVAILABLE_REASON = None
|
||||
"""Manually enable Matrix features (for testing/fallback)."""
|
||||
global _SERVICE_STATE
|
||||
_SERVICE_STATE["matrix"]["available"] = True
|
||||
_SERVICE_STATE["matrix"]["reason"] = None
|
||||
logger.info("[Matrix] Features manually enabled")
|
||||
|
||||
|
||||
def initialize_local_library_scan(config: Dict[str, Any]) -> None:
|
||||
"""Initialize and scan local library at startup.
|
||||
def initialize_local_library_scan(config: Dict[str, Any], emit_debug: bool = True) -> Tuple[bool, str]:
|
||||
"""Initialize and scan all folder stores at startup.
|
||||
|
||||
Returns a tuple of (success, detail_message).
|
||||
|
||||
This ensures that any new files in the local library folder are indexed
|
||||
Note: Individual store results are stored in _SERVICE_STATE["folder_stores"]
|
||||
for the CLI to display as separate table rows.
|
||||
|
||||
This ensures that any new files in configured folder stores are indexed
|
||||
and their sidecar files are imported and cleaned up.
|
||||
"""
|
||||
from config import get_local_storage_path
|
||||
from helper.local_library import LocalLibraryInitializer
|
||||
from helper.folder_store import LocalLibraryInitializer
|
||||
from helper.store import Folder
|
||||
|
||||
logger.info("[Startup] Starting Local Library scan...")
|
||||
logger.info("[Startup] Starting folder store scans...")
|
||||
|
||||
try:
|
||||
storage_path = get_local_storage_path(config)
|
||||
if not storage_path:
|
||||
debug("⚠️ Local Library: SKIPPED - No storage path configured", file=sys.stderr)
|
||||
return
|
||||
# Get all configured folder stores from config
|
||||
folder_sources = config.get("store", {}).get("folder", {})
|
||||
if not isinstance(folder_sources, dict) or not folder_sources:
|
||||
if emit_debug:
|
||||
debug("⚠️ Folder stores: SKIPPED - No folder stores configured", file=sys.stderr)
|
||||
return False, "No folder stores configured"
|
||||
|
||||
results = []
|
||||
total_new_files = 0
|
||||
total_sidecars = 0
|
||||
failed_stores = []
|
||||
store_results = {}
|
||||
|
||||
for store_name, store_config in folder_sources.items():
|
||||
if not isinstance(store_config, dict):
|
||||
continue
|
||||
|
||||
debug(f"Scanning local library at: {storage_path}", file=sys.stderr)
|
||||
initializer = LocalLibraryInitializer(storage_path)
|
||||
stats = initializer.scan_and_index()
|
||||
store_path = store_config.get("path")
|
||||
if not store_path:
|
||||
continue
|
||||
|
||||
try:
|
||||
from pathlib import Path
|
||||
storage_path = Path(str(store_path)).expanduser()
|
||||
|
||||
if emit_debug:
|
||||
debug(f"Scanning folder store '{store_name}' at: {storage_path}", file=sys.stderr)
|
||||
|
||||
# Migrate the folder store to hash-based naming (only runs once per location)
|
||||
Folder.migrate_location(str(storage_path))
|
||||
|
||||
initializer = LocalLibraryInitializer(storage_path)
|
||||
stats = initializer.scan_and_index()
|
||||
|
||||
# Accumulate stats
|
||||
new_files = stats.get('files_new', 0)
|
||||
sidecars = stats.get('sidecars_imported', 0)
|
||||
total_new_files += new_files
|
||||
total_sidecars += sidecars
|
||||
|
||||
# Record result for this store
|
||||
if new_files > 0 or sidecars > 0:
|
||||
result_detail = f"New: {new_files}, Sidecars: {sidecars}"
|
||||
if emit_debug:
|
||||
debug(f" {store_name}: {result_detail}", file=sys.stderr)
|
||||
else:
|
||||
result_detail = "Up to date"
|
||||
if emit_debug:
|
||||
debug(f" {store_name}: {result_detail}", file=sys.stderr)
|
||||
|
||||
results.append(f"{store_name}: {result_detail}")
|
||||
store_results[store_name] = {
|
||||
"path": str(storage_path),
|
||||
"detail": result_detail,
|
||||
"ok": True
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to scan folder store '{store_name}': {e}", exc_info=True)
|
||||
if emit_debug:
|
||||
debug(f" {store_name}: ERROR - {e}", file=sys.stderr)
|
||||
failed_stores.append(store_name)
|
||||
store_results[store_name] = {
|
||||
"path": str(store_config.get("path", "?")),
|
||||
"detail": f"ERROR - {e}",
|
||||
"ok": False
|
||||
}
|
||||
|
||||
# Log summary
|
||||
new_files = stats.get('files_new', 0)
|
||||
sidecars = stats.get('sidecars_imported', 0)
|
||||
# Store individual results for CLI to display
|
||||
_SERVICE_STATE["folder_stores"] = store_results
|
||||
|
||||
if new_files > 0 or sidecars > 0:
|
||||
debug(f"✅ Local Library: Scanned - New files: {new_files}, Sidecars imported: {sidecars}", file=sys.stderr)
|
||||
# Build detail message
|
||||
if failed_stores:
|
||||
detail = f"Scanned {len(results)} stores ({len(failed_stores)} failed); Total new: {total_new_files}, Sidecars: {total_sidecars}"
|
||||
if emit_debug:
|
||||
debug(f"Folder stores scan complete: {detail}", file=sys.stderr)
|
||||
return len(failed_stores) < len(results), detail
|
||||
else:
|
||||
debug("✅ Local Library: Up to date", file=sys.stderr)
|
||||
detail = f"Scanned {len(results)} stores; Total new: {total_new_files}, Sidecars: {total_sidecars}"
|
||||
if emit_debug:
|
||||
debug(f"Folder stores scan complete: {detail}", file=sys.stderr)
|
||||
return True, detail
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Startup] Failed to scan local library: {e}", exc_info=True)
|
||||
debug(f"⚠️ Local Library: ERROR - Scan failed: {e}", file=sys.stderr)
|
||||
logger.error(f"[Startup] Failed to scan folder stores: {e}", exc_info=True)
|
||||
if emit_debug:
|
||||
debug(f"⚠️ Folder stores: ERROR - Scan failed: {e}", file=sys.stderr)
|
||||
return False, f"Scan failed: {e}"
|
||||
|
||||
|
||||
def initialize_cookies_check() -> None:
|
||||
"""Check for cookies.txt in the application root directory."""
|
||||
def initialize_cookies_check(emit_debug: bool = True) -> Tuple[bool, str]:
|
||||
"""Check for cookies.txt in the application root directory.
|
||||
|
||||
Returns a tuple of (found, detail_message).
|
||||
"""
|
||||
global _COOKIES_FILE_PATH
|
||||
|
||||
# Assume CLI.py is in the root
|
||||
@@ -590,10 +496,12 @@ def initialize_cookies_check() -> None:
|
||||
|
||||
if cookies_path.exists():
|
||||
_COOKIES_FILE_PATH = str(cookies_path)
|
||||
debug(f"✅ Cookies: ENABLED - Found cookies.txt", file=sys.stderr)
|
||||
if emit_debug:
|
||||
debug(f"Cookies: ENABLED - Found cookies.txt", file=sys.stderr)
|
||||
return True, str(cookies_path)
|
||||
else:
|
||||
_COOKIES_FILE_PATH = None
|
||||
# debug("ℹ️ Cookies: Using browser cookies (fallback)", file=sys.stderr)
|
||||
return False, "Not found"
|
||||
|
||||
|
||||
def get_cookies_file_path() -> Optional[str]:
|
||||
|
||||
943
metadata.py
943
metadata.py
File diff suppressed because it is too large
Load Diff
225
models.py
225
models.py
@@ -16,134 +16,183 @@ from typing import Any, Callable, Dict, List, Optional, Protocol, TextIO, Tuple
|
||||
class PipeObject:
|
||||
"""Unified pipeline object for tracking files, metadata, tags, and relationships through the pipeline.
|
||||
|
||||
This is the single source of truth for all result data in the pipeline. It can represent:
|
||||
- Tag extraction results (IMDb, MusicBrainz, OpenLibrary lookups)
|
||||
- Remote metadata fetches
|
||||
- File operations with metadata/tags and relationship tracking
|
||||
- Search results
|
||||
- Files with version relationships (king/alt/related)
|
||||
This is the single source of truth for all result data in the pipeline. Uses the hash+store
|
||||
canonical pattern for file identification.
|
||||
|
||||
Attributes:
|
||||
source: Source of the object (e.g., 'imdb', 'musicbrainz', 'libgen', 'debrid', 'file', etc.)
|
||||
identifier: Unique identifier from the source (e.g., IMDb ID, MBID, magnet hash, file hash)
|
||||
hash: SHA-256 hash of the file (canonical identifier)
|
||||
store: Storage backend name (e.g., 'default', 'hydrus', 'test', 'home')
|
||||
tags: List of extracted or assigned tags
|
||||
title: Human-readable title if applicable
|
||||
source_url: URL where the object came from
|
||||
duration: Duration in seconds if applicable
|
||||
metadata: Full metadata dictionary from source
|
||||
remote_metadata: Additional remote metadata
|
||||
warnings: Any warnings or issues encountered
|
||||
mpv_metadata: MPV-specific metadata if applicable
|
||||
file_path: Path to the file if this object represents a file
|
||||
file_hash: SHA-256 hash of the file for integrity and relationship tracking
|
||||
king_hash: Hash of the primary/master version of this file (for alternates)
|
||||
alt_hashes: List of hashes for alternate versions of this file
|
||||
related_hashes: List of hashes for related files (e.g., screenshots, editions)
|
||||
path: Path to the file if this object represents a file
|
||||
relationships: Relationship data (king/alt/related hashes)
|
||||
is_temp: If True, this is a temporary/intermediate artifact that may be cleaned up
|
||||
action: The cmdlet that created this object (format: 'cmdlet:cmdlet_name', e.g., 'cmdlet:get-file')
|
||||
parent_id: Hash of the parent file in the pipeline chain (for tracking provenance/lineage)
|
||||
action: The cmdlet that created this object (format: 'cmdlet:cmdlet_name')
|
||||
parent_hash: Hash of the parent file in the pipeline chain (for tracking provenance/lineage)
|
||||
extra: Additional fields not covered above
|
||||
"""
|
||||
source: str
|
||||
identifier: str
|
||||
hash: str
|
||||
store: str
|
||||
tags: List[str] = field(default_factory=list)
|
||||
title: Optional[str] = None
|
||||
url: Optional[str] = None
|
||||
source_url: Optional[str] = None
|
||||
duration: Optional[float] = None
|
||||
metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
remote_metadata: Optional[Dict[str, Any]] = None
|
||||
warnings: List[str] = field(default_factory=list)
|
||||
mpv_metadata: Optional[Dict[str, Any]] = None
|
||||
file_path: Optional[str] = None
|
||||
file_hash: Optional[str] = None
|
||||
king_hash: Optional[str] = None
|
||||
alt_hashes: List[str] = field(default_factory=list)
|
||||
related_hashes: List[str] = field(default_factory=list)
|
||||
path: Optional[str] = None
|
||||
relationships: Dict[str, Any] = field(default_factory=dict)
|
||||
is_temp: bool = False
|
||||
action: Optional[str] = None
|
||||
parent_id: Optional[str] = None
|
||||
parent_hash: Optional[str] = None
|
||||
extra: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def register_as_king(self, file_hash: str) -> None:
|
||||
"""Register this object as the king (primary) version of a file."""
|
||||
self.king_hash = file_hash
|
||||
|
||||
def add_alternate(self, alt_hash: str) -> None:
|
||||
"""Add an alternate version hash for this file."""
|
||||
if alt_hash not in self.alt_hashes:
|
||||
self.alt_hashes.append(alt_hash)
|
||||
|
||||
def add_related(self, related_hash: str) -> None:
|
||||
"""Add a related file hash (e.g., screenshot, edition)."""
|
||||
if related_hash not in self.related_hashes:
|
||||
self.related_hashes.append(related_hash)
|
||||
def add_relationship(self, rel_type: str, rel_hash: str) -> None:
|
||||
"""Add a relationship hash.
|
||||
|
||||
Args:
|
||||
rel_type: Relationship type ('king', 'alt', 'related')
|
||||
rel_hash: Hash to add to the relationship
|
||||
"""
|
||||
if rel_type not in self.relationships:
|
||||
self.relationships[rel_type] = []
|
||||
|
||||
if isinstance(self.relationships[rel_type], list):
|
||||
if rel_hash not in self.relationships[rel_type]:
|
||||
self.relationships[rel_type].append(rel_hash)
|
||||
else:
|
||||
# Single value (e.g., king), convert to that value
|
||||
self.relationships[rel_type] = rel_hash
|
||||
|
||||
def get_relationships(self) -> Dict[str, Any]:
|
||||
"""Get all relationships for this object."""
|
||||
rels = {}
|
||||
if self.king_hash:
|
||||
rels["king"] = self.king_hash
|
||||
if self.alt_hashes:
|
||||
rels["alt"] = self.alt_hashes
|
||||
if self.related_hashes:
|
||||
rels["related"] = self.related_hashes
|
||||
return rels
|
||||
return self.relationships.copy() if self.relationships else {}
|
||||
|
||||
def debug_table(self) -> None:
|
||||
"""Print a formatted debug table showing PipeObject state.
|
||||
|
||||
Only prints when debug logging is enabled. Useful for tracking
|
||||
object state throughout the pipeline.
|
||||
"""
|
||||
try:
|
||||
from helper.logger import is_debug_enabled, debug
|
||||
|
||||
if not is_debug_enabled():
|
||||
return
|
||||
except Exception:
|
||||
return
|
||||
|
||||
# Prepare display values
|
||||
hash_display = self.hash or "N/A"
|
||||
store_display = self.store or "N/A"
|
||||
title_display = self.title or "N/A"
|
||||
tags_display = ", ".join(self.tags[:3]) if self.tags else "[]"
|
||||
if len(self.tags) > 3:
|
||||
tags_display += f" (+{len(self.tags) - 3} more)"
|
||||
file_path_display = self.path or "N/A"
|
||||
if file_path_display != "N/A" and len(file_path_display) > 50:
|
||||
file_path_display = "..." + file_path_display[-47:]
|
||||
|
||||
url_display = self.url or "N/A"
|
||||
if url_display != "N/A" and len(url_display) > 48:
|
||||
url_display = url_display[:45] + "..."
|
||||
|
||||
relationships_display = "N/A"
|
||||
if self.relationships:
|
||||
rel_parts = []
|
||||
for key, val in self.relationships.items():
|
||||
if isinstance(val, list):
|
||||
rel_parts.append(f"{key}({len(val)})")
|
||||
else:
|
||||
rel_parts.append(key)
|
||||
relationships_display = ", ".join(rel_parts)
|
||||
|
||||
warnings_display = f"{len(self.warnings)} warning(s)" if self.warnings else "none"
|
||||
|
||||
# Print table
|
||||
debug("┌─────────────────────────────────────────────────────────────┐")
|
||||
debug("│ PipeObject Debug Info │")
|
||||
debug("├─────────────────────────────────────────────────────────────┤")
|
||||
debug(f"│ Hash : {hash_display:<48}│")
|
||||
debug(f"│ Store : {store_display:<48}│")
|
||||
debug(f"│ Title : {title_display:<48}│")
|
||||
debug(f"│ Tags : {tags_display:<48}│")
|
||||
debug(f"│ URL : {url_display:<48}│")
|
||||
debug(f"│ File Path : {file_path_display:<48}│")
|
||||
debug(f"│ Relationships: {relationships_display:<47}│")
|
||||
debug(f"│ Warnings : {warnings_display:<48}│")
|
||||
|
||||
# Show extra keys as individual rows
|
||||
if self.extra:
|
||||
debug("├─────────────────────────────────────────────────────────────┤")
|
||||
debug("│ Extra Fields: │")
|
||||
for key, val in self.extra.items():
|
||||
# Format value for display
|
||||
if isinstance(val, (list, set)):
|
||||
val_display = f"{type(val).__name__}({len(val)})"
|
||||
elif isinstance(val, dict):
|
||||
val_display = f"dict({len(val)})"
|
||||
elif isinstance(val, (int, float)):
|
||||
val_display = str(val)
|
||||
else:
|
||||
val_str = str(val)
|
||||
val_display = val_str if len(val_str) <= 40 else val_str[:37] + "..."
|
||||
|
||||
# Truncate key if needed
|
||||
key_display = key if len(key) <= 15 else key[:12] + "..."
|
||||
debug(f"│ {key_display:<15}: {val_display:<42}│")
|
||||
|
||||
if self.action:
|
||||
debug("├─────────────────────────────────────────────────────────────┤")
|
||||
action_display = self.action[:48]
|
||||
debug(f"│ Action : {action_display:<48}│")
|
||||
if self.parent_hash:
|
||||
if not self.action:
|
||||
debug("├─────────────────────────────────────────────────────────────┤")
|
||||
parent_display = self.parent_hash[:12] + "..." if len(self.parent_hash) > 12 else self.parent_hash
|
||||
debug(f"│ Parent Hash : {parent_display:<48}│")
|
||||
debug("└─────────────────────────────────────────────────────────────┘")
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Serialize to dictionary, excluding None and empty values."""
|
||||
data: Dict[str, Any] = {
|
||||
"source": self.source,
|
||||
"tags": self.tags,
|
||||
"hash": self.hash,
|
||||
"store": self.store,
|
||||
}
|
||||
if self.identifier:
|
||||
data["id"] = self.identifier
|
||||
|
||||
if self.tags:
|
||||
data["tags"] = self.tags
|
||||
if self.title:
|
||||
data["title"] = self.title
|
||||
if self.url:
|
||||
data["url"] = self.url
|
||||
if self.source_url:
|
||||
data["source_url"] = self.source_url
|
||||
if self.duration is not None:
|
||||
data["duration"] = self.duration
|
||||
if self.metadata:
|
||||
data["metadata"] = self.metadata
|
||||
if self.remote_metadata is not None:
|
||||
data["remote_metadata"] = self.remote_metadata
|
||||
if self.mpv_metadata is not None:
|
||||
data["mpv_metadata"] = self.mpv_metadata
|
||||
if self.warnings:
|
||||
data["warnings"] = self.warnings
|
||||
if self.file_path:
|
||||
data["file_path"] = self.file_path
|
||||
if self.file_hash:
|
||||
data["file_hash"] = self.file_hash
|
||||
# Include pipeline chain tracking fields
|
||||
if self.path:
|
||||
data["path"] = self.path
|
||||
if self.relationships:
|
||||
data["relationships"] = self.relationships
|
||||
if self.is_temp:
|
||||
data["is_temp"] = self.is_temp
|
||||
if self.action:
|
||||
data["action"] = self.action
|
||||
if self.parent_id:
|
||||
data["parent_id"] = self.parent_id
|
||||
# Include relationship data if present
|
||||
rels = self.get_relationships()
|
||||
if rels:
|
||||
data["relationships"] = rels
|
||||
if self.parent_hash:
|
||||
data["parent_hash"] = self.parent_hash
|
||||
|
||||
# Add extra fields
|
||||
data.update({k: v for k, v in self.extra.items() if v is not None})
|
||||
return data
|
||||
|
||||
@property
|
||||
def hash(self) -> str:
|
||||
"""Compute SHA-256 hash from source and identifier."""
|
||||
base = f"{self.source}:{self.identifier}"
|
||||
return hashlib.sha256(base.encode('utf-8')).hexdigest()
|
||||
|
||||
# Backwards compatibility aliases
|
||||
def as_dict(self) -> Dict[str, Any]:
|
||||
"""Alias for to_dict() for backwards compatibility."""
|
||||
return self.to_dict()
|
||||
|
||||
def to_serializable(self) -> Dict[str, Any]:
|
||||
"""Alias for to_dict() for backwards compatibility."""
|
||||
return self.to_dict()
|
||||
|
||||
|
||||
class FileRelationshipTracker:
|
||||
"""Track relationships between files for sidecar creation.
|
||||
@@ -235,6 +284,7 @@ class DownloadOptions:
|
||||
clip_sections: Optional[str] = None
|
||||
playlist_items: Optional[str] = None # yt-dlp --playlist-items format (e.g., "1-3,5,8")
|
||||
no_playlist: bool = False # If True, pass --no-playlist to yt-dlp
|
||||
quiet: bool = False # If True, suppress all console output (progress, debug logs)
|
||||
|
||||
|
||||
class SendFunc(Protocol):
|
||||
@@ -546,18 +596,25 @@ class ProgressBar:
|
||||
class PipelineStageContext:
|
||||
"""Context information for the current pipeline stage."""
|
||||
|
||||
def __init__(self, stage_index: int, total_stages: int):
|
||||
def __init__(self, stage_index: int, total_stages: int, worker_id: Optional[str] = None):
|
||||
self.stage_index = stage_index
|
||||
self.total_stages = total_stages
|
||||
self.is_last_stage = (stage_index == total_stages - 1)
|
||||
self.worker_id = worker_id
|
||||
self.emits: List[Any] = []
|
||||
|
||||
def emit(self, obj: Any) -> None:
|
||||
"""Emit an object to the next pipeline stage."""
|
||||
self.emits.append(obj)
|
||||
|
||||
def get_current_command_text(self) -> str:
|
||||
"""Get the current command text (for backward compatibility)."""
|
||||
# This is maintained for backward compatibility with old code
|
||||
# In a real implementation, this would come from the stage context
|
||||
return ""
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"PipelineStageContext(stage={self.stage_index}/{self.total_stages}, is_last={self.is_last_stage})"
|
||||
return f"PipelineStageContext(stage={self.stage_index}/{self.total_stages}, is_last={self.is_last_stage}, worker_id={self.worker_id})"
|
||||
|
||||
|
||||
# ============================================================================
|
||||
|
||||
241
pipeline.py
241
pipeline.py
@@ -25,21 +25,18 @@ from models import PipelineStageContext
|
||||
from helper.logger import log
|
||||
|
||||
|
||||
def _is_selectable_table(table: Any) -> bool:
|
||||
"""Return True when a table can be used for @ selection."""
|
||||
return bool(table) and not getattr(table, "no_choice", False)
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# PIPELINE GLOBALS (maintained for backward compatibility)
|
||||
# PIPELINE STATE
|
||||
# ============================================================================
|
||||
|
||||
# Current pipeline context (thread-local in real world, global here for simplicity)
|
||||
# Current pipeline context
|
||||
_CURRENT_CONTEXT: Optional[PipelineStageContext] = None
|
||||
|
||||
# Active execution state
|
||||
_PIPE_EMITS: List[Any] = []
|
||||
_PIPE_ACTIVE: bool = False
|
||||
_PIPE_IS_LAST: bool = False
|
||||
|
||||
# Ephemeral handoff for direct pipelines (e.g., URL --screen-shot | ...)
|
||||
_LAST_PIPELINE_CAPTURE: Optional[Any] = None
|
||||
|
||||
# Remember last search query to support refreshing results after pipeline actions
|
||||
_LAST_SEARCH_QUERY: Optional[str] = None
|
||||
|
||||
@@ -52,25 +49,23 @@ _PIPELINE_LAST_ITEMS: List[Any] = []
|
||||
# Store the last result table for @ selection syntax (e.g., @2, @2-5, @{1,3,5})
|
||||
_LAST_RESULT_TABLE: Optional[Any] = None
|
||||
_LAST_RESULT_ITEMS: List[Any] = []
|
||||
# Subject for the current result table (e.g., the file whose tags/URLs are displayed)
|
||||
# Subject for the current result table (e.g., the file whose tags/url are displayed)
|
||||
_LAST_RESULT_SUBJECT: Optional[Any] = None
|
||||
|
||||
# History of result tables for @.. navigation (LIFO stack, max 20 tables)
|
||||
_RESULT_TABLE_HISTORY: List[tuple[Optional[Any], List[Any], Optional[Any]]] = []
|
||||
_MAX_RESULT_TABLE_HISTORY = 20
|
||||
|
||||
# Forward history for @,, navigation (LIFO stack for popped tables)
|
||||
_RESULT_TABLE_FORWARD: List[tuple[Optional[Any], List[Any], Optional[Any]]] = []
|
||||
|
||||
# Current stage table for @N expansion (separate from history)
|
||||
# Used to track the ResultTable with source_command + row_selection_args from current pipeline stage
|
||||
# This is set by cmdlets that display tabular results (e.g., download-data showing formats)
|
||||
# and used by CLI to expand @N into full commands like "download-data URL -item 2"
|
||||
_CURRENT_STAGE_TABLE: Optional[Any] = None
|
||||
|
||||
# Items displayed by non-selectable commands (get-tag, delete-tag, etc.)
|
||||
# These are available for @N selection but NOT saved to history
|
||||
_DISPLAY_ITEMS: List[Any] = []
|
||||
|
||||
# Table for display-only commands (overlay)
|
||||
# Used when a command wants to show a specific table formatting but not affect history
|
||||
_DISPLAY_TABLE: Optional[Any] = None
|
||||
# Subject for overlay/display-only tables (takes precedence over _LAST_RESULT_SUBJECT)
|
||||
_DISPLAY_SUBJECT: Optional[Any] = None
|
||||
@@ -98,7 +93,7 @@ _UI_LIBRARY_REFRESH_CALLBACK: Optional[Any] = None
|
||||
# ============================================================================
|
||||
|
||||
def set_stage_context(context: Optional[PipelineStageContext]) -> None:
|
||||
"""Internal: Set the current pipeline stage context."""
|
||||
"""Set the current pipeline stage context."""
|
||||
global _CURRENT_CONTEXT
|
||||
_CURRENT_CONTEXT = context
|
||||
|
||||
@@ -126,26 +121,21 @@ def emit(obj: Any) -> None:
|
||||
return 0
|
||||
```
|
||||
"""
|
||||
# Try new context-based approach first
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.debug(f"[EMIT] Context-based: appending to _CURRENT_CONTEXT.emits. obj={obj}")
|
||||
_CURRENT_CONTEXT.emit(obj)
|
||||
return
|
||||
|
||||
|
||||
def emit_list(objects: List[Any]) -> None:
|
||||
"""Emit a list of objects to the next pipeline stage.
|
||||
|
||||
# Fallback to legacy global approach (for backward compatibility)
|
||||
try:
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.debug(f"[EMIT] Legacy: appending to _PIPE_EMITS. obj type={type(obj).__name__}, _PIPE_EMITS len before={len(_PIPE_EMITS)}")
|
||||
_PIPE_EMITS.append(obj)
|
||||
logger.debug(f"[EMIT] Legacy: _PIPE_EMITS len after={len(_PIPE_EMITS)}")
|
||||
except Exception as e:
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.error(f"[EMIT] Error appending to _PIPE_EMITS: {e}", exc_info=True)
|
||||
pass
|
||||
This allows cmdlets to emit multiple results that are tracked as a list,
|
||||
enabling downstream cmdlets to process all of them or filter by metadata.
|
||||
|
||||
Args:
|
||||
objects: List of objects to emit
|
||||
"""
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
_CURRENT_CONTEXT.emit(objects)
|
||||
|
||||
|
||||
def print_if_visible(*args: Any, file=None, **kwargs: Any) -> None:
|
||||
@@ -171,7 +161,7 @@ def print_if_visible(*args: Any, file=None, **kwargs: Any) -> None:
|
||||
"""
|
||||
try:
|
||||
# Print if: not in a pipeline OR this is the last stage
|
||||
should_print = (not _PIPE_ACTIVE) or _PIPE_IS_LAST
|
||||
should_print = (_CURRENT_CONTEXT is None) or (_CURRENT_CONTEXT and _CURRENT_CONTEXT.is_last_stage)
|
||||
|
||||
# Always print to stderr regardless
|
||||
if file is not None:
|
||||
@@ -304,17 +294,17 @@ def clear_pending_pipeline_tail() -> None:
|
||||
_PENDING_PIPELINE_SOURCE = None
|
||||
|
||||
|
||||
|
||||
|
||||
def reset() -> None:
|
||||
"""Reset all pipeline state. Called between pipeline executions."""
|
||||
global _PIPE_EMITS, _PIPE_ACTIVE, _PIPE_IS_LAST, _PIPELINE_VALUES
|
||||
global _LAST_PIPELINE_CAPTURE, _PIPELINE_REFRESHED, _PIPELINE_LAST_ITEMS
|
||||
global _PIPELINE_COMMAND_TEXT, _LAST_RESULT_SUBJECT, _DISPLAY_SUBJECT
|
||||
global _PENDING_PIPELINE_TAIL, _PENDING_PIPELINE_SOURCE
|
||||
global _PIPELINE_VALUES, _LAST_SEARCH_QUERY, _PIPELINE_REFRESHED
|
||||
global _PIPELINE_LAST_ITEMS, _PIPELINE_COMMAND_TEXT, _LAST_RESULT_SUBJECT
|
||||
global _DISPLAY_SUBJECT, _PENDING_PIPELINE_TAIL, _PENDING_PIPELINE_SOURCE
|
||||
global _CURRENT_CONTEXT
|
||||
|
||||
_PIPE_EMITS = []
|
||||
_PIPE_ACTIVE = False
|
||||
_PIPE_IS_LAST = False
|
||||
_LAST_PIPELINE_CAPTURE = None
|
||||
_CURRENT_CONTEXT = None
|
||||
_LAST_SEARCH_QUERY = None
|
||||
_PIPELINE_REFRESHED = False
|
||||
_PIPELINE_LAST_ITEMS = []
|
||||
_PIPELINE_VALUES = {}
|
||||
@@ -327,13 +317,15 @@ def reset() -> None:
|
||||
|
||||
def get_emitted_items() -> List[Any]:
|
||||
"""Get a copy of all items emitted by the current pipeline stage."""
|
||||
return list(_PIPE_EMITS)
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
return list(_CURRENT_CONTEXT.emits)
|
||||
return []
|
||||
|
||||
|
||||
def clear_emits() -> None:
|
||||
"""Clear the emitted items list (called between stages)."""
|
||||
global _PIPE_EMITS
|
||||
_PIPE_EMITS = []
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
_CURRENT_CONTEXT.emits.clear()
|
||||
|
||||
|
||||
def set_last_selection(indices: Sequence[int]) -> None:
|
||||
@@ -375,20 +367,8 @@ def clear_current_command_text() -> None:
|
||||
_PIPELINE_COMMAND_TEXT = ""
|
||||
|
||||
|
||||
def set_active(active: bool) -> None:
|
||||
"""Internal: Set whether we're in a pipeline context."""
|
||||
global _PIPE_ACTIVE
|
||||
_PIPE_ACTIVE = active
|
||||
|
||||
|
||||
def set_last_stage(is_last: bool) -> None:
|
||||
"""Internal: Set whether this is the last stage of the pipeline."""
|
||||
global _PIPE_IS_LAST
|
||||
_PIPE_IS_LAST = is_last
|
||||
|
||||
|
||||
def set_search_query(query: Optional[str]) -> None:
|
||||
"""Internal: Set the last search query for refresh purposes."""
|
||||
"""Set the last search query for refresh purposes."""
|
||||
global _LAST_SEARCH_QUERY
|
||||
_LAST_SEARCH_QUERY = query
|
||||
|
||||
@@ -399,7 +379,7 @@ def get_search_query() -> Optional[str]:
|
||||
|
||||
|
||||
def set_pipeline_refreshed(refreshed: bool) -> None:
|
||||
"""Internal: Track whether the pipeline already refreshed results."""
|
||||
"""Track whether the pipeline already refreshed results."""
|
||||
global _PIPELINE_REFRESHED
|
||||
_PIPELINE_REFRESHED = refreshed
|
||||
|
||||
@@ -410,7 +390,7 @@ def was_pipeline_refreshed() -> bool:
|
||||
|
||||
|
||||
def set_last_items(items: list) -> None:
|
||||
"""Internal: Cache the last pipeline outputs."""
|
||||
"""Cache the last pipeline outputs."""
|
||||
global _PIPELINE_LAST_ITEMS
|
||||
_PIPELINE_LAST_ITEMS = list(items) if items else []
|
||||
|
||||
@@ -420,17 +400,6 @@ def get_last_items() -> List[Any]:
|
||||
return list(_PIPELINE_LAST_ITEMS)
|
||||
|
||||
|
||||
def set_last_capture(obj: Any) -> None:
|
||||
"""Internal: Store ephemeral handoff for direct pipelines."""
|
||||
global _LAST_PIPELINE_CAPTURE
|
||||
_LAST_PIPELINE_CAPTURE = obj
|
||||
|
||||
|
||||
def get_last_capture() -> Optional[Any]:
|
||||
"""Get ephemeral pipeline handoff (e.g., URL --screen-shot | ...)."""
|
||||
return _LAST_PIPELINE_CAPTURE
|
||||
|
||||
|
||||
def set_ui_library_refresh_callback(callback: Any) -> None:
|
||||
"""Set a callback to be called when library content is updated.
|
||||
|
||||
@@ -501,6 +470,22 @@ def set_last_result_table(result_table: Optional[Any], items: Optional[List[Any]
|
||||
_LAST_RESULT_TABLE = result_table
|
||||
_LAST_RESULT_ITEMS = items or []
|
||||
_LAST_RESULT_SUBJECT = subject
|
||||
|
||||
# Sort table by Title/Name column alphabetically if available
|
||||
if result_table is not None and hasattr(result_table, 'sort_by_title') and not getattr(result_table, 'preserve_order', False):
|
||||
try:
|
||||
result_table.sort_by_title()
|
||||
# Re-order items list to match the sorted table
|
||||
if _LAST_RESULT_ITEMS and hasattr(result_table, 'rows'):
|
||||
sorted_items = []
|
||||
for row in result_table.rows:
|
||||
src_idx = getattr(row, 'source_index', None)
|
||||
if isinstance(src_idx, int) and 0 <= src_idx < len(_LAST_RESULT_ITEMS):
|
||||
sorted_items.append(_LAST_RESULT_ITEMS[src_idx])
|
||||
if len(sorted_items) == len(result_table.rows):
|
||||
_LAST_RESULT_ITEMS = sorted_items
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
def set_last_result_table_overlay(result_table: Optional[Any], items: Optional[List[Any]] = None, subject: Optional[Any] = None) -> None:
|
||||
@@ -518,6 +503,22 @@ def set_last_result_table_overlay(result_table: Optional[Any], items: Optional[L
|
||||
_DISPLAY_TABLE = result_table
|
||||
_DISPLAY_ITEMS = items or []
|
||||
_DISPLAY_SUBJECT = subject
|
||||
|
||||
# Sort table by Title/Name column alphabetically if available
|
||||
if result_table is not None and hasattr(result_table, 'sort_by_title') and not getattr(result_table, 'preserve_order', False):
|
||||
try:
|
||||
result_table.sort_by_title()
|
||||
# Re-order items list to match the sorted table
|
||||
if _DISPLAY_ITEMS and hasattr(result_table, 'rows'):
|
||||
sorted_items = []
|
||||
for row in result_table.rows:
|
||||
src_idx = getattr(row, 'source_index', None)
|
||||
if isinstance(src_idx, int) and 0 <= src_idx < len(_DISPLAY_ITEMS):
|
||||
sorted_items.append(_DISPLAY_ITEMS[src_idx])
|
||||
if len(sorted_items) == len(result_table.rows):
|
||||
_DISPLAY_ITEMS = sorted_items
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
def set_last_result_table_preserve_history(result_table: Optional[Any], items: Optional[List[Any]] = None, subject: Optional[Any] = None) -> None:
|
||||
@@ -567,7 +568,7 @@ def restore_previous_result_table() -> bool:
|
||||
True if a previous table was restored, False if history is empty
|
||||
"""
|
||||
global _LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT
|
||||
global _RESULT_TABLE_HISTORY, _DISPLAY_ITEMS, _DISPLAY_TABLE, _DISPLAY_SUBJECT
|
||||
global _RESULT_TABLE_HISTORY, _RESULT_TABLE_FORWARD, _DISPLAY_ITEMS, _DISPLAY_TABLE, _DISPLAY_SUBJECT
|
||||
|
||||
# If we have an active overlay (display items/table), clear it to "go back" to the underlying table
|
||||
if _DISPLAY_ITEMS or _DISPLAY_TABLE or _DISPLAY_SUBJECT is not None:
|
||||
@@ -579,6 +580,9 @@ def restore_previous_result_table() -> bool:
|
||||
if not _RESULT_TABLE_HISTORY:
|
||||
return False
|
||||
|
||||
# Save current state to forward stack before popping
|
||||
_RESULT_TABLE_FORWARD.append((_LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT))
|
||||
|
||||
# Pop from history and restore
|
||||
prev = _RESULT_TABLE_HISTORY.pop()
|
||||
if isinstance(prev, tuple) and len(prev) >= 3:
|
||||
@@ -595,6 +599,44 @@ def restore_previous_result_table() -> bool:
|
||||
return True
|
||||
|
||||
|
||||
def restore_next_result_table() -> bool:
|
||||
"""Restore the next result table from forward history (for @,, navigation).
|
||||
|
||||
Returns:
|
||||
True if a next table was restored, False if forward history is empty
|
||||
"""
|
||||
global _LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT
|
||||
global _RESULT_TABLE_HISTORY, _RESULT_TABLE_FORWARD, _DISPLAY_ITEMS, _DISPLAY_TABLE, _DISPLAY_SUBJECT
|
||||
|
||||
# If we have an active overlay (display items/table), clear it to "go forward" to the underlying table
|
||||
if _DISPLAY_ITEMS or _DISPLAY_TABLE or _DISPLAY_SUBJECT is not None:
|
||||
_DISPLAY_ITEMS = []
|
||||
_DISPLAY_TABLE = None
|
||||
_DISPLAY_SUBJECT = None
|
||||
return True
|
||||
|
||||
if not _RESULT_TABLE_FORWARD:
|
||||
return False
|
||||
|
||||
# Save current state to history stack before popping forward
|
||||
_RESULT_TABLE_HISTORY.append((_LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT))
|
||||
|
||||
# Pop from forward stack and restore
|
||||
next_state = _RESULT_TABLE_FORWARD.pop()
|
||||
if isinstance(next_state, tuple) and len(next_state) >= 3:
|
||||
_LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT = next_state[0], next_state[1], next_state[2]
|
||||
elif isinstance(next_state, tuple) and len(next_state) == 2:
|
||||
_LAST_RESULT_TABLE, _LAST_RESULT_ITEMS = next_state
|
||||
_LAST_RESULT_SUBJECT = None
|
||||
else:
|
||||
_LAST_RESULT_TABLE, _LAST_RESULT_ITEMS, _LAST_RESULT_SUBJECT = None, [], None
|
||||
# Clear display items so get_last_result_items() falls back to restored items
|
||||
_DISPLAY_ITEMS = []
|
||||
_DISPLAY_TABLE = None
|
||||
_DISPLAY_SUBJECT = None
|
||||
return True
|
||||
|
||||
|
||||
def get_display_table() -> Optional[Any]:
|
||||
"""Get the current display overlay table.
|
||||
|
||||
@@ -637,9 +679,15 @@ def get_last_result_items() -> List[Any]:
|
||||
# Prioritize items from display commands (get-tag, delete-tag, etc.)
|
||||
# These are available for immediate @N selection
|
||||
if _DISPLAY_ITEMS:
|
||||
if _DISPLAY_TABLE is not None and not _is_selectable_table(_DISPLAY_TABLE):
|
||||
return []
|
||||
return _DISPLAY_ITEMS
|
||||
# Fall back to items from last search/selectable command
|
||||
return _LAST_RESULT_ITEMS
|
||||
if _LAST_RESULT_TABLE is None:
|
||||
return _LAST_RESULT_ITEMS
|
||||
if _is_selectable_table(_LAST_RESULT_TABLE):
|
||||
return _LAST_RESULT_ITEMS
|
||||
return []
|
||||
|
||||
|
||||
def get_last_result_table_source_command() -> Optional[str]:
|
||||
@@ -648,7 +696,7 @@ def get_last_result_table_source_command() -> Optional[str]:
|
||||
Returns:
|
||||
Command name (e.g., 'download-data') or None if not set
|
||||
"""
|
||||
if _LAST_RESULT_TABLE and hasattr(_LAST_RESULT_TABLE, 'source_command'):
|
||||
if _is_selectable_table(_LAST_RESULT_TABLE) and hasattr(_LAST_RESULT_TABLE, 'source_command'):
|
||||
return _LAST_RESULT_TABLE.source_command
|
||||
return None
|
||||
|
||||
@@ -659,7 +707,7 @@ def get_last_result_table_source_args() -> List[str]:
|
||||
Returns:
|
||||
List of arguments (e.g., ['https://example.com']) or empty list
|
||||
"""
|
||||
if _LAST_RESULT_TABLE and hasattr(_LAST_RESULT_TABLE, 'source_args'):
|
||||
if _is_selectable_table(_LAST_RESULT_TABLE) and hasattr(_LAST_RESULT_TABLE, 'source_args'):
|
||||
return _LAST_RESULT_TABLE.source_args or []
|
||||
return []
|
||||
|
||||
@@ -673,7 +721,7 @@ def get_last_result_table_row_selection_args(row_index: int) -> Optional[List[st
|
||||
Returns:
|
||||
Selection arguments (e.g., ['-item', '3']) or None
|
||||
"""
|
||||
if _LAST_RESULT_TABLE and hasattr(_LAST_RESULT_TABLE, 'rows'):
|
||||
if _is_selectable_table(_LAST_RESULT_TABLE) and hasattr(_LAST_RESULT_TABLE, 'rows'):
|
||||
if 0 <= row_index < len(_LAST_RESULT_TABLE.rows):
|
||||
row = _LAST_RESULT_TABLE.rows[row_index]
|
||||
if hasattr(row, 'selection_args'):
|
||||
@@ -696,13 +744,18 @@ def set_current_stage_table(result_table: Optional[Any]) -> None:
|
||||
_CURRENT_STAGE_TABLE = result_table
|
||||
|
||||
|
||||
def get_current_stage_table() -> Optional[Any]:
|
||||
"""Get the current pipeline stage table (if any)."""
|
||||
return _CURRENT_STAGE_TABLE
|
||||
|
||||
|
||||
def get_current_stage_table_source_command() -> Optional[str]:
|
||||
"""Get the source command from the current pipeline stage table.
|
||||
|
||||
Returns:
|
||||
Command name (e.g., 'download-data') or None
|
||||
"""
|
||||
if _CURRENT_STAGE_TABLE and hasattr(_CURRENT_STAGE_TABLE, 'source_command'):
|
||||
if _is_selectable_table(_CURRENT_STAGE_TABLE) and hasattr(_CURRENT_STAGE_TABLE, 'source_command'):
|
||||
return _CURRENT_STAGE_TABLE.source_command
|
||||
return None
|
||||
|
||||
@@ -713,7 +766,7 @@ def get_current_stage_table_source_args() -> List[str]:
|
||||
Returns:
|
||||
List of arguments or empty list
|
||||
"""
|
||||
if _CURRENT_STAGE_TABLE and hasattr(_CURRENT_STAGE_TABLE, 'source_args'):
|
||||
if _is_selectable_table(_CURRENT_STAGE_TABLE) and hasattr(_CURRENT_STAGE_TABLE, 'source_args'):
|
||||
return _CURRENT_STAGE_TABLE.source_args or []
|
||||
return []
|
||||
|
||||
@@ -727,7 +780,7 @@ def get_current_stage_table_row_selection_args(row_index: int) -> Optional[List[
|
||||
Returns:
|
||||
Selection arguments or None
|
||||
"""
|
||||
if _CURRENT_STAGE_TABLE and hasattr(_CURRENT_STAGE_TABLE, 'rows'):
|
||||
if _is_selectable_table(_CURRENT_STAGE_TABLE) and hasattr(_CURRENT_STAGE_TABLE, 'rows'):
|
||||
if 0 <= row_index < len(_CURRENT_STAGE_TABLE.rows):
|
||||
row = _CURRENT_STAGE_TABLE.rows[row_index]
|
||||
if hasattr(row, 'selection_args'):
|
||||
@@ -735,23 +788,21 @@ def get_current_stage_table_row_selection_args(row_index: int) -> Optional[List[
|
||||
return None
|
||||
|
||||
|
||||
def get_current_stage_table_row_source_index(row_index: int) -> Optional[int]:
|
||||
"""Get the original source index for a row in the current stage table.
|
||||
|
||||
Useful when the table has been sorted for display but selections should map
|
||||
back to the original item order (e.g., playlist or provider order).
|
||||
"""
|
||||
if _is_selectable_table(_CURRENT_STAGE_TABLE) and hasattr(_CURRENT_STAGE_TABLE, 'rows'):
|
||||
if 0 <= row_index < len(_CURRENT_STAGE_TABLE.rows):
|
||||
row = _CURRENT_STAGE_TABLE.rows[row_index]
|
||||
return getattr(row, 'source_index', None)
|
||||
return None
|
||||
|
||||
|
||||
def clear_last_result() -> None:
|
||||
"""Clear the stored last result table and items."""
|
||||
global _LAST_RESULT_TABLE, _LAST_RESULT_ITEMS
|
||||
_LAST_RESULT_TABLE = None
|
||||
_LAST_RESULT_ITEMS = []
|
||||
|
||||
|
||||
def emit_list(objects: List[Any]) -> None:
|
||||
"""Emit a list of PipeObjects to the next pipeline stage.
|
||||
|
||||
This allows cmdlets to emit multiple results that are tracked as a list,
|
||||
enabling downstream cmdlets to process all of them or filter by metadata.
|
||||
|
||||
Args:
|
||||
objects: List of PipeObject instances or dicts to emit
|
||||
"""
|
||||
if _CURRENT_CONTEXT is not None:
|
||||
_CURRENT_CONTEXT.emit(objects)
|
||||
else:
|
||||
_PIPE_EMITS.append(objects)
|
||||
|
||||
@@ -106,7 +106,7 @@ dev = [
|
||||
mm = "medeia_macina.cli_entry:main"
|
||||
medeia = "medeia_macina.cli_entry:main"
|
||||
|
||||
[project.urls]
|
||||
[project.url]
|
||||
Homepage = "https://github.com/yourusername/medeia-macina"
|
||||
Documentation = "https://medeia-macina.readthedocs.io"
|
||||
Repository = "https://github.com/yourusername/medeia-macina.git"
|
||||
|
||||
198
result_table.py
198
result_table.py
@@ -114,6 +114,8 @@ class ResultRow:
|
||||
columns: List[ResultColumn] = field(default_factory=list)
|
||||
selection_args: Optional[List[str]] = None
|
||||
"""Arguments to use for this row when selected via @N syntax (e.g., ['-item', '3'])"""
|
||||
source_index: Optional[int] = None
|
||||
"""Original insertion order index (used to map sorted views back to source items)."""
|
||||
|
||||
def add_column(self, name: str, value: Any) -> None:
|
||||
"""Add a column to this row."""
|
||||
@@ -166,13 +168,14 @@ class ResultTable:
|
||||
>>> print(result_table)
|
||||
"""
|
||||
|
||||
def __init__(self, title: str = "", title_width: int = 80, max_columns: int = None):
|
||||
def __init__(self, title: str = "", title_width: int = 80, max_columns: int = None, preserve_order: bool = False):
|
||||
"""Initialize a result table.
|
||||
|
||||
Args:
|
||||
title: Optional title for the table
|
||||
title_width: Width for formatting the title line
|
||||
max_columns: Maximum number of columns to display (None for unlimited, default: 5 for search results)
|
||||
preserve_order: When True, skip automatic sorting so row order matches source
|
||||
"""
|
||||
self.title = title
|
||||
self.title_width = title_width
|
||||
@@ -187,10 +190,25 @@ class ResultTable:
|
||||
"""Base arguments for the source command"""
|
||||
self.header_lines: List[str] = []
|
||||
"""Optional metadata lines rendered under the title"""
|
||||
self.preserve_order: bool = preserve_order
|
||||
"""If True, skip automatic sorting so display order matches input order."""
|
||||
self.no_choice: bool = False
|
||||
"""When True, suppress row numbers/selection to make the table non-interactive."""
|
||||
|
||||
def set_no_choice(self, no_choice: bool = True) -> "ResultTable":
|
||||
"""Mark the table as non-interactive (no row numbers, no selection parsing)."""
|
||||
self.no_choice = bool(no_choice)
|
||||
return self
|
||||
|
||||
def set_preserve_order(self, preserve: bool = True) -> "ResultTable":
|
||||
"""Configure whether this table should skip automatic sorting."""
|
||||
self.preserve_order = bool(preserve)
|
||||
return self
|
||||
|
||||
def add_row(self) -> ResultRow:
|
||||
"""Add a new row to the table and return it for configuration."""
|
||||
row = ResultRow()
|
||||
row.source_index = len(self.rows)
|
||||
self.rows.append(row)
|
||||
return row
|
||||
|
||||
@@ -210,6 +228,50 @@ class ResultTable:
|
||||
self.source_command = command
|
||||
self.source_args = args or []
|
||||
return self
|
||||
|
||||
def init_command(self, title: str, command: str, args: Optional[List[str]] = None, preserve_order: bool = False) -> "ResultTable":
|
||||
"""Initialize table with title, command, args, and preserve_order in one call.
|
||||
|
||||
Consolidates common initialization pattern: ResultTable(title) + set_source_command(cmd, args) + set_preserve_order(preserve_order)
|
||||
|
||||
Args:
|
||||
title: Table title
|
||||
command: Source command name
|
||||
args: Command arguments
|
||||
preserve_order: Whether to preserve input row order
|
||||
|
||||
Returns:
|
||||
self for method chaining
|
||||
"""
|
||||
self.title = title
|
||||
self.source_command = command
|
||||
self.source_args = args or []
|
||||
self.preserve_order = preserve_order
|
||||
return self
|
||||
|
||||
def copy_with_title(self, new_title: str) -> "ResultTable":
|
||||
"""Create a new table copying settings from this one but with a new title.
|
||||
|
||||
Consolidates pattern: new_table = ResultTable(title); new_table.set_source_command(...)
|
||||
Useful for intermediate processing that needs to preserve source command but update display title.
|
||||
|
||||
Args:
|
||||
new_title: New title for the copied table
|
||||
|
||||
Returns:
|
||||
New ResultTable with copied settings and new title
|
||||
"""
|
||||
new_table = ResultTable(
|
||||
title=new_title,
|
||||
title_width=self.title_width,
|
||||
max_columns=self.max_columns,
|
||||
preserve_order=self.preserve_order
|
||||
)
|
||||
new_table.source_command = self.source_command
|
||||
new_table.source_args = list(self.source_args) if self.source_args else []
|
||||
new_table.input_options = dict(self.input_options) if self.input_options else {}
|
||||
new_table.no_choice = self.no_choice
|
||||
return new_table
|
||||
|
||||
def set_row_selection_args(self, row_index: int, selection_args: List[str]) -> None:
|
||||
"""Set the selection arguments for a specific row.
|
||||
@@ -252,6 +314,39 @@ class ResultTable:
|
||||
self.set_header_line(summary)
|
||||
return summary
|
||||
|
||||
def sort_by_title(self) -> "ResultTable":
|
||||
"""Sort rows alphabetically by Title or Name column.
|
||||
|
||||
Looks for columns named 'Title', 'Name', or 'Tag' (in that order).
|
||||
Case-insensitive sort. Returns self for chaining.
|
||||
|
||||
IMPORTANT: Updates source_index to match new sorted positions so that
|
||||
@N selections continue to work correctly after sorting.
|
||||
"""
|
||||
if getattr(self, "preserve_order", False):
|
||||
return self
|
||||
# Find the title column (try Title, Name, Tag in order)
|
||||
title_col_idx = None
|
||||
for row in self.rows:
|
||||
if not row.columns:
|
||||
continue
|
||||
for idx, col in enumerate(row.columns):
|
||||
col_lower = col.name.lower()
|
||||
if col_lower in ("title", "name", "tag"):
|
||||
title_col_idx = idx
|
||||
break
|
||||
if title_col_idx is not None:
|
||||
break
|
||||
|
||||
if title_col_idx is None:
|
||||
# No title column found, return unchanged
|
||||
return self
|
||||
|
||||
# Sort rows by the title column value (case-insensitive)
|
||||
self.rows.sort(key=lambda row: row.columns[title_col_idx].value.lower() if title_col_idx < len(row.columns) else "")
|
||||
|
||||
return self
|
||||
|
||||
def add_result(self, result: Any) -> "ResultTable":
|
||||
"""Add a result object (SearchResult, PipeObject, ResultItem, TagItem, or dict) as a row.
|
||||
|
||||
@@ -338,8 +433,7 @@ class ResultTable:
|
||||
|
||||
# Size (for files)
|
||||
if hasattr(result, 'size_bytes') and result.size_bytes:
|
||||
size_mb = result.size_bytes / (1024 * 1024)
|
||||
row.add_column("Size", f"{size_mb:.1f} MB")
|
||||
row.add_column("Size (Mb)", _format_size(result.size_bytes, integer_only=True))
|
||||
|
||||
# Annotations
|
||||
if hasattr(result, 'annotations') and result.annotations:
|
||||
@@ -385,8 +479,7 @@ class ResultTable:
|
||||
|
||||
# Size (for files) - integer MB only
|
||||
if hasattr(item, 'size_bytes') and item.size_bytes:
|
||||
size_mb = int(item.size_bytes / (1024 * 1024))
|
||||
row.add_column("Size", f"{size_mb} MB")
|
||||
row.add_column("Size (Mb)", _format_size(item.size_bytes, integer_only=True))
|
||||
|
||||
def _add_tag_item(self, row: ResultRow, item: Any) -> None:
|
||||
"""Extract and add TagItem fields to row (compact tag display).
|
||||
@@ -421,8 +514,8 @@ class ResultTable:
|
||||
row.add_column("Title", obj.title[:50] + ("..." if len(obj.title) > 50 else ""))
|
||||
|
||||
# File info
|
||||
if hasattr(obj, 'file_path') and obj.file_path:
|
||||
file_str = str(obj.file_path)
|
||||
if hasattr(obj, 'path') and obj.path:
|
||||
file_str = str(obj.path)
|
||||
if len(file_str) > 60:
|
||||
file_str = "..." + file_str[-57:]
|
||||
row.add_column("Path", file_str)
|
||||
@@ -467,8 +560,8 @@ class ResultTable:
|
||||
def is_hidden_field(field_name: Any) -> bool:
|
||||
# Hide internal/metadata fields
|
||||
hidden_fields = {
|
||||
'__', 'id', 'action', 'parent_id', 'is_temp', 'file_path', 'extra',
|
||||
'target', 'hash', 'hash_hex', 'file_hash'
|
||||
'__', 'id', 'action', 'parent_id', 'is_temp', 'path', 'extra',
|
||||
'target', 'hash', 'hash_hex', 'file_hash', 'tags', 'tag_summary', 'name'
|
||||
}
|
||||
if isinstance(field_name, str):
|
||||
if field_name.startswith('__'):
|
||||
@@ -551,15 +644,12 @@ class ResultTable:
|
||||
|
||||
# Only add priority groups if we haven't already filled columns from 'columns' field
|
||||
if column_count == 0:
|
||||
# Priority field groups - uses first matching field in each group
|
||||
# Explicitly set which columns to display in order
|
||||
priority_groups = [
|
||||
('title | name | filename', ['title', 'name', 'filename']),
|
||||
('title', ['title']),
|
||||
('ext', ['ext']),
|
||||
('origin | source | store', ['origin', 'source', 'store']),
|
||||
('size | size_bytes', ['size', 'size_bytes']),
|
||||
('type | media_kind | kind', ['type', 'media_kind', 'kind']),
|
||||
('tags | tag_summary', ['tags', 'tag_summary']),
|
||||
('detail | description', ['detail', 'description']),
|
||||
('size', ['size', 'size_bytes']),
|
||||
('store', ['store', 'origin', 'source']),
|
||||
]
|
||||
|
||||
# Add priority field groups first - use first match in each group
|
||||
@@ -568,14 +658,22 @@ class ResultTable:
|
||||
break
|
||||
for field in field_options:
|
||||
if field in visible_data and field not in added_fields:
|
||||
value_str = format_value(visible_data[field])
|
||||
# Special handling for size fields - format as MB integer
|
||||
if field in ['size', 'size_bytes']:
|
||||
value_str = _format_size(visible_data[field], integer_only=True)
|
||||
else:
|
||||
value_str = format_value(visible_data[field])
|
||||
|
||||
if len(value_str) > 60:
|
||||
value_str = value_str[:57] + "..."
|
||||
|
||||
# Special case for Origin/Source -> Store to match user preference
|
||||
col_name = field.replace('_', ' ').title()
|
||||
if field in ['origin', 'source']:
|
||||
# Map field names to display column names
|
||||
if field in ['store', 'origin', 'source']:
|
||||
col_name = "Store"
|
||||
elif field in ['size', 'size_bytes']:
|
||||
col_name = "Size (Mb)"
|
||||
else:
|
||||
col_name = field.replace('_', ' ').title()
|
||||
|
||||
row.add_column(col_name, value_str)
|
||||
added_fields.add(field)
|
||||
@@ -583,17 +681,7 @@ class ResultTable:
|
||||
break # Use first match in this group, skip rest
|
||||
|
||||
# Add remaining fields only if we haven't hit max_columns (and no explicit columns were set)
|
||||
if column_count < self.max_columns:
|
||||
for key, value in visible_data.items():
|
||||
if column_count >= self.max_columns:
|
||||
break
|
||||
if key not in added_fields: # Only add if not already added
|
||||
value_str = format_value(value)
|
||||
if len(value_str) > 40:
|
||||
value_str = value_str[:37] + "..."
|
||||
row.add_column(key.replace('_', ' ').title(), value_str)
|
||||
added_fields.add(key) # Track in added_fields to prevent re-adding
|
||||
column_count += 1
|
||||
# Don't add any remaining fields - only use priority_groups for dict results
|
||||
|
||||
# Check for selection args
|
||||
if '_selection_args' in data:
|
||||
@@ -637,8 +725,8 @@ class ResultTable:
|
||||
value_width
|
||||
)
|
||||
|
||||
# Calculate row number column width
|
||||
num_width = len(str(len(self.rows))) + 1 # +1 for padding
|
||||
# Calculate row number column width (skip if no-choice)
|
||||
num_width = 0 if self.no_choice else len(str(len(self.rows))) + 1
|
||||
|
||||
# Preserve column order
|
||||
column_names = list(col_widths.keys())
|
||||
@@ -647,7 +735,7 @@ class ResultTable:
|
||||
cap = 5 if name.lower() == "ext" else 90
|
||||
return min(col_widths[name], cap)
|
||||
|
||||
widths = [num_width] + [capped_width(name) for name in column_names]
|
||||
widths = ([] if self.no_choice else [num_width]) + [capped_width(name) for name in column_names]
|
||||
base_inner_width = sum(widths) + (len(widths) - 1) * 3 # account for " | " separators
|
||||
|
||||
# Compute final table width (with side walls) to accommodate headers/titles
|
||||
@@ -668,7 +756,7 @@ class ResultTable:
|
||||
# Title block
|
||||
if self.title:
|
||||
lines.append("|" + "=" * (table_width - 2) + "|")
|
||||
lines.append(wrap(self.title.center(table_width - 2)))
|
||||
lines.append(wrap(self.title.ljust(table_width - 2)))
|
||||
lines.append("|" + "=" * (table_width - 2) + "|")
|
||||
|
||||
# Optional header metadata lines
|
||||
@@ -676,8 +764,8 @@ class ResultTable:
|
||||
lines.append(wrap(meta))
|
||||
|
||||
# Add header with # column
|
||||
header_parts = ["#".ljust(num_width)]
|
||||
separator_parts = ["-" * num_width]
|
||||
header_parts = [] if self.no_choice else ["#".ljust(num_width)]
|
||||
separator_parts = [] if self.no_choice else ["-" * num_width]
|
||||
for col_name in column_names:
|
||||
width = capped_width(col_name)
|
||||
header_parts.append(col_name.ljust(width))
|
||||
@@ -688,7 +776,7 @@ class ResultTable:
|
||||
|
||||
# Add rows with row numbers
|
||||
for row_num, row in enumerate(self.rows, 1):
|
||||
row_parts = [str(row_num).ljust(num_width)]
|
||||
row_parts = [] if self.no_choice else [str(row_num).ljust(num_width)]
|
||||
for col_name in column_names:
|
||||
width = capped_width(col_name)
|
||||
col_value = row.get_column(col_name) or ""
|
||||
@@ -785,6 +873,11 @@ class ResultTable:
|
||||
If accept_args=False: List of 0-based indices, or None if cancelled
|
||||
If accept_args=True: Dict with "indices" and "args" keys, or None if cancelled
|
||||
"""
|
||||
if self.no_choice:
|
||||
print(f"\n{self}")
|
||||
print("Selection is disabled for this table.")
|
||||
return None
|
||||
|
||||
# Display the table
|
||||
print(f"\n{self}")
|
||||
|
||||
@@ -832,6 +925,9 @@ class ResultTable:
|
||||
Returns:
|
||||
List of 0-based indices, or None if invalid
|
||||
"""
|
||||
if self.no_choice:
|
||||
return None
|
||||
|
||||
indices = set()
|
||||
|
||||
# Split by comma for multiple selections
|
||||
@@ -1206,14 +1302,15 @@ def _format_duration(duration: Any) -> str:
|
||||
return ""
|
||||
|
||||
|
||||
def _format_size(size: Any) -> str:
|
||||
def _format_size(size: Any, integer_only: bool = False) -> str:
|
||||
"""Format file size as human-readable string.
|
||||
|
||||
Args:
|
||||
size: Size in bytes or already formatted string
|
||||
integer_only: If True, show MB as integer only (e.g., "250 MB" not "250.5 MB")
|
||||
|
||||
Returns:
|
||||
Formatted size string (e.g., "1.5 MB", "250 KB")
|
||||
Formatted size string (e.g., "250 MB", "1.5 MB" or "250 MB" if integer_only=True)
|
||||
"""
|
||||
if isinstance(size, str):
|
||||
return size if size else ""
|
||||
@@ -1223,11 +1320,22 @@ def _format_size(size: Any) -> str:
|
||||
if bytes_val < 0:
|
||||
return ""
|
||||
|
||||
for unit, divisor in [("GB", 1024**3), ("MB", 1024**2), ("KB", 1024)]:
|
||||
if bytes_val >= divisor:
|
||||
return f"{bytes_val / divisor:.1f} {unit}"
|
||||
|
||||
return f"{bytes_val} B"
|
||||
if integer_only:
|
||||
# For table display: always show as integer MB if >= 1MB
|
||||
mb_val = int(bytes_val / (1024 * 1024))
|
||||
if mb_val > 0:
|
||||
return str(mb_val)
|
||||
kb_val = int(bytes_val / 1024)
|
||||
if kb_val > 0:
|
||||
return str(kb_val)
|
||||
return str(bytes_val)
|
||||
else:
|
||||
# For descriptions: show with one decimal place
|
||||
for unit, divisor in [("GB", 1024**3), ("MB", 1024**2), ("KB", 1024)]:
|
||||
if bytes_val >= divisor:
|
||||
return f"{bytes_val / divisor:.1f} {unit}"
|
||||
|
||||
return f"{bytes_val} B"
|
||||
except (ValueError, TypeError):
|
||||
return ""
|
||||
|
||||
|
||||
10
scripts/check_cmdlets_import.py
Normal file
10
scripts/check_cmdlets_import.py
Normal file
@@ -0,0 +1,10 @@
|
||||
import importlib
|
||||
import traceback
|
||||
import sys
|
||||
|
||||
try:
|
||||
importlib.import_module('cmdlets')
|
||||
print('cmdlets imported OK')
|
||||
except Exception:
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
8
scripts/check_download_media.py
Normal file
8
scripts/check_download_media.py
Normal file
@@ -0,0 +1,8 @@
|
||||
import importlib, traceback, sys
|
||||
|
||||
try:
|
||||
importlib.import_module('cmdlets.download_media')
|
||||
print('download_media imported OK')
|
||||
except Exception:
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
5
scripts/inspect_shared_lines.py
Normal file
5
scripts/inspect_shared_lines.py
Normal file
@@ -0,0 +1,5 @@
|
||||
from pathlib import Path
|
||||
p = Path('cmdlets/_shared.py')
|
||||
for i, line in enumerate(p.read_text().splitlines(), start=1):
|
||||
if 1708 <= i <= 1720:
|
||||
print(f"{i:4}: {repr(line)}")
|
||||
24
scripts/normalize_shared_indent.py
Normal file
24
scripts/normalize_shared_indent.py
Normal file
@@ -0,0 +1,24 @@
|
||||
from pathlib import Path
|
||||
import re
|
||||
|
||||
p = Path('cmdlets/_shared.py')
|
||||
src = p.read_text(encoding='utf-8')
|
||||
lines = src.splitlines(True)
|
||||
changed = False
|
||||
new_lines = []
|
||||
for line in lines:
|
||||
m = re.match(r'^(?P<ws>[ \t]*)', line)
|
||||
ws = m.group('ws') if m else ''
|
||||
if '\t' in ws:
|
||||
new_ws = ws.replace('\t', ' ')
|
||||
new_line = new_ws + line[len(ws):]
|
||||
new_lines.append(new_line)
|
||||
changed = True
|
||||
else:
|
||||
new_lines.append(line)
|
||||
|
||||
if changed:
|
||||
p.write_text(''.join(new_lines), encoding='utf-8')
|
||||
print('Normalized leading tabs to spaces in', p)
|
||||
else:
|
||||
print('No leading tabs found; no changes made')
|
||||
160
scripts/refactor_download_careful.py
Normal file
160
scripts/refactor_download_careful.py
Normal file
@@ -0,0 +1,160 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Careful refactoring of download_data.py to class-based pattern.
|
||||
Handles nested functions and inner definitions correctly.
|
||||
"""
|
||||
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
def refactor_download_data():
|
||||
backup_file = Path('cmdlets/download_data_backup.py')
|
||||
output_file = Path('cmdlets/download_data.py')
|
||||
|
||||
print(f"Reading: {backup_file}")
|
||||
content = backup_file.read_text(encoding='utf-8')
|
||||
lines = content.split('\n')
|
||||
|
||||
output = []
|
||||
i = 0
|
||||
in_cmdlet_def = False
|
||||
skip_old_run_wrapper = False
|
||||
class_added = False
|
||||
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
|
||||
# Skip old _run wrapper function
|
||||
if line.strip().startswith('def _run(result: Any'):
|
||||
while i < len(lines):
|
||||
i += 1
|
||||
if lines[i] and not lines[i][0].isspace():
|
||||
break
|
||||
continue
|
||||
|
||||
# Skip old CMDLET definition
|
||||
if line.strip().startswith('CMDLET = Cmdlet('):
|
||||
while i < len(lines):
|
||||
i += 1
|
||||
if lines[i].strip() == ')':
|
||||
i += 1
|
||||
break
|
||||
output.append('')
|
||||
output.append('# Create and register the cmdlet')
|
||||
output.append('CMDLET = Download_Data()')
|
||||
output.append('')
|
||||
continue
|
||||
|
||||
# Insert class definition before first top-level helper
|
||||
if not class_added and line.strip().startswith('def _download_torrent_worker('):
|
||||
# Add class header with __init__ and run()
|
||||
output.extend([
|
||||
'',
|
||||
'',
|
||||
'class Download_Data(Cmdlet):',
|
||||
' """Class-based download-data cmdlet with self-registration."""',
|
||||
'',
|
||||
' def __init__(self) -> None:',
|
||||
' """Initialize download-data cmdlet."""',
|
||||
' super().__init__(',
|
||||
' name="download-data",',
|
||||
' summary="Download data from url with playlist/clip support using yt-dlp",',
|
||||
' usage="download-data <url> [options] or search-file | download-data [options]",',
|
||||
' alias=["download", "dl"],',
|
||||
' arg=[',
|
||||
' CmdletArg(name="url", type="string", required=False, description="URL to download (HTTP/HTTPS or file with URL list)", variadic=True),',
|
||||
' CmdletArg(name="-url", type="string", description="URL to download (alias for positional argument)", variadic=True),',
|
||||
' CmdletArg(name="list-formats", type="flag", description="List available formats without downloading"),',
|
||||
' CmdletArg(name="audio", type="flag", alias="a", description="Download audio only (extract from video)"),',
|
||||
' CmdletArg(name="video", type="flag", alias="v", description="Download video (default if not specified)"),',
|
||||
' CmdletArg(name="format", type="string", alias="fmt", description="Explicit yt-dlp format selector (e.g., bestvideo+bestaudio)"),',
|
||||
' CmdletArg(name="clip", type="string", description="Extract time range: MM:SS-MM:SS (e.g., 34:03-35:08) or seconds"),',
|
||||
' CmdletArg(name="section", type="string", description="Download sections (yt-dlp only): TIME_RANGE[,TIME_RANGE...] (e.g., 1:30-1:35,0:05-0:15)"),',
|
||||
' CmdletArg(name="cookies", type="string", description="Path to cookies.txt file for authentication"),',
|
||||
' CmdletArg(name="torrent", type="flag", description="Download torrent/magnet via AllDebrid (requires API key in config)"),',
|
||||
' CmdletArg(name="wait", type="float", description="Wait time (seconds) for magnet processing timeout"),',
|
||||
' CmdletArg(name="background", type="flag", alias="bg", description="Start download in background and return to prompt immediately"),',
|
||||
' CmdletArg(name="item", type="string", alias="items", description="Item selection for playlists/formats: use -item N to select format N, or -item to show table for @N selection in next command"),',
|
||||
' SharedArgs.STORAGE,',
|
||||
' ],',
|
||||
' detail=["Download media from url with advanced features.", "", "See help for full usage examples."],',
|
||||
' exec=self.run,',
|
||||
' )',
|
||||
' self.register()',
|
||||
'',
|
||||
' def run(self, result: Any, args: Sequence[str], config: Dict[str, Any]) -> int:',
|
||||
' """Main execution method."""',
|
||||
' stage_ctx = pipeline_context.get_stage_context()',
|
||||
' in_pipeline = stage_ctx is not None and getattr(stage_ctx, "total_stages", 1) > 1',
|
||||
' if in_pipeline and isinstance(config, dict):',
|
||||
' config["_quiet_background_output"] = True',
|
||||
' return self._run_impl(result, args, config, emit_results=True)',
|
||||
'',
|
||||
' # ' + '='*70,
|
||||
' # HELPER METHODS',
|
||||
' # ' + '='*70,
|
||||
'',
|
||||
])
|
||||
class_added = True
|
||||
|
||||
# Convert top-level helper functions to static methods
|
||||
if class_added and line and not line[0].isspace() and line.strip().startswith('def _'):
|
||||
output.append(' @staticmethod')
|
||||
output.append(f' {line}')
|
||||
i += 1
|
||||
# Copy function body with indentation
|
||||
while i < len(lines):
|
||||
next_line = lines[i]
|
||||
# Stop at next top-level definition
|
||||
if next_line and not next_line[0].isspace() and (next_line.strip().startswith(('def ', 'class ', 'CMDLET'))):
|
||||
break
|
||||
# Add indentation
|
||||
if next_line.strip():
|
||||
output.append(f' {next_line}')
|
||||
else:
|
||||
output.append(next_line)
|
||||
i += 1
|
||||
continue
|
||||
|
||||
output.append(line)
|
||||
i += 1
|
||||
|
||||
result_text = '\n'.join(output)
|
||||
|
||||
# NOW: Update function calls carefully
|
||||
# Only update calls in _run_impl, not in nested function definitions
|
||||
# Pattern: match _func( but NOT when it's after "def " on the same line
|
||||
helper_funcs = [
|
||||
'_download_torrent_worker', '_guess_libgen_title', '_is_libgen_entry',
|
||||
'_download_libgen_entry', '_libgen_background_worker',
|
||||
'_start_libgen_background_worker', '_run_pipeline_tail',
|
||||
'_download_http_background_worker', '_start_http_background_download',
|
||||
'_parse_torrent_file', '_download_torrent_file', '_is_torrent_file_or_url',
|
||||
'_process_torrent_input', '_show_playlist_table', '_parse_time_range',
|
||||
'_parse_section_ranges', '_parse_playlist_selection_indices',
|
||||
'_select_playlist_entries', '_sanitize_title_for_filename',
|
||||
'_find_playlist_files_from_entries', '_snapshot_playlist_paths',
|
||||
'_is_openlibrary_downloadable', '_as_dict', '_is_youtube_url',
|
||||
]
|
||||
|
||||
# Split into lines for careful replacement
|
||||
result_lines = result_text.split('\n')
|
||||
for idx, line in enumerate(result_lines):
|
||||
# Skip lines that are function definitions
|
||||
if 'def ' in line:
|
||||
continue
|
||||
# Replace helper function calls with self.
|
||||
for func in helper_funcs:
|
||||
# Pattern: _func( with word boundary before
|
||||
pattern = rf'\b({re.escape(func)})\('
|
||||
if re.search(pattern, line):
|
||||
result_lines[idx] = re.sub(pattern, r'self.\1(', line)
|
||||
|
||||
result_text = '\n'.join(result_lines)
|
||||
|
||||
output_file.write_text(result_text, encoding='utf-8')
|
||||
print(f"✓ Written: {output_file}")
|
||||
print(f"✓ Class-based refactor complete")
|
||||
|
||||
if __name__ == '__main__':
|
||||
refactor_download_data()
|
||||
131
scripts/refactor_download_data.py
Normal file
131
scripts/refactor_download_data.py
Normal file
@@ -0,0 +1,131 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Automated refactoring script for download_data.py
|
||||
Converts module-level functions to class-based cmdlet pattern.
|
||||
"""
|
||||
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
def main():
|
||||
backup_file = Path('cmdlets/download_data_backup.py')
|
||||
output_file = Path('cmdlets/download_data.py')
|
||||
|
||||
print(f"Reading: {backup_file}")
|
||||
content = backup_file.read_text(encoding='utf-8')
|
||||
lines = content.split('\n')
|
||||
|
||||
output = []
|
||||
i = 0
|
||||
in_cmdlet_def = False
|
||||
skip_old_run_wrapper = False
|
||||
class_section_added = False
|
||||
|
||||
# Track where to insert class definition
|
||||
last_import_line = 0
|
||||
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
|
||||
# Track imports
|
||||
if line.strip().startswith(('import ', 'from ')):
|
||||
last_import_line = len(output)
|
||||
|
||||
# Skip old _run wrapper function
|
||||
if 'def _run(result: Any' in line:
|
||||
skip_old_run_wrapper = True
|
||||
i += 1
|
||||
continue
|
||||
|
||||
if skip_old_run_wrapper:
|
||||
if line and not line[0].isspace():
|
||||
skip_old_run_wrapper = False
|
||||
else:
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# Skip old CMDLET definition
|
||||
if line.strip().startswith('CMDLET = Cmdlet('):
|
||||
in_cmdlet_def = True
|
||||
i += 1
|
||||
continue
|
||||
|
||||
if in_cmdlet_def:
|
||||
if line.strip() == ')':
|
||||
in_cmdlet_def = False
|
||||
# Add class instantiation instead
|
||||
output.append('')
|
||||
output.append('# Create and register the cmdlet')
|
||||
output.append('CMDLET = Download_Data()')
|
||||
output.append('')
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# Insert class definition before first helper function
|
||||
if not class_section_added and line.strip().startswith('def _download_torrent_worker('):
|
||||
output.append('')
|
||||
output.append('')
|
||||
output.append('class Download_Data(Cmdlet):')
|
||||
output.append(' """Class-based download-data cmdlet with self-registration."""')
|
||||
output.append('')
|
||||
output.append(' # Full __init__ implementation to be added')
|
||||
output.append(' # Full run() method to be added')
|
||||
output.append('')
|
||||
output.append(' # ' + '='*70)
|
||||
output.append(' # HELPER METHODS')
|
||||
output.append(' # ' + '='*70)
|
||||
output.append('')
|
||||
class_section_added = True
|
||||
|
||||
# Convert top-level helper functions to static methods
|
||||
if class_section_added and line.strip().startswith('def _') and not line.strip().startswith('def __'):
|
||||
# Check if this is a top-level function (no indentation)
|
||||
if not line.startswith((' ', '\t')):
|
||||
output.append(' @staticmethod')
|
||||
output.append(f' {line}')
|
||||
i += 1
|
||||
# Copy function body with indentation
|
||||
while i < len(lines):
|
||||
next_line = lines[i]
|
||||
# Stop at next top-level definition
|
||||
if next_line and not next_line[0].isspace() and (next_line.strip().startswith('def ') or next_line.strip().startswith('class ') or next_line.strip().startswith('CMDLET')):
|
||||
break
|
||||
# Add indentation
|
||||
if next_line.strip():
|
||||
output.append(f' {next_line}')
|
||||
else:
|
||||
output.append(next_line)
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# Convert _run_impl to method (but keep as-is for now, will be updated later)
|
||||
if class_section_added and line.strip().startswith('def _run_impl('):
|
||||
output.append(' def _run_impl(self, result: Any, args: Sequence[str], config: Dict[str, Any], emit_results: bool = True) -> int:')
|
||||
i += 1
|
||||
# Copy function body with indentation
|
||||
while i < len(lines):
|
||||
next_line = lines[i]
|
||||
if next_line and not next_line[0].isspace() and next_line.strip():
|
||||
break
|
||||
if next_line.strip():
|
||||
output.append(f' {next_line}')
|
||||
else:
|
||||
output.append(next_line)
|
||||
i += 1
|
||||
continue
|
||||
|
||||
output.append(line)
|
||||
i += 1
|
||||
|
||||
# Write output
|
||||
result_text = '\n'.join(output)
|
||||
output_file.write_text(result_text, encoding='utf-8')
|
||||
print(f"✓ Written: {output_file}")
|
||||
print(f"✓ Converted {content.count('def _')} helper functions to static methods")
|
||||
print("\nNext steps:")
|
||||
print("1. Add full __init__ method with cmdlet args")
|
||||
print("2. Add run() method that calls _run_impl")
|
||||
print("3. Update function calls in _run_impl from _func() to self._func()")
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Binary file not shown.
BIN
test/medios-macina.db
Normal file
BIN
test/medios-macina.db
Normal file
Binary file not shown.
BIN
test/yapping.m4a
Normal file
BIN
test/yapping.m4a
Normal file
Binary file not shown.
1
test/yapping.m4a.metadata
Normal file
1
test/yapping.m4a.metadata
Normal file
@@ -0,0 +1 @@
|
||||
hash:00beb438e3c02cdc0340526deb0c51f916ffd6330259be4f350009869c5448d9
|
||||
1
test/yapping.m4a.tag
Normal file
1
test/yapping.m4a.tag
Normal file
@@ -0,0 +1 @@
|
||||
title:yapping
|
||||
Reference in New Issue
Block a user