df
Some checks failed
smoke-mm / Install & smoke test mm --help (push) Has been cancelled

This commit is contained in:
2025-12-29 17:05:03 -08:00
parent 226de9316a
commit c019c00aed
104 changed files with 19669 additions and 12954 deletions

92
ENHANCEMENT_SUMMARY.md Normal file
View File

@@ -0,0 +1,92 @@
# get-url Command Enhancement Summary
## What Changed
Enhanced the `get-url` command in [cmdlet/get_url.py](cmdlet/get_url.py) to support searching for URLs across all stores with smart pattern matching.
## Key Features Added
### 1. URL Normalization (`_normalize_url_for_search`)
- Strips protocol prefixes: `https://`, `http://`, `ftp://`, etc.
- Removes `www.` prefix (case-insensitive)
- Converts to lowercase for case-insensitive matching
**Examples:**
- `https://www.youtube.com/watch?v=xx``youtube.com/watch?v=xx`
- `http://www.google.com``google.com`
- `FTP://cdn.example.com``cdn.example.com`
### 2. Wildcard Pattern Matching (`_match_url_pattern`)
- Supports `*` (matches any sequence) and `?` (matches single character)
- Case-insensitive matching
- Uses Python's `fnmatch` for robust pattern support
**Examples:**
- `youtube.com*` matches `youtube.com/watch`, `youtube.com/shorts`, etc.
- `*.example.com*` matches `cdn.example.com`, `api.example.com`, etc.
- `google.com/search*` matches `google.com/search?q=term`, etc.
### 3. Cross-Store URL Search (`_search_urls_across_stores`)
- Searches all configured stores (hydrus, folder, etc.)
- Finds matching URLs across all files in all stores
- Returns results grouped by store
- Emits `UrlItem` objects for pipelining
## Command Usage
### Search for URLs matching a pattern
```bash
get-url -url "www.google.com"
get-url -url "youtube.com*"
get-url -url "*.example.com*"
```
### Original usage (unchanged)
```bash
@1 | get-url
# Requires hash and store from piped result
```
## Implementation Details
### New Methods
- `_normalize_url_for_search(url)` - Static method to normalize URLs
- `_match_url_pattern(url, pattern)` - Static method to match with wildcards
- `_search_urls_across_stores(pattern, config)` - Search across all stores
### Modified Method
- `run()` - Enhanced to support `-url` flag for searching, fallback to original behavior
### Return Values
- **Search mode**: List of `UrlItem` objects grouped by store, exit code 0 if found, 1 if no matches
- **Original mode**: URLs for specific file, exit code 0 if found, 1 if not found
## Testing
A test script is included: [test_get_url_search.py](test_get_url_search.py)
**All tests pass:**
- ✓ URL normalization (protocol/www stripping)
- ✓ Wildcard pattern matching
- ✓ Case-insensitive matching
- ✓ Complex patterns with subdomains and paths
## Files Modified
- [cmdlet/get_url.py](cmdlet/get_url.py) - Enhanced with URL search functionality
- [docs/GET_URL_SEARCH.md](docs/GET_URL_SEARCH.md) - User documentation
- [test_get_url_search.py](test_get_url_search.py) - Test suite
## Backward Compatibility
✓ Fully backward compatible - original usage unchanged:
- `@1 | get-url` still works as before
- `-query` flag still works for hash lookups
- `-store` flag still required for direct lookups
## Error Handling
- Returns exit code 1 if no matches found (search mode)
- Returns exit code 1 if no store configured
- Gracefully handles store backend errors
- Logs errors to stderr without crashing