# get-url Command Enhancement Summary ## What Changed Enhanced the `get-url` command in [cmdlet/get_url.py](cmdlet/get_url.py) to support searching for URLs across all stores with smart pattern matching. ## Key Features Added ### 1. URL Normalization (`_normalize_url_for_search`) - Strips protocol prefixes: `https://`, `http://`, `ftp://`, etc. - Removes `www.` prefix (case-insensitive) - Converts to lowercase for case-insensitive matching **Examples:** - `https://www.youtube.com/watch?v=xx` → `youtube.com/watch?v=xx` - `http://www.google.com` → `google.com` - `FTP://cdn.example.com` → `cdn.example.com` ### 2. Wildcard Pattern Matching (`_match_url_pattern`) - Supports `*` (matches any sequence) and `?` (matches single character) - Case-insensitive matching - Uses Python's `fnmatch` for robust pattern support **Examples:** - `youtube.com*` matches `youtube.com/watch`, `youtube.com/shorts`, etc. - `*.example.com*` matches `cdn.example.com`, `api.example.com`, etc. - `google.com/search*` matches `google.com/search?q=term`, etc. ### 3. Cross-Store URL Search (`_search_urls_across_stores`) - Searches all configured stores (hydrus, folder, etc.) - Finds matching URLs across all files in all stores - Returns results grouped by store - Emits `UrlItem` objects for pipelining ## Command Usage ### Search for URLs matching a pattern ```bash get-url -url "www.google.com" get-url -url "youtube.com*" get-url -url "*.example.com*" ``` ### Original usage (unchanged) ```bash @1 | get-url # Requires hash and store from piped result ``` ## Implementation Details ### New Methods - `_normalize_url_for_search(url)` - Static method to normalize URLs - `_match_url_pattern(url, pattern)` - Static method to match with wildcards - `_search_urls_across_stores(pattern, config)` - Search across all stores ### Modified Method - `run()` - Enhanced to support `-url` flag for searching, fallback to original behavior ### Return Values - **Search mode**: List of `UrlItem` objects grouped by store, exit code 0 if found, 1 if no matches - **Original mode**: URLs for specific file, exit code 0 if found, 1 if not found ## Testing A test script is included: [test_get_url_search.py](test_get_url_search.py) **All tests pass:** - ✓ URL normalization (protocol/www stripping) - ✓ Wildcard pattern matching - ✓ Case-insensitive matching - ✓ Complex patterns with subdomains and paths ## Files Modified - [cmdlet/get_url.py](cmdlet/get_url.py) - Enhanced with URL search functionality - [docs/GET_URL_SEARCH.md](docs/GET_URL_SEARCH.md) - User documentation - [test_get_url_search.py](test_get_url_search.py) - Test suite ## Backward Compatibility ✓ Fully backward compatible - original usage unchanged: - `@1 | get-url` still works as before - `-query` flag still works for hash lookups - `-store` flag still required for direct lookups ## Error Handling - Returns exit code 1 if no matches found (search mode) - Returns exit code 1 if no store configured - Gracefully handles store backend errors - Logs errors to stderr without crashing