hj

2025-12-30 04:47:13 -08:00
parent 756b024b76
commit 925a1631bc
15 changed files with 1083 additions and 625 deletions
--- a/docs/BOOTSTRAP.md
+++ b/docs/BOOTSTRAP.md
@@ -1,134 +0,0 @@
-# Bootstrapping the development environment
-
-This project includes convenience scripts to create a Python virtual environment, install the package, and (optionally) create OS shortcuts.
-
-Files:
- `scripts/bootstrap.ps1` — PowerShell script for Windows (creates venv, installs, optional Desktop/Start Menu shortcuts)
- `scripts/bootstrap.sh` — POSIX shell script (Linux/macOS) (creates venv, installs, optional desktop launcher)
-
-Quick examples
-
-Windows (PowerShell):
-
-```powershell
-# Create a .venv, install in editable mode and add a Desktop shortcut
-powershell -ExecutionPolicy Bypass -File .\scripts\bootstrap.ps1 -Editable -CreateDesktopShortcut
-
-# Use a specific python.exe and force overwrite
-powershell -ExecutionPolicy Bypass -File .\scripts\bootstrap.ps1 -Python "C:\\Python39\\python.exe" -Force
-```
-
-Linux/macOS (bash):
-
-```bash
-# Create a .venv and install the project in editable mode
-./scripts/bootstrap.sh --editable
-
-# Create a desktop entry (GNU/Linux)
-./scripts/bootstrap.sh --editable --desktop
-```
-
-Notes
-
- On Windows you may need to run PowerShell with an appropriate ExecutionPolicy (example shows using `-ExecutionPolicy Bypass`).
- The scripts default to a venv directory named `.venv` in the repository root. Use `-VenvPath` (PowerShell) or `--venv` (bash) to choose a different directory.
- The scripts will also install Playwright browser binaries by default (Chromium only) after installing Python dependencies. Use `--no-playwright` (bash) or `-NoPlaywright` (PowerShell) to opt out, or `--playwright-browsers <list>` / `-PlaywrightBrowsers <list>` to request specific engines (comma-separated, or use `all` to install all engines).
- The scripts are intended to make day-to-day developer setup easy; tweak flags for your desired install mode (editable vs normal) and shortcut preferences.
-
-## Deno — installed by bootstrap
-
-The bootstrap scripts will automatically install Deno if it is not already present on the system. They use the official installers and attempt to add Deno's bin directory to the PATH for the current session. If the installer completes but `deno` is not available in your shell, restart your shell or add `$HOME/.deno/bin` (Windows: `%USERPROFILE%\\.deno\\bin`) to your PATH.
-
-Opinionated behavior
-
-Running `python ./scripts/bootstrap.py` is intentionally opinionated: it will create a local virtual environment at `./.venv` (repo root), install Python dependencies and the project into that venv, install Playwright browsers, install Deno, and write small launcher scripts in the project root:
-
- `mm` (POSIX shell)
- `mm.ps1` (PowerShell)
- `mm.bat` (Windows CMD)
-
-These launchers prefer the local `./.venv` Python and console scripts so you can run the project with `./mm` or `mm.ps1` directly from the repo root.
-
- When installing in editable mode from a development checkout, the bootstrap will also add a small `.pth` file to the venv's `site-packages` pointing at the repository root. This ensures top-level scripts such as `CLI.py` are importable even when using PEP 660 editable wheels (avoids having to create an egg-link by hand).
-
-Additionally, the setup helpers install a global `mm` launcher into your user bin so you can run `mm` from any shell session:
-
- POSIX: `~/.local/bin/mm` (created if missing; the script attempts to add `~/.local/bin` to `PATH` by updating `~/.profile` / shell RCs if required)
- Windows: `%USERPROFILE%\bin\mm.cmd` and `%USERPROFILE%\bin\mm.ps1` (created if missing; the script attempts to add the folder to your **User** PATH)
-
-The scripts back up any existing `mm` shims before replacing them and will print actionable messages when a shell restart is required.
-
-Debugging the global `mm` launcher
-
- POSIX: set MM_DEBUG=1 and run `mm` to print runtime diagnostics (resolved REPO, VENV, and Python import checks):
-
-  ```bash
-  MM_DEBUG=1 mm
-  ```
-
- PowerShell: set and export `$env:MM_DEBUG='1'` then run `mm.ps1` or the installed `mm` shim:
-
-  ```powershell
-  $env:MM_DEBUG = '1'
-  mm
-  ```
-
- CMD: `set MM_DEBUG=1` then run `mm`.
-
-These diagnostics help identify whether the global launcher is selecting the correct repository and virtual environment; please include the output when reporting launcher failures.
-
-PowerShell (Windows):
-```powershell
-irm https://deno.land/install.ps1 | iex
-```
-
-Linux/macOS:
-```bash
-curl -fsSL https://deno.land/install.sh | sh
-```
-
-Pinning a Deno version
-
-You can pin a Deno release by setting the `DENO_VERSION` environment variable before running the bootstrap script. Examples:
-
-PowerShell (Windows):
-```powershell
-$env:DENO_VERSION = 'v1.34.3'; .\scripts\bootstrap.ps1
-```
-
-POSIX (Linux/macOS):
-```bash
-DENO_VERSION=v1.34.3 ./scripts/bootstrap.sh
-```
-
-If you'd like, I can also:
- Add a short README section in `readme.md` referencing this doc, or
- Add a small icon and polish Linux desktop entries with an icon path.
-
-## Troubleshooting: urllib3 / urllib3-future conflicts ⚠️
-
-On some environments a third-party package (for example `urllib3-future`) may
-install a site-packages hook that interferes with the real `urllib3` package.
-When this happens you might see errors like:
-
-  Error importing cmdlet 'get_tag': No module named 'urllib3.exceptions'
-
-The bootstrap scripts now run a verification step after installing dependencies
-and will stop if a broken `urllib3` is detected to avoid leaving you with a
-partially broken venv.
-
-Recommended fix (activate the venv first or use the venv python explicitly):
-
-PowerShell / Windows (from repo root):
-
-  .venv\Scripts\python.exe -m pip uninstall urllib3-future -y
-  .venv\Scripts\python.exe -m pip install --upgrade --force-reinstall urllib3
-  .venv\Scripts\python.exe -m pip install niquests -U
-
-POSIX (Linux/macOS):
-
-  .venv/bin/python -m pip uninstall urllib3-future -y
-  .venv/bin/python -m pip install --upgrade --force-reinstall urllib3
-  .venv/bin/python -m pip install niquests -U
-
-If problems persist, re-run the bootstrap script after applying the fixes.
--- a/docs/GET_URL_ARCHITECTURE.md
+++ b/docs/GET_URL_ARCHITECTURE.md
@@ -1,234 +0,0 @@
-# get-url Architecture & Flow
-
-## Overview
-
-The enhanced `get-url` command supports two modes:
-
-```
-get-url
-├── SEARCH MODE (new)
-│   └── -url "pattern"
-│       ├── Normalize pattern (strip protocol, www)
-│       ├── Search all stores
-│       ├── Match URLs with wildcards
-│       └── Return grouped results
-│
-└── ORIGINAL MODE (unchanged)
-    ├── Hash lookup
-    ├── Store lookup
-    └── Return URLs for file
-```
-
-## Flow Diagram: URL Search
-
-```
-User Input
-    │
-    v
-get-url -url "youtube.com*"
-    │
-    v
-_normalize_url_for_search()
-    │ Strips: https://, http://, www.
-    │ Result: "youtube.com*" (unchanged, already normalized)
-    v
-_search_urls_across_stores()
-    │
-    ├─→ Store 1 (Hydrus)
-    │   ├─→ search("*", limit=1000)
-    │   ├─→ get_url(file_hash) for each file
-    │   └─→ _match_url_pattern() for each URL
-    │
-    ├─→ Store 2 (Folder)
-    │   ├─→ search("*", limit=1000)
-    │   ├─→ get_url(file_hash) for each file
-    │   └─→ _match_url_pattern() for each URL
-    │
-    └─→ ...more stores...
-    
-    Matching URLs:
-    ├─→ https://www.youtube.com/watch?v=123
-    ├─→ http://youtube.com/shorts/abc
-    └─→ https://youtube.com/playlist?list=xyz
-    
-    Normalized for matching:
-    ├─→ youtube.com/watch?v=123  ✓ Matches "youtube.com*"
-    ├─→ youtube.com/shorts/abc   ✓ Matches "youtube.com*"
-    └─→ youtube.com/playlist?...  ✓ Matches "youtube.com*"
-    
-    v
-Collect UrlItem results
-    │
-    ├─→ UrlItem(url="https://www.youtube.com/watch?v=123", 
-    │           hash="abcd1234...", store="hydrus")
-    │
-    ├─→ UrlItem(url="http://youtube.com/shorts/abc",
-    │           hash="efgh5678...", store="folder")
-    │
-    └─→ ...more items...
-    
-    v
-Group by store
-    │
-    ├─→ Hydrus
-    │   ├─→ https://www.youtube.com/watch?v=123
-    │   └─→ ...
-    │
-    └─→ Folder
-        ├─→ http://youtube.com/shorts/abc
-        └─→ ...
-    
-    v
-Emit UrlItem objects for piping
-    │
-    v
-Return exit code 0 (success)
-```
-
-## Code Structure
-
-```
-Get_Url (class)
-    │
-    ├── __init__()
-    │   └── Register command with CLI
-    │
-    ├── _normalize_url_for_search() [static]
-    │   └── Strip protocol & www, lowercase
-    │
-    ├── _match_url_pattern() [static]
-    │   └── fnmatch with normalization
-    │
-    ├── _search_urls_across_stores() [instance]
-    │   ├── Iterate stores
-    │   ├── Search files in store
-    │   ├── Get URLs for each file
-    │   ├── Apply pattern matching
-    │   └── Return (items, stores_found)
-    │
-    └── run() [main execution]
-        ├── Check for -url flag
-        │   ├── YES: Search mode
-        │   │   └── _search_urls_across_stores()
-        │   └── NO: Original mode
-        │       └── Hash+store lookup
-        │
-        └── Return exit code
-```
-
-## Data Flow Examples
-
-### Example 1: Search by Domain
-```
-Input:  get-url -url "www.google.com"
-        
-Normalize: "google.com" (www. stripped)
-
-Search Results:
-  Store "hydrus":
-    - https://www.google.com ✓
-    - https://google.com/search?q=hello ✓
-    - https://google.com/maps ✓
-  
-  Store "folder":
-    - http://google.com ✓
-    - https://google.com/images ✓
-
-Output: 5 matching URLs grouped by store
-```
-
-### Example 2: Wildcard Pattern
-```
-Input:  get-url -url "youtube.com/watch*"
-
-Pattern: "youtube.com/watch*"
-
-Search Results:
-  Store "hydrus":
-    - https://www.youtube.com/watch?v=123 ✓
-    - https://youtube.com/watch?list=abc ✓
-    - https://www.youtube.com/shorts/xyz ✗ (doesn't match /watch*)
-  
-  Store "folder":
-    - http://youtube.com/watch?v=456 ✓
-
-Output: 3 matching URLs (watch only, not shorts)
-```
-
-### Example 3: Subdomain Wildcard
-```
-Input:  get-url -url "*.example.com*"
-
-Normalize: "*.example.com*" (already normalized)
-
-Search Results:
-  Store "hydrus":
-    - https://cdn.example.com/video.mp4 ✓
-    - https://api.example.com/endpoint ✓
-    - https://www.example.com ✓
-    - https://other.org ✗
-
-Output: 3 matching URLs
-```
-
-## Integration with Piping
-
-```
-# Search → Filter → Add Tag
-get-url -url "youtube.com*" | add-tag -tag "video-source"
-
-# Search → Count
-get-url -url "reddit.com*" | wc -l
-
-# Search → Export
-get-url -url "github.com*" > github_urls.txt
-```
-
-## Error Handling Flow
-
-```
-get-url -url "pattern"
-    │
-    ├─→ No stores configured?
-    │   └─→ Log "Error: No stores configured"
-    │   └─→ Return exit code 1
-    │
-    ├─→ Store search fails?
-    │   └─→ Log error, skip store, continue
-    │
-    ├─→ No matches found?
-    │   └─→ Log "No urls matching pattern"
-    │   └─→ Return exit code 1
-    │
-    └─→ Matches found?
-        └─→ Return exit code 0
-```
-
-## Performance Considerations
-
-1. **Store Iteration**: Loops through all configured stores
-2. **File Scanning**: Each store searches up to 1000 files
-3. **URL Matching**: Each URL tested against pattern (fnmatch - O(n) per URL)
-4. **Memory**: Stores all matching items in memory before display
-
-Optimization opportunities:
- Cache store results
- Limit search scope with --store flag
- Early exit with --limit N
- Pagination support
-
-## Backward Compatibility
-
-Original mode (unchanged):
-```
-@1 | get-url
-    │
-    └─→ No -url flag
-        └─→ Use original logic
-            ├─→ Get hash from result
-            ├─→ Get store from result or args
-            ├─→ Call backend.get_url(hash)
-            └─→ Return URLs for that file
-```
-
-All original functionality preserved. New -url flag is additive only.
--- a/docs/GET_URL_QUICK_REF.md
+++ b/docs/GET_URL_QUICK_REF.md
@@ -1,76 +0,0 @@
-# Quick Reference: get-url URL Search
-
-## Basic Syntax
-```bash
-# Search mode (new)
-get-url -url "pattern"
-
-# Original mode (unchanged)
-@1 | get-url
-```
-
-## Examples
-
-### Exact domain match
-```bash
-get-url -url "google.com"
-```
-Matches: `https://www.google.com`, `http://google.com/search`, `https://google.com/maps`
-
-### YouTube URL search
-```bash
-get-url -url "https://www.youtube.com/watch?v=xx_88TDWmEs"
-```
-Normalizes to: `youtube.com/watch?v=xx_88tdwmes`
-Matches: Any video with same ID across different protocols
-
-### Wildcard domain
-```bash
-get-url -url "youtube.com*"
-```
-Matches: All YouTube URLs (videos, shorts, playlists, etc.)
-
-### Subdomain wildcard
-```bash
-get-url -url "*.example.com*"
-```
-Matches: `cdn.example.com`, `api.example.com`, `www.example.com`
-
-### Specific path pattern
-```bash
-get-url -url "youtube.com/watch*"
-```
-Matches: Only YouTube watch URLs (not shorts or playlists)
-
-### Single character wildcard
-```bash
-get-url -url "example.com/file?.mp4"
-```
-Matches: `example.com/file1.mp4`, `example.com/fileA.mp4` (not `file12.mp4`)
-
-## How It Works
-
-1. **Normalization**: Strips `https://`, `www.` prefix from pattern and all URLs
-2. **Pattern Matching**: Uses `*` and `?` wildcards (case-insensitive)
-3. **Search**: Scans all configured stores for matching URLs
-4. **Results**: Groups matches by store, shows URL and hash
-
-## Return Values
- Exit code **0** if matches found
- Exit code **1** if no matches or error
-
-## Piping Results
-```bash
-get-url -url "youtube.com*" | grep -i video
-get-url -url "example.com*" | add-tag -tag "external-source"
-```
-
-## Common Patterns
-
-| Pattern | Matches | Notes |
-|---------|---------|-------|
-| `google.com` | Google URLs | Exact domain (after normalization) |
-| `youtube.com*` | All YouTube | Wildcard at end |
-| `*.example.com*` | Subdomains | Wildcard at start and end |
-| `github.com/user*` | User repos | Path pattern |
-| `reddit.com/r/*` | Subreddit | Path with wildcard |
--- a/docs/GET_URL_SEARCH.md
+++ b/docs/GET_URL_SEARCH.md
@@ -1,91 +0,0 @@
-# get-url Enhanced URL Search
-
-The `get-url` command now supports searching for URLs across all stores with automatic protocol and `www` prefix stripping.
-
-## Features
-
-### 1. **Protocol Stripping**
-URLs are normalized by removing:
- Protocol prefixes: `https://`, `http://`, `ftp://`, etc.
- `www.` prefix (case-insensitive)
-
-### 2. **Wildcard Matching**
-Patterns support standard wildcards:
- `*` - matches any sequence of characters
- `?` - matches any single character
-
-### 3. **Case-Insensitive Matching**
-All matching is case-insensitive for domains and paths
-
-## Usage Examples
-
-### Search by full domain
-```bash
-get-url -url "www.google.com"
-# Matches:
-# - https://www.google.com
-# - http://google.com/search
-# - https://google.com/maps
-```
-
-### Search with YouTube example
-```bash
-get-url -url "https://www.youtube.com/watch?v=xx_88TDWmEs"
-# Becomes: youtube.com/watch?v=xx_88tdwmes
-# Matches:
-# - https://www.youtube.com/watch?v=xx_88TDWmEs
-# - http://youtube.com/watch?v=xx_88TDWmEs
-```
-
-### Domain wildcard matching
-```bash
-get-url -url "youtube.com*"
-# Matches any URL starting with youtube.com:
-# - https://www.youtube.com/watch?v=123
-# - https://youtube.com/shorts/abc
-# - http://youtube.com/playlist?list=xyz
-```
-
-### Subdomain matching
-```bash
-get-url -url "*example.com*"
-# Matches:
-# - https://cdn.example.com/file.mp4
-# - https://www.example.com
-# - https://api.example.com/endpoint
-```
-
-### Specific path matching
-```bash
-get-url -url "youtube.com/watch*"
-# Matches:
-# - https://www.youtube.com/watch?v=123
-# - http://youtube.com/watch?list=abc
-# Does NOT match:
-# - https://youtube.com/shorts/abc
-```
-
-## Get URLs for Specific File
-
-The original functionality is still supported:
-```bash
-@1 | get-url
-# Requires hash and store from piped result
-```
-
-## Output
-
-Results are organized by store and show:
- **Store**: Backend name (hydrus, folder, etc.)
- **Url**: The full matched URL
- **Hash**: First 16 characters of the file hash (for compactness)
-
-## Implementation Details
-
-The search:
-1. Iterates through all configured stores
-2. Searches for all files in each store (limit 1000 per store)
-3. Retrieves URLs for each file
-4. Applies pattern matching with normalization
-5. Returns results grouped by store
-6. Emits `UrlItem` objects for piping to other commands
--- a/docs/KNOWN_ISSUES.md
+++ b/docs/KNOWN_ISSUES.md
@@ -1,8 +0,0 @@
-Known issues and brief remediation steps
-
- urllib3 / urllib3-future conflict
-  - Symptom: `No module named 'urllib3.exceptions'` or missing `urllib3.__version__`.
-  - Root cause: a `.pth` file or packaging hook from `urllib3-future` may mutate the
-    `urllib3` namespace in incompatible ways.
-  - Remediation: uninstall `urllib3-future`, reinstall `urllib3`, and re-install
-    `niquests` if required. See `docs/ISSUES/urllib3-future.md` for more details.
--- a/docs/cookies.md
+++ b/docs/cookies.md
@@ -1,13 +0,0 @@
-# Obtain cookies.txt for youtube.com
-1. You need a google account, throwaway is fine
-2. You need webbrowser extension Get cookies.txt LOCALLY
-
-Chrome based browser: [cookies.txt LOCALLY](https://chromewebstore.google.com/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc)
-
-Firefox based browser: [cookies.txt LOCALLY](https://addons.mozilla.org/en-US/firefox/addon/get-cookies-txt-locally/)
-
-3. open incognito tab and sign into youtube with your account
-4. open extension and click on "export all cookies"
-5. with the cookies.txt file produced, place that in the project folder
-
-restart the medios-macina app and verify status for cookies is FOUND
--- a/docs/hydrusnetwork.md
+++ b/docs/hydrusnetwork.md
@@ -1,41 +0,0 @@
-1. open shell prompt to a good spot for hydrusnetwork i.e. C:\hydrusnetwork
-2. send command "git clone https://github.com/hydrusnetwork/hydrus"
-3. send command "cd hydrus"
-4. send command "python -m venv .venv"
-
---------------------------------------------------
-5. Windows
-1. send command ".\.venv\Scripts\Activate.ps1"
-
-5. Linux
-1. send command "source .venv/bin/activate"
-
--------------------------------------------------
-your commandline should have (.venv) infront of it now
-
-5. send command "pip install -r requirements.txt"
-6. send command "python hydrus_client.py"
---------------------------------------------------
-the gui application should be opened now
-7.in the top menu click on services > manage services > double-click "client api"
-8.check the boxes
-X run the client api?
-X allow non-local connections
-X supports CORS headers
-click apply
-
-9.click on services > review services > click on "client api"
-10. click "Add" > manually > change "new api permissions" to "medios"
-11. click apply > click "copy api access key", click "open client api base url"
-
--------------------------------------------
-edit the below and place in your config.conf
-
-<figure>
-  <figcaption>config.conf</figcaption>
-  <pre><code class="language-powershell">[store=hydrusnetwork]
-NAME="shortnamenospacesorsymbols"
-API="apiaccesskeygoeshere"
-URL="apibaseurlgoeshere"
-</code></pre>
-</figure>
--- a/docs/img/hydrus/edit-service.png
+++ b/docs/img/hydrus/edit-service.png