# Provider authoring: ResultTable & provider adapters ✅ This short guide explains how to write providers that integrate with the *strict* ResultTable API: adapters must yield `ResultModel` instances and providers register via `SYS.result_table_adapters.register_provider` with a column specification and a `selection_fn`. --- ## Quick summary - Providers register a *provider adapter* (callable that yields `ResultModel`). - Providers must also provide `columns` (static list or factory) and a `selection_fn` that returns CLI args for a selected row. - For simple HTML table/list scraping, prefer `TableProviderMixin` from `SYS.provider_helpers` to fetch and extract rows using `SYS.html_table.extract_records`. ## Runtime dependency policy - Treat required runtime dependencies (e.g., **Playwright**) as mandatory: import them unconditionally and let missing dependencies fail fast at import time. Avoid adding per-call try/except import guards for required modules—these silently hide configuration errors and add bloat. - Use guarded imports only for truly optional dependencies (e.g., `pandas` for enhanced table parsing) and provide meaningful fallbacks or helpful error messages in those cases. - Keep provider code minimal and explicit: fail early and document required runtime dependencies in README/installation notes. --- ## Minimal provider template (copy/paste) ```py # Provider/my_provider.py from typing import Any, Dict, Iterable, List from SYS.result_table_api import ResultModel, ColumnSpec, title_column, metadata_column from SYS.result_table_adapters import register_provider # Example adapter: convert provider-specific items into ResultModel instances SAMPLE_ITEMS = [ {"name": "Example File.pdf", "path": "https://example.com/x.pdf", "ext": "pdf", "size": 1024, "source": "myprovider"}, ] def adapter(items: Iterable[Dict[str, Any]]) -> Iterable[ResultModel]: for it in items: title = it.get("name") or it.get("title") or str(it.get("path") or "") yield ResultModel( title=str(title), path=str(it.get("path")) if it.get("path") else None, ext=str(it.get("ext")) if it.get("ext") else None, size_bytes=int(it.get("size")) if it.get("size") is not None else None, metadata=dict(it), source=str(it.get("source")) if it.get("source") else "myprovider", ) # Optional: build columns dynamically from sample rows def columns_factory(rows: List[ResultModel]) -> List[ColumnSpec]: cols = [title_column()] # add extra columns if metadata keys exist if any((r.metadata or {}).get("size") for r in rows): cols.append(ColumnSpec("size", "Size", lambda r: r.size_bytes or "")) return cols # Selection args for `@N` expansion or `select` cmdlet def selection_fn(row: ResultModel) -> List[str]: # prefer -path when available if row.path: return ["-path", row.path] return ["-title", row.title or ""] # Register provider (done at import time) register_provider("myprovider", adapter, columns=columns_factory, selection_fn=selection_fn) ``` --- ## Table scraping: using TableProviderMixin (HTML tables / list-results) If your provider scrapes HTML tables or list-like results (common on web search pages), use `TableProviderMixin`: ```py from ProviderCore.base import Provider from SYS.provider_helpers import TableProviderMixin class MyTableProvider(TableProviderMixin, Provider): URL = ("https://example.org/search",) def validate(self) -> bool: return True def search(self, query: str, limit: int = 50, **kwargs): url = f"{self.URL[0]}?q={quote_plus(query)}" return self.search_table_from_url(url, limit=limit) ``` `TableProviderMixin.search_table_from_url` returns `ProviderCore.base.SearchResult` entries. If you want to integrate this provider with the strict `ResultTable` registry, add a small adapter that converts `SearchResult` -> `ResultModel` and register it using `register_provider` (see `Provider/vimm.py` for a real example). --- ## Columns & selection - `columns` may be a static `List[ColumnSpec]` or a factory `def cols(rows: List[ResultModel]) -> List[ColumnSpec]` that inspects sample rows. - `selection_fn` must accept a `ResultModel` and return a `List[str]` representing CLI args (e.g., `['-path', row.path]`). These args are used by `select` and `@N` expansion. **Tip:** for providers that produce downloadable file rows prefer returning explicit URL args (e.g., `['-url', row.path]`) so the selected URL is clearly identified by downstream downloaders and to avoid ambiguous parsing when provider hints (like `-provider`) are present. - Ensure your `ResultModel.source` is set (either in the model or rely on the provider name set by `serialize_row`). --- ## Optional: pandas path for `` extraction `SYS.html_table.extract_records` prefers a pure-lxml path but will use `pandas.read_html` if pandas is installed and the helper detects it works for the input table. This is optional and **not required** to author a provider — document in your provider whether it requires `pandas` and add an informative error/log message when it is missing. --- ## Testing & examples - Write `tests/test_provider_.py` that imports your provider and verifies `provider.build_table(...)` produces a `ResultTable` (has `.rows` and `.columns`) and that `serialize_rows()` yields dicts with `_selection_args`, `_selection_action` when applicable, and `source`. - When you need to guarantee a specific CLI stage sequence (e.g., `download-file -url -provider `), call `table.set_row_selection_action(row_index, tokens)` so the serialized payload emits `_selection_action` and the CLI can run the row exactly as intended. - For table providers you can test `search_table_from_url` using a local HTML fixture or by mocking `HTTPClient` to return a small sample page. - If you rely on pandas, add a test that monkeypatches `sys.modules['pandas']` to a simple shim to validate the pandas path. **Example test skeleton** ```py from SYS.result_table_adapters import get_provider from Provider import example_provider def test_example_provider_registration(): provider = get_provider("example") rows = list(provider.adapter(example_provider.SAMPLE_ITEMS)) assert rows and rows[0].title cols = provider.get_columns(rows) assert any(c.name == "title" for c in cols) table = provider.build_table(example_provider.SAMPLE_ITEMS) assert table.provider == "example" and table.rows ``` --- ## References & examples - Read `Provider/example_provider.py` for a compact example of a strict adapter and dynamic columns. - Read `Provider/vimm.py` for a table-provider that uses `TableProviderMixin` and converts `SearchResult` → `ResultModel` for registration. - See `docs/provider_guide.md` for a broader provider development checklist. --- If you want, I can also add a small `Provider/myprovider_template.py` file and unit tests for it — say the word and I'll add them and wire up tests. 🎯