This commit is contained in:
2026-05-16 15:26:08 -07:00
parent 5048729b0c
commit 02d84f423e
10 changed files with 488 additions and 470 deletions
+66 -58
View File
@@ -1,37 +1,48 @@
# Plugin authoring: ResultTable & plugin adapters
# Plugin authoring: ResultTable and plugin adapters
This short guide explains how to write plugins that integrate with the *strict* ResultTable API: adapters must yield `ResultModel` instances and plugins register via `SYS.result_table_adapters.register_plugin` with a column specification and a `selection_fn`.
This short guide explains how to write plugins that integrate with the strict
ResultTable API: adapters yield `ResultModel` instances, and plugins register
via `SYS.result_table_adapters.register_plugin` with columns and a
`selection_fn`.
Note: this file keeps its historical `provider_authoring` name, but the public
terminology is plugin-first. Some internal classes and metadata fields still use
`Provider` naming.
---
## Quick summary
- Plugins register a *plugin adapter* (callable that yields `ResultModel`).
- Plugins must also provide `columns` (static list or factory) and a `selection_fn` that returns CLI args for a selected row.
- For simple HTML table/list scraping, prefer `TableProviderMixin` from `SYS.provider_helpers` to fetch and extract rows using `SYS.html_table.extract_records`.
- Plugins register a plugin adapter, a `columns` definition, and a `selection_fn`.
- `selection_fn` returns CLI args for a selected row.
- For HTML table or list scraping, prefer `TableProviderMixin` from `SYS.provider_helpers`.
## Runtime dependency policy
- Treat required runtime dependencies (e.g., **Playwright**) as mandatory: import them unconditionally and let missing dependencies fail fast at import time. Avoid adding per-call try/except import guards for required modules—these silently hide configuration errors and add bloat.
- Use guarded imports only for truly optional dependencies (e.g., `pandas` for enhanced table parsing) and provide meaningful fallbacks or helpful error messages in those cases.
- Keep provider code minimal and explicit: fail early and document required runtime dependencies in README/installation notes.
- Treat required runtime dependencies such as Playwright as mandatory: import them unconditionally and let missing dependencies fail fast.
- Use guarded imports only for truly optional dependencies such as `pandas`.
- Keep plugin code minimal and explicit: fail early and document required runtime dependencies in README and installation notes.
---
## Minimal provider template (copy/paste)
## Minimal plugin template
```py
# plugins/my_plugin.py
from typing import Any, Dict, Iterable, List
from SYS.result_table_api import ResultModel, ColumnSpec, title_column, metadata_column
from SYS.result_table_api import ResultModel, ColumnSpec, title_column
from SYS.result_table_adapters import register_plugin
# Example adapter: convert provider-specific items into ResultModel instances
SAMPLE_ITEMS = [
{"name": "Example File.pdf", "path": "https://example.com/x.pdf", "ext": "pdf", "size": 1024, "source": "myprovider"},
{
"name": "Example File.pdf",
"path": "https://example.com/x.pdf",
"ext": "pdf",
"size": 1024,
"source": "myplugin",
},
]
def adapter(items: Iterable[Dict[str, Any]]) -> Iterable[ResultModel]:
for it in items:
title = it.get("name") or it.get("title") or str(it.get("path") or "")
@@ -41,39 +52,38 @@ def adapter(items: Iterable[Dict[str, Any]]) -> Iterable[ResultModel]:
ext=str(it.get("ext")) if it.get("ext") else None,
size_bytes=int(it.get("size")) if it.get("size") is not None else None,
metadata=dict(it),
source=str(it.get("source")) if it.get("source") else "myprovider",
source=str(it.get("source")) if it.get("source") else "myplugin",
)
# Optional: build columns dynamically from sample rows
def columns_factory(rows: List[ResultModel]) -> List[ColumnSpec]:
cols = [title_column()]
# add extra columns if metadata keys exist
if any((r.metadata or {}).get("size") for r in rows):
cols.append(ColumnSpec("size", "Size", lambda r: r.size_bytes or ""))
if any((row.metadata or {}).get("size") for row in rows):
cols.append(ColumnSpec("size", "Size", lambda row: row.size_bytes or ""))
return cols
# Selection args for `@N` expansion or `select` cmdlet
def selection_fn(row: ResultModel) -> List[str]:
# prefer -path when available
if row.path:
return ["-path", row.path]
return ["-title", row.title or ""]
# Register plugin (done at import time)
register_plugin("myprovider", adapter, columns=columns_factory, selection_fn=selection_fn)
register_plugin("myplugin", adapter, columns=columns_factory, selection_fn=selection_fn)
```
---
## Table scraping: using TableProviderMixin (HTML tables / list-results)
## Table scraping with `TableProviderMixin`
If your provider scrapes HTML tables or list-like results (common on web search pages), use `TableProviderMixin`:
If your plugin scrapes HTML tables or list-like results, use `TableProviderMixin`:
```py
from ProviderCore.base import Provider
from SYS.provider_helpers import TableProviderMixin
class MyTableProvider(TableProviderMixin, Provider):
class MyTablePlugin(TableProviderMixin, Provider):
URL = ("https://example.org/search",)
def validate(self) -> bool:
@@ -84,58 +94,56 @@ class MyTableProvider(TableProviderMixin, Provider):
return self.search_table_from_url(url, limit=limit)
```
`TableProviderMixin.search_table_from_url` returns `ProviderCore.base.SearchResult` entries. If you want to integrate this plugin with the strict `ResultTable` registry, add a small adapter that converts `SearchResult` -> `ResultModel` and register it using `register_plugin` (see `plugins/vimm/__init__.py` for a real example).
`TableProviderMixin.search_table_from_url` returns
`ProviderCore.base.SearchResult` entries. If you want to integrate the plugin
with the strict `ResultTable` registry, add a small adapter that converts
`SearchResult` to `ResultModel` and register it using `register_plugin`.
---
## Columns & selection
- `columns` may be a static `List[ColumnSpec]` or a factory `def cols(rows: List[ResultModel]) -> List[ColumnSpec]` that inspects sample rows.
- `selection_fn` must accept a `ResultModel` and return a `List[str]` representing CLI args (e.g., `['-path', row.path]`). These args are used by `select` and `@N` expansion.
**Tip:** for plugins that produce downloadable file rows prefer returning explicit URL args (e.g., `['-url', row.path]`) so the selected URL is clearly identified by downstream downloaders and to avoid ambiguous parsing when plugin hints (like `-plugin`) are present.
- Ensure your `ResultModel.source` is set (either in the model or rely on the provider name set by `serialize_row`).
## Columns and selection
- `columns` may be a static `List[ColumnSpec]` or a factory that inspects sample rows.
- `selection_fn` must accept a `ResultModel` and return a `List[str]` representing CLI args.
- For downloadable file rows, prefer explicit URL args such as `['-url', row.path]` so downstream downloaders interpret the row unambiguously.
- Ensure `ResultModel.source` is set directly or falls back to the registered plugin name during serialization.
---
## Optional: pandas path for `<table>` extraction
`SYS.html_table.extract_records` prefers a pure-lxml path but will use `pandas.read_html` if pandas is installed and the helper detects it works for the input table. This is optional and **not required** to author a provider — document in your provider whether it requires `pandas` and add an informative error/log message when it is missing.
## Optional pandas support
`SYS.html_table.extract_records` prefers a pure-lxml path but can fall back to
`pandas.read_html` when pandas is installed and the helper detects it works for
the input table. This is optional. Document whether your plugin requires
`pandas` and emit a clear error or log message when it is missing.
---
## Testing & examples
## Testing and examples
- Write `tests/test_plugin_<name>.py` or follow the repo's older naming conventions when extending existing tests.
- Verify `plugin.build_table(...)` produces a `ResultTable` with rows and columns.
- Verify `serialize_rows()` yields `_selection_args`, `_selection_action` when applicable, and `source`.
- When you need an exact CLI stage sequence, call `table.set_row_selection_action(row_index, tokens)` so replay uses the row action verbatim.
- For table-oriented plugins, test `search_table_from_url` with a local HTML fixture or a mocked `HTTPClient`.
- Write `tests/test_provider_<name>.py` that imports your provider and verifies `provider.build_table(...)` produces a `ResultTable` (has `.rows` and `.columns`) and that `serialize_rows()` yields dicts with `_selection_args`, `_selection_action` when applicable, and `source`.
- When you need to guarantee a specific CLI stage sequence (e.g., `download-file -url <path> -plugin <name>`), call `table.set_row_selection_action(row_index, tokens)` so the serialized payload emits `_selection_action` and the CLI can run the row exactly as intended.
- For table providers you can test `search_table_from_url` using a local HTML fixture or by mocking `HTTPClient` to return a small sample page.
- If you rely on pandas, add a test that monkeypatches `sys.modules['pandas']` to a simple shim to validate the pandas path.
**Example test skeleton**
Example test skeleton:
```py
from SYS.result_table_adapters import get_provider
from SYS.result_table_adapters import get_plugin
from plugins import example_provider
def test_example_provider_registration():
def test_example_plugin_registration():
plugin = get_plugin("example")
rows = list(provider.adapter(example_provider.SAMPLE_ITEMS))
rows = list(plugin.adapter(example_provider.SAMPLE_ITEMS))
assert rows and rows[0].title
cols = provider.get_columns(rows)
assert any(c.name == "title" for c in cols)
table = provider.build_table(example_provider.SAMPLE_ITEMS)
cols = plugin.get_columns(rows)
assert any(col.name == "title" for col in cols)
table = plugin.build_table(example_provider.SAMPLE_ITEMS)
assert table.provider == "example" and table.rows
```
---
## References & examples
## References and examples
- Read `plugins/example_provider.py` for a compact example of a strict adapter and dynamic columns.
- Read `plugins/vimm/__init__.py` for a table-provider that uses `TableProviderMixin` and converts `SearchResult` `ResultModel` for registration.
- See `docs/provider_guide.md` for a broader provider development checklist.
---
If you want, I can also add a small `plugins/myprovider_template.py` file and unit tests for it.
- Read `plugins/vimm/__init__.py` for a table-oriented plugin that uses `TableProviderMixin` and converts `SearchResult` to `ResultModel` for registration.
- See `docs/provider_guide.md` for a broader plugin development checklist.