14.76. DD 76: Paivana - Fighting AI Bots with GNU Taler#

14.76.1. Summary#

This design document describes the architecture of an AI Web firewall using GNU Taler, as well as new features that are required for the implementation.

14.76.2. Motivation#

AI bots are causing enormous amounts of traffic by scraping sites like git forges. They neither respect robots.txt nor 5xx HTTP responses. Solutions like Anubis and IP-based blocking do not work anymore at this point.

14.76.3. Requirements#

  • Must withstand high traffic from bots, requests before a payment happened must be very cheap, both in terms of response generation and database interaction.

14.76.4. Proposed Solution#

Architecture

  • paivana-httpd is a reverse proxy that sits between ingress HTTP(S) traffic and the protected upstream service.

  • paivana-httpd is configured with a particular merchant backend.

  • A payment template must be set up in the merchant backend (called {template_id} from here on).

Steps:

  • Browser visits git.taler.net

  • paivana-httpd checks for a Paivana cookie

    • If cookie is set and valid, the request is reverse-proxied to upstream. Stop.

    • Otherwise, a paywall page is rendered, continue.

  • The browser (rendering the paywall page) generates a random paivana ID via JS.

  • Based on this paivana ID, a taler://pay-template/{paivana_backend}/.well-known/paivana/{template_id}?paivana_id={paivana_id} URI is generated and rendered as a QR code and link.

  • The browser long-polls on a {paivana_backend}/.well-known/pavivana/paivanas/{paivana_id} endpoint that returns when an order with the given paivana ID has been paid for (regardless of the order ID, which is not known to the browser).

  • A wallet now needs to instantiate the pay template and pay for the resulting order by talking to the Paivana backend which proxies the requests to the merchant backend and in the process learns the order ID and the payment status change. paivana-httpd may also implement the required subset of the merchant backend itself in the future.

  • When the long-poller returns and the payment has succeeded, the HTTP response sets the Paivana cookie. The browser reloads the page.

The Paivana Cookie is computed as exp_timestamp || '-' || H(client_ip || paivana_server_secret || exp_timestamp).

Problems:

  • A smart attacker might still create a lot of orders via the pay-template.

    • Solution A: Don’t care, unlikely to happen in the first place.

    • Solution B: Rate-limit template instantiation on a per-IP basis.

Implementation:

  • Paivana needs to support extended template instantiation with a paivana_id.

  • Paivana component needs to be specified / implemented

  • Wallet-core needs support for a paivana_id in pay templates.

14.76.5. Test Plan#

  • Deploy it for git.taler.net

14.76.6. Definition of Done#

N/A

14.76.7. Alternatives#

14.76.8. Drawbacks#

  • Requires JavaScript

    • Could be made to work without JS by returning some Paivana: ... header.

14.76.9. Discussion / Q&A#