Every now and then, I wonder: can AI do [insert simple task]? And I decide to let it have a whirl.

In this case, I wanted to make a list of IANS Faculty that were giving talks at this year's RSAC Conference.

The IANS faculty page is public: https://www.iansresearch.com/our-faculty/faculty

The RSAC Agenda page is public: https://path.rsaconference.com/flow/rsac/us26/FullAgenda/page/catalog

Seems simple, right?

  • Scrape site A
  • Scrape site B
  • Make a list of all people in list A that have a talk on list B.
  • It understood what it needed to do, but failed miserably.

  • it cannot handle websites made for humans - for performance, each of these sites is paginated in different ways, to avoid loading all data all at once. It gets the first few pages and gives up after that.
  • even with a small subset of data, it hallucinated and pulled in Hugh Thompson (he's not faculty).
  • Conclusion: I would be done with this task by now if I had ignored the existence of AI. Instead, I tried using AI, got tired and frustrated after trying half a dozen prompt approaches and different products (Claude, ChatGPT, and Gemini) and no longer have the time or energy to do it myself.

    AI continues to be frustrating for these tasks that are too small to write a script for, but too large to want to do manually (open dev tools, copy JSON, manually extract/parse data on CLI, etc). Unfortunately, SO MANY tasks still require pulling information from a website built for humans, with no API.

    @sawaba I've had decent success with exactly these kind of things, but only with the local agents not with the online ones. I put into a container and run it with --dangerously-skip-permissions

    I tried and it seemed to work, took 10 minutes. I spot checked two and those were correct. No guarantees. This project is definitely "good luck with that" licensed.