💻 PowerShell can scrape anything — even if it’s not meant to. At #PSConfEU 2025, @[email protected] shows how to grab data from APIs, HTML & apps with browser tools and native cmdlets. 🎟️ 2026 → Wiesbaden: psconf.eu #PowerShell #ScreenScraping #DevTools #Automation

- YouTube
Home - PSConfEU

Discover PowerShell scripting & automation at psconf.eu. Join experts, learn, & boost productivity. Elevate your skills today!

PSConfEU

#Roku just threw me an "accept this" about automatic content detection.

Maybe Im wrong, but smelled like they wanna scrape the screen.

It allowed disagree 2 clicks in but now I trust them even less than before and that was none already.

#tv
#privacy
#ScreenScraping

💻 PowerShell can scrape anything — even if it’s not meant to. At #PSConfEU 2025, @james-oneill.bsky.social shows how to grab data from APIs, HTML & apps with browser tools and native cmdlets. 🎟️ 2026 → Wiesbaden: psconf.eu #PowerShell #ScreenScraping #DevTools #Automation

- YouTube
Home - PSConfEU

Discover PowerShell scripting & automation at psconf.eu. Join experts, learn, & boost productivity. Elevate your skills today!

PSConfEU
What is #screenscraping? Find out what it is, how it works, the use cases of screen #scraping, its advantages and disadvantages, the legal implications, and best practices for your #business. https://shorturl.at/7dxsM
💻 PowerShell can scrape anything — even if it’s not meant to. At #PSConfEU 2025, @james-oneill.bsky.social shows how to grab data from APIs, HTML & apps with browser tools and native cmdlets. 🎟️ 2026 → Wiesbaden: psconf.eu #PowerShell #ScreenScraping #DevTools #Automation

- YouTube
James O'Neill (@james-oneill.bsky.social)

Exasperated curmudgeon Ex Scuba instructor, Ex Formula team employee, Ex Microsoftie. Still doing Photography with Pentax and coding with PowerShell. Occasional drone pilot, but more prone to droning on.

Bluesky Social
I played with my #python script to tweak the prompt and finally got to ...

prompt = f"Today's date is {today_date}. I want to know today's custard flavor of the day, the description of today's custard flavor of the day, today's date, the numeric day of the month, the current month (spelled out), the name of the custard shop, the URL the custard shop's website, the physical street address of where the custard shop is located, the city the shop is physically located in, and the hours the custard shop is open today. The 'flavor of the day' could be listed as 'TODAY’S FLAVOR OF THE DAY', 'Gilles Flavor of the Day', 'Flavor of the Day', 'Today’s Flavor', or the '{today_date} flavor of the day'. Do not include any other information beyond what I am requesting. If you can not find the value of an attribute, set it to the value 'NA' in the object that you return. The description of the flavor of the day must not include the phrase 'They are currently hiring and offer competitive wages' and the description must not include the phrase 'Gilles Frozen Custard is a Milwaukee institution serving up delicious frozen custard in a variety of flavors'. The flavor of the day must not be for a future day. The result must look like this without any modifications ... 'date': '[Day], [Month] [Date]', 'flavor_of_the_day': '[Flavor]', 'description': '[Flavor Description]', 'hours': '[Hours they are open, today]', 'url': '[The address for their website]', 'name': '[The name of the custard shop]' 'location': '[Street Address Of Where They Are Located]', 'city': '[The city the shop is located in]', 'day': '[The numeric day of the month]', 'month': '[The current month, spelled out]'"

... But it still isn't playing nicely. I could reduce the number of attributes returned or try to get a 70b model working on it.

#Firebase #Firestore #AI #ScreenScraping
I have roughly 50 lines of code that convert ...

custard_shops = {
"Gilles": "https://gillesfrozencustard.com",
"Murfs": "https://www.murfsfrozencustard.com",
"Georgie Porgies": "https://georgieporgies.com",
"Pops": "https://popscustard.com",
"Golden Gyros": "https://goldengyro.com",
"Leons": "https://leonsfrozencustardmke.com",
"Kraverz": "https://www.kraverzcustard.com",
"Hefners": "https://www.hefnerscustard.com"
}

... to what is supposed to be JSON objects that look like ...

shop: {'date': '[Day], [Month] [Date]', 'flavor_of_the_day': '[Flavor]', 'description': '[Flavor Description]', 'hours': '[Hours they are open, today]', 'address': '[Street Address Of Where They Are Located]'}

... but this needs a bit of attention. The goal is to run a process once daily in Docker that scrapes for custard shop data (hours, the flavor of the day, etc) so that I can use it to update Firebase Cloud Firestore.

#Python #ScreenScraping #AI #Custard

@si_irini
You really ought to read #TimChambers article again.
Massive data extraction through #screenscraping has been happening for years:

https://mastodon.social/@HistoPol/111575306870943698

The only real defense is complete anonymity and deleting old posts after a week or so (the latter being unacceptable to me.)

Someone has been creating a meta-#searchengine for the #Fediverse. He respects #nobots (and something else.)

Don't recall the name @ present.

@OlDude82 @mina @evelynefoerster @MaJ1

@_steve
IDK, I would need to know first if #ScreenScraping by #AI firms (and others) is an issue if I opt in.
If posts' content can be extracted, I'm out.
Still, it cannot be that I cannot even search my own stuff with fulltext search. That is just ridiculous.

@Gargron

Max props to the developer of puppeteer-extra-plugin-stealth, who I just bought a coffee for.

The screen-scraper I wrote to bulk-export data from my Garmin sports tracker (because Garmin's API is "only for corporate partners", which is a magic spell you can say to make me write and open source a screen-scraper that targets your systems) stopped working today. Turns out Cloudflare could detect my automation.

Installed puppeteer-extra-plugin-stealth. Fixed instantly. Awesome.

#ScreenScraping #puppeteer #OpenSource