OpenAI violated Canadian privacy laws, federal and provincial watchdogs say

Commissioners from four of Canada’s privacy watchdogs have found that OpenAI violated Canadian privacy laws while developing and training its early models of ChatGPT.

Philippe Dufresne, Canada’s privacy commissioner, was joined by his provincial counterparts from British Columbia, Alberta, and Québec to announce the findings of a joint investigation into the tech giant. The investigation examined how OpenAI sourced training data for its early, GPT-3.5 and GPT-4 models, which included scraped content from publicly accessible internet sources like social media and blog posts, licensed third party sources like media outlets and stock image vendors, and user interactions with ChatGPT.

Dufresne noted that all four regulators found OpenAI had violated various federal and provincial privacy laws, including the federal Personal Information Protection and Electronic Documents Act (PIPEDA), and its provincial counterparts in Alberta, BC, and Québec. 

Read more at BetaKit

#Alberta #BritishColumbia #consent #OPC #scraping

I improved my X / Twitter likes scaper.

https://github.com/norandom/x_likes_scraper/tree/main

1. added MCP support
2. X algorithm inspired ranking with a vector store
3. OpenAI embeddings
4. LLM fencing against prompt injection
5. sanitization
6. command line search
7. improved code base health
8. readability now human friendly
9. ruff and mypy precommit hooks
10. easier config
11. moved to uv
12. ...

Robots at work.

#scraping #twitter #likes #mcp #search

GitHub - norandom/x_likes_scraper: Twitter / X likes exporter for Python for headless social network intelligence data sync

Twitter / X likes exporter for Python for headless social network intelligence data sync - norandom/x_likes_scraper

GitHub

Los actuales modelos comerciales de IA generativa han sido desarrollados vulnerando el Reglamento General de Protección de Datos (RGPD) y la Ley de Propiedad Intelectual (LPI). Jamás hubo un pedido de consentimiento por parte de las empresas tecnológicas. Por eso hablamos de ROBO DE DATOS.

#AI #genAI #generativeAI #data #datos #robo #robodedatos #theft #stolen #illegal #technology #bigtech #author #scraping

#Nvidia & co are bit like the #Monsanto of cyberspace. genAI does favour the very wealthy, thus, independent creative companies will be unable to compete, in the so-called "AI" era, namely, the #scraping era:

AI doesn't produce data ex nihilo.

Creative companies will be forced to sell their #data to content brokers for cheap, there's no competition. Thus, creative companies, will no longer be creative : R&D is expensive, and slow, they'll use Nvidia & co products to sprint faster. #fashion

Reescribiendo Nuestro Scraper co…

Procesos técnicos involucrados La arquitectura de un scraper asíncrono se basa en la gestión eficiente de las conexiones.

https://norvik.tech/news/analisis-asyncio-scraper-norvik-tech

#Technology #Asyncio #Scraping #DesarrolloWeb #Python #NorvikTech #DesarrolloSoftware #TechInnovation

Nos está escrapeando AliyunSecBot/Aliyun de Alibaba, ya esta bloqueado, en breve el bloqueo lo levanta el firewall y chau pinela... nos estan escrapeando nuestra instancia de Wikipedia local #bot #scraping #ddos
J'ai open-sourcé mon pipeline de prospection B2B 🧲
Lead Scraper Pro v2.1
→ 6 sources scrappées en parallèle
→ Déduplication + enrichissement email auto
→ 1 CSV propre pour Mailchimp / Lemlist / CRM
Node.js + Playwright · Licence MIT · Gratuit
⭐ github.com/molokoloco/lead-scraper-pro
#B2B #OpenSource #Scraping #LeadGen
⚖️ Landgericht München, Urteil vom 29.09.2023, 11 O 1884-22: Öffentliche Profildaten müssen nicht vor der Erhebung durch Dritte geschützt werden. #Scraping #Soziale #Netzwerke #Schadensersatz #teamdatenschutz #dsgvoportal https://www.dsgvo-portal.de/gerichtsentscheidungen/2023-09-29-LGM-11-O-1884-22-Scraping-Soziale-Netzwerke-Schadensersatz-2709.php
Öffentliche Profildaten müssen nicht vor der Erhebung durch Dritte geschützt werden. | 24.04.2026 | dsgvo-portal.de

Kein DSGVO-Schadensersatz bei Scraping öffentlich zugänglicher Daten ohne nachweisbaren Verstoß, Schaden und Kausalität.

Compliance Essentials GmbH

Onlinewerbung: Mangelhafte Umsetzung von Betroffenenrechten?

Onlinewerbung führt zunehmend zu datenschutzrechtlichen Beschwerden, insbesondere weil automatisierte Verfahren wie das Scraping von personenbezogenen Daten immer häufiger eingesetzt werden. Viele Unternehmen, die solche Methoden nutzen, verfügen jedoch nicht über die notwendigen Prozesse oder das e(...)
https://www.dr-datenschutz.de/onlinewerbung-mangelhafte-umsetzung-von-betroffenenrechten/

#Betroffenenrechte #E-Mail-Werbung #Informationspflicht #Marketing #Scraping

Onlinewerbung: Mangelhafte Umsetzung von Betroffenenrechten?

Online-Werbung rückt stärker in den Datenschutzfokus; insbesondere Scraping und die Gewährleistung der Betroffenenrechten werden anhand aktueller Hinweise des HmbBfDI beleuchtet.

Dr. Datenschutz

RE: https://mastodon.online/@NatureMC/116442840702472493

And it works! Since blocking the worst #scraping bots, all these "visitors" staying only some seconds on a blog article, are gone. 👍 And I can tell you that they were *many*, often more than real humans reading my blog. Illegal training of LLMs & Co. is a bigger problem than many are aware of.

#noAI #scraper #blogging #blogger #bloggingCommunity