#Microsoft has given three months' notice of the retirement of API access to its #Bing #search index [1], following several years of price hikes. This abrupt change will impact alternative search engines and third-party app developers who rely on Microsoft’s search results to power their services, highlighting the risks inherent in relying on US #BigTech companies to build services. (1/2)

[1] https://azure.microsoft.com/en-us/updates?id=492574

Azure updates | Microsoft Azure

Subscribe to Microsoft Azure today for service updates, all in one place. Check out the new Cloud Platform roadmap to see our latest product plans.

(2/2) Fortunately, an alternative is in the works. #CERN is a member of the #OpenWebSearch consortium, which is building a new web index based on #European values of #fairness and #privacy. The Open Web Index project aims to build a public index that offers an alternative to existing indexes held by companies like #Microsoft and #Google.

The system is crawling and indexing about 9 million URLs per hour, aiming to index 30–50% of the text-based web by the end of 2025.

https://home.cern/news/news/computing/european-project-make-web-search-more-open-and-ethical

European project to make web search more open and ethical

On 6 June, the OpenWebSearch.eu consortium released a pilot of a new infrastructure that aims to make European web search fairer, more transparent and commercially unbiased. With strong participation by CERN, the European Open Web Index (OWI) is now open for use by academic, commercial and independent teams under a general research licence, with commercial options in development on a case-by-case basis. The OpenWebSearch.eu initiative was launched in 2022, with a consortium made up of 14 leading research institutions from across Europe, including CERN. The project aims to build a public web index that offers an alternative to existing indexes held by companies like Google (USA), Microsoft (USA), Baidu (China) and Yandex (Russia). Web indexes provide the back-end data infrastructure behind search engines, and today the companies that manage them determine what content is searchable and how it is ranked. Currently, Europe does not have a search index of its own, making it vulnerable to digital dependence.  The OWI offers a clear alternative based on European values. The project’s cross-disciplinary nature, ensuring continuous dialogue between technical teams and legal, ethical and social experts, ensures that fairness and privacy are built into the OWI from the start. “Over thirty years since the World Wide Web was created at CERN and released to the public, our commitment to openness continues,” says Noor Afshan Fathima, IT research fellow at CERN. “Search is the next logical step in democratising digital access, especially as we enter the AI era.” The OWI facilitates AI capabilities, allowing web search data to be used for training large language models (LLMs), generating embeddings and powering chatbots. The CERN team has built key parts of the infrastructure that power the OWI’s crawling and indexing capabilities. This means that it tracks which webpages should be scanned. The system handles about 9 million URLs per hour, which equates to roughly 3 terabytes of public web data a day, with the aim of indexing 30–50% of the text-based web by the end of 2025. “We have already hit our target of indexing one petabyte of openly licensed web data, and our public dashboard helps users monitor that progress,” says Noor. CERN is also contributing to other parts of the project. For example, it is scanning its own public physics content to enhance the OWI, as well as developing an internal index and its own search tools and services. Currently, a prototype of a use case for the OWI is in development: known as “Nooon”, this research-driven search engine is dedicated to people with disabilities who require search engines that surface structured, accessible and representative information while ensuring privacy in both access and contribution. The release of the OWI, which has received funding from the European Union’s Horizon research and innovation programme, comes at a pivotal time. The European Commission’s Invest AI initiative is set to mobilise 200 billion euros for artificial intelligence, and the OWI offers a powerful foundation of open data for innovation. Furthermore, as Microsoft plans to retire access to the Bing index, the OWI will be able to offer an alternative index for European search engines. After two and a half years of intensive research and development, anybody can now request access to the OWI by signing up at openwebindex.eu/auth/login. Note that the project provides a web index, and not a search engine or API, and users wishing to build their own search engines or chatbots will need a working knowledge of how to apply web index data.  Read more: OpenWebSearch website Ethical, open and non-commercial: the the Open Web Search project is designed to provide Europe with the right alternative to existing search engines (home.cern) Towards an unbiased digital world (CERN Courier) Empowering data sovereignty through OpenWebSearch.eu (CERN Computing blog, behind the CERN SSO)  

CERN
@strangequark "In the works" does not sound like something that'll be ready to address a crisis in less than three months. With Google being useless and any search left not based on it being based on Bing, what are we supposed to do in the meantime?...

@pteryx According to the #Wired article [1], companies that have signed long-term deals with Microsoft will maintain access for the time being: "the largest customers of the #Bing APIs will retain their access after August 11, while smaller developers that were less profitable for Microsoft to support are being cut off".

#DuckDuckGo is cited as one of the companies who will retain access. #Ecosia plans to switch to #OWI.

#Mojeek and #Brave have their own indexes.

[1] https://www.wired.com/story/bing-microsoft-api-support-ending/

Microsoft Cuts Off Access to Bing Search Data as It Shifts Focus to Chatbots

Microsoft is limiting access to tools that boosted its rivals, but larger customers like DuckDuckGo say they won’t be affected.

WIRED
@pteryx I should also mention that #OWI is on #Fedi (because of course it is) @openwebsearcheu