Seeking help! No matter how many keywords/names u search related to this story the article doesn't appear on Google

This was a front page A1 story I wrote for WaPo on how smear campaigns and abuse women journalists endure are a press freedom issue. Can someone explain why the article does not appear on Google? https://www.washingtonpost.com/investigations/2023/02/14/women-journalists-global-violence/

These women journalists were doing their jobs. That made them targets.

A Forbidden Stories consortium: Female reporters are often pushed out of their jobs as global news organizations struggle to respond to disinformation campaigns

The Washington Post
One of the journalists featured in this story raised it to me, she is wondering if bad actors have been able to get the story hidden from Google. These women journalists were so brave to detail their abuse and harassment, and now the story is essentially wiped from the web. What is going on!?
washington post "these women journalists were doing their jobs. that made them targets." - Bing

Intelligent search from Bing makes it easier to quickly find what you’re looking for and rewards you.

Bing
Ask The Post AI - The Washington Post

What if all you had to do was ask? For concise, factual answers to the questions you have right now, our new tool, Ask The Post AI, delivers AI-generated summaries from published Washington Post reporting.

@thisismissem @taylorlorenz It's also unfindable on DuckDuckGo. My guess is that it has been hidden (and then delisted in search crawlers) by WaPo itself.

@leftyknowitall
It's not in WaPo's results for the primary subject but other pages are found, including an unrelated story from the same day that links to it as a "Lunchtime read."

https://www.washingtonpost.com/search/?query=Gharidah+Farooqi

Only the video included in the story is found for Juliana Dal Piva:

https://www.washingtonpost.com/search/?query=Juliana+Dal+Piva

The story is included in their sitemaps, used by search engines to index sites.

https://www.washingtonpost.com/sitemaps/sitemap-2023-02.xml

There are zero results for Zartaj Gul Wazir:

https://www.washingtonpost.com/search/?query=Zartaj+Gul+Wazir

@thisismissem @taylorlorenz

Ask The Post AI - The Washington Post

What if all you had to do was ask? For concise, factual answers to the questions you have right now, our new tool, Ask The Post AI, delivers AI-generated summaries from published Washington Post reporting.

@taylorlorenz Even searching for the URL directly in Google won't show it. Links to posts that link to the article.

The only reason I can think is this feature somehow made it on their suppression list. This is just wrong.

@taylorlorenz I’ve been able to find syndicated copies of the story (on sites like rsn.org and thefridaytimes.com) which credit WaPo for the original. But definitely not finding the original easily. Seems like it must be on an explicit “no index” list, which is certainly something either Google or WaPo could do. Given that the behaviour is consistent across other search engines, I’d suspect WaPo has set a “no index” flag. Interesting that they still have the article online if you know the link.
@taylorlorenz Perhaps Alphabet has patriarchal elements that are hidden from the public gaze...
@njsm11 @taylorlorenz Oh, no, I reckon they're right out in plain sight; techbros gonna techbro, no matter where they work.
@taylorlorenz It’s not findable using @kagihq. The closest it gets is this tweet about the article by the WP: https://x.com/washingtonpost/status/1625550937937874945
The Washington Post (@washingtonpost) on X

Women who are targeted in these campaigns are doing some of the most crucial journalistic work in their regions: investigating powerful cultural leaders, exposing government wrongdoing and revealing corruption. https://t.co/emlznfw7jJ

X (formerly Twitter)
@taylorlorenz Is it paywalled? I can't get more than the first couple of paragraphs.
@megatronicthronbanks @taylorlorenz exactly. And paywalled sites are deprecated by search engines as most clickers just get frustrated
@Bruce_Ak @megatronicthronbanks @taylorlorenz Try an A/B test with other paywalled articles from the Washington Post.
@taylorlorenz It looks very much like some of the men you describe in that article are working for WaPo. Were I your editor, I'd find that person and show them the door; this is disgusting.
@taylorlorenz It would be extremely interesting to know when media orgs do this. I doubt this is the first time.
@taylorlorenz if I had to guess, which I am, there's probably am active lawsuit open between the publication and AI companies, so it's on a 'no-fly' list publicly even though they're still gonna scrape the data
@taylorlorenz So I happen to use Firefox on Ubuntu, but google search, and it shows up for me right now.
@taylorlorenz it's because there's a "noindex" tag in the head. This is telling search engines not to display the article in search results

@fromjason @taylorlorenz Also “noarchive.” It’s been specifically barred from the Internet Archive.

EDIT: This may not be a unique thing for WaPo stories, that said

@fromjason @taylorlorenz Based on the dates in the Wayback Machine, the blacklisting happened sometime before October 2023; the last archived date for the story was June 2023.

EDIT: That may be a widespread thing on the site though. I found other stories with “noarchive.”

But yeah @taylorlorenz, what @fromjason found is likely your answer. It looks like an internal block. (Possibly a legal thing? Though you probably have a better handle on that than us.)
@fromjason @taylorlorenz
But Brave search finds the article. Because they're going rogue?😉
@ShutterbugDoug @taylorlorenz yeah. The tag is mostly a "request". Search engines can choose to honor or ignore it.

@ShutterbugDoug

@fromjason @taylorlorenz Or they crawled it before the noindex tag was added.

@fromjason @taylorlorenz Wow, WaPo putting noindex on their past articles that are inconvenient to their owner's far-right interests??
What was this #WashingtonPost slogan again? "Democracy dies in indexes"? "Democracy dies from A to Z"? I keep forgetting it…
USA Today, Washington Post & Others Can't Get Their Sponsored Content Out Of Google News

Typically you do not want Google News to index and rank sponsored (paid for) content in the news results. But sites like USA Today, The Washington Post and many others simply cannot get their sponsor

Search Engine Roundtable
@fromjason
@taylorlorenz
Correct, the article has in the HTML header an extra 'robots' meta tag containing 'noindex', blocking it from search engines. This tag is on top of the 'robots' meta tag containing 'noarchive' (and some) that all WP articles have. So some process of the WP (human or automated) added for some reason this extra noindex tag.

@fromjason @taylorlorenz

It's difficult to reconcile "democracy dies in darkness" with "content=noindex, content=noarchive".

@mhoye @taylorlorenz maybe the real democracy was the articles we scrubbed along the way

JK I don't know why they did this. Perhaps it was an act of God. Or maybe the index grinch.

@taylorlorenz The article contains the tag

<meta name="robots" content="noindex"/>

(view source and search for "noindex"), which prevents indexing by any search engine (e.g., https://developers.google.com/search/docs/crawling-indexing/block-indexing).

It would be interesting to know when and why WaPo added this tag.

Block Search Indexing with noindex | Google Search Central  |  Documentation  |  Google for Developers

A noindex tag can block Google from indexing a page so that it won't appear in Search results. Learn how to implement noindex tags with this guide.

Google for Developers

@taylorlorenz Does wWPo’s seo suck? (Probabaly, I don’t know) Any way to get them to look into it?

Too bad google bombing doesn’t work anymore. We could mount a linking campaign.

@taylorlorenz I tried putting quotes (searches as string instead of separate words) on the article title and it did come up in Google. “These Women Journalists Were Doing Their Jobs. That Made Them Targets” Without quotes the results are very different. The results will also vary by region. https://www.weglot.com/blog/how-to-see-google-search-results-for-other-locations.

@taylorlorenz not sure this is what may be playing a role, but WaPo’s robots.txt file (one way webmasters can request robots handle things on their domain) has this:

User-agent: Google-Extended
Disallow: /

That is requesting any machine that makes a request with the user-agent set to “Google-Extended” not access (and therefore not index) anything at all on that domain.

Odd thing to have in a robots.txt file, but it is there at https://www.washingtonpost.com/robots.txt

@davidaugust @taylorlorenz Google extended is the ai search crawler, normal search still uses Google it which is allowed by the robots file

Google however has no index of the URL, which could be done by either Google in the background, or by wapo themselves in the search console, and with Google making it very difficult to submit URLs that you don't own, you would probably have to ask wapo to submit it for crawling manually to try and fix it

@tristan @taylorlorenz I agree. The WaPo web team (or even search marketing or organic search team, not sure how WaPo is structured) may be able to get into it as Tristan says.
@taylorlorenz Somebody archived it using archive.ph at this link https://archive.ph/5bs3U
@taylorlorenz That is because someone and his super computer is related to Google and has been able to refuse access to some articles or even finding words ...computers have been contaminated...

@taylorlorenz I found it in duckduckgo, which is Bing with a better tailor. But it's buried without adding "Washington Post" to the search. It popped right up when I did.

I didn't find it on Alexandra search, but they don't seem to include Washington Post articles. They have some bizarre sources I just noticed. Newt Gingrich 360? Wtaf?

Google's algorithms must not like you.

Maybe the fediverse needs a search engine.

Alleged Censorship By Google and YouTube

@taylorlorenz It's marked as "noindex".

You can look up the source code.
<meta name="robots" content="noindex"/><meta name="robots" content="noarchive, max-image-preview:large"/>

@taylorlorenz
Perhaps because Bezos prefers plastique women over the real thing.
https://mastodon.online/@davidaugust/113698879493205706
David August (@[email protected])

Attached: 1 image Amazon is flooding striking workers at DBK4 in Queens with freezing water in sub-zero weather, endangering everyone. If you can, please do not shop at Amazon right now. Social media posts confirming the flooding: https://www.instagram.com/reel/DD2nNx1Sl6i/ https://www.instagram.com/reel/DD2oNicJ96-/ https://www.facebook.com/share/r/bjYgQLcLVv9NVBRn/ https://twitter.com/nycdsa/status/1870569037194989688 #UnionStrong #union #amazon #strike #law

Mastodon
@taylorlorenz Found it on DDG using first 5 words of the title and also using these words from part of a key sentence:
“torment and humiliate women journalists”
DDG had several citations of the article.
Google responded with one using same search terms.

@taylorlorenz seems like it's not in the index, but you have to be the owner of the site to use the tool to inspect the index I think.

https://support.google.com/webmasters/answer/7474347?hl=en

Why is my page missing from Google Search? - Search Console Help

Troubleshooting missing pages and sitesHere's how to troubleshoot and fix the most common problems when your page or site is missing from Google Search results. Did you recently create the page or

@taylorlorenz Maybe a failure of SEO optimization? That sometimes happened with my articles in a big newspaper.
@taylorlorenz
Forgive my laziness in not checking myself, but have they told google not to index the article via a robot.txt file?
@taylorlorenz Google is not a search engine, its an advertisement platform. If you use it as a search engine you will always be disappointed.

@taylorlorenz
if you do a site search on Feb 23 some articles don't have the "noindex" tag.

site:https://www.washingtonpost.com/investigations/2023/02/