Now another esteemed project/repo I can wreck... 🤣🤣🤣

But seriously, thank you @sebnagel and the entire #ApacheNutch team for this invitation. I'm honored to join.

Thank you!

[FEATURE] Reimplement BulkProcessor · Issue #181 · opensearch-project/opensearch-java

Are there any plans on adding an equivalent of the BulkProcessor API, which was available in the High Level Rest Client, so that Index and Delete requests can be batched?

GitHub
file-observatory/commoncrawl-fetcher at main · tballison/file-observatory

Single server/laptop grade file-observatory. Contribute to tballison/file-observatory development by creating an account on GitHub.

GitHub
@willoremus I ❤️ that Google uses #CommonCrawl and thereby the fruits of #ApacheTika and #ApacheNutch.
@elan also see #Heritrix and of course #StormCrawler as alternatives to #ApacheNutch

#OpenSearch #ApacheNutch index writer on its way?

Please take a look and offer feedback!

I started with #OpenSearch 1.x.

https://github.com/apache/nutch/pull/761

NUTCH-2920 -- first working attempt at an OpenSearchIndexWriter by tballison · Pull Request #761 · apache/nutch

…iter to OpenSearch Thanks for your contribution to Apache Nutch! Your help is appreciated! Before opening the pull request, please verify that there is an open issue on the Nutch issue tracker wh...

GitHub