Hello :)
At https://mastodon.help we have a crawler that, once a week, crawls Mastodon instances in search for new ones, and once a day updates info about those which have already been discovered, which are searchable at https://mastodon.help/instances. I'm thinking of making the crawler optionally create and maintain a searchable "global directory of public mastodon accounts", which would be updated once a day too, by retrieving accounts info *only* for accounts which are published in the opt-in public directories of Mastodon instances, and *only* for those users who have not activated "opt-out of search engine indexing" ("preferences" -> "other" -> "opt-out of search engine indexing"). Since each instance's public directory is opt-in and in any case, even when they choose to opt-in to it, users can opt-out of search engine indexing, i'm not sure whether the crawler should also exclude those accounts which have the #nobot or #nobots tags in their bio, which seem to me to refer to bots, particularly follow-bots, and not to search engines. What do you think about it?
At https://mastodon.help we have a crawler that, once a week, crawls Mastodon instances in search for new ones, and once a day updates info about those which have already been discovered, which are searchable at https://mastodon.help/instances. I'm thinking of making the crawler optionally create and maintain a searchable "global directory of public mastodon accounts", which would be updated once a day too, by retrieving accounts info *only* for accounts which are published in the opt-in public directories of Mastodon instances, and *only* for those users who have not activated "opt-out of search engine indexing" ("preferences" -> "other" -> "opt-out of search engine indexing"). Since each instance's public directory is opt-in and in any case, even when they choose to opt-in to it, users can opt-out of search engine indexing, i'm not sure whether the crawler should also exclude those accounts which have the #nobot or #nobots tags in their bio, which seem to me to refer to bots, particularly follow-bots, and not to search engines. What do you think about it?
