This question is for folks who have done some kind of computing research.

Did you ever get formal training in how to do a literature review? What about informal training?

Some options, in case that lowers the barrier to entering the conversation:

Learned in a formal course
7.3%
Learned from peers
27.3%
Learned from advisor
34.5%
Other
30.9%
Poll ended at .
As a follow-up question: what platform do you do use for search?
Google Scholar
76.9%
JSTOR/EBSCOhost/via library
0%
ArXiv
7.7%
Other
15.4%
Poll ended at .
@etosch
for computers: acm
for everything else: library search
in general: following citation chains from any work that i'm starting from
@cxli I originally had ACM in here but could only have four options, so I replaced it with other!

@cxli For context: the #acmdl frictions make systematic reviews painful. It feels borderline unusable as a research tool and is incomplete.

#googlescholar is more complete, but the accuracy of the metadata drops off. I've found that historic searches (e.g., <1950) are mostly incorrectly dated.

I was curious whether this is corroborated by research and came across: https://pmc.ncbi.nlm.nih.gov/articles/PMC7079055/
...

Checking your browser - reCAPTCHA

@cxli Interestingly, this study (conducted in 2019) reports that the #ACMDL allows bulk download. I don't know if this feature is just hard to find or if it's been removed since then.

(Maybe @JonathanAldrich would know?)

@etosch @cxli I don't know the history but right now I think they are doing it as a defense against unauthorized LLM training and other things that act like DDOS. It can cause problems for certain kinds of academic use; given this, I'm honestly not sure it's worth the cost.
@JonathanAldrich @cxli I've had several research threads over the past 3-4 years that have more or less stalled out because while the DL seems like the best resource for them, it's just too labor intensive to manually search, click, download, refine the search, exclude papers already read, etc.
@JonathanAldrich @cxli I'm curious what their threat model is for LLMs (aside DDOS) and how that relates to their costs and revenue. Like, what I really want is a database connection and _maybe_ some UI and querying features. I'd prefer to work locally, but I could also see value in working on an ACM-hosted private notebook (which could become public up on publication). I _do not_ want an "AI research assistant." I would accept certain constrained AI/ML tools, if I understood their affordances.

@JonathanAldrich @cxli What I'm wondering is whether people like me are even the target audience for ACM DL subscriptions. If yes, then surely others would be interested in these features! If no, I'd like to know what our alternatives are.

I'd love to hear any insights you have on this, @JonathanAldrich! I really appreciate having some insight into the mechanics of these orgs.

@etosch @cxli I can't speak definitively, but beyond DDOS (which seems implausible anyway) I think the threat model is simply that a lot of ACM members don't want to allow LLMs to train on their papers. And while ACM wouldn't necessarily mind allowing training on the remainder, ACM would like to get paid for it, enabling us to provide more services to members at lower cost.
@etosch @cxli And the people who don't want LLMs to train on their papers generally feel that way for ethical reasons. It's not an issue for me, but I respect those who feel that way and I think ACM wants to also respect those wishes. Of course, when your whole library is Open Access (and honestly, even if it is not), this is very hard to enforce. It may be a losing battle.
@JonathanAldrich @cxli whoops, you mentioned open access here; this is what happens when I reply to messages one at a time. 🙃

@JonathanAldrich @cxli Hm. I suspect a lot of the ACM members who don't want their work to be training data are also proponents of open access. I don't know if these options are as mutually exclusive as they appear.

I'm also not convinced that firms selling LLMs services would have a competitive advantage over what a usable ACMDL UI could provide, but maybe I'm alone here?

@etosch @cxli Yeah LLMs.txt is supposed to allow training limits to coexist with open source (and indeed robots.txt could also be used). But compliance is voluntary. ACM's rate-limiting tools are a backstop--and lawsuits could be another one--but it's hard to be sure how effective they are.
@JonathanAldrich @cxli It's not just rate limiting though! Someone who definitely isn't me has only successfully programmatically downloaded (even open access papers!) directly, but only by jumping through hoops (copying real browser headers, creating sessions, some other stuff I'd have to...ask them about). I assume it wasn't always like this?