This question is for folks who have done some kind of computing research.

Did you ever get formal training in how to do a literature review? What about informal training?

Some options, in case that lowers the barrier to entering the conversation:

Learned in a formal course
7.3%
Learned from peers
27.3%
Learned from advisor
34.5%
Other
30.9%
Poll ended at .
As a follow-up question: what platform do you do use for search?
Google Scholar
76.9%
JSTOR/EBSCOhost/via library
0%
ArXiv
7.7%
Other
15.4%
Poll ended at .
@etosch
for computers: acm
for everything else: library search
in general: following citation chains from any work that i'm starting from

@cxli re: citation chains.

Ever notice that older papers have fewer citations? Is it because there's been more growth in the field? Or is because citation practices have become bloated at best and polluting at worst?

Put another way: perhaps citation practices perform one function during review but two (possibly conflicting) functions once published?

@etosch i feel mixed on this but i also tend to have a different sense of what needs to be cited bc norms of english citation are different than in CS

crediting authors is very important to me but i tend to value citations that actually Add Substance to a piece of writing but have much less tolerance for citations that serve the purpose of "yes we read this paper don't bother me about it"

@cxli I agree the fields' norms feel qualitatively different, but I also never formally published in English, so my impression is based solely on my professors' feedback at the time!

One of the main pieces of work I really wanted to put together for the Helical project was an encoding of a specific model/hypothesis some area computer science that evolved over time, due to experimentation. Citation practices made this a challenging task.

@cxli The fundamental issue was that I couldn't differentiate _why_ a paper was being cited without diffing through the citing text. I'd characterize most of the citations to major papers as "junk."
@cxli As an example, there's a highly cited paper from CCS that does an empirical analysis of different fuzzers. I started painstakingly tracking down which papers cited it as a proxy for "generalized knowledge" being "passed down." I found that the first 20 or so papers cited it when justifying the number of independent runs they used for their fuzzers. They didn't explicity engage with the paper otherwise.
@cxli Now, it's possible that "do at least 30 runs" or whatever _is_ the main generalized knowledge. I was just surprised to see so little variability in the context for those citations.
@cxli I ran into a similar issue when looking at benchmarks in ML. There's been a proliferation of new datasets, corpora, and "benchmark suites" over the past 5-10 years --- I was even part of a group that made one! (toybox.rs) Our work had a small amount of reuse and most of the citations were like "this thing exists." Looking into similar efforts, I realized this is the norm.
@etosch yeah i guess my main frustration w CS citation practice is how superficial it feels so much of the time.....? the way papers r written always make me feel like it's super individualistic work that only faintly ekes out connections to other stuff

@cxli STRONG AGREE.

Within smaller subfields, I think this is less true --- for better or for worse, PL has remained "niche" and IMO the community actually cares about getting the "process knowledge" right through apprenticeship.

I would note that I was trained to be the thing I hate and have absolutely contributed to the problem (ish: I'm a research nobody, so I actually just experience the psychological wound of the pratice, without the benefit. :P)