Are you planning to attend (virtually or physically) #IMC23 next week? Be sure to attend our (@nrllah @thorstenholz @chwress @pizzahax @ProfPohlmann ) talk on the similarity of Web measurements in the Security and Privacy session on Wednesday, the 25th!
In our paper, we perform a large-scale web measurement (1.7M page visits) and discuss the differences and similarities between multiple measurements. We find that differences in the result exist even if a website is crawled using the same setup in parallel from two machines.
However, we found significant differences between first- and third-party components. While first-party components are loaded relatively stable and similar, third-party elements (e.g., trackers) make up a significant part of the cases of observed variances.
Finally, what we learned in our study that (1) Drawing conclusions based on loading dependencies is error-prone, (2) An understanding of whether the phenomenon of interest is present in the dynamic or static content of a page is vital for planning the experiments, and (3) Our approach confirms that researchers should use different profiles and execute multiple measurements to assess the potential of ‘randomized’ findings.