Robert Gove

44 Followers
43 Following
3 Posts
Senior Data Visualization Engineer at CrowdStrike. Award winning cocktail maker. I also do math and machine learning and cyber security and stuff.
@scheidegger But people *do* evaluate things only on synthetic data, and then we all treat it as authoritative, e.g. readability of node-link vs. matrix-based diagrams http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.180.5768&rep=rep1&type=pdf But I take your point that both standard and synthetic data can be useful and serve a purpose. I just object when synthetic is the only means of evaluation, which happens too often.
@scheidegger I agree standard datasets are useful for those reasons. But I'm genuinely skeptical of techniques demonstrated on synthetic datasets. There are too many cases where techniques have wildly different results on synthetic and real datasets because the synthetic datasets are too abstract. E.g. most graphs are not nearly-uniform meshes, and Watts-Strogatz/Barabasi-Albert graphs have substantial topological differences from real social nets, but these are standard graph layout datasets.