Holy shit, this made me realize I have *tremendous* room for improvement in my data visualizations https://stackoverflow.blog/2022/03/03/stop-aggregating-away-the-signal-in-your-data/
Stop aggregating away the signal in your data - Stack Overflow

@alexkyllo Fantastic. Thanks for sharing. 🙌🏻
@alexkyllo fantastic post. This is my kind of #DataViz . “Embrace the complexity of your data.” Thanks Alex.
@brohrer @alexkyllo Agreed. I'm only halfway through and already I feel like I'm learning an entirely new approach to take. Also can't wait to learn how to work w the tools they used to make these visuals.
@austin_bradley @brohrer Observable Plot looks really interesting, though I'm confident one could make plots of similar quality with ggplot2. On the message of the piece, I've long been in a "simplify as much as possible" mindset--aggregating the data to the point that the analysis is just simple comparisons between two to several groups--and this really challenges me on that.
@austin_bradley @brohrer I found the article in a link from this site, which also challenges my approach of using stock standard generalized linear models for everything, and recommends defaulting to using splines and nonparametric smoothers, and *never* discretizing data with quantiles or presenting it in tables (!!) https://hbiostat.org/rflow/analysis.html

@alexkyllo @austin_bradley oh wow, so many gems here. I’ve always had the best best luck keeping as close to the raw data as possible, making as few aggregations and modeling simplifications as I can get away with.

Sometimes it’s necessary to collapse it in order to communicate it. And once you have a story, you can aggregate the hell out of it to streamline your storyline. But for me aggregation and modeling is mostly a necessary evil.

@brohrer @austin_bradley I never really even understood what something like LOESS was useful for, but now I see how it fits into EDA. I think I need to go read RMS now, I've seen so much good stuff from Harrell on Cross Validated and Twitter

@alexkyllo @brohrer @austin_bradley Ok! The finding out what LOESS is useful for serious made me decide to read all linked material tonight 😃.

Thanks for sharing! Can't wait 😊

@alexkyllo I skimmed this as I've spent nearly a decade visualizing time series data. It looks like a wonderful article. I'll add it to my list to take a deeper dive into tomorrow when my brain can properly digest it. Thanks for sharing.
@jerodwaldman If you have any other content you like on this topic I'd love to see it!
@alexkyllo Makes me think of this comic by the awesome @allison_horst
@alexkyllo I noticed a lot of the examples used Observable Plot (https://observablehq.com/@observablehq/plot), a JavaScript library, for the visualisations. I was thinking of playing with some very similar visualisations of Energy trading related time series this weekend. Would you recommend that library as an easy one to get started with quickly? Any other good ways to do some quick hacking with #visualisation?
Observable Plot

Observable Plot is a free, open-source JavaScript library to help you quickly visualize tabular data. It has a concise and (hopefully) memorable API to foster fluency — and plenty of examples to learn from and copy-paste. In the spirit of show don’t tell, below is a scatterplot of the height and weight of Olympic athletes (sourced from Matt Riggott), constructed using a dot mark. We assign columns of data (such as weight) to visual properties (such as the dot’s x), and Plot infers the rest. You can configur

Observable
@stevie I haven't used Observable Plot myself yet. I like to use ggplot2 in R and plotnine in Python. Plotly sometimes, for interactive stuff.
@alexkyllo Thanks - I’ve never done anything in R but I’m happy in Python. I’d probably use that to parse these spreadsheets I have anyway. I’ll take a look at plotnine.