244 Followers
382 Following
234 Posts
Data discovery software developer at JMP Discovery, LLC. Focused on data visualization and exploration. Prefer smoothers over fitted lines. Views my own.
Creator of Graph Builder UI within JMP.
Creator of Packed Bars chart type for high-cardinality Pareto data.
#DataViz #DataScience #TieDye #LessIsMore
Bloghttps://rawdatastudies.com
Packed Barshttps://packedbars.com
Blueskyhttps://bsky.app/profile/xangregg.bsky.social
New blog post: It started with me trying to understand some radar charts, but getting there required a side-quest of fitting a panel of sigmoidal curves. https://rawdatastudies.com/2025/12/15/from-radar-charts-to-curve-fitting-and-back/
My new blog post exploring data from a paper comparing step count changes of people who move to cities with different walk scores. #dataviz https://rawdatastudies.com/2025/08/23/step-count-versus-city-walkability/
I've created a little web app to experiment with alternative "data strip" views. That is, one-dimensional views to summarize a distribution, like a box plot.
In this screenshot, there are 14 data views and the 8 in green are my experimental alternatives.
app: https://xangregg.github.io/data-strips/
blog post: https://rawdatastudies.com/2025/07/05/data-strips-experiment/
I logged all my two-digit authentication codes in 2024, after sensing more high-value codes in 2023. Maybe I was right. Low first digits are relatively scarce. Hard to come up with an explanation, though. #dataviz
Here's an example of the difficulties of box plots for discrete data (Likert in this case, from poetry assessment study). In the second view the last group is clearly different from the others, but not so in the box plot version. #dataviz
I learned via #Numberphile about Sloane's Gap, an empty band that appears when you plot counts of integers appearing in the Online Encyclopedia of Integer Sequences (OEIS.org). Here's the 2011 original plot and my updated version with 2024 data and coloring some categories.
arxiv.org/pdf/1101.4470

I learned from Heather Cox Richardson that today is the anniversary of standard time zones in the US (in 1883).

Here's a #dataviz I made unintentionally showing a time zone affect from a survey question about wake-up times, from BLS American Time Use Survey.

Maine kids may have a beef with the standard. Not surprisingly, being east within a TZ goes with getting up earlier.

I found the origin of the term "boxen plot," which the seaborn library uses for letter-value plots, in this github commit from 2018. Looks like a seaborn invention connoting both a plural of box and a blend of box and violin. https://github.com/mwaskom/seaborn/pull/1490
Rename factorplot to catplot and change the default plot kind to strip by mwaskom · Pull Request #1490 · mwaskom/seaborn

Moderately disruptive but well-intentioned changes herein: First: I've decided to abandon the original R-inflected name for factorplot and change it to catplot, which better corresponds to the ...

GitHub
New packed bar chart: US car sales by model. Didn't realize big trucks were still so dominant nor that Model Y was right up there with the RAV4. #dataviz
The staircase pattern in this WaPost moving average seems odd. Looking closer, there's a pretty strong sawtooth pattern in the dots, which is also a mystery. Even birth years lead to higher song ratings???
https://wordpress.com/post/rawdatastudies.com/2347
WordPress.com

WordPress.com