David McClure

12 Followers
30 Following
9 Posts
@arnicas The mall episode stressed me out to a degree that I'm not sure I want to keep watching it.
@TedUnderwood I wonder how many wordpieces are in the average novel - 75k, 100k? Getting into the right order of magnitude.

Really liking Polars as a replacement for Pandas, after using it for a ~week on a real project - https://www.pola.rs/. Thoughts so far -

- The speedup over Pandas really is noticeable and meaningful.

- API is more SQL-ish, very similar to Spark. I bet a lot of Spark pipelines could be converted to Polars without much structural change.

- Integrations with other Python packages are somewhat less smooth than Pandas (sklean, seaborn). But, nothing too hard to work around, so far.

Polars, lightning-fast DataFrame library

Polars is a blazingly fast DataFrame library completely written in Rust, using the Apache Arrow memory model. It exposes bindings for the popular Python and soon JavaScript languages. Polars supports a full lazy execution API allowing query optimization.

Shibboleths rule every statistical system trained on text. The meaning of a word is deeply coupled to the characteristics of the slice of the people who use it.

It would not surprise me at all if fine tuning to use a lot of emojis is what unleashed the unhingedness characteristic of someone who uses a lot of emojis.

Two-word prompt: "lenticular night." #midjourney #aiart
@heuser Looks awesome, will check this out! @paulgb's Treeverse might be food for thought on some of the layout questions - https://treeverse.app/.
Treeverse for Bluesky

Treeverse for Bluesky

Welcome to 2023, where we just go back to treating everything as a language problem, and it might just be better.

https://muse-model.github.io/

Muse: Text-To-Image Generation via Masked Generative Transformers

@arnicas @TedUnderwood @rivershavewings What a delightful paper!
Hi all, first post here! Does anyone know where I might find a big list of nouns that represent physical objects - tree, car, chair, mountain, etc? The bigger and more varied the better. Could that be extracted from something like Wordnet or Wikidata?