Big new release of my LLM CLI tool and Python library for working with Large Language models (Llama 2, GPT-4 etc)

LLM 0.9 adds support for embedding models, installed via plugins

If you aren't familiar with embeddings I have a very detailed explanation of what they can do and how you can use them here:

https://simonwillison.net/2023/Sep/4/llm-embeddings/

LLM now provides tools for working with embeddings

LLM is my Python library and command-line tool for working with language models. I just released LLM 0.9 with a new set of features that extend LLM to provide tools …

Here's a fun example of something you can now do with LLM: search for every README.md file in your home directory and store embeddings for all of them in a collection called "readmes":

```
llm embed-multi readmes \
--model sentence-transformers/all-MiniLM-L6-v2 \
--files ~/ '**/README.md'
```

Then run a similarity search for "sqlite" like this:

```
llm similar readmes -c sqlite
```

Also new today: the llm-cluster plugin, which derives clusters of documents from a collection of embeddings

A fun trick with that is that you can ask it to pass the items in each cluster through an LLM in order to generate titles for each cluster!

https://github.com/simonw/llm-cluster

GitHub - simonw/llm-cluster: LLM plugin for clustering embeddings

LLM plugin for clustering embeddings. Contribute to simonw/llm-cluster development by creating an account on GitHub.

GitHub

I just released a new version of my Symbex tool, which finds functions and classes in a Python codebase

It can now export the code it finds in a format that can then be piped to "llm embed-multi" to embed those functions

https://github.com/simonw/symbex/releases/tag/1.4

Release 1.4 · simonw/symbex

New output options: --json, --nl, --csv and --tsv. These can be used to produce structured output which can be consumed by other tools such as sqlite-utils or llm embed-multi. #40 To generate and s...

GitHub
Ran "llm embed-multi" against a CSV file of all of my tweets, now I'm having fun running vibes-based searches against them
@simon Please excuse my ignorance, but what is the use case?
(Genuine q, python beginner)

@impersonal totally reasonable question! I'm still figuring out what you can do with embeddings of Python functions myself

My hunch is that they'll get interesting when combined with other tricks - like finding the example code most relevant to a question posed to an LLM