I think this #FutureLaw 2023 panel on GPT4 is good in terms of a balanced view of the risks and opportunities of using tools like GPT4. What surprises me is that no one is talking about how soon GPT4 will be obsolete, and replaced by something that improves over the previous iteration just as significantly. #LegalTech #LawFedi
OpenAI’s CEO confirms the company isn’t training GPT-5 and ‘won’t for some time’

OpenAI’s CEO Sam Altman has confirmed that the company is not currently training GPT-5 — the successor to its language model GPT-4, released this March. Altman was discussing fears about AI safety.

The Verge
@ltmccarty that article just says you can't assume that new versions are better than earlier versions in any stable degree. Granted. But no one is assuming. It is objectively getting significantly better. That they haven't started "training" GPT5 yet doesn't mean anything. I don't see any evidence of a plateau, and I see lots of evidence to the contrary.

@lexpedite @ltmccarty — Two ways this could play out:

1. OpenAI feels threatened by the many free + open-source competitors (e.g., Eleuther, Dolly-2), and wonders whether the time/expense of generating a Foundational Model is worth it — when they're competing with "free."

2. OpenAI — with Microsoft money — takes a run at improving the existing GPT-4 model incrementally. Like they did with the davinci releases of GPT3, GPT3.5, etc.

Seems like they're choosing Option 2. Long Microsoft runway

OpenAI’s CEO Says the Age of Giant AI Models Is Already Over

Sam Altman says the research strategy that birthed ChatGPT is played out and future strides in artificial intelligence will require new ideas.

WIRED

@ltmccarty @lexpedite Yes, that's really helpful. Thanks, Thorne.

I wonder if this comes from the lack of high quality data sources. There are only so many human-created words. Reddit will only get you so far.

Last bastion of high quality legal data: Law? Judicial, statutory, and regulatory text seems like an evergreen source.

@damienriehl @ltmccarty I would be reluctant to categorize judicial writing as high quality for general purposes. It's not even high quality data for legal purposes in predicting the outcome of cases, because it is biased toward cases where one or more of the parties is rich and/or crazy.

@lexpedite @ltmccarty

This is a "compared to what" and "what goal" problem.

Compared to Reddit and Twitter? Judicial writing is pretty high quality.

Gauging the law's current state (e.g., Roe v. Wade as no longer current law)? Pretty high quality.

Prediction? For that, is there *any* high quality source?

@damienriehl @ltmccarty We are talking for general purpose language models, and compared to literally anything No one wants an AI that will take 40 pages to explain something in esoteric language, for the benefit of the losing party, while hedging their bets against appeal. There is no use case except drafting judgments for which judgments are "good" data, and there they are good only if you don't care whether the judgment is correct. Nevergreen.
@lexpedite @damienriehl

The purpose of pre-training is not to make predictions for specific tasks, but to build some kind of a model of the concepts underlying the lexical items in the texts. At least, that's the current understanding of how LLMs work.

So it does not matter what the outcome is in a particular case.