I have roughly 50 lines of code that convert ...

custard_shops = {
"Gilles": "https://gillesfrozencustard.com",
"Murfs": "https://www.murfsfrozencustard.com",
"Georgie Porgies": "https://georgieporgies.com",
"Pops": "https://popscustard.com",
"Golden Gyros": "https://goldengyro.com",
"Leons": "https://leonsfrozencustardmke.com",
"Kraverz": "https://www.kraverzcustard.com",
"Hefners": "https://www.hefnerscustard.com"
}

... to what is supposed to be JSON objects that look like ...

shop: {'date': '[Day], [Month] [Date]', 'flavor_of_the_day': '[Flavor]', 'description': '[Flavor Description]', 'hours': '[Hours they are open, today]', 'address': '[Street Address Of Where They Are Located]'}

... but this needs a bit of attention. The goal is to run a process once daily in Docker that scrapes for custard shop data (hours, the flavor of the day, etc) so that I can use it to update Firebase Cloud Firestore.

#Python #ScreenScraping #AI #Custard
I updated my #python script to update a Firebase Cloud Firestore instead of just printing a JSON array to the terminal but I'm still not getting a great result out of it. Sometimes I get the data that I am after and sometimes I don't. I've tried a dozen different AI models and I eventually settled on gemma2:27b. I think that I need to start working on tweaking the prompt.

#Firebase #Firestore #AI
I played with my #python script to tweak the prompt and finally got to ...

prompt = f"Today's date is {today_date}. I want to know today's custard flavor of the day, the description of today's custard flavor of the day, today's date, the numeric day of the month, the current month (spelled out), the name of the custard shop, the URL the custard shop's website, the physical street address of where the custard shop is located, the city the shop is physically located in, and the hours the custard shop is open today. The 'flavor of the day' could be listed as 'TODAY’S FLAVOR OF THE DAY', 'Gilles Flavor of the Day', 'Flavor of the Day', 'Today’s Flavor', or the '{today_date} flavor of the day'. Do not include any other information beyond what I am requesting. If you can not find the value of an attribute, set it to the value 'NA' in the object that you return. The description of the flavor of the day must not include the phrase 'They are currently hiring and offer competitive wages' and the description must not include the phrase 'Gilles Frozen Custard is a Milwaukee institution serving up delicious frozen custard in a variety of flavors'. The flavor of the day must not be for a future day. The result must look like this without any modifications ... 'date': '[Day], [Month] [Date]', 'flavor_of_the_day': '[Flavor]', 'description': '[Flavor Description]', 'hours': '[Hours they are open, today]', 'url': '[The address for their website]', 'name': '[The name of the custard shop]' 'location': '[Street Address Of Where They Are Located]', 'city': '[The city the shop is located in]', 'day': '[The numeric day of the month]', 'month': '[The current month, spelled out]'"

... But it still isn't playing nicely. I could reduce the number of attributes returned or try to get a 70b model working on it.

#Firebase #Firestore #AI #ScreenScraping
I tried rerunning it using mixtral:8x22b (https://ollama.com/library/mixtral:8x22b). It requires 300GB of memory and 74.1GB of disk space but I figured that it was worth a try.

I should try mixtral:8x7b, next ...
mixtral:8x22b

A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.

It looks like mixtral:8x7b (https://ollama.com/library/mixtral:8x7b) has some promise. I might try tweaking the prompt also and see if I can get a good result.
mixtral:8x7b

A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.

I made a few changes:

- I went back to using gemma2:27b
- I added some new shops
- I reduced the number of attributes to just four
- I set a non-standard user agent for when it gets the page
- I set up some shops to get the flavor from their Facebook page instead of their website

If the script can't find a flavor, it returns the value as "NA", so I will probably set the final app only to show the shop if the flavor value *is not* "NA".

I need to see if I need to add any more custard shops, next.
I wrote a wrap-up for this on @[email protected]. It should be up on Thursday.