Mastodawn

CriticalResist8 5d ago

Burn the empire one flag at a time, now with global leaderboard

https://lemmygrad.ml/post/11613039

CriticalResist8 May 3

Expanded the page on 4chan - what do you think of the writing, flow, facts etc?

https://lemmygrad.ml/post/11506798

Expanded the page on 4chan - what do you think of the writing, flow, facts etc? - Lemmygrad

Obviously everybody knows about 4chan already, so I’m not posting this just to keep y’all updated on what the channers have been up to. Rather I expanded the page from the single intro paragraph at the top into a more fleshed out, complete version. And I was wondering if you could read it and let me know what you think of the new additions, especially as a reader. Did you like reading the page? Did you have trouble following anything? Did you gloss over any part, and why? Anything else you’d like to share? This will help us decide if this is a good boilerplate process when expanding on stub pages. Mind you I’m aware this is barely scratching the surface lol, but I didn’t want to do too much at once. Thanks in advance.

Xenomaoism ∞Mar 31

Can someone explain why Laos is not a socialist state according to the wiki?

https://lemmygrad.ml/post/11179358

Can someone explain why Laos is not a socialist state according to the wiki? - Lemmygrad

The page for socialist state does not include Laos and on its own page it says it is a “people’s democratic state”, unlike the other AES states, which say they are socialist states. Is there a reason Laos is not considered a socialist state?

CriticalResist8 Mar 7

Would you be interested in uploading books to ProleWiki if we opened library editing without an account?

https://lemmygrad.ml/post/10942166

Would you be interested in uploading books to ProleWiki if we opened library editing without an account? - Lemmygrad

Just gauging interest for now. It’s a question we’ve thrown around internally before but never settled. As a reminder ‘anons’ (people without an account) can edit wiki pages like on wikipedia, but they go into a moderation queue that the stewardship checks and approves/rejects accordingly. They don’t instantly appear. On the one hand, more books. On the other hand, there is a small process to it (adding the infobox and proper categorization) and the big part imo is we have to trust that people will upload the book exactly as it is in the source, which I’m not sure we can even do (and therefore won’t do). It’s easy to add a vandalizing sentence in the middle of a chapter. So would you be interested in uploading books to PW if it was available?

NotMushroomForDebate Feb 23

Suggestion for a weekly essay discussion thread.

https://lemmygrad.ml/post/10806217

NotMushroomForDebate Jan 17

ProleWiki gets a shout-out on ProlesPod

https://lemmygrad.ml/post/10398236

CriticalResist8 Dec 18

ProleWiki RAG MCP vs WSWS' (trots) Socialism AI

https://lemmygrad.ml/post/10103797

ProleWiki RAG MCP vs WSWS' (trots) Socialism AI - Lemmygrad

How many keywords can you stuff in a title right? I’m posting this in the prolewiki community because we’ll be discussing ProleWiki’s own in-development RAG for LLMs, but first you probably saw the WSWS, i.e. the trots, published ‘Socialism AI’. In their press release [https://www.wsws.org/en/articles/2025/12/12/gpid-d12.html], they basically self-congratulate themselves about how cool this is for the workers movement and socialism and great victory this and great victory that blahblahblah. You know how trots are. Their system is usable through ai.wsws.org [http://ai.wsws.org] or something iirc, it’s a web-interface so yes it’s cool that it comes as a package you can just run from any device and don’t have to fiddle with it, there’s also a lot of problems with it especially when coming from self-proclaimed communists. Though with how much of a joke trots are to everyone, I feel like I’m not really throwing oil into the fire with this post lol. We looked into how their system works because they give absolutely 0 indication on the technical implementation, and found several notices of copyright in the Terms of Service. They say that the output from their AI belongs to them, for example. Courts in the US have found that LLM output is public domain but sure I guess, not really my area of expertise. We’ll get into it. ## Understanding what WSWS did * WSWS did not train a model from the ground-up * WSWS did not fine-tune an existing open-source model * WSWS is not running and hosting their own model. What WSWS does (and you can find this out from just using browser tools, i.e. F12 on their homepage) is use the chatGPT and Deepseek APIs. Their pipeline is like this (as far as we can ascertain from simple browser tools): You send your prompt -> they add their own instructions to it -> LLM fetches WSWS blog articles to answer your prompt -> LLM reads blog articles -> LLM answers your prompt with the WSWS blog articles as sources. This is what we call RAG, or Retrieval-Augmented Generation. The technique is legit, I’m not disputing that, it’s just the way they did it is both inefficient and concerning. ## The Problems I have with that way of doing things We’ll get into the technical problems when I detail what the ProleWiki MCP will look like. it’s also very closed-source and obfuscated. Mind you I did not create an account (too much hassle if I want to retain my privacy on it), but you have to understand your prompt + llm output transits through OpenAI and Deepseek. There is no privacy when using this service, it goes straight to the feds with OAI. Secondly they sell paid tiers, starting at 5$ per month for 150 messages which is… absolutely nothing. Thirdly everything is closed off. They did not release any documentation on how this works or how you could run this yourself. Selling paid tiers is not a problem in itself at least for me personally. You have to break even and they do pay API access to openAI and Deepseek (though Deepseek is very cheap). The problem I have is they at least should offer an open-source implementation for people who know how to use it, at the very least make the RAG files available. This is not the case. I’m also a proponent of paying it forward. Yes this costs them money, but they could find a way to break even in ways that don’t consist of just selling another SaaS (software-as-a-service). Let people pay it forward for others or something. Accept that you will lose some money on running this and cover with dues or people in the party who have money and don’t mind maintaining this service. Accept donations. Lots of ways you can do this that are not so commercial, i.e. “if you can’t pay you must vacate the premises”. ## The technical implementation: ProleWiki MCP vs. Socialism AI A few months ago we started working with a dev who was making the Marxists Internet Archive available for RAG use. This project evolved and they are now making a ProleWiki MCP with the pages we sent them. It’ll still be RAG, but more efficient. So first, let’s look at how the Socialism AI RAG works. If you remember the pipeline: You send your prompt -> they add their own instructions to it -> LLM fetches WSWS blog articles to answer your prompt (<-- we are here) -> LLM reads blog articles -> LLM answers your prompt with the WSWS blog articles as sources. The problem we’ve found is what kind of data exactly the LLM gets access to. Imagine it like a bin the LLM can sift through to make an answer with. If you provide it with the link to the page, it parses that as html code, with all its tags, headers, script calls etc. Imagine me giving you a page full of html code and asking you “can you answer when Lenin was born from this info?” You can, but it’s gonna take a while and a lot of it is simply unnecessary. And you only have this one page to make an answer. If Lenin’s DOB is not neatly written on it, you have to do extra thinking to put it together (this is the context window - the LLM simply won’t look through 250k WSWS articles, it has to pick and choose which articles are more likely to help answer the question). Therefore we can optimize this bin. Instead of giving you full pages you can pick from, we can give you individual lines. In our RAG for ProleWiki, what our dev did was some math that extracts every line from our pages on the principle of 1 line = 1 idea. Then it puts these ideas together in a matrix and sorts them by semantic closeness. What this means is if you’re the LLM, you don’t get a full page on the October Revolution or Lenin [https://en.prolewiki.org/wiki/Vladimir_Lenin] to answer a question with. You can see our page on Lenin is quite lengthy and if you asked a question that is not on this page when the LLM pulled it to look at it before answering (for example you can see the self-exile section is empty), it might not answer your question as best it could. With the semantic matrix, instead of picking from pages, it picks from lines to make a coherent answer. Instead of looking at just Lenin’s page and filling its entire context window with it, it looks at semantic information relating to Lenin’s self-exile on ProleWiki - or other sources you add to the corpus, the ‘bin’ - and then makes an answer on this. This means if we have information about Lenin’s self-exile on say the USSR page (because why not!), it will pull exactly that thread from that page. And this is much more powerful than what the WSWS did and why they offer such measly usage rates. They are filling up context window and sending noise tokens because they’re giving an entire <!DOCTYPE HTML><head><meta-name>… html page instead of just the relevant content. ## But where does the MCP come in? MCPs are kinda new, and were made for AI to work with. I wouldn’t be the best person to explain them but basically it lets an LLM look at some data (website, files, etc) and work with that data in some way. Mostly used in agentic work, tools are exposed to the llm such as view file or edit file, so it can perform these operations itself instead of having you do it and then confirm. So if you have an agent (such as crush [https://github.com/charmbracelet/crush], our favorite here on lemmygrad), an LLM can and will view and edit the files you tell it to. These are an example of 2 tools. With an MCP, you give the LLM access to data it can read and can also give it its own tools. You could make a tool “ProleWiki-fetch”. When the LLM decides to use this tool, it communicates with the ProleWiki MCP you have installed locally and lets it say “okay, let’s use the prolewiki-fetch tool to look at data from prolewiki to answer this question”. Then the MCP does its magic and sends back to the LLM the information. And not only that, but as we said you can also run this locally. We are still figuring out how we’ll package all of this but most likely we’ll make the source files available so that anyone can build any RAG or make their own cloud web interface if they want. Likewise for the MCP, it will be downloadable with our source files so that you can just add it to your agent interface and start using it to query the LLM and answer with prolewiki content. Communism is not in a position of strength currently. So, I don’t see any reason we should be trying to hide and obfuscate any of our content. On the contrary, proletarian education demands it be accessible without discrimination. Unlike trots, we trust the people to make the right decisions collectively - if someone wants to use ProleWiki content to train a model and paywall that, let them. There will be 10 more that won’t be. In fact speaking of models, our dev is also working on something there… but I was asked not to say too much about it as it’s very experimental 🤐

CriticalResist8 Dec 6

Neobrutalist prolewiki idk

https://lemmygrad.ml/post/9982113

Saymaz Dec 4

Lmao, the NazBols are livid! Keep doing the good work, all the editors and moderators.

https://lemmygrad.ml/post/9957238

commontern they/them Dec 2

Is there a pdf or epub version of The CIA's Shining Path?

https://lemmygrad.ml/post/9943225

Is there a pdf or epub version of The CIA's Shining Path? - Lemmygrad

Like the title says. I’d like to be able to read it offline.