Mastodawn

jslakro

Moving from GitHub to Codeberg, for lazy people

https://unterwaditzer.net/2025/codeberg.html

Moving from GitHub to Codeberg, for lazy people - Markus Unterwaditzer

Show thread

INTPenis 2d ago

Lazy has nothing to do with it, codeberg simply doesn't work.

Most of my friends who use codeberg are staunch cloudflare-opponents, but cloudflare is what keeps Gitlab alive. Fact of life is that they're being attacked non-stop, and need some sort of DDoS filter.

Codeberg has that anubis thing now I guess? But they still have downtime, and the worst thing ever for me as a developer is having the urge to code and not being able to access my remote. That is what murders the impression of a product like codeberg.

Sorry, just being frank. I want all competitors to large monopolies to succeed, but I also want to be able to do my job/passion.

Show thread

freedomben 2d ago

I've had the same experience.

Philosophically I think it's terrible that Cloudflare has become a middleman in a huge and important swath of the internet. As a user, it largely makes my life much worse. It limits my browser, my ability to protect myself via VPNs, etc, and I am just browsing normally, not attacking anything. Pragmatically though, as a webmaster/admin/whatever you want to call it nowadays, Cloudflare is basically a necessity. I've started putting things behind it because if I don't, 99%+ of my traffic is bots, and often bots clearly scanning for vulnerabilities (I run mostly zero PHP sites, yet my traffic logs are often filled with requests like /admin.php and /wp-admin.php and all the wordpress things, and constant crawls from clearly not search engines that download everything and use robots.txt as a guide of what to crawl rather than what not to crawl. I haven't been DDoSed yet, but I've had images and PDFs and things downloaded so many times by these things that it costs me money. For some things where I or my family are the only legitimate users, I can just firewall-cmd all IPs except my own, but even then it's maintenance work I don't want to have to do.

I've tried many of the alternatives, and they often fail even on legitimate usecases. I've been blocked more by the alternatives than I have by Cloudflare, especially that one that does a proof of work. It works about 80% of the time, but that 20% is really, really annoying to the point that when I see that scren pop up I just browse away.

It's really a disheartening state we find ourselves in. I don't think my principles/values have been tested more in the real world than the last few years.

Show thread

rglullis 2d ago

Either I am very lucky or what I am doing has zero value to bots, because I've been running servers online for at least 15 years, and never had any issue that couldn't be solved with basic security hygiene. I use cloudflare as my DNS for some servers, but I always disable any of their paid features. To me they could go out of business tomorrow and my servers would be chugging along just fine.

Show thread

j16sdiz 2d ago

Sometime it is not security , it could be just bandwidth or CPU.

I have website small enough not to attract too many bot, but sometime, something very innocent can bring my website down.

For example, I put a php ical viewer.. and some crawler start loading the calendar page, taking up all the cpu cycle.

Show thread

dspillett 2d ago

> and use robots.txt as a guide of what to crawl rather than what not to crawl

Mental note, make sure my robots.txt files contain a few references to slowly returning pages full of almost nonsense that link back to each other endlessly…

Not complete nonsense, that would be reasonably easy to detect and ignore. Perhaps repeats of your other content with every 5th word swapped with a random one from elsewhere in the content, every 4th word randomly misspelt, every seventh word reversed, every seventh sentence reversed, add a random sprinkling of famous names (Sir John Major, Arc de Triomphe, Sarah Jane Smith, Viltvodle VI) that make little sense in context, etc. Not enough change that automatic crap detection sees it as an obvious trap, but more than enough that ingesting data from your site into any model has enough detrimental effect to token weightings to at least undo any beneficial effect it might have had otherwise.

And when setting traps like this, make sure the response is slow enough that it won't use much bandwidth, and the serving process is very lightweight, and just in case that isn't enough make sure it aborts and errors out if any load metric goes above a given level.

Show thread

matrss 2d ago

So, basically iocaine (https://iocaine.madhouse-project.org/). It has indeed been very useful to get the AI scraper load on a server I maintain down to a reasonable level, even with its not so strict default configuration.

iocaine - the deadliest poison known to AI

Show thread

willx86 2d ago

https://blog.cloudflare.com/ai-labyrinth/

A bit like this? ( iocaine is newer)

Trapping misbehaving bots in an AI Labyrinth

How Cloudflare uses generative AI to slow down, confuse, and waste the resources of AI Crawlers and other bots that don’t respect “no crawl” directives.

The Cloudflare Blog

Show thread

dwedge 2d ago

While I sympathise, I disagree with your stance. Cloudflare handle a large % of the Internet now because of people putting sites that, as you admitted, don't need to be behind it there.

[dead]

My own git server has been hit severely by scrapers. They're scraping everything. Commits, comparisons between commits, api calls for files, everything.

And pretty much all of them, ByteDance, OpenAI, AWS, Claude, various I couldn't recognize. I basically just had to block all of them to get reasonable performance for a server running on a mini-pc.

I was going to move to codeberg at some point, but they had downtime when I was considering it, I'd rather deal with that myself then.

Show thread

marginalia_nu 2d ago

Anyone actually scraping git repos would probably just do a 'git clone'. Crawling git hosts is extremely expensive, as git servers have always been inadvertent crawler traps.

They generate a URL for every version of every file on every commit and every branch and tag, and if that wasn't enough, n(n+1)/2 git diffs for every file on every commit it has exited on. Even a relatively small git repo with a few hundred files and commit explodes into millions of URLs in the crawl frontier. Server side many of these are very expensive to generate as well so it's really not a fantastic interaction, crawler and git host.

If you run a web crawler, you need to add git host detection to actively avoid walking into them.

Show thread

Tharre 2d ago

And yet, it's exactly what all the AI companies are doing. However much it costs them in server costs and good will seems to be worth less to them then the engineering time to special case the major git web UIs.

Show thread

frevib 2d ago

OP is about Github. Have you seen the Github uptime monitor? It’s at 90% [1] for the last 90 days. I use both Codeberg and Github a lot and Github has, by far, more problems than Codeberg. Sometimes I notice slowdowns on Codeberg, but that’s it.

[1] https://mrshu.github.io/github-statuses/

The Missing GitHub Status Page

Historical GitHub uptime reconstructed from archived status data.

Show thread

kevinfiol 2d ago

To be fair, Github has several magnitudes higher of users running on it than Codeberg. I'm also a Codeberg user, but I don't think anyone has seen a Forgejo/Gitea instance working at the scale of Github yet.

[delayed]

I think evaluating alternatives to GitHub is going to become increasingly important over the coming years. At the same time, I think these kinds of migrations discount how much GitHub has changed the table stakes/raised the bar for what makes a valuable source forge: it's simply no longer reasonable to BYO CI or accept one that can't natively build for a common set of end-user architectures.

This on its own makes me pretty bearish on community-driven attempts to oust GitHub, even if ideologically I'm aligned with them: the real cost (both financial and in terms of complexity) of user expectations around source forges in 2026 is immense.

Show thread

wongarsu 2d ago

CI needs good integration into the source forge. But I don't really perceive Github actions as a huge benefit over the times when everone just set up CircleCI or whatever. As long as it can turn PR checks red, yellow and green and has a link to the logs I'm happy

The whole PR and code review experience is much more important to me. Github is striving to set a high bar, but is also hilariously bad in some ways. Similarly the whole issue system is passable on Github, but doesn't really reach the state of the art of issue systems from 20 years ago

[dead]

> it's simply no longer reasonable to BYO CI

Why? I know plenty of teams which are fine with repo and CI being separate tools as long as there is integration between the 2.

Show thread

CuriouslyC 2d ago

Actions are bad, but they're free (to start) and just good enough that they're useful to set up something quick and dirty, and tempt you to try and scale it for a little while.

Show thread

woodruffw 2d ago

Emphasis on teams; the median open source project has a fraction of a single person working on it.

Show thread

usrbinenv 2d ago

I don't understand the hype around CI and that it's supposedly impossible to run something like that without Git, let alone Github. Like sure, a nice interface is fine, but I can do with a simpler one. I don't need a million features, because what is CI (in practice today, not in theory)? It's just a set of commands that run on a remote machine and then the output of those commands is displayed in the browser and it also influences what other commands may or may not run. What exactly is the big deal here? It can probably be built internally if needed and it certainly doesn't need to depend on git so much - git can trigger it via hooks, but that's it?

I think the real problem is we were sold all these complex processes that supposedly deliver better results, while in reality for most people and orgs it's just cargo culting, like with Kubernetes, for example. We can get rid of 90% of them and be just fine. You easily get away without any kind of CI in teams of less than 5-7 people I would argue - just have some sane rules and make everyone follow them (like run unit tests before submitting a PR).

Show thread

duped 2d ago

> just have some sane rules and make everyone follow them (like run unit tests before submitting a PR)

and thus you discover the value of CI

Show thread

IshKebab 2d ago

The big deal is that GitHub provides it for free. Plus it integrated properly into the PR workflow.

Good luck implementing merge queues yourself. As far as I know there are no maintained open source implementations of merge queues. It's definitely not as trivial as you claim.

Show thread

psychoslave 2d ago

Working with all these modern layers, I don't see why people bother so much about it. This is all upper level decision to centralize so they feel they keep control. As a dev I'm 100% confident life would be as least as pleasant without all this abysmal layers of remote services that could all be replaced with distributed solutions that work 100% in local with thin sync step here and there.

Show thread

noirscape 2d ago

I don't dislike Codeberg inherently, but it's not a "true" GitHub replacement. It can handle a good chunk of GitHub repositories (namely those for well established FOSS projects looking to have everything a proper capital P project has), but if you're just looking for a generic place to put your code projects that aren't necessarily intended for public release and support (ie. random automation scripts, scraps of concepts that never really got off the ground, things not super cleaned up), they're not really for that - private repositories are discouraged according to their FAQ and are very limited (up to 100mb).

They also don't want to host your homepage, so if GitHub Pages is why you used GitHub, they are not a replacement.

Unfortunately I don't think there's really an answer to that conundrum that doesn't involve just spinning up your own git server and accepting all the operational overhead that comes with it. At least Forgejo (software behind Codeberg) is FOSS, so you can do that and it should cover most of what you need (and while you're in the realm of having a server, a Pages-esque replacement is trivial since you're configuring a webserver anyway.) Maybe Gitlab.com, although I am admittedly unfamiliar with how Gitlab's "main" instance has changed over the years wrt features.

Here's their FAQ on the matter, it's worth a read: https://docs.codeberg.org/getting-started/faq/

Frequently Asked Questions | Codeberg Documentation

Show thread

real_joschi 2d ago

> They also don't want to host your homepage, so if GitHub Pages is why you used GitHub, they are not a replacement.

https://docs.codeberg.org/codeberg-pages/

Codeberg Pages | Codeberg Documentation

Show thread

noirscape 2d ago

From their FAQ:

> If you do not contribute to free/libre software (or if it is limited to your personal homepage), and we feel like you only abuse Codeberg for storing your commercial projects or media backups, we might get unhappy about that.

Emphasis mine. This isn't about if it's technically possible (it certainly is), it's whether or not it's allowed by their platform policies.

Their page publishing feature seems more like it's meant for projects and organizations rather than individual people. The way it's described here indicates that using them to host your own blog/portfolio/what have you is considered to be abusing their services.

Show thread

johnisgood 2d ago

Reading what you quoted, no it is not, as long as you contribute to free software or you have projects that are open source. Not just your personal homepage. If you only have a personal homepage and nothing else that is open source, then they have a problem.

My 2 cents.

Show thread

noirscape 2d ago

Which makes it not really a suitable replacement for GitHub, which is my entire point.

Keep in mind, I'm not saying Codeberg is bad, but it's terms of use are pretty clear in the sense that they only really want FOSS and anyone who has something other than FOSS better look elsewhere. GitHub allowed you to basically put up anything that's "yours" and the license wasn't really their concern - that isn't the case with Codeberg. It's not about price or anything either; it'd be fine if the offer was "either give us 5$ for the privilege of private repositories or only publish and contribute public FOSS code" - I'm fine paying cash for that if need be.

One of the big draws of GitHub (and what got me to properly learn git) back in the day with GitHub Pages in particular was "I can write an HTML page, do a git push and anyone can see it". Then you throw on top an SSG (GitHub had out of the box support for Jekyll, but back then you could rig Travis CI up for other page generators if you knew what you were doing), and with a bit of technical knowledge, anyone could host a blog without the full on server stack. Codeberg cannot provide that sort of experience with their current terms of service.

Even sourcehut has, from what I can tell, a more lenient approach to what they provide (and the only reason why I wouldn't recommend sourcehut as a GitHub replacement is because git-by-email isn't really workable for most people anymore). They encourage FOSS licensing, but from what I can tell don't force it in their platform policies. (The only thing they openly ban is cryptocurrency related projects, which seems fair because cryptocurrency is pretty much always associated with platform abuse.)

Show thread

enraged_camel 2d ago

That FAQ snippet is insane to me. Maybe it's a cultural thing but I'd never do business with a company that has implicit threats in their ToS based on something so completely arbitrary.

Show thread

0x3f 2d ago

The worst part is really the unclear procedure. If they set out terms that say they'll give me 4 weeks to migrate projects they don't like off the platform, with n email reminders in between, then that's not ideal but fine. As it is, I'd be worried I'll wake up to data loss if they get 'unhappy'. I have the same problem with sourcehut, actually, with their content policy.

Show thread

shimman 2d ago

Seems fair to me, they're a nonprofit that exists in our lived reality and not an abusive monopolist that can literally throw a billion dollars to subsidize loss leaders.

All it shows the world is why there needs to be a VAT like tax against US digital services to help drive a public option for developers.

There's no reason why the people can't make our own solutions rather than be confined to abusive private US tech platforms.

Show thread

ronsor 2d ago

The truth is that I publish OSS projects on GitHub because that's where the community is, and the issues/pull requests/discussions are a bonus.

If I just want to host my code, I can self host or use an SSH/SFTP server as a git remote, and that's usually what I do.

Show thread

LinXitoW 2d ago

Considering that "the community" is now filled with vibe coding slop pull requesters, and non-coders bitching in issues, the filter that not-github provides becomes better and better.

Of course, that mostly goes for projects big enough to already have an indepedent community.

Show thread

mtlynch 2d ago

>The by far nastiest part is CI. GitHub has done an excellent job luring people in with free macOS runners and infinite capacity for public repos.

This was my biggest blocker as well, as there weren't any managed CIs that supported Codeberg until recently.

NixCI[0] recently added support for Codeberg, and I've had a great experience with it. The catch is that you have to write your CI in Nix, though with LLMs, this is actually pretty easy. Most of my CI jobs are just bash scripts with some Nix wiring on top.[1] It also means you can reproduce all your CI jobs locally without changing any code.

[0] https://nix-ci.com

[1] https://codeberg.org/mtlynch/little-moments/src/commit/d9856... - for example

NixCI

Locally-reproducible zero-config hosted continuous integration for Nix