45 Followers
10 Following
40 Posts
Software developer, sysadmin and all-round IT guy for the Open University in Milton Keynes.
githubhttps://github.com/fisharebest
webtreeshttps://webtrees.net
scubahttps://www.mksac.co.uk
rock climbinghttp://www.mkmountaineering.org

I'm moving many of my projects from github to codeberg.

One of them has 20 years of history, 10K users and a git repository approaching 1GB. A lot of this comes from:

* third-party libraries, from the days before composer
* translation files, that could easily live in a separate repository

Should I take this opportunity to purge the git history and reduce storage/network/environmental costs for everyone?

No - historical integrity is too important
0%
Yes - nobody cares about ancient versions of code
100%
Poll ended at .
@maccath - None of the above. In my experience, you need to be careful with ceramic pans to avoid scratching, and I'm wary of the chemicals in non-stick surfaces. I have the Le Cruset 3-ply stainless (I got them in the sales), but if it is out of your budget, then the JL classic stainless steel ones would be my next choice.

Several of my sites are now being hit by Microsoft's BingBot (valid, I checked the IPs).

It is abusing my "/search?query=..." URL to search for everything from chinese motorcycle parts to train times.

Is Microsoft really running its searches through every search-form it can find on the internet?

/Sigh

@Daddaniele -
On my employer's website, they only fetch the home page.

On a genealogy site, they are following links in a calendar widget to fetch every day in history.

Other sites have different patterns.

The IPs are from every country. AFAICT, mostly residential.

Every website I control is being overwhelmed by robots.

It's all the same pattern: one request per IP address and random user-agent strings.

One site had 1.6 million unique IPs in the last week.

What kind of botnet is this?

Current fix:

If request has no cookies and a chrome/firefox/etc. user-agent, then send 4xx response with a cookie, and a body containing meta-refresh tag.

Browsers see the HTML and reload with a cookie.
Robots see the 4xx.

It works - but the robots keep coming...

@ramsey - if the timestamps are generated by MySQL, then aren't they set to the start of the transaction, not to the actual time? So overlapping transactions would explain this...?

@jdecool - I have about 2000 errors in my baseline for about 250,000 lines of code. (Down from about 8000 when I started using phpstan.)

AFAICT, most are caused by two things:

1) I have an error-handler so functions like file_get_contents() will throw exceptions instead of returning false.

2) Every SQL query (Laravel QueryBuilder) gives a return type of mixed, and I don't see a way to tell it what columns/types are returned.

@paulshryock Using `FooInterface` frees up the name `Foo` for your implementation. If the interface is `Foo`, then I need to think of something else to call my implementation. One of the two hardest problems in computer science and all that... 😀
@paulshryock IMHO, interfaces should always have names ending in"Interface". No preference as to implementation names.
#webtrees 2.1.22 (PHP 7.4-8.4) and 2.2.1 (PHP 8.3-8.4) are now available for download. See https://webtrees.net/blog for the changelog.
Blog

webtrees is a web application that allows you to publish your genealogy online, collaborate with family members and take control of your data.