How will lemmy scale?

phoneymouse@lemmy.world · 1 year ago

How will lemmy scale?

marsara9@lemmy.world · 1 year ago

Lemmy is entirely open source, so you can see what their architecture looks like, etc… here: https://github.com/LemmyNet/lemmy.

Rate limits, as I understand them from the code, should only apply on a per-IP basis. So you should only be seeing rate limit errors if:

your behind a CGNAT and multiple people who use your ISP are using Lemmy
you’re sending A LOT of requests to your instance yourself
the admin of your instance has significantly lowered the rate limits (viewable here: /api/v3/site)

radix@lemm.ee · edit-2 1 year ago

I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.

Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.

Again uninformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.

marsara9@lemmy.world · 1 year ago

I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.

From what I can see it’s both. lemmy.world and others are getting overloaded, but there is an inherit built-in rate-limit in the code itself. You can see what those limits are via the api/v3/site. Now in theory if you’re actually getting rate-limited you should be seeing HTTP 429 responses from the server. If the server is just overloaded, you’ll get a 5xx response, the request will just timeout or at best you’ll still get a response but after a significant delay (what most people are seeing).

Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.

I don’t want to comment on this too much as I’m not an expert here, but here’s how federation / ActivityPub works from what I understand looking at the code:

Whenever you take any action (or activity) your browser will first send that message to your instance. If your instance then owns the community that message is then propagated out to EVERY linked instance listed here: /instances / api/v3/federated_instances. If your instance doesn’t own the community, that message is forwarded off to the instance that does and they sent it out to EVERYONE on their federated instances list. As you can see this creates A LOT of network traffic.

This posing an interesting problem… the number of ActivityPub messages goes up as the number of instances increase. But at the same time as more and more users join a single instance that require that that instance send more and more traffic to individual user’s browsers as they view and respond to posts. So the problem here is trying to find a good balance. And to top it off, the default behavior of most users is going to be to join the largest instances, making that instance incur more and more traffic to view content.

Again uniformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.

Will it though? How would an individual instance monetize? They would have to use donations. If an instance tries to add Ads, users will leave to an instance that doesn’t, making it so that they don’t get any income. They could charge a subscription fee, but again users would just leave and the admins get nothing.

The ideal configuration of the fediverse as I see it, is if we had two types of servers 1) content servers that only hosted communities but didn’t have any real number of users, and 2) user servers that have no communities but most of the users. This way the number of API requests between instances is rather limited. When you end up with a server that has both most of the content and the userbase, the workload of that server appears to grow exponentially instead of linearly as the number of new instances rises.