Every political thread is chock full of people being angry and unreasonable. I did some data mining, and most of the hate is coming from a very small percentage of the community, and the rest of the community is very consistent in downvoting them.

The problem is that even with human moderators enforcing a series of rules, most of those people are still in the comments making things miserable. So I made a bot to do it instead.

!santabot@slrpnk.net is a bot that uses an algorithm similar to PageRank to analyze the Lemmy community, and preemptively bans about 1-2% of posters, that consistently get a negative reaction a lot of the time. Take a look at an example of the early results. See how nice that is? It’s just people talking, and when they disagree, they say things like “clearly that part is wrong” and “your additions are good information though.”

It’s too early to tell how well it will work on a larger scale, but I’m hopeful. So, welcome to my experiment. Let’s talk politics without all the abusive people coming into the picture too. Please come in and test if this thing can work in the long run.

Pleasant Politics

!pleasantpolitics@slrpnk.net

  • auk@slrpnk.netOP
    link
    fedilink
    English
    arrow-up
    12
    ·
    6 months ago

    The code for the bot is open source. It’s not an AI model. It’s based on a classical technique for analyzing networks of relative trust and turning them into a master list of community trust, combined with a lot of studying its output and tweaking parameters. The documentation is sparse, but if someone is skilled in these things they can probably take a few hours to study it and its conclusions and see what’s going on.

    If you’re interested in looking at it for real, I can write some better documentation for the algorithm parts, which will probably be necessary to make sense of it beyond the surface level.

    • driving_crooner
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 months ago

      Thanks you, I’m personally more interested on the statistics used on the parameter searching, but given that is python I’m checking out to see what can I learn.

      • auk@slrpnk.netOP
        link
        fedilink
        English
        arrow-up
        5
        ·
        6 months ago

        Don’t let the python fool you. It is not simple python. I’ll try to add some comments later on to make it more clear what’s going on.

        For tuning parameters, it was complicated. Mostly, I did spot-checks on random users at different ranking levels, to try to check that the boundary for banning matched up pretty well with what I thought was the boundary of an acceptable level of jerkishness. That, combined with deeper dives into which comments had made what contributions to the user’s overall rankings. And then talking with existing moderators, looking over the banlists, and bringing up users where they thought the bot was getting it wrong. There were a lot of corner cases and fixes to the parameters to fix the corner cases. Sometimes it was increasing SMOOTHING_FACTOR to make users more equal in rank with each other, when we found some user that was banned because of one bad interaction with some high-rank person who downvoted them. Sometimes it was changing parameters to change how easy it is to overcome a few negatively-ranked postings by being generally positive with the rest of your postings. There are always users for which the right answer is a matter for debate or opinion, but as long as the bot isn’t making decisions that are clearly wrong, I think it’s doing pretty well.

        You can look over some places where I talked with people about the bot’s opinion of their user, in this post and this post. I don’t want to publicly do those breakdowns for people who haven’t agreed to have it done to them, but that might give you an idea of how the tuning went. What I did to tune the parameters was the same type of thing as I showed in those comments, just a whole lot more of it.

      • auk@slrpnk.netOP
        link
        fedilink
        English
        arrow-up
        5
        ·
        6 months ago

        I added an explanation of the details of how it works to the source file that implements the main rank algorithm. The math behind it is not simple, but it’s also not rocket science, if you have some data science abilities and want to check it out.