KICK TECH BROS OUT OF 196

    • huginn@feddit.it
      link
      fedilink
      arrow-up
      2
      ·
      9 months ago

      The paper states that the graphs representing those relations are the result of training LLMs on a very small subset of unambiguous true and false statements.

      While these emergent properties may provide interesting avenues to model refinement and inspecting outputs it doesn’t change the fact that these weird little dictionaries aren’t doing anything truly unexpected. We just are learning the extra data associated with the training data.

      It’s not far removed from the primary complaint of Gebru’s On Stochastic Parrots where she points out the ways that our biases are implicitly trained into LLMs because of the uncontrolled and unexamined inputs: except in this case those biases are the linguistics of truth and lies in unambiguous boolean inputs.

      • Even_Adder@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        This may provide interesting avenues to model refinement that aren’t spitting things out and being retrained by “consciousness” telling it yes or no, or feeding it additional info.

        • huginn@feddit.it
          link
          fedilink
          arrow-up
          1
          ·
          9 months ago

          Only if the “direction of truth” exists in the wild with unchecked training data.

          That clustering is a representation of the nature of the data fed to the model: all their training data was unambitious true or false… It’s not surprising that it clusters.