• Kerb@discuss.tchncs.de
        link
        fedilink
        arrow-up
        71
        ·
        1 year ago

        unicode is great isnt it,
        it supports almost all writing systems

        for example here is the transcript of one of the complaints about ea-nasirs shitty copper :

        𒀀 𒈾 𒂍 𒀀 𒈾 𒍢 𒅕

        𒀀 𒈾 𒂍 𒀀 𒈾 𒍢 𒅕
        𒆠 𒉈 𒈠
        𒌝 𒈠 𒈾 𒀭 𒉌 𒈠
        𒀀 𒉡 𒌑 𒈠 𒋫 𒀠 𒇷 𒆪
        𒆠 𒀀 𒄠 𒋫 𒀝 𒁉 𒄠
        𒌝 𒈠 𒀜 𒋫 𒀀 𒈠
        𒄖 𒁀 𒊑 𒁕 𒄠 𒆪 𒁴
        𒀀 𒈾 𒄀 𒅖 𒀭 𒂗𒍪 𒀀 𒈾 𒀜 𒁲 𒅔
        𒋫 𒀠 𒇷 𒅅 𒈠 𒋫 𒀝 𒁉 𒀀 𒄠
        𒌑 𒆷 𒋼 𒁍 𒍑
        𒄖 𒁀 𒊑 𒆷 𒁕 𒄠 𒆪 𒁴
        𒀀 𒈾 𒈠 𒅈 𒅆 𒅁 𒊑 𒅀
        𒋫 𒀸 𒆪 𒌦 𒈠 𒌝 𒈠 𒀜 𒋫 𒈠
        𒋳 𒈠 𒋼 𒇷 𒆠 𒀀 𒇷 𒆠 𒀀
        𒋳 𒈠 [𒆷] 𒋼 𒇷 𒆠 𒀀 𒀜 𒆷 𒅗
        𒅀 𒋾 𒀀 𒈾 𒆠 𒈠 𒈠 𒀭 𒉌 𒅎
        𒌅 𒅆 𒅎 𒈠 𒉌 𒈠
        𒆠 𒀀 𒄠 𒋼 𒈨 𒊭 𒀭 𒉌
        𒈠 𒊑 𒀀 𒉿 𒇷 𒀀 𒈾 𒆠 𒈠 𒅗 𒋾
        𒀀 𒈾 𒆠 𒋛 𒅀 𒈠 𒄩 𒊑 𒅎
        𒀸 𒁍 𒊏 𒄠 𒈠
        𒌅 𒈨 𒄿 𒊭 𒄠 𒈠
        𒄿 𒈾 𒂵 𒂵 𒅈 𒈾 𒀝 𒊑 𒅎
        𒅖 𒋾 𒅖 𒋗 𒅇 𒅆 𒉌 𒋗
        𒊑 𒆪 𒋢 𒉡 𒌅 𒋼 𒅕 𒊏 𒄠
        𒄿 𒈾 𒀀 𒇷 𒅅 𒋼 𒂖 𒈬 𒌦
        𒈠 𒀭 𒉡 𒌝 𒊭 𒆠 𒀀 𒄠
        𒄿 𒁍 𒊭 𒀭 𒉌 𒄿 𒈠
        𒀜 𒋫 𒈠 𒅈 𒅆 𒅁 𒊑 𒅀 𒌅 𒈨 𒂊 𒅖
        𒀀 𒈾 𒈠 𒆷 𒅗 𒊍 𒉿 𒅎
        𒊭 𒄿 𒈾 𒂵 𒋾 𒅀 𒌅 𒊺 𒍪 𒌑
        𒆠 𒀀 𒄠 𒋫 𒁕 𒁍 𒌒
        𒅇 𒀸 𒋳 𒄿 𒅗
        𒀀 𒈾 𒂍 𒃲 𒇷
        𒌋 𒐍 𒄘 𒍏 𒀀 𒈾 𒆪 𒀜 𒁲 𒅔
        𒅇 𒋗 𒈪 𒀀 𒁍 𒌝
        𒌋 𒐍 𒄘 𒍏 𒄿 𒁲 𒅔
        𒂊 𒍣 𒅁 𒊭 𒀀 𒈾 𒂍 𒀭 𒌓
        𒆪 𒉡 𒊌 𒅗 𒄠 𒉌 𒍣 𒁍
        𒀀 𒈾 𒉿 𒊑 𒅎 𒊭 𒀀 𒋾
        𒆠 𒄿 𒋼 𒁍 𒊭 𒀭 𒉌
        𒆠 𒋛 𒄿 𒈾 𒂵 𒂵 𒅈 𒈾 𒀝 𒊑
        𒌅 𒊌 𒋾 𒅋
        𒆠 𒋛 𒀀 𒈾 𒂵 𒋾 𒅀
        𒋗 𒇻 𒈠 𒄠 𒂊 𒇷 𒅗 𒄿 𒋗
        𒆠 𒈠 𒀭 𒉌 𒆠 𒀀 𒄠
        𒉿 𒊑 𒀀 𒄠 𒆷 𒁺 𒈬 𒂵 𒄠
        𒆷 𒀀 𒈠 𒄩 𒊒 𒅗 𒋫 𒆷 𒈠 𒀜
        𒄿 𒈾 𒆠 𒊓 𒇷 𒅀
        𒅖 𒋾 𒈾 𒀀 𒌑 𒈾 𒍝 𒀝 𒈠
        𒂊 𒇷 𒆠
        𒅇 𒀀 𒈾 𒊭 𒌅 𒈨 𒄿 𒊭 𒀭 𒉌
        𒈾 𒋛 𒄴 𒋫 𒄠 𒂊 𒁍 𒍑 𒅗

        in the original cuneiform as a copypasta

      • sarmale@lemmy.zip
        link
        fedilink
        arrow-up
        8
        ·
        1 year ago

        How many unicode characters could you add to the standard until it becomes unreliable?

        • Kerb@discuss.tchncs.de
          link
          fedilink
          arrow-up
          27
          ·
          edit-2
          1 year ago

          aparently unicode supports about 1.1 million characters, and we currently only use 96,382 as of version 4.0

          EDIT: i just read that unicode 4.0 is very outdated, current version is unicode 15.1 with 149,878 characters.

        • A Unicode character can be up to 4 bytes, so 2^32 or 4,294,967,296 potential unique characters. And it’d be easy enough to adjust the standard to allow for an extra byte(s) if necessary – it’s been done before.

          • Turun@feddit.de
            link
            fedilink
            arrow-up
            4
            ·
            edit-2
            1 year ago

            This is incorrect. While in UTF-32 a character (actually a code point) requires 4 bytes, and in UTF-8 up to 4 bytes, the Unicode standard is limited to 17*2^16 code points. (edit: apparently because that is the limit of UTF-16. 4 Byte UTF-8 can encode 2^21 code points, but it is not technically limited to four bytes, so in total is a ble to encode 2^31 code points)

            Unicode is the standard that says “the thing we call captial A is the 65th character”, literally defining a mapping from numbers to concepts.
            UTF-8 or UTF-32 are a way to encode a list of numbers in a more (UTF-8) or less (UTF-32) efficient way.