I’ve noticed a rise in people sharing links to YouTube, Instagram, Twitter, TikTok, and reddit that include tracking parameters in the URL.
It might largely be harmless for now, but it’s not good to let companies build a web of links between users of this site, and to link the usernames of users on this site to their off-site accounts, which may include sensitive info.
SM | URL Part | Appearance in URL | Filtration technique |
---|---|---|---|
Youtube | Query | ?si=* | Remove query string |
Query | ?igshid=* | Remove query string | |
Query | ?t= | Remove query string | |
Tiktok | Subdomain and path | (vm/vt).tiktok.com/(random_string) | Block |
Path | /(sub_name)/s/(random_string) | Block |
This site should only allow canonical links to the content to limit the information exposed.
There are often many ways to represent a webpage link in a URL format. For example, a random reddit post has several forms of links, even without any tracking:
https://www.reddit.com/r/me_irl/comments/18xheeg/me_irl/
https://redd.it/18xheeg
Both go to the same reddit post. However, if I were to use the new reddit redesign, or reddit mobile to share this link, it would look something like https://www.reddit.com/r/me_irl/s/stxMlEtK5H (not a real link). If you press on that, it might go to the more expanded form https://www.reddit.com/r/me_irl/comments/18xheeg/me_irl?share_id=5168327 but it will have a share_id parameter. Both clicking the link with the /s/stxMlEtK5H and landing on the page that has ?share_id=5168327 will register on reddit’s servers as some user following some other user’s link, and of course they know who both these users are. They can then correlate it, and form a graph (a structure that represents a network) that links these users because they interacted by sharing this link, even though they might have shared it on a second medium like Whatsapp, or Hexbear, and never interacted directly on reddit itself.
Canonical links are just the most normal links to the content. Without ?share_id stuff, and without pointless random letters. When Google finds reddit pages to show on their end they only show the full form, which is https://www.reddit.com/r/me_irl/comments/18xheeg/me_irl/. This is the canonical link form for reddit.
gotcha i see how this is a privacy issue now, thank you!