gcheliotis@lemmy.worldtoTechnology@lemmy.world•Someone Made a Dataset of One Million Bluesky Posts for 'Machine Learning Research'English
2·
6 days agoThe real question here is why the researcher “librarian” didn’t even attempt to anonymize the dataset before making it available. Full anonymization isn’t a trivial task, but at least removing unique identifiers or replacing them with randomly generated ones would be good practice.
That is odd that nobody could tell this wouldn’t be for you @SHOW_ME_YOUR_ASSHOLE@lemm.ee