@pup_atlas

pup_atlas@pawb.social · 7 hours ago

Discord isn’t a social media. With platforms like facebook, you’re still paying for all your storage, just not with money. There’s ads all over the platform, and all your content is data mined to be sold to advertisers. Discord doesn’t data mine (to my knowledge) OR run ads. Would you prefer a higher limit at the cost of having ads all over the interface? The AWS bill has to get paid somehow, nothing is free.

pup_atlas@pawb.social · 7 hours ago

This was my core point. I don’t consider a business raising prices or gating features as a direct result of those features increasing their cost as “enshittification”. Stickers being paid, custom emojis, etc, that doesn’t cost Discord anything to provide, making that paid is enshittification; But if the feature itself costs the business actual money to provide, does everyone just expect them to eat that cost forever, in a lot of cases for absolutely no revenue from the users?

Calling out businesses for not giving stuff that costs them money away for free just, doesn’t fundamentally make sense to me. Why is it just expected of Discord that they pay to store all your large files? A lot of “freemium” services like GMail recoup some of that money by mining your email for data that it can sell to advertisers, or eating the cost in an attempt to lock you into an ecosystem where you’ll spend money. Storing files on Discord is neither of those things.

Don’t get me wrong, a lot of services are enshittifying, and making their services worse so you spend more money with them— but adjusting your quotas and pricing to reflect your real world cost of business is not that. To frame it as though you are entitled to free compute and resources from companies that don’t owe you anything comes off as just that, entitled. The cloud isn’t free. If you want to use a service, you should pay for it if you can.

pup_atlas@pawb.social · 24 hours ago

I don’t see this as enshittification. It’s a real thing that’s happening, but raw storage is expensive. They pay for it directly. Unlike artificially limiting features that are “free” to them, this genuinely isn’t, it’s not even really super discounted for them on the backend. They’re likely just paying for a series of S3 buckets.

pup_atlas@pawb.social · 4 months ago

Indexing and lookups on datasets as big as companies like Google and Amazon are running also take trillions of operations to complete, especially when you take into account the constant reindexing that needs to be done. In some cases, encoding data into a neural network is actually cheaper than storing the data itself. You can see this in practice with gaussian splatting point cloud capture, where they are training networks to guide points in the cloud at runtime, rather than storing the position of trillions of points over time.

pup_atlas@pawb.social · 4 months ago

I firmly believe it will slow down significantly. My prediction for the future is that there will be a much bigger focus on a few “base” models that will be tweaked slightly for different roles, rather than “from the ground up” retraining like we see now. The industry is already starting to move in that direction.

pup_atlas@pawb.social · edit-2 4 months ago

While I agree in principle, one thing I’d like to clarify is that TRAINING is super energy intensive, once the network is trained, it’s more or less static. Actually using the network isn’t dramatically more energy than any other indexed database lookup.

pup_atlas@pawb.social · 4 months ago

Actually, Windows does allow you to use an alternate “compositor”— a feature which is used quite frequently in the industrial/embedded space. Windows calls them “custom shells”. The default is Explorer, but it can be set to any executable.

https://learn.microsoft.com/en-us/windows/iot/iot-enterprise/customize/shell-launcher

pup_atlas@pawb.social · 4 months ago

When I read the headline I just assumed this must be an onion article. That is not good.

pup_atlas@pawb.social · 5 months ago

It’s really not though. It’s actually pretty simple under the hood.

pup_atlas@pawb.social · 1 year ago

I’m aware the model doesn’t literally contain the training data, but for many models and applications, the training data is by nature small enough, and the application is restrictive enough that it is trivial to get even snippets of almost verbatim training data back out.

One of the primary models I work on involves code generation, and in those applications we’ve actually observed verbatim code being output by the model from the training data, even if there’s a fair amount of training data it’s been trained on. This has spurred concerns about license violation on open source code that was trained on.

There’s also the concept of less verbatim, but more “copied” style. Sure making a movie in the style of Wes Anderson is legitimate artistic expression, but what about a graphic designer making a logo in the “style of McDonalds”? The law is intentionally pretty murky in this department, with even some colors being trademarked for certain categories in the states. There’s not a clear line here, and LLMs are well positioned to challenge what we have on the books already. IMO this is not an AI problem, it’s a legal one that AI just happens to exacerbate.

pup_atlas@pawb.social · 1 year ago

That’s not what’s happening though, they are using that data to train their AI models, which pretty irreparably embeds identifiable aspects of it into their model. The only way to remove that data from the model would be an incredibly costly retrain. It’s not literally embedded verbatim anywhere, but it’s almost as if you took an image of a book. The data is definitely different, but if you read it (i.e. make the right prompts, or enough of them), there’s the potential to get parts of the original data back.

pup_atlas@pawb.social · 1 year ago

That has been a feature in all of their competitors for 10+ years.

pup_atlas@pawb.social · 1 year ago

It would definitely stop pretty much any counterfeit if they added some rudimentary depth data into the image format as well, within the signed contents. That way simply taking a picture of a monitor would be obviously detectable, and not alterable without removing the signing. It wouldn’t have to be a high resolution depth map at all either.

pup_atlas@pawb.social · 1 year ago

The key to safe AI use is to treat the AI the same as the user. Let them automate tasks on behalf of the user (after confirmation) in their scope. That way no matter how much the model is manipulated, it can only ever perform the same tasks as the user.

pup_atlas@pawb.social · 1 year ago

I am an American and I use it religiously for the record. Especially for version numbers. Major.minor.year.month.day.hour.minute-commit. It sorts easy, is specific, intuitive, and makes it clear which version you’re using/working on.

pup_atlas@pawb.social · 1 year ago

This. Many of the people who still work there have no other choice. Their citizenship is tied to their employer. I feel sympathy for them.

pup_atlas@pawb.social · 1 year ago

Oh lawd he shrimpin