• 4 Posts
  • 152 Comments
Joined 1 year ago
cake
Cake day: June 20th, 2023

help-circle
  • I think if your photos are on any kind of public website, AI idiots will scrape them regardless of the provider. So at minimum you have to password protect them. That said, I’d feel ok using this:

    https://www.hetzner.com/storage/storage-share/

    It basically runs NextCloud. You’d configure it so that only logged-in users can view the pictures, and give accounts to your friends and family. I don’t think Hetzner is likely to train AI with it, though you could check through their privacy policy. Part of the issue with eg. Google Drive is that everyone wants stuff for free, so Google recovers some of its costs by advertising, AI training, etc. Hetzner charges enough to actually make a profit, while still being IMHO affordable at the level we’re discussing. That means they don’t have to do crap with advertising etc. I have 5TB in their Storage Box product and am happy with it.

    If you want to be more hardcore, you could set up a dedicated server with an encrypted HDD, but now you have to deal with the hassles of self hosting, including backups. It still wouldn’t be end to end encryption, which would require your users to run some kind of special client, or maybe use some awful javascript client.


  • It would help if you gave some numbers. How much data, within a factor of 1000 say? A few megabytes? A few gigabytes? A few terabytes? A few petabytes? The approach you need will change depending on the level. What is your budget?

    What bothers you about cloud storage? Are any of the photos edgy?

    Anyway it sounds to me like you would be fine with a decent web hosting plan and a basic photo gallery app.





  • Sure. Normally I think of visiting a site in a browser as navigating to that site on purpose. If Mozilla sells placement in the browser so that the browser navigates to that site automatically (unless you disable that), it’s invasive. That said I do remember Mozilla sets the default start page to something annoying and I had to reconfigure it to about:blank when I set up the system.


  • I think you are saying Mozilla sold tabs screen placement to Accuweather, which means Accuweather gets user IP addresses (and therefore approximate location among other things) when the user launches the browser. So I guess the answer to OP question is yes.

    And yes, opt-out is possible, but a pro-privacy approach would require opt in.



  • If whatever they are doing has been working for stuff written in languages other than Rust, we have to ask what makes Rust special. Rust is a low level language, so its dependencies if anything should be simpler than most, with just a minimal shim between its runtime and the C world. Why does any production software have a version <= X constraint in any of its dependencies anyway? I can understand version >= X, but the other way implies that the API’s are unstable and you’re going to get tons of copies stuff around. I remember seeing that in Ruby at a time when Python was relatively free of it, but now Python has it too. Microsoft at least understood in the 1990s that you can’t go around breaking stuff like that.

    No it’s not all C99. I’m using Calibre (written in Python), Pandoc (written in Haskell), GCC (written in C, C++, and Ada), and who knows what else. All of these are complex applications with many dependencies. Eclipse (written in Java) is also in Debian though I don’t use it. Bcachefs though is apparently just special.

    Joe Armstrong (inventor of Erlang) said of OOP, “you wanted a banana but what you got was a gorilla holding the banana, and the entire jungle”. Rust begins to sound like that too. It might not be inherent in the language, but it looks like the way the community thinks.

    I also still don’t understand why the Bcachefs userspace stuff is written in Rust. I can understand about the kernel part, but the concept of a low level language is manual resource management that a HLL handles for you automatically. Writing the userspace in a LLL seems like more pain for unclear gain. Are there intense performance or memory constraints or what?

    Actually I see now that kernel part of Bcachefs is also considered unstable, so maybe the whole thing is not yet ready for production.


  • Talks about different developer styles, slightly interesting and not too long winded I guess, but not much about the actual situation.

    I think this is still not such a great look for Rust. I had expected interfacing Rust to C to present fewer problems than it seems to. I had hoped the Rust compiler could produce object code with almost no runtime dependencies, the way C compilers can. So integrating Rust code into the kernel should be fairly painless from the C side, if things were as one would hope.

    It does sound to me in the earlier post that there was some toxicity going on. Maybe it had something to do with the context being a DRM driver.

    I looked at a few Rust tutorials but they seemed to take forever to get to any interesting parts. I will keep looking.












  • How many ebooks are you talking about (millions)? Is there just a question of finding duplicated files? That’s easy with a shell script. For metadata, see if the books already have it since a lot do. After that, you can use fairly crude hacks as an initial pass at matching library records. There’s code like that around already, try some web searches, maybe code4lib (library related programming) if that is still around. I saw your earlier comment before you deleted it and it was perfectly fine.


  • If the files are literally duplicated (exact same bytes in the files, so matching md5sums) then maybe you could just delete the duplicates and maybe replace them with links.

    Automatically sorting books by category isn’t so easy. Is the metadata any good? Are there categories already? ISBN’s? Even titles and authors? It starts to be kind of a project but you could possibly import MARC records (library metadata) which have some of thatinfo in them, if you can match up the books to library records. I expect that the openlibrary.org API still works but I haven’t used it in ages.