

It’s not just that, it’s also the fact they scored the responses based on user feedback, and users tend to give better feedback for more confident, even if wrong, responses.


It’s not just that, it’s also the fact they scored the responses based on user feedback, and users tend to give better feedback for more confident, even if wrong, responses.


If you’re putting in that much work, please submit those edits to musicbrainz! We need all the help we can get 😭


As a musicbrainz editor, don’t depend entirely on Picard and musicbrainz for correct tagging either cause shit isn’t as well curated as you think.


I don’t agree with a total ban, but the writers of the article downplaying the harmful content on YouTube I think have forgotten the multiple times YouTube has gotten in trouble with advertisers for shit like “elsagate” where they were showing mutilation etc. of Disney characters targeted at children.
There needed to be some kind of regulation, but an outright ban is a bit much.
This feels like they’re trying to drive a tack with a sledgehammer.


I don’t think that’s accurate because they asked Dlsite before them to restrict their content based on American Law. They tried to remove access to content from outside Japan that Visa was complaining about and Visa still told them to remove the content (I guess cause people were using VPNs) so they had to remove the ability to pay with visa and Mastercard entirely.


“Unlawful” based on what? American law?
These are global payment companies, they can’t just have a “we don’t allow payment for illegal content” cause that varies by country (and by state even).
What an absolutely nothing statement.


Well, Stable Diffusion 3 supposedly purposefully removed all porn from their training and negatively trained the model on porn and it apparently destroyed the model’s ability to generate proper anatomy.
Regardless, image generation models need some porn in training to at least know what porn is so that they know what porn is not.
It’s part of a process called regularization, or preventing any particular computational model from over-fitting.


I was definitely in the same camp of thinking (I mean Hindenburg etc, duh). But there’s been a bunch of studies where, because hydrogen basically immediately dissappates up and away, unless you’re in an extremely cramped area it’s much safer in collisions and unexpected containment breaches.
Even then, it actually poses less of a threat to life because it doesn’t create smoke or burn for awhile like gasoline does.


Hydrogen is more dangerous than gasoline if it leaks
I’d love to see a source on that.
This Report by the US department of energy says otherwise.


You didn’t say “most” on your original post. You might want to edit it if that’s what you meant.


? You can get Hydrogen through simple water electrolysis. In fact you can do it at home. That’s like how 4% of all hydrogen is manufactured.


Is this just hosted nextcloud with collabora office pre installed?


I don’t think chain of trust and security through kernel-level access are fighting the same problem.
Usually chain of trust is to prevent app tampering, and kernel-level access is to prevent memory tampering.
I assume Windows is creating a new API for applications to monitor certain regions of memory for tampering without needing kernel access.
I have also been done in many times by git-filter-repo. My condolences to the chef.


There’s a lot of assumptions about the reliability of the LLMs to get better over time laced into that…
But so far they have gotten steadily better, so I suppose there’s enough fuel for optimists to extrapolate that out into a positive outlook.
I’m very pessimistic about these technologies and I feel like we’re at the top of the sigma curve for “improvements,” so I don’t see LLM tools getting substantially better than this at analyzing code.
If that’s the case I don’t feel like having hundreds and hundreds of false security reports creates the mental arena that allows for researchers to actually spot the non-false report among all the slop.


It found it 8/100 times when the researcher gave it only the code paths he already knew contained the exploit. Essentially the garden path.
The test with the actual full suite of commands passed in the context only found it 1/100 times and we didn’t get any info on the number of false positives they had to wade through to find it.
This is also assuming you can automatically and reliably filter out false negatives.
He even says the ratio is too high in the blog post:
That is quite cool as it means that had I used o3 to find and fix the original vulnerability I would have, in theory, done a better job than without it. I say ‘in theory’ because right now the false positive to true positive ratio is probably too high to definitely say I would have gone through each report from o3 with the diligence required to spot its solution.


I’m not sure if the Gutenberg Press had only produced one readable copy for every 100 printed it would have been the literary revolution that it was.


I’m not sure if it would work for your situation but you seem to be able to ssh into a server on that network? If so you can run a browser on that computer and tunnel the X session over ssh:
https://www.cyberciti.biz/tips/running-x-window-graphical-application-over-ssh-session.html
Otherwise neko seems neat, I’ve actually been looking for something for watch parties.


The Blog Post from the researcher is a more interesting read.
Important points here about benchmarking:
o3 finds the kerberos authentication vulnerability in the benchmark in 8 of the 100 runs. In another 66 of the runs o3 concludes there is no bug present in the code (false negatives), and the remaining 28 reports are false positives. For comparison, Claude Sonnet 3.7 finds it 3 out of 100 runs and Claude Sonnet 3.5 does not find it in 100 runs.
o3 finds the kerberos authentication vulnerability in 1 out of 100 runs with this larger number of input tokens, so a clear drop in performance, but it does still find it. More interestingly however, in the output from the other runs I found a report for a similar, but novel, vulnerability that I did not previously know about. This vulnerability is also due to a free of sess->user, but this time in the session logoff handler.
I’m not sure if a signal to noise ratio of 1:100 is uh… Great…
They’re both “immutable” in the sense that they’re setting up either read-only Filesystem Hierarchies (as in bazzite, which uses ostree) or Symlinking their entire filesystem hierarchy to a read-only “store” (as in nixos).
Bazzite uses something called ostree to “diff” the filesystem hierarchy much like git does, while Nix basically makes giant read-only store of files and hashes them, then weaves them all together into a “view” of a filesystem that gets symlinked into the context of a running program.