Rust 1.78.0 released

[Posted May 2, 2024 by corbet]

Version 1.78.0 of the Rust language has been released. Changes include a new mechanism for diagnostic attributes, changes to how assertions around unsafe blocks are handled, and more.

Rust now supports a #[diagnostic] attribute namespace to influence compiler error messages. These are treated as hints which the compiler is not required to use, and it is also not an error to provide a diagnostic that the compiler doesn't recognize. This flexibility allows source code to provide diagnostics even when they're not supported by all compilers, whether those are different versions or entirely different implementations.

(Log in to post comments)

Big changes stuck in nightly

Posted May 2, 2024 18:13 UTC (Thu) by proski (subscriber, #104) [Link]

While it's nice to see Rust releasing new version on a schedule, it's disappointing that many important features are stuck in nightly for years. The `error_in_core` feature (Error trait with no_std) is not moving anywhere, and its dependencies (`provide_any` and `error_generic_member_access`) are not moving towards stabilization either. Last comments on the tracking tickets are months old.

Supporting the Error trait without standard library would make low-level programming in Rust more Rust-like and less like a separate programming language.

Only simple changes seem to be going to most releases.

Big changes stuck in nightly

Posted May 3, 2024 0:37 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Well, necessarily things won't move by themselves, so, if you particularly want something to move, one option is certainly to help make that happen. I want Pattern Types (specifically I want types like BalancedI8), but since I haven't actually done much work to make that happen I can hardly complain that not much has happened.

The feature error_in_core seems currently to depends on Rust choosing a particular way to do things which has been rejected, so that's not going to make forward progress on its current route - maybe you can identify, champion and help move the feature towards, some viable alternative.

Big changes stuck in nightly

Posted May 3, 2024 16:13 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

Personally, I'd love to help,* but many of the problems with Rust (including the one you describe!) appear to be structural or political rather than technical. How are new contributors realistically supposed to improve things on that front?

* I would have to have a conversation with my employer in order to help, but that's a separate issue.

Big changes stuck in nightly

Posted May 3, 2024 16:16 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

(And yes, I'm aware that Rust probably does have technical problems too. I'm talking about perceptions, specifically how Rust is perceived by newcomers and outsiders. This is the language that turned a simple keynote address into megabytes of internet drama.)

Big changes stuck in nightly

Posted May 3, 2024 23:35 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

I can't see how the one I was talking about is "structural or political" except in the very broad sense that in my head is best expressed as the Skunk Anansie lyric "Yes it's fucking political. Everything's Political".

There's a question of how Rust should handle the nuts and bolts of this feature, one path forward was Provider, the library team decided (at a meeting you can probably find a summary of if you care enough) that Provider isn't a good idea, but the ticket for the error_in_core feature is still predicated on that feature. So "Wait until Provider is stabilised" is futile, that won't happen, meanwhile the next step is blocked on work by somebody who "needs to decomputer for a while". So that's a great place to start where they left off if you feel like you're not contributing by tidying up tickets to say they shouldn't wait on things that won't happen.

tl;dr Read through say 99301, figure out what must happen next, what's good to have but not crucial, and then if you feel like you can start to solve that, write down what you think needs solving and start. If you realise this requires skills you don't have and don't care to acquire, then yeah, I'm sorry, you won't be able to contribute, but this doesn't look like it's about delicate political negotiation, looks like it's type theory and a bunch of programming to me.

Big changes stuck in nightly

Posted May 4, 2024 0:30 UTC (Sat) by NYKevin (subscriber, #129325) [Link]

> [...] write down what you think needs solving and start.

Here's the problem: Rust has such a history of leadership dysfunction, that I have exactly zero confidence that this would be a useful thing for me to do. For all I know, I'll write a document, get mildly positive feedback, spend weeks or even months on implementation, and then get told "actually, we should've rejected this when you first proposed it, whoops, sorry." I get paid to put up with that sort of thing at $DAY_JOB. I'm not doing it in my free time.

Rust's leadership is aware of the problem. The difficulty is, they were already aware of it when keynotegate happened. They were, concurrent with that incident, already in the process of restructuring to make things more transparent, in the wake of the *previous* "leadership dropped the ball and miscommunicated something important" crisis. Given that backdrop, my position is that I'm not going to be interested in touching anything Rust that involves any degree of "figuring out what the solution should be" for at least another year, possibly longer if they manage to have another crisis, unless I'm getting paid to do so.

Rust 1.78.0 released

Posted May 3, 2024 1:10 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

One interesting barely mentioned change in 1.78 is that now Rust's core synchronization primitives are futex flavoured on Windows as well as (non-Apple) modern Unix systems. Previously on Windows the Windows specific APIs were used, but it so happens that somebody was developing a way to provide a futex API on Windows, and of course Rust already has implementations of all its desired primitives against a futex since that's what you get on Linux and several other Unix or Unix-like systems. [ On a Mac AFAIK you're just out of luck ]

This was lucky because in March a C++ programmer discovered that under MSVC std::shared_mutex doesn't have the require semantics. Upon digging into this, Microsoft's brilliantly named Stephen T Lavavej (a maintainer of their C++ stdlib which they call "STL") realised actually that's not just a MSVC C++ bug, the cause is in a Windows API, SRWLock, which was documented to have the behaviour C++ wants but actually just doesn't. Why should Rust care? Well this behaviour is (not coincidentally) also what Rust specified for its RWLock type, and so Rust like MSVC used SRWLock - after all that's what it's for. So the exact same undocumented defect affects Rust (1.77 and earlier). Getting off SRWLock and similar APIs escapes from this menace.

It does seem as though Microsoft probably just aren't interested in actually fixing this. Maybe even to the point where it just becomes accepted that oh yeah, C++ std::shared_mutex doesn't work properly on Windows. Rust won't be stuck in a similar position now if that happens, so that's great news.

Rust 1.78.0 released

Posted May 3, 2024 6:33 UTC (Fri) by mb (subscriber, #50428) [Link]

>that somebody was developing a way to provide a futex API on Windows

Do you have more information about that?

Rust 1.78.0 released

Posted May 3, 2024 8:12 UTC (Fri) by p2mate (subscriber, #51563) [Link]

https://github.com/rust-lang/rust/issues/93740

Rust 1.78.0 released

Posted May 3, 2024 8:18 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

That's Mara's 2022 work which uses SRWLock, I assume mb is interested in the new work, which landed in 1.78 https://github.com/rust-lang/rust/pull/121956/

Rust 1.78.0 released

Posted May 3, 2024 9:40 UTC (Fri) by khim (subscriber, #9252) [Link]

> it so happens that somebody was developing a way to provide a futex API on Windows

Ugh. That's… and interesting take on the whole thing, I supposed. That “someone” is actually… Microsoft.

The catch? It's Windows 8+ API. And till Rust 1.76 Windows 7 was Tier 1. Now that it's Tier 3 instead the road the use of that API is open.

I guess {x86_64,i686}-win7-windows-msvc uses some kind of emulation, but that's not as important since it's Legacy platform now.

Rust 1.78.0 released

Posted May 3, 2024 10:01 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

Windows 8 was over a decade ago, and Windows 7 stopped receiving extended life security updates last year. Worrying about Windows 7 isn't far off worrying about Amiga Unix.

Rust 1.78.0 released

Posted May 3, 2024 10:14 UTC (Fri) by khim (subscriber, #9252) [Link]

> Worrying about Windows 7 isn't far off worrying about Amiga Unix.

And that thinking is the reason why Year of Linux desktop never happened and would, probably, never happen.

Rust was listing Windows 7 as tier 1 platform till February 8 2024. It may have been old, at this point, but it's still supported (as tier 3 platform now).

You don't go and just randomly break supported platform (except if you are typical desktop Linux developer, I guess).

Amiga Unix, on other other hand, was never officially supported by Rust AFAIK.

Rust 1.78.0 released

Posted May 3, 2024 11:49 UTC (Fri) by zdzichu (subscriber, #17118) [Link]

But this is still Windows, a fringe platform from our perspective. By “our” I mean open source, Linux-centric Rust community.

Rust 1.78.0 released

Posted May 3, 2024 11:55 UTC (Fri) by khim (subscriber, #9252) [Link]

It may be “fringe platform” from your personal POV, but for Rust it's one of the most important platforms.

In fact there are eigth tier1 platforms listed on the Rust website and precisely half of them are some kind of Windows.

Treating Windows as a fringe, irrelevant platform was another reason about why Rust developers ignored many demands of “open source, Linux-centric” community.

Rust 1.78.0 released

Posted May 3, 2024 13:59 UTC (Fri) by cesarb (subscriber, #6266) [Link]

> In fact there are eigth tier1 platforms listed on the Rust website and precisely half of them are some kind of Windows.

The number of Windows platforms listed only means it has lots of variation (32-bit versus 64-bit, MinGW versus MSVC), and gives us no information about its relative importance. Its presence in the tier1 list is enough to tell us it's considered an important platform for Rust, but that doesn't mean the other platforms in the tier1 list are less (or more) important than it.

(We have a 2x2 matrix of Windows variants depending on the tooling and the CPU architecture, the 3 Linux variants all use the same compiler and have an extra CPU architecture, and Apple has a single variant in tier1.)

Rust 1.78.0 released

Posted May 3, 2024 16:10 UTC (Fri) by ballombe (subscriber, #9523) [Link]

aarch64-apple-darwin is in tiers 2 while x86_64-apple-darwin is in tiers 1.
So being in tiers1 is not an indication of freshness.

Rust 1.78.0 released

Posted May 3, 2024 20:20 UTC (Fri) by atnot (subscriber, #124910) [Link]

In general T3 vs T2 is a question of active maintainers, while T2 vs T1 is just a question of whether someone could be bothered to run a build farm for it.

Rust 1.78.0 released

Posted May 3, 2024 15:32 UTC (Fri) by mb (subscriber, #50428) [Link]

There is not one single reason why Rust should keep supporting platforms that Microsoft itself dropped and does not support anymore.
It's a waste of time.

Rust 1.78.0 released

Posted May 3, 2024 16:46 UTC (Fri) by khim (subscriber, #9252) [Link]

It all depends on what people are using. And old platforms are used for a very long time. After all Windows 3.11 is still in use and compared to that Windows XP is practically bleeding age and Windows 7 something they may contemplate to upgrade soon!

Rust 1.78.0 released

Posted May 3, 2024 17:04 UTC (Fri) by mb (subscriber, #50428) [Link]

>It all depends on what people are using.

No. Not at all. it depends on what Rust developers want to invest their time in.

But even if your argument was true, then it would be another reason to *drop* support for Win7.

https://www.google.de/search?q=windows+7+market+share

If you want to support this 3% market share, please spend *your* time and continue to maintain a Rust fork for them.

Rust 1.78.0 released

Posted May 3, 2024 17:38 UTC (Fri) by khim (subscriber, #9252) [Link]

> If you want to support this 3% market share, please spend *your* time and continue to maintain a Rust fork for them.

Seriously? 3% are not enough to warrant tier 3 support but 4% should deserve special status? Why?

Thanks god Rust people are more reasonable. Rust for Windows 95 is an unofficial port because there are apparently enough people who care about it but not enough people to do regular builds, while Rust for Windows 7 is tier 3 because at this point supporting it as host platform is too much hassle while supporting it as target platform is valuable to many developers.

Rust 1.78.0 released

Posted May 3, 2024 17:51 UTC (Fri) by mb (subscriber, #50428) [Link]

>3% are not enough to warrant tier 3 support but 4% should deserve special status? Why?

Because you are completely missing the non desktop uses of Linux? Because these percentage numbers have completely different bases and as such can't be compared? Because the Rust people decide what to do with their thing?

Seriously, if you want Rust for Win7 so much, please maintain it. Nobody stops you.
But please stop demanding.

Rust 1.78.0 released

Posted May 3, 2024 21:49 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link]

It all depends on what people are using.

It depends on what people are developing for. My guess is that most of the people still running Windows 3.11 are running legacy code on it, rather than writing anything new. That makes it uninteresting as a target for language developers. That's different from Windows 7, where there are still a surprising number of users who want to run up-to-date applications, like web browsers.

Rust 1.78.0 released

Posted May 6, 2024 8:12 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

Users want a lot of things.

Running a web browser on Windows 7 (for the usual purpose of browsing the public web) is not and cannot be made safe. Chromium-based browsers no longer support it,[1] Firefox is fully dropping support in September,[2] and Microsoft's extended support program (which, for the record, was never intended for home users) ceased in January of last year.[3] Let me repeat that: Even if your company paid a lot of money for extra security updates, your OS and (probably) browser was last updated a year and four months ago. If you're a home user or your IT department didn't feel like paying for those updates, then your OS was last updated over four years ago.[4]

We should not lie to users about this being a reasonable thing to do. It is not. You will get hacked. Your passwords will get stolen, and your accounts will get taken over. Do not do this under any circumstances. Do not suggest to users that this is an OK thing to do. If you absolutely have to use a Windows 7 box, airgap it.

[1]: https://support.google.com/chrome/thread/185534985/sunset...
[2]: https://support.mozilla.org/en-US/kb/firefox-users-window...
[3]: https://learn.microsoft.com/en-us/troubleshoot/windows-cl...
[4]: https://support.microsoft.com/en-us/windows/windows-7-sup...

Rust 1.78.0 released

Posted May 3, 2024 14:08 UTC (Fri) by cesarb (subscriber, #6266) [Link]

> Windows 8 was over a decade ago, and Windows 7 stopped receiving extended life security updates last year.

Just because some Windows version stops receiving security updates doesn't mean it stops being used. It only means Microsoft wishes it stopped being used.

> Worrying about Windows 7 isn't far off worrying about Amiga Unix.

Amiga Unix was an obscure operating system few people used. Windows 7 was a very popular operating system a lot of people used, and many people still use.

Rust 1.78.0 released

Posted May 3, 2024 11:19 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

When you say "that someone" is Microsoft, are you telling me that Chris Denton (who wrote the change we're talking about) is a Microsoft employee?

Or just that the Windows API itself is something Microsoft developed (which seems obvious and so not worth noting)? Or something else?

> I guess {x86_64,i686}-win7-windows-msvc uses some kind of emulation, but that's not as important since it's Legacy platform now.

It's stuck with the SRWLock and similar Windows 7 era "native" features which it turns out are broken nonsense. Microsoft is, of course, free to ship fixes but since this is out of support I expect they just don't care. It's not clear to me how "Microsoft shipped garbage and don't care" is somehow a Rust bug and proof that LInux (?) isn't fit for desktop use.

Within Microsoft my impression is they've decided the nonsense is an "important performance optimisation" and that "Most Windows software relies on it" which is very odd for an undocumented "feature" (I'd say "bug") which nobody knew existed and has never actually been benchmarked, but I guess that's the state of software "engineering" at Microsoft these days.

In terms of the Rust change, there's literally a rename in the git change, renaming the "WIndows" stuff (which was what you were pleased with in say Rust 1.77 and earlier) to "Windows7" because now actual modern Windows uses futex style APIs instead, and then code to ensure the Windows7 code is used on er... Windows 7.

Rust 1.78.0 released

Posted May 3, 2024 11:49 UTC (Fri) by khim (subscriber, #9252) [Link]

> When you say "that someone" is Microsoft, are you telling me that Chris Denton (who wrote the change we're talking about) is a Microsoft employee?

No, I'm saying that an actual futex API that new implementation is used was done by Microsoft and included in Windows8+.

Sure, without Chris Denton we wouldn't have gotten it, but without Window 7 deprecation his work wouldn't have been usable.

> Or just that the Windows API itself is something Microsoft developed (which seems obvious and so not worth noting)?

It's worth noting because from your description it looks as if someone took Windows API as it existed years ago and implemented futexes on top of that.

That's not what have happened: Windows8 got futex-like API added to it and Rust 1.78 uses it, there are no emulation of futexes anywhere.

> It's not clear to me how "Microsoft shipped garbage and don't care" is somehow a Rust bug and proof that LInux (?) isn't fit for desktop use.

Rust had bug precisely because it doesn't want to repeat fate of Linux desktop. Sure, sometimes people ship garbage. No, that's not a good enough excuse to drop support for said garbage and force everyone to switch to something else every year.

It's really strange how we ended up with a system which has such a deep disconnect between layers: low-level layers (kernel, foundational libraries like glibc or libstdc++) developers understand the issue, desktop Linux developers… nope.

> Within Microsoft my impression is they've decided the nonsense is an "important performance optimisation" and that "Most Windows software relies on it" which is very odd for an undocumented "feature" (I'd say "bug") which nobody knew existed and has never actually been benchmarked, but I guess that's the state of software "engineering" at Microsoft these days.

It was always like this. If it took 17 years to notice then it's probably not common enough to be a problem worth fixing. And if std::mutex which was introduced 10 years after the low-level API was introduced doesn't like it then it's std::mutex that should be fixed, not API that worked fine for more than a decade.

Fixing it in code in the future would make sense (AFAICS std::mutex-mode doesn't make old software fail), but I don't think “turning the world upside down” because of some subtle bug somewhere makes any sense. Yet that's exactly what aforementioned CADT developers like to do: they rewrite everything because some issues are hard to fix in old design without thinking… and in the process break the workflow people had based on old designs.

I have heard that they have started to, finally, learn (e.g. PipeWire is trying to be bug-to-bug compatible with PulseAudio), but damage is [mostly] done: most people who actually need something done have left linux desktop long ago (me including).

> In terms of the Rust change, there's literally a rename in the git change, renaming the "WIndows" stuff (which was what you were pleased with in say Rust 1.77 and earlier) to "Windows7" because now actual modern Windows uses futex style APIs instead, and then code to ensure the Windows7 code is used on er... Windows 7.

Yes, Rust got differently implemented rwlock as a happy accident while Windows is still uses the one it was using for years. That's another thing that both kernel developers and Rust do right while CADT developers do wrong: if you explicitly make some ABI or API unstable and unsupported then you may break it. But it's very different thing from providing some ABI or API as the only way of doing things and then breaking it.

Rust 1.78.0 released

Posted May 3, 2024 13:11 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

> from your description it looks as if someone took Windows API as it existed years ago and implemented futexes on top of that.

The Windows API in question did exist years ago, specifically more than a decade ago in Windows 8

> there are no emulation of futexes anywhere

I probably wouldn't call this "emulation" because of course the API you actually want isn't very negotiable, so Windows ends up with a very similar arrangement, but there are differences and if you want to call adapting to those differences, or even just the naming, "emulation" then that's what this does. For example:

pub fn futex_wait<W: Waitable>(futex: &W::Atomic, expected: W, timeout: Option<Duration>) -> bool {
// return false only on timeout
wait_on_address(futex, expected, timeout) || api::get_last_error().code != c::ERROR_TIMEOUT
}

> Rust had bug precisely because it doesn't want to repeat fate of Linux desktop.

Rust had this bug for the same reason the STL had this bug, because Microsoft is bad at software engineering as a practice and it routinely ships garbage. In this case the garbage isn't so bad that everybody noticed immediately... and so it was just left dormant for many years. Now, on supported Windows builds (and always on every decent OS) it does not have this bug.

When I say they're bad at engineering, this same incident provides an excellent demonstration. Rather than fix the bug in SRWLock, the preferred "solution" from that side of Microsoft was to document that this is just how that API worked, and everybody who has been using it (ie all C++, Rust and other downstream software) was doing it wrong. So they wrote that documentation change, and... it's stuck somewhere for weeks, their own autogenerated documentation doesn't work, nobody seems sure exactly why, but hey, they can close their "bug" ticket and too bad the published document is wrong, software doesn't work, everything is broken but at least the people who screwed up are rewarded...

This seems to surprise non-Windows people, but I've worked with Microsoft APIs before, so I'm used to it, broken garbage that's kept kinda-sorta working by people like Raymond Chen. "Oh yeah, always use values between 53 and 140 to make it work right". One of the first bits of work I did at my current job should just have mostly been a copy-paste of a Microsoft example, but it didn't work. Then I realised the example didn't work either. Plenty of Microsoft employees just don't even bother testing that their examples even work, they write something that sounds about right and hit publish, there is no verification step and there are no negative consequences if it's broken, you shipped, get your trophy. The results very much speak for themselves. That incident stuck in my head because I'd been writing a lot of Rust and appreciating the fact Rust's test automation *assumes your examples should work* and so your tests fail if you wrote a bad example. But probably Microsoft's tests don't pass either so...

As to the idea that somehow SRWLock is the correct idea and it's std::shared_mutex (or even sillier Rust's RWLock) that should be altered, this makes no sense. The reader-writer lock is a *really old* idea. It's in POSIX back when that mattered. Obviously the POSIX reader-writer locks, which is what Rust will use on a really crusty Unix with no futex, are slower, but they're not broken, they, unlike SRWLock, provide a useful reader-writer lock feature.

std::shared_mutex despite being newer than Windows 8 isn't "For compatibility with Windows 7" or something silly, it's the canonical way to write a reader-writer lock in pure standard C++ -- before that existed you'd either use platform APIs such as pthreads or you'd use a third party library or you'd go without. RWLock was in Rust 1.0 because duh, obviously you want a reader-writer lock, this is the 21st century.

You seem to just take the people who screwed up at their word that this is an optimisation. Is that how you'd feel in a similar situation for say, Linux filesystem APIs? "Oh of course the file close feature doesn't always close the file, it's an optimisation, people should know that closing a file might sometimes not close it". Without the optimisation, closing a file might close the file even if that was non-optimal somehow.

That's what is going on here. The "optimisation" in Microsoft's SRWLock as implemented is that if you try to take a reader lock, sometimes you silently have the writer lock instead, when you release it things will be back to normal, but meanwhile it's exclusively locked to you. So, if you needed (say) another of the readers to also take this lock, too bad they can't because it wasn't a reader lock after all. This is a crucial optimisation apparently, so crucial that it was unmentioned and nobody suspected it existed for over a decade. Also nobody else seems to have invented a similar optimisation on other platforms, and nobody else got excited and rushed off to implement it. "Weird".

Rust 1.78.0 released

Posted May 3, 2024 14:11 UTC (Fri) by chris_se (subscriber, #99706) [Link]

To add to your post a bit of additional context for Microsoft's SRWLock issue, in case someone wants to read up on the details:

Reddit Thread about this issue
A StackOverflow question about this issue (a bit later than the reddit thread)
The Microsoft STL GitHub issue (Microsoft's STL is Apache 2.0 licensed nowadays)

All of these also contain comments from people who work/worked at Microsoft and can be an interesting read if you want to know more about the details.

Rust 1.78.0 released

Posted May 3, 2024 16:37 UTC (Fri) by kleptog (subscriber, #1183) [Link]

>This seems to surprise non-Windows people, but I've worked with Microsoft APIs before, so I'm used to it, broken garbage that's kept kinda-sorta working by people like Raymond Chen.

Thank you for confirming it's not in my head. I've not had to do anything with the Microsoft ecosystem for two decades and then all of a sudden it's "must use Azure". The implementation is unbelievably flaky in places and the documentation is just, in a word, terrible. It's verbose, yet somehow manages to avoid explaining any of the important details you need to know.

I'm lucky that ChatGPT come along at the right time because it manages to distil the wisdom of the internet into explanations and examples that are actually reasonably correct. I'd sort of vaguely understood the Windows API was a bit flaky, but now nothing surprises me any more.

Rust 1.78.0 released

Posted May 3, 2024 17:25 UTC (Fri) by khim (subscriber, #9252) [Link]

> This seems to surprise non-Windows people, but I've worked with Microsoft APIs before, so I'm used to it, broken garbage that's kept kinda-sorta working by people like Raymond Chen.

Why should it surprise non-Windows people? That's how all popular OSes work: be it Windows, Android, or even Linux (kernel, not Linux desktop). Heck, even GLibC works like that.

It's only CADT Linux desktop developers think that if you find the garbage in your implementation then proper approach is just to fix it and break perfectly working user's code. Well, I guess OpenBSD folk do that, too, but who cares about it?

> You seem to just take the people who screwed up at their word that this is an optimisation.

No. I take up their word on the fact that it's a deliberate behavior, that was in use for more than a decade. It's enough. You don't go and break code that was in use for so long even if it's “wrong and stupid”, the only reason to break code that's in use by billions is if you really have no any other choice.

Here is similar example from another OS which I personally helped to investigate and fix: see something interesting, no? Yup: on Android 6.0 or below sem_wait is not conforming to the specification. And, of course, it was fixed only for new versions of Android because older apps may rely on erroneous behavior (hint: they actually do, some old games wouldn't work if you would add that fix).

> Is that how you'd feel in a similar situation for say, Linux filesystem APIs?

Do you even have to ask? Sure. Linux developers invent very complicated tricks to keep old programs working, including adding pretty crazy warts to filesystem APIs.

> This is a crucial optimisation apparently, so crucial that it was unmentioned and nobody suspected it existed for over a decade.

Nobody suspected it may be perceived as broken implementation. Apparently there were (and are!) more than enough people who think it's something that should be permitted by RWLock.

> Rather than fix the bug in SRWLock, the preferred "solution" from that side of Microsoft was to document that this is just how that API worked, and everybody who has been using it (ie all C++, Rust and other downstream software) was doing it wrong.

They “weren't doing it wrong”, they were “documenting it wrong” from their POV. And yes, fixing the documentation is kind of important in such cases.

Whether to fix it for new applications or not is separate issue.

Rust 1.78.0 released

Posted May 4, 2024 1:23 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

> Nobody suspected it may be perceived as broken implementation. Apparently there were (and are!) more than enough people who think it's something that should be permitted by RWLock.

This "Nobody" seems to compromise a handful, perhaps only two, of Microsoft insiders who don't even seem clear on what a reader-writer lock is. So it's hardly surprising they don't see why their implementation is broken.

Was RWLock a typo for SRWLock, or are you claiming that people want this behaviour from Rust's std::sync::RWLock ? I haven't seen any sign of this, and it would astonish me.

You linked an article about how glibc memcpy won't get slowed down to be "compatible" with insane C programmers at Adobe who decided to ignore the definition of memcpy and assume it's actually memmove. Was that what you intended? Am I missing a story and actually on my Linux today the memcpy is just memmove so that obsolete 1990s software which won't run still works? I can't really follow your reasoning.

> Here is similar example from another OS which I personally helped to investigate and fix

It doesn't seem like a similar example at all? It looks as though the Android bug you've linked was an immediately observable defect, and so correcting it meant breaking all the software which depended on the faulty behaviour.

Maybe the problem is that you haven't thought about the consequence of the SRWLock "optimisation" for most software which uses reader-writer locks. Sometimes, all but one of your readers stalls for no reason. Not often, but once in a while. That's the "optimisation". If your readers don't care, which will be the usual case, this "optimisation" makes your software a little um, slower. If your readers do need to work together, this might sometimes deadlock as one reader gets the write lock, then waits forever for another reader. That's a pretty nasty bug, but you'll probably suspect some other cause, and hey if you get lucky it won't happen, you might just decide to add a watchdog timer to fix the occasional deadlocks on customer machines. This is probably a rare pattern anyway, it's legal but I can't remember working on any software which relied on it.

Your contention is that fixing the bug is wrong, and instead it's crucial to keep the "optimisation". Inside Microsoft your position seems to be winning. But the reason Rust even exists is that Hoare was annoyed that a software bug put his elevator out of action. I'm sure somewhere at Otis (or whichever elevator company) somebody felt that "optimising" the software to just be wrong was better than this stupid correctness other people keep asking for but fortunately the hardware people at Otis have this feeling that "Oh, yeah, for optimisation reasons we sometimes kill everybody in the elevator" won't result in many successful sales and so the result is the elevator is taken out of use.

This resonates with me because for a while when I was a starving postgraduate student I helped supervise electronics undergrads learning to write real time software, and the demo hardware for them was (toy) elevators. So every week I'd get to watch these idiots smash the elevator into the ceiling or floor of the shaft because hard real time deadlines are not negotiable. The toy just non-destructively disassembles when crashed and can be reassembled (something I got good at) but of course in the real world those passengers would be injured or dead.

Rust 1.78.0 released

Posted May 6, 2024 8:20 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> Am I missing a story and actually on my Linux today the memcpy is just memmove so that obsolete 1990s software which won't run still works? I can't really follow your reasoning.

I believe they did eventually decide to do that, not (just) to support Flash, but because Linus convinced them that there was no material performance advantage if you assume no aliasing (and therefore they could support Flash "for free" and might as well do it). But this all happened many years ago, so it might no longer look like that today.

Rust 1.78.0 released

Posted May 6, 2024 8:24 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

Note also that, if there really isn't any performance advantage, then it also makes sense from a code health perspective: Redundant-but-slightly-different implementations of the same function are not exactly something you *want* in a codebase.

Rust 1.78.0 released

Posted May 6, 2024 9:16 UTC (Mon) by khim (subscriber, #9252) [Link]

> Note also that, if there really isn't any performance advantage, then it also makes sense from a code health perspective: Redundant-but-slightly-different implementations of the same function are not exactly something you *want* in a codebase.

This depends on your goals, but if you want to have to write a codebase which people would take seriously you have to have them. And, of course, there are quite a few versions of memcpy/memmove including backward-compatible version, too.

Rust 1.78.0 released

Posted May 6, 2024 10:36 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

What you're describing doesn't make much sense. If you assume no aliasing you will break Flash. The Flash problem is precisely that they've got aliasing and they're calling an API where it's forbidden. Maybe there's a problem where you didn't grasp what aliasing is exactly?

If we have at least two pointers, A and B, aliasing is the situation where modifying (part of) the object pointed to by A also modifies (any part of) the object pointed to by B - hence the name alias. The most obvious case is that they're literally just pointing to the same object, but it's still aliasing when the overlap is less obvious.

For example if I have a pointer A to the ASCII text "Goose=cooked" and I also have a pointer B to the = sign, these pointers are aliased. A compiler which, having been told they cannot alias, re-arranges writes so that changes to replace B with a zero byte, then later put it back are both elided, breaks our token splitting C code. This is one of the performance leaks of C and C++ compared to Rust - Rust can almost always correctly tell the compiler it's sure there is no aliasing and get nicer code emitted.

Looking into it, it seems glibc "fixed" this by welding obsolete software's memcpy to memmove via symbol versioning. So you could have compatibility for Flash (and worse performance, memmove doesn't have the restrict keyword so the compiler must assume these pointers alias) or build new software to get back your performance.

Rust 1.78.0 released

Posted May 6, 2024 11:08 UTC (Mon) by khim (subscriber, #9252) [Link]

> Maybe there's a problem where you didn't grasp what aliasing is exactly?

I know what aliasing is. Do you know bug is?

> Looking into it, it seems glibc "fixed" this by welding obsolete software's memcpy to memmove via symbol versioning.

Not enough, but close: they have replaced memcpy which always copied data “starting from the bottom” with memmove. Technically that's a change in the behavior, but since no real-world apps rely on it, it's fine.

> So you could have compatibility for Flash (and worse performance, memmove doesn't have the restrict keyword so the compiler must assume these pointers alias) or build new software to get back your performance.

Precisely. These are the rules:
• if I upgrade something (doesn't matter what: shared library, kernel, OS) and my app, that worked perfectly before, is broken — then said something have a bug and needs to be fixed.
• if there are issue when documentation tells one thing and implementation does the other thing then it's a bug, too, but is less-critical bug and thus it should be fixed on besf-effort basis.

Why is it so hard to understand? People don't need your OS. People don't need these apps! They need something done: watch the video, print the document or something else. And bugs are prioritized not on the basis of some matchematical properties but on the basis of their impact on that real-world need.

For some reason in GNU/Linux desktop world this idea is throughly ignored. I wanted to say that it's ignored in FOSS world, but that wouldn't be true: it's accepted and emraced by a lot of people, from Linus and kernel developers, to Rust developers, to Firefox developers and so on.

Only Linux desktop developers ignored it for years. Starting from the initial madness when every release of RedHat or Debian included new shared libraries and dropped only shared libraries. And even today something absolutely trivial and obvious, somehow, is not understood by significant percent of developers: don't break my apps, or I would find something else to run them!

Rust 1.78.0 released

Posted May 6, 2024 15:19 UTC (Mon) by mb (subscriber, #50428) [Link]

>don't break my apps

Hm, what about: Don't build apps with UB in them, if you want them to not break.

The app never worked. Its behavior was undefined from the very beginning. It only gave the correct results by pure coincidence.

The root cause of all the problems is neither that glibc changed behavior (under the given rules), nor that the app triggered UB.
The real problem is that the app development process did not detect the UB.

Yes, back then it was hard to detect such bugs. But today it is very easy, if you use the right tools (Rust).

Rust 1.78.0 released

Posted May 6, 2024 15:33 UTC (Mon) by khim (subscriber, #9252) [Link]

> Hm, what about: Don't build apps with UB in them, if you want them to not break.

I, as user, don't build apps. I use them. And I, as user, don't even know what UB is, I just know if app works or not.

> The app never worked.

Sure, if app crashed and was unstable before then I agree, it would probably be crashy and unstable on a new OS and with new library. Otherwise… nope. I had working OS and app, if upgrade of app breaks it then app is to blame, if upgrade of OS breaks app then OS is to blame. It's as simple as that.

> The real problem is that the app development process did not detect the UB.

Yes. But we still ended up with app that worked before OS upgrade and then stopped. OS was fixed to make it work again. And that's how things are supposed to work.

> Yes, back then it was hard to detect such bugs. But today it is very easy, if you use the right tools (Rust).

Rust makes things easier to detect, yes. But it's not a panacea and with unsafe can lead to all these same effects, of course.

The gray area is not the case of UB at all, nope. The gray area are various apps that are violating certain explicit and documented design specifications on purpose. Many anti-cheat or copy-protection modules do that.

In these cases where apps are borderline malicious I can accept that perfect compatibility is just impossible (and end users usually accept that, too). But if app worked reliably and developer of said app haven't done anything on purpose to make it extra-fragile it have to work after upgrade, too.

Because… why not? User needs something apps are providing, OS and computer that said app runs on are just the means to run these apps.

Rust 1.78.0 released

Posted May 6, 2024 17:13 UTC (Mon) by mb (subscriber, #50428) [Link]

>I had working OS and app

Yes. And it ever only worked in that exact combination.
Don't upgrade the OS, if you want to use an app that requires a feature that has never been a supported feature.
In fact, by marking the pointers restrict, it was an officially *not* supported feature that the app depended on.

>and developer of said app haven't done anything on purpose to make it extra-fragile

The developer introduced UB into the program execution. Which makes it a completely invalid program.
Whether that has happened on purpose or not, is not relevant.

>Because… why not?

Because improving performance is a perfectly valid reason to change internal implementations, as long as the new implementation still adheres to the API contract.

If your app violates the API contract, don't upgrade the lib.

Rust 1.78.0 released

Posted May 6, 2024 17:32 UTC (Mon) by khim (subscriber, #9252) [Link]

> Don't upgrade the OS, if you want to use an app that requires a feature that has never been a supported feature.

That's fine proposal, except Linux desktop distros don't give you a choice there, too: typical desktop computer lasts for 10 years, but very few desktop OSes are supported for that long. Not anything in the desktop linux realm AFAICS.

> In fact, by marking the pointers restrict, it was an officially *not* supported feature that the app depended on.

What are pointer? would ask a typical user. And s/he would be right: who cares about what feature my app depends on if it works?

> The developer introduced UB into the program execution.

So what? That wasn't the developer fault. When you have hundreds of UBs in your language and thousands in your library introducing them is just way too easy.

Maybe if OS would have included some tools that made introduction of UB hard then it would have worked… that's how have we got that abomination called managed languages which made development of Windows Vista a nightmare and then made Microsoft lose the mobile market.

It also destroyed Sun, but, strangely enough worked somewhat Ok for Google. Perhaps because Google haven't tried to create their own language, but picked the existing one.

> Because improving performance is a perfectly valid reason to change internal implementations, as long as the new implementation still adheres to the API contract.

Should I include links to dozen of Linus rants where he explains how contracts don't matter if real apps rely on details of the implementation? I can find quite a few.

This exact attitude is why Linux is the most popular kernel in existence yet desktop Linux one of the least popular OS in existence.

Even after billions spent on it.

> If your app violates the API contract, don't upgrade the lib.

Bundle it? Desktop linux makers start wailing, if you do that, too.

Well, not anymore, they have finally grown a bit, but it took decades. Literally. Thus missing a golden opportunity.

They are poised to get another one soon, when world would split into few separate regions and, hopefully, this time someone sane would be in charge. But they wasted about two decades fighting the obvious.

Rust 1.78.0 released

Posted May 6, 2024 18:05 UTC (Mon) by mb (subscriber, #50428) [Link]

>Linus rants where he explains how contracts don't matter if real apps rely on details of the implementation

So? It's forbidden to disagree with Linus? No, it's not, of course.
But in fact I actually *don't* disagree with Linus. He's right.

This is not just an implementation detail that apps depend on, though. It's UB. That is undefined by definition. Anything could have broken it at any time. Not only the libc itself. It only ever worked by pure luck.

If you start demanding that every UB that some app uses is now written in stone and can't be changed anymore, then you can't effectively make any change to anything at all anymore.

Apps do stupid things and apps receive their undefined behavior that they asked for.
Stupid apps must be fixed.

>This exact attitude is why Linux is the most popular kernel in existence

Supporting UB makes Linux the most popular kernel in existence? I doubt it. Very much.

Rust 1.78.0 released

Posted May 6, 2024 18:20 UTC (Mon) by khim (subscriber, #9252) [Link]

> This is not just an implementation detail that apps depend on, though. It's UB. That is undefined by definition.

Is it really UB if it behaved in exact same way for many decades, though? Hyrum's Law says someone, somewhere would depend on such a thing. Flash wasn't the only one such app, BTW, only the most prominent one.

> It only ever worked by pure luck.

It worked, on old version of GLibC, 100% reliably. How can you call something “pure luck” if it works 100 times out of 100? That's “implementation detail” at this point, not “pure luck”.

> If you start demanding that every UB that some app uses is now written in stone and can't be changed anymore, then you can't effectively make any change to anything at all anymore.

Yes, that's more-or-less the case. That's why OSes that are actually used by billions have so much redundancy and so many different ways of doing “the same thing”.

Fringe OSes that don't care about their users, like OpenBSD, may do things differently, of course.

> Apps do stupid things and apps receive their undefined behavior that they asked for.
Stupid apps must be fixed.

Is “we no longer support your OS” an acceptable fix? Because that's what developers of fringe OSes with an attitude get as an answer.

> Supporting UB makes Linux the most popular kernel in existence? I doubt it. Very much.

There are many components for success. But if your OS doesn't even want to support apps properly then it would have no apps.

It may even be, in some sense, popular (e.g. Intel ships Minix on billions of systems, after all), it just would have no apps (and thus no users in traditional sense).

Rust 1.78.0 released

Posted May 6, 2024 18:43 UTC (Mon) by mb (subscriber, #50428) [Link]

>How can you call something “pure luck” if it works 100 times out of 100?

Because you are interpreting what I have said differently.

I didn't mean "pure luck at execution time". I meant "pure luck at implementation time".
Of couse one implementation does the same thing over and over again, if you don't change anything.
I thought that would be obvious.

>> then you can't effectively make any change to anything at all anymore.
>Yes, that's more-or-less the case.

Ok. Got it.

So we must stop adding syscalls to Linux now. Because my app randomly calls syscalls with random identifiers. My app works fine, because it just gets ENOSYS back. No other effects. It's fine as-is.

*You* will break my app, if you add new syscalls! My app aint broken! It works 100 of 100 times on a current Linux! The kernel must not break my app.

Rust 1.78.0 released

Posted May 6, 2024 19:01 UTC (Mon) by khim (subscriber, #9252) [Link]

> Because you are interpreting what I have said differently.

I'm interpreting it like normal layman user would. S/he wouldn't care about UBs, documentation, restricted pointers and other such things. S/he would just look on what component broke everything.

And if app worked before 100% and doesn't work now, when only OS was changed and nothing else, then the culprit is obvious, isn't it?

> I didn't mean "pure luck at execution time". I meant "pure luck at implementation time".

What is “an implementation time” and why should I care about it? Why are you breaking my apps? They worked fine before.

That's what would you hear.

> Because my app randomly calls syscalls with random identifiers.

And now you are doing reductio to absurdum without trying to understand anything. Yes, apps which are doing really silly things on purpose deserve to be broken. At some point you just have to give up and say “sorry, this app is too crazy to be supported, we would have to take the reputation hit”.

But that's very-very different from an app that was developed with honest attempt to make it behave, just with some silly dependency on some fringe property of your system.

Most app developers are not malicious (some copy-protection and anti-cheat software are borderline) yet they sure as hell are sloppy.

You either accept it and have apps and users. Or you don't accept it and then have no apps and no users (in the traditional sense: embedded use where apps are never provided by end-user is still possible).

> *You* will break my app, if you add new syscalls! My app aint broken! It works 100 of 100 times on a current Linux! The kernel must not break my app.

Show me an app, tell me how many users use it and then we would have something to talk about. Because if someone is deranged enough to create such an app then I would just laugh and ignore it. But if there are millions, or, worse yet, billions of users that may be affect… then I would have to care. Why do you think GetProductInfo is the same on Windows 10 and Windowss 11? Too many apps would have been broken if it would have changed.

That's not a new phenomenon, SETVER was added MS-DOS 5 decades ago.

Rust 1.78.0 released

Posted May 6, 2024 19:37 UTC (Mon) by mb (subscriber, #50428) [Link]

> At some point you just have to give up and say “sorry, this app is too crazy to be supported

Finally, we agree.

Rust 1.78.0 released

Posted May 7, 2024 16:52 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

> *You* will break my app, if you add new syscalls! My app aint broken! It works 100 of 100 times on a current Linux! The kernel must not break my app.

Making people not do this sort of thing is why TLS 1.3 has GREASE and indeed Google have ensured it smears stuff in GREASE as appropriate.

Suppose one of your employees has Chrome, and they try to connect to popular web site https://pop.example/ with GREASE the Chrome browser may (more or less at random) choose to ask pop.example if it can do some arbitrary non-existent stuff, hey pop.example, can we Foozle Bingle 0x3057 ? No? That's fine. And likewise of course pop.example can propose equally non-existent extensions, Chrome can we Woop F4 Awooga? No? Don't worry about it then.

The reason to do this is that if we let them middleboxes will insist that nothing should ever change, like your hypothetical syscall allergic app. Their software is written by lazy incompetent people (ie humans) of course, so far this has meant we've always found a way to hide changes (e.g. TLS 1.3 is spelled as a TLS 1.2 resumption of a session which never existed) but rather than trust to always being lucky we should specifically set fire to software which incorrectly believes the protocol is frozen and can't evolve, and GREASE means customers do that for us.

Rust 1.78.0 released

Posted May 8, 2024 11:03 UTC (Wed) by foom (subscriber, #14868) [Link]

If the app that did something like this was popular, Windows might add a compatibility mode --automaticlly triggered for that app only -- which disabled the new syscalls which the app requires not to exist.

Both Microsoft's and Apple's OSes are chock full of those sorts of per-app compatibility hacks, because they both consider that they simply cannot release an update that breaks widely-used apps.

Rust 1.78.0 released

Posted May 6, 2024 17:19 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> What you're describing doesn't make much sense. If you assume no aliasing you will break Flash. The Flash problem is precisely that they've got aliasing and they're calling an API where it's forbidden. Maybe there's a problem where you didn't grasp what aliasing is exactly?

Read it again. I'm saying that (Linus said that) there was no performance advantage if you assume no aliasing. Ergo, that assumption does nothing for you and can be discarded, at which point you have memcpy == memmove, which incidentally also fixes Flash.

Rust 1.78.0 released

Posted May 6, 2024 17:57 UTC (Mon) by khim (subscriber, #9252) [Link]

In some alternate reality this is what would have happened. But in our reality they still kept separate implementation for old apps.

From my understanding on some CPUs memcpy is even still implemented as memmove, but the effect is that Flash works.

Sure, there are fuzzy corner cases where breakage is confined to some obscure feature which nobody notices when it's broken (e.g. N_HDLC line discipline for TTY), but it's one thing to break something by accident and entirely different thing to break something on purpose.

It's may be sorta-kinda-maybe acceptable for a library or a language (because developer would, presumably, test things after upgrade), but that's entirely unacceptable for the OS (at least as long as we only normally run one on our devices, in a world where dozen of OSes are routinely simultaneously used by users this never-ever-break things role would go to hypervisor)

Rust 1.78.0 released

Posted May 6, 2024 18:03 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

I understand what you meant now, but it's extraordinary.

Do you have a quote from Linus I can look at ? I just can't imagine why he'd think this.

The memmove implementation has to do extra work, you can choose to do this anyway (which is what happens for older binaries to deliver the compatibility hack) but it's clearly a disadvantage.

Rust 1.78.0 released

Posted May 7, 2024 5:45 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

He has a lot of comments in this Bugzilla entry, but here's the one about performance: https://bugzilla.redhat.com/show_bug.cgi?id=638477#c132

Rust 1.78.0 released

Posted May 7, 2024 12:16 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

Thanks for digging that out. Wow, angry Linus, it's nice that he's calmer these days. It seems to me that Linus agrees that yeah, actually it's slightly more expensive, but then (which would make a pattern around this thread) insists without measuring that doesn't matter because the cost of this is smaller (how much? No measurement) than the other thing (also not measured).

Rust 1.78.0 released

Posted May 9, 2024 0:22 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

It is quite absurd to argue that the cost of an extra few CPU cycles matters, when you are about to perform an operation that could easily take multiple microseconds or longer. The only case where this conceivably makes a difference is if you're doing lots of tiny memcpy calls. But the people who care about individual CPU cycles are not doing lots of tiny memcpy calls, because the function call overhead is ruinous anyway.

Rust 1.78.0 released

Posted May 6, 2024 9:58 UTC (Mon) by khim (subscriber, #9252) [Link]

> Your contention is that fixing the bug is wrong

No. My contention is that demaning that people would fix something that was never broken in the first place is wrong. Take look on SWLock documentation from 2022: this part is specifically there because SWLock may give you exclusive ownership when you requested shared: Shared mode SRW locks should not be acquired recursively as this can lead to deadlocks when combined with exclusive acquisition.

It wasn't added as a response to bug report about std::shared_mutex, it was always there.

> Sometimes, all but one of your readers stalls for no reason.

Have you actually read what these people wrote? There is a reason for that behavior: when exclusive lock is not relinquished but moved to another user who requested shared access (but got exclusive instead) you avoid losing scheduling quantum. Whether this would be an optimisation or pessimisation depends heavily on what exactly you are trying to do, but I wouldn't be surprised to find out that in general it's an optimisation.

> It looks as though the Android bug you've linked was an immediately observable defect, and so correcting it meant breaking all the software which depended on the faulty behaviour.

Any change in the software may be characterised like that: if change has no immediately observable defect, then how may it ever fix a bug? The question is always whether change to the behavior is likely to trigger problems in the real-world programs (and non just ex-post-facto programs written specifically after discovery of “bug”).

> Was RWLock a typo for SRWLock, or are you claiming that people want this behaviour from Rust's std::sync::RWLock ?

Why do you think it may only ever be about Windows SRWLock or about Rust's std::sync::RWLock? No, I was talking about abstract concept. Windows people believe lock convoys are serious enough issue to fight them in making this pattern disallowed, C++ and Rust people think differently:
> If your readers do need to work together, this might sometimes deadlock as one reader gets the write lock, then waits forever for another reader.

Why do you think that C++ and Rust people have the right to make rules for something they haven't made?

> That's a pretty nasty bug

Yes, that's a nasty bug, but whose bug is that? I would say: not an SWRLock bug, for sure. SWRlock always warned about not taking locks recursively but it haven't included warning about how if you attempt to take a lock in one thread but then make other thread to take it, too… that's not actually a recursion, but would have the exact same deadlocking effect.

Simple documentation issue, basically.

> This is probably a rare pattern anyway, it's legal but I can't remember working on any software which relied on it.

And that's precisely the issue: it's illegal with SRWLock, legal (according to the documentation, anyway) with std::mutex and Rust's std::sync::RwLock… why should it be fixed in the SRWLock, then? Especially if you can't remember working on any software which relied on it (me too, BTW).

> Am I missing a story and actually on my Linux today the memcpy is just memmove so that obsolete 1990s software which won't run still works?

No, they did exact same thing Android did: made sure old apps would get old, compatible, behavior while new ones would get new version.

Rust 1.78.0 released

Posted May 6, 2024 11:24 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

The 2022 documentation you linked says SRWLock isn't suitable for recursive use. Which is correct, and sure enough you'll see that std::shared_mutex and Rust's RWLock likewise are not suitable for recursive use.

It doesn't say oh, you might get silently given "upgraded" access which will arbitrarily disrupt performance of your software. It doesn't say that because that's not what anybody who wanted this specified. It may well not even have been understood (though they won't admit this now) by the people who wrote it.

You're correct that the authors now maintain they were concerned about lock convoys - a performance question. What do we tell anybody who has a performance question? Measure. Measure, measure, measure. Do you see any measurements? No. There aren't any measurements because this is only an excuse.

The history of SRWLock is that it's largely a copy-paste of some (NT) kernel code. That's a familiar story here on LWN (but with code from the Linux kernel). Of course in the kernel it's OK to have things with slightly weird behaviour, it's a small community you can all learn about the weirdness and you all benefit to some extent. Outside the kernel weird is bad. I would not be astonished if the NT kernel people have measurements showing their internal SRWLock-like feature has desirable behaviour including convoy breaking, but those measurements don't apply to userspace and of course their internal feature continues to evolve in parallel with SRWLock over the subsequent decade plus.

SRWLock's only real consumers are standard library folks, who want (duh) a reader-writer lock, not this weird kernel internal allegedly convoy breaking thing. Even documenting what this is now "for" is a challenge. The proposed C++ wording is to say these basically don't promise to be anything better than an ordinary mutex. Why bother? Lots of programmers will accordingly just not use them.

> why should it be fixed in the SRWLock, then

Because the thing people actually want is useful, and has been used, on a wide variety of platforms for decades. SRWLock could just deliver that, and it doesn't. The thing it chooses to do instead is very weird, and the excuse offered is "performance" but with no measurements whatsoever. Rust has therefore moved off SRWLock, and I'd guess that reluctantly STL will task somebody to go write a fresh C++ implementation too, leaving this orphaned.

Rust 1.78.0 released

Posted May 6, 2024 13:52 UTC (Mon) by khim (subscriber, #9252) [Link]

> The 2022 documentation you linked says SRWLock isn't suitable for recursive use.

No, it very clearly says it may lead to deadlock when combined with exclusive acquisition. Which happens precisely because of automatic upgrade from shared to exclusive.

> Rust's RWLock likewise are not suitable for recursive use.

Yes, but details are different: this function might panic when called if the lock is already held by the current thread is entirely different thing, the possibility of deadlock was never mentioned AFAICS.

> It may well not even have been understood (though they won't admit this now) by the people who wrote it.

It was most definitely understood correctly by people who wrote SRWLock (that's why they immediately reacted to it as a documentation deficiency), although, probably, not people who wrote std::shared_mutex and std::sync::RWLock.

> Do you see any measurements? No.

That's the pot calling the kettle black: you also asserted that this change made things worse without any measurements AFAICS.

> SRWLock's only real consumers are standard library folks

Where does that idea comes from? It was added to Windows Vista in 2006. Years before C++17 or Rust were dreamed of. And I assume there were tons of users in various game engines before standard libraries of either languages got appropriate types.

> who want (duh) a reader-writer lock, not this weird kernel internal allegedly convoy breaking thing

How do you know? Have you asked? Have you measured? You, youself, have admitted that pattern that this behavior breaks is very uncommon and you have never seen it.

At least Microsoft developers have some kernel-internal benchmarks. You, on the other hand, loudly proclaim that everyone else should show you measurements, yet don't present any.

> Why bother?

That's the question that may be asked about any RWLock in general. The only reason to use RWLock 99% of the time is performance. And while you demand some benchmarks you are not presenting any.

This being said I suspect the actual best solution is to leave SRWLock alone and just use futexes (okay, Windows analogue). Like Rust already did and, hopefully, C++ would do.

Simply because, most of the time, the ability to take lock wouthout roundtrip to kernel would beat any optimizations that SRWLock may offer.

But, again, like always one has to do that without affecting existing apps. Issue that you ignore entirely. What you call “broken garbage that's kept kinda-sorta working by people like Raymond Chen” normal people call “something you may actually trust”.

And it's not dissimilar to heroic work of Linux kernel and/or GLibC maintainers (but yes, very different from what desktop Linux used to do and somewhat different from what Rust is doing now).

It would be interesting to see how Rust developers would behave when there would be actual billions of lines of software written in it and they would discover something strange in their standard library.

>> why should it be fixed in the SRWLock, then
>Because the thing people actually want is useful, and has been used, on a wide variety of platforms for decades

But the exact same think may be said about SRWLock, too. Well, variety of platforms may not we as wide as with pthread_rwlock, but it was used on comparable number of systems, at least. And for almost two decades, too. And, in fact, for many years users of std::shared_mutex with this behaviour without complaints, even.

Thus it's definitely not a bug which deserves dropping everything and rushing to break working things to fix it in hurry.

> Rust has therefore moved off SRWLock

Rust haven't moved of SRWLock because it's behaving like it does, though. This whole tempest in a teapot about SRWLock started way after Rust code that stopped using it was written. And that code was written because, outside of the kernel, something that doesn't use syscalls, most of the time, is preferable to something that uses them. Fine details of how SRWLock is behaving weren't the reason Rust stopped using it (have it actually dropped it, BTW? what does it do on Windows 7 now?).

Rust 1.78.0 released

Posted May 6, 2024 19:48 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

> It was most definitely understood correctly by people who wrote SRWLock (that's why they immediately reacted to it as a documentation deficiency)

Was it? You seem much more confident than they do. This "Optimisation" is a really weird thing to do as a conscious choice, it looks much more like a bug that wasn't caught to me. One reason to have the measurements is it shows intentionality. "We tried this, we expected this, here's what happened". I do not for a moment buy that they were trying to prevent convoys, they measured and this very weird "optimisation" fixed the convoys.

You can read articles by numerous people about SRWLock specifically (including Raymond Chen) and none of them describe this "optimisation". Because it's a bug.

Rust 1.78.0 released

Posted May 6, 2024 20:53 UTC (Mon) by khim (subscriber, #9252) [Link]

> This "Optimisation" is a really weird thing to do as a conscious choice

Yet basically impossible to implement by accident. Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.

> You seem much more confident than they do.

How can I be even more confident that that:
…@adr26 reached out to me, as he worked on the SRWLOCK implementation before leaving Microsoft, and confirmed that this behavior, while not documented on Microsoft Learn, is a consequence of a performance optimization
… @cpkleynhans also confirmed this in OS-49268777
… NeillC originally implemented the original 2-phase unlock algorithm which can allow the shared->exclusive upgrade path some time after pushlocks were first created, but before Win7 (I can't remember how far back it was). I then expanded the use of 2-phase unlock to more cases in 2011. In 2011, I made hitting this more likely, but the risk had been there previously as well.

Lots of people chimed in and explained how that was an optimisation implemented on purpose and not a single one who claimed that it happened by accident.

> You can read articles by numerous people about SRWLock specifically (including Raymond Chen) and none of them describe this "optimisation".

Raymond Chen does describe that optimization, it even describes why it exists. It just doesn't tell one tiny detail: that if in that tiny window shared acquirer wins the race it gets exclusive lock as a prize. It may even be mentioned in discussion that followed, but that was lost when his blog was converted to a new format.

> Because it's a bug.

Sure. But where exactly? If thing is behaving like designed and not like documented then it's a documentation bug, isn't it?

And it's not as if this was done by gnomes in the dungeon who are no longer accessible, they have found the exact guys who added this behavior and discussed the issue with them… I guess you may argue that they all conspired against you and all that code was added as a result of use of illegal drugs, but at this point it's their words against yours and I would trust words of people who actually did the job more than some random guys who haven't touched it ever.

P.S. Also: documentation is not even wrong BTW, since it doesn't mention anywhere how many other shared readers can there be in parallel at any given time. It's just not entirely clear about lack of these guarantees. C++ documentation includes such guarantess, though, and they are incompatible with SWRLock which means that behavior is worth documenting.

And it was documented (in a sense that change was sent to the docs team, when would it be released is a different question).

Now the only issue remains as to what to do with STL and promises it gives to developers.

Rust 1.78.0 released

Posted May 7, 2024 14:06 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

> It just doesn't tell one tiny detail: that if in that tiny window shared acquirer wins the race it gets exclusive lock as a prize

But that's the bug! That's the entire bug. Nobody is astonished that this happens if it's a writer and it wanted an exclusive lock, of course that gets an exclusive lock. Nobody is astonished that the unfair "sniping" event causes the fresh reader to get a reader lock even though there's at least one writer queued, that makes sense too (and Chen provides a link to explain it). The astonishment is that when there's *no writer at all* it gives a reader the exclusive lock!

The thing you consider "impossible" is just a mental lapse. After thinking so hard about how the SRWLock can keep track of state in the few available bits (squirrelled away in the bottom of a pointer) you might forget that you don't necessarily need to keep state about your own nature. The code written to steal locks is in a context where it knows whether it wanted to get a shared or exclusive lock - but as a result of this lapse it just doesn't, they both steal an exclusive lock.

it's the sort of lapse that a code review should have caught. But it didn't.

If you agree that (obviously) this isn't an "Optimisation" it's just a bug, it should be fixed. But fixing it would be annoying, somebody has to do a bunch of actual work. Hence the push back that somehow the C++ ISO documents should be changed. That's not work for the engineers who wrote the bug, that's just a bunch of paperwork for somebody else, much easier.

There are no gnomes here, no illegal drugs, just the two inevitable human problems, laziness and incompetence.

I have sometimes suspected other motives (especially greed) but I find that almost always what I see can be explained with just laziness and incompetence. At the Post Office for example, there's a good chance there was some greed in there, but you can explain all of it as laziness and incompetence if you just sprinkle on some deceit as well. Will any of them go to prison for that? Alas probably not, even though sufficient laziness and incompetence + deceit is indeed punishable and there's a clear public interest.

Rust 1.78.0 released

Posted May 7, 2024 14:11 UTC (Tue) by Wol (subscriber, #4433) [Link]

> Will any of them go to prison for that? Alas probably not,

There's a decent chance some people will lose their career over that, but seeing as I doubt they'll lose their pension, and they're probably close to retirement (if not retired), will that be any real punishment?

Some of them probably ought to be charged with manslaughter, but that's very unlikely :-(

Cheers,
Wol

Rust 1.78.0 released

Posted May 7, 2024 16:53 UTC (Tue) by paulj (subscriber, #341) [Link]

Jarnail Singh surely will face criminal proceedings after this inquiry. The behaviour of Rodric Williamson and Hugh Flemington also was awful, but they may get away with it, as they were not criminal lawyers. Crichton as GC surely has a lot to answer for, though she did at least try ensure Second Sight remained independent (so Vennells and the board pushed her out and then fired Second Sight). On the business side, who knows.

So much more to say. It is an incredible case. The banality of evil. From a tech perspective there should be huge ethical lessons for anyone who supplies software that manages data that must balance (stock, financial transactions, etc.), or software that could affect the integrity of such data.

But.. this is a Rust story.

Rust 1.78.0 released

Posted May 7, 2024 18:14 UTC (Tue) by khim (subscriber, #9252) [Link]

> But that's the bug!

It's done on purpose, though. So it's entirely not clear whether that's a bug in documentation or implementation.

And it doesn't contradict the existing implementation so it's pretty minor bug, if you'll ask me.

> The astonishment is that when there's *no writer at all* it gives a reader the exclusive lock!

But there is writer! That's the issue! This maybe-bug-maybe-not-bug happens when there is a writer! Just in different thread!

The “prize” given to the “lucky reader” is limited precisely to give said reader a chance to “try again” faster!

Because the horde of readers was denied access and only one reader was allowed to go before “unlucky writer” the hope is that said writer would get a chance to try his luck again faster!

> The code written to steal locks is in a context where it knows whether it wanted to get a shared or exclusive lock - but as a result of this lapse it just doesn't, they both steal an exclusive lock.

No, but this is EXPLICIT and CONSCIOUS optimization! And with PRETTY OBVIOUS AND UNDERSTANDABLE GOAL, to boot!

True, these benchmarks that they were doing on machines that were current 20 years ago may not be 100% correct today anymore and different users may benefit from that behavior more or less, but it's not as if that was done without some goal is the mind.

The goal was reduction of lost data race effect for writers. Whether it's good enough effect to pay for with this “exclusive reader” side-effect is different question, but the idea behind this optimization is pretty clear: make writer “less bitter” for it's loss of a race.

Rust 1.78.0 released

Posted May 8, 2024 12:25 UTC (Wed) by chris_se (subscriber, #99706) [Link]

> > But that's the bug!
>
> It's done on purpose, though. So it's entirely not clear whether that's a bug in documentation or implementation.
>
> And it doesn't contradict the existing implementation so it's pretty minor bug, if you'll ask me.

From my perspective, it does break the underlying definition of a reader-writer lock. If I ask for a shared lock, I expect that other threads are also able to obtain additional shared locks while I am holding that lock. But if my shared lock is actually exclusive, that assumption is broken.

This optimization by Microsoft doesn't work well if the shared lock is held for a long time (which is often the case when a rwlock is used), and it causes a deadlock if threads holding a shared lock are waiting for synchronization with other threads that are also supposed to hold shared locks. (Plenty of code out there does this.)

Now, I don't dispute that the design decisions Microsoft made created a data structure that can be useful for synchronization. It's just that this is a different data structure from a reader-writer lock that has been discussed in the computer science literature. Even other people within Microsoft (those who wrote Microsoft's STL implementation) were surprised by this behavior. And independently from them the same applied to the Rust developers.

Rust 1.78.0 released

Posted May 8, 2024 14:03 UTC (Wed) by khim (subscriber, #9252) [Link]

> This optimization by Microsoft doesn't work well if the shared lock is held for a long time (which is often the case when a rwlock is used)

If shared lock is held for a long time then how writers are ever supposed to get this lock?

And if shared lock is held for a long time then problem is writers starving would be even more acute which would, most likely, make that optimization even more important.

> it causes a deadlock if threads holding a shared lock are waiting for synchronization with other threads that are also supposed to hold shared locks

But SRWLock never promised such thing to be sound and reliable.

> Plenty of code out there does this.

Can you show us at least one example to gauge how often that actually happen?

> It's just that this is a different data structure from a reader-writer lock that has been discussed in the computer science literature.

Blame marketing guys. What Microsoft designed was called a pushlock. Then someone decided to expose that to userspace under the name of SRWLock. Perhaps that was a mistake, but it's hard to rename it today.

Rust 1.78.0 released

Posted May 10, 2024 10:40 UTC (Fri) by smurf (subscriber, #17840) [Link]

> If shared lock is held for a long time then how writers are ever supposed to get this lock?

That's immaterial to the current discussion. The reader will hold the lock for however long it does so, regardless of whether the lock was upgraded to a write lock behind the scenes. In fact the opposite is true: by artificially serializing readers you block the writer for even longer.

Also, I can write perfectly cromulent non-recursive code that depends on shared reads being, well, *shared* reads. if I want exclusive access I request that.

This is a prime example of the "it's not a bug, it's a feature" mindest that MS is justifiably famous for.

Rust 1.78.0 released

Posted May 8, 2024 22:58 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

> But there is writer! That's the issue! This maybe-bug-maybe-not-bug happens when there is a writer! Just in different thread!

Where is this writer? Look at the reproducer, it's not difficult C++. Spell out for us, where the writer is that nobody else can see? I can't see it. The author can't see it. STL can't see it. Where is this waiter?

That's the crucial insight I think you've still missed. There is no writer! Nobody wants the exclusive lock or ever will again. But, the buggy code may give the exclusive lock to a "lucky" reader anyway, and in doing so it stalls all other readers.

I must say that when I try to follow the SRWLock / push lock algorithm I'm certain I don't understand why it's correct. Maybe it's badly documented, but maybe it's just wrong even aside from this "optimisation", it's striking how much easier to understand the (futex based, Rust) Linux RWLock code is.

RWLock is still subtle, I wouldn't recommend it to a beginner trying to learn how to read sync code, but with fewer bit flags and a typical Rust focus on writing what you mean I found it very readable.

Rust 1.78.0 released

Posted May 8, 2024 23:32 UTC (Wed) by khim (subscriber, #9252) [Link]

> That's the crucial insight I think you've still missed. There is no writer!

Of course there is. The main loop is constantly racing with all other threads for the writing lock. Usually it wins the race and get's its exclusive lock. Sometimes it loses the race and some other thread gets that lock. It's still exclusive to ensure that writer that lost the race would have a chance to get writer lock ASAP, but since it was handed to the reader that is not prepared to deal with exclusive lock hilarity ensues.

> Nobody wants the exclusive lock or ever will again.

In that case there are no race and issue shouldn't reproducible. Perhaps you have some other example in mind but the main example presented on all these links have main thread which tries to get writer lock in the loop and N helper threads that are playing ping-pong with shared access.

And sometimes the lock that was intended to end up with the main thread is won” by some other thread.

Remove the writer from main thread and deadlock goes away.

> But, the buggy code may give the exclusive lock to a "lucky" reader anyway, and in doing so it stalls all other readers.

Yes, but it gives the writer a chance to try to win the race again. In many cases this may help writer to, finally, acquire that coveted writer's lock faster and do what it was supposed to do ASAP.

Rust 1.78.0 released

Posted May 9, 2024 0:30 UTC (Thu) by tialaramex (subscriber, #21167) [Link]

Oh! No, you've misunderstood. Each loop is making a completely new std::shared_mutex

Let's imagine three iterations, each will make a new std::shared_mutex, let's call them S, T and U

Iteration one makes S, it takes the exclusive (writer) lock on S, it makes five threads, each of them will (when it runs) try to take the shared (reader) lock on S, and then our main thread releases the writer lock on S. It so happens all of the reader threads were either queued waiting, or hadn't begun to try to take a reader lock, so things go swimmingly. All of them acquire the reader lock on S. Having all done so, they're done. The main thread reaps the corpses, and S is destroyed. Notice there was only ever a writer for S once, at the very start.

Next iteration, it's roughly the same, this time we make T, maybe this time all of the reader threads are eagerly begun and are queued. T's exclusive lock is released, the queued threads are awoken, they take the shared / read lock, and everything proceeds much the same again, eventually T is destroyed this time. Notice again, just one writer at the start and then nothing more.

Finally third iteration, makes U, takes exclusive lock on U, makes five new threads again. This time however, by a fluke one of the five reader threads happens to try to take a shared (reader) lock on U at just the moment after that exclusive (writer) lock was released by the main thread. Due to the "Optimisation" it actually steals the lock and so it doesn't know but it was "upgraded" to an exclusive lock. Notice there is no writer, and there never will be. The program is deadlocked.

It's a bug.

Rust 1.78.0 released

Posted May 9, 2024 8:21 UTC (Thu) by tialaramex (subscriber, #21167) [Link]

> It's a bug.

And checking my mail this morning, turns out Microsoft ended up agreeing, "Disable implicit upgrade for PushLock and SRWLock"

They did exactly what STL wanted, in their private git repo - some future Windows product (probably not a security patch for a Windows version which exists today, although that isn't ruled out) will fix this. This means STL can choose to say eh, product bug and move on with their life. Not their job to make C++ work any better on Windows than any other software does.

Rust 1.78.0 released

Posted May 9, 2024 8:40 UTC (Thu) by khim (subscriber, #9252) [Link]

> Oh! No, you've misunderstood. Each loop is making a completely new std::shared_mutex

Oh, you are right. They are promoting lock which have expored, not the one that is waiting.

> It's a bug.

I would say it's more of “failed optimization”: if there are no writer that may try to win the race again then this whole dance doesn't provide any benefit, but just makes things slower.

It's even possible that 20 years ago it was actual optimization with benchmarks and everything, but then, over course of 20 years, it degraded into what we have today.

Rust 1.78.0 released

Posted May 8, 2024 12:05 UTC (Wed) by chris_se (subscriber, #99706) [Link]

> That's the question that may be asked about any RWLock in general. The only reason to use RWLock 99% of the time is performance. And while you demand some benchmarks you are not presenting any.

After having written quite a bit of concurrent code in C++ in the last couple of years, from my experience any time I had an implementation with a shared_mutex at some point, at a later point I refactored the entire piece of code, and got rid of the shared_mutex. Thinking about how the data structure could be designed differently and NOT using a shared_mutex always lead to much more understandable code, and better performance.

I personally consider shared_mutex to be an anti-pattern nowadays. Given that Rust's design is typically very opinionated, I'm actually quite surprised the Rust standard library even implements this synchronization primitive.

Rust 1.78.0 released

Posted May 8, 2024 13:14 UTC (Wed) by atnot (subscriber, #124910) [Link]

I agree, I've also come to believe there's almost no valid use cases for RWLocks.

As a pitch it sounds great. It's like a lock, except you don't have to worry about readers blocking each other! You may even swap it out quickly and see a performance boost.

The problems start happening when you start adding more writes to the mix, and fairness. If your rwlock is not fair, your writes will just block, potentially forever. If it is fair, it needs to block every new read until all existing reads have ended and the write has been performed. That is, in the presence of writes, readers suddenly do block each other. This is a disaster waiting to happen.

What's worse, people usually look at RWlocks when there's a lot of contention. That is, there's a lot of readers and/or the critical sections are already very long. The rwlock just masks this problem, which in practice means your critical sections grow unchecked until someone decides to hit "delete all" at 3am and there's no time to rearchitect things. The number of writes you need to completely kill the performance of an rwlock is just shockingly low in practice.

The better solution in almost all cases is to either stop sharing that data, or to stop writing to it. Either by copying more, or using immutable objects with reference counts.

Rust 1.78.0 released

Posted May 8, 2024 15:09 UTC (Wed) by Wol (subscriber, #4433) [Link]

The system I worked on years ago had a variety of locks - NR/1W, NR&1W, 1W, NR&NW as I remember. Can't remember the details.

What I do remember is that when I wrote our accounting package, all the files were NR&1W and opened by default read-only. When the user hit "commit", I re-opened all the files read-write in strict order, and re-ran the transaction updating ledger by ledger. So everybody could work completely unhindered by anybody else, and only very rarely did people get a message complaining about a write collision because they'd both hit "post" at the same time.

And I took the attitude if people entering data didn't see consistent data, well it didn't matter. What mattered was the write was clean and atomic.

I guess, however, that the kernel is rather more complex than that :-)

Cheers,
Wol

Rust 1.78.0 released

Posted May 8, 2024 23:46 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

I don't think this type of lock promises fairness, and nor does Rust's Mutex. Fairness is expensive and rarely that important in practice.

The thing RWLock does on Linux today, and how SRWLock was intended to work on Windows (unless you believe khim), is a writer-priority reader-writer lock. Once a writer is queued, no further readers will be given the shared lock. No matter what happens, a writer gets the exclusive lock when all the current readers are done. That should be soon, and I agree it's a problem if poor design means you hold the read lock for long periods when you needn't.

[SRWLock's bug means a reader might be given the exclusive lock even though a writer should be next...]

However I'd expect a great many systems actually thrive with a RWLock and if you didn't offer this type they'd be tempted (well maybe not in Rust) to "risk it" and just not bother locking, then pay the price when they get unlucky and a write in the middle of a read causes chaos. Or, if they were sufficiently risk averse to reject that, they'd use a Mutex and be very unhappy with the resulting performance as the Mutex serializes everything.

There are indeed fancy tricks you can do if you can't afford RWLock, Linux obviously uses Read-Copy-Update extensively internally, but Hazard Pointers are another strategy I like - however I don't agree with the idea that you should reach for such drastic measures immediately, they make sense if RWLock isn't cutting it, but they're pretty invasive and you're going to have to learn a whole lot more to use them.

Rust 1.78.0 released

Posted May 3, 2024 17:51 UTC (Fri) by tuna (guest, #44480) [Link]

"I have heard that they have started to, finally, learn (e.g. PipeWire is trying to be bug-to-bug compatible with PulseAudio), but damage is [mostly] done: most people who actually need something done have left linux desktop long ago (me including)."

I seriously doubt bug-to-bug emulation of Pulseaudio was a design target for Pipewire.

Rust 1.78.0 released

Posted May 3, 2024 18:10 UTC (Fri) by khim (subscriber, #9252) [Link]

> I seriously doubt bug-to-bug emulation of Pulseaudio was a design target for Pipewire.

Straight from their front page: Seamless support for PulseAudio, JACK, ALSA, and GStreamer applications.

I haven't used PipeWire thus I couldn't say if they achieved their goal or not, but it was definitely a design target. A good one.

Flatpak and Snap are also promising (it would have been even more promising if there would have been one format instead of two, but two are better than hundred).

Maybe, few years down the road, developers would start making apps for Linux and users would finally start using Linux.

But they wouldn't using it if CADT developers would find another way to break things every year.

Rust 1.78.0 released

Posted May 3, 2024 18:55 UTC (Fri) by tuna (guest, #44480) [Link]

" > I seriously doubt bug-to-bug emulation of Pulseaudio was a design target for Pipewire.

Straight from their front page: Seamless support for PulseAudio, JACK, ALSA, and GStreamer applications."

"Seamless support" != bug-to-bug reproduction. It was a very smooth transition and great work by all devs and distros involved!

Rust 1.78.0 released

Posted May 3, 2024 18:25 UTC (Fri) by intelfx (subscriber, #130118) [Link]

> but damage is [mostly] done: most people who actually need something done have left linux desktop long ago (me including)

Perhaps you should leave this site then, too, and save us all your grandstanding?

Rust 1.78.0 released

Posted May 3, 2024 19:20 UTC (Fri) by khim (subscriber, #9252) [Link]

> Perhaps you should leave this site then, too, and save us all your grandstanding?

Why should I? I still use Linux, mostly as part of Android, but also as part of development environment in WSL. Just not all that CADT nonsense.

Let's stop this here

Posted May 3, 2024 19:29 UTC (Fri) by corbet (editor, #1) [Link]

I really don't think that the conversation is helped by throwing around insulting terms and linking to 20-year-old jwz rants. Let's not do that anymore. Then perhaps we can all get along and nobody needs to leave...?

Making unsafe code much safer

Posted May 3, 2024 14:39 UTC (Fri) by CChittleborough (subscriber, #60775) [Link]

Most (all?) unsafe functions have rules callers need to observe, which should be (1) spelled out in the function documentation and (2) checked using assertions. But assertions are not checked in release builds. Worse still, development builds were linked against release builds of the standard library in which those assertions were compiled out.

The new assert_unsafe_precondition! macro lets coders write unsafe code which will check arguments properly in development releases.

Enabling automatic checking of these preconditions in development releases seems like a big win to me, though of course it may take some time for libraries to take advantage of this.

Making unsafe code much safer

Posted May 3, 2024 16:55 UTC (Fri) by bluss (subscriber, #47454) [Link]

I wanted to clarify that in Rust the rule is that anything named `assert!` is checked in all compilation modes and `debug_assert!` is checked when debug assertions are enabled (typically debug builds).

Here's an easier to read doc link for that macro: https://doc.rust-lang.org/nightly/core/macro.assert_unsaf... (nightly docs, link will break when/if it changes name.)

Making unsafe code much safer

Posted May 4, 2024 0:08 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

While such checks are desirable, it's a bit much to say they *should* be checked. Some of the unsafe pre-requisites are just difficult facts about how the machine works, either the physical target machine or the LLVM abstract machine. Pointer validity is an example in a few of these APIs. The documentation typically says you must provide a "valid pointer" and it links a fairly wordy and vague document which can be informally summarised as an exasperated sigh - no garbage you've made probably isn't a "valid pointer".

There's no sane way to "check" the pointer is "valid". You as caller promise this is so, and if you're lying then probably terrible things could happen which is why it's a pre-requisite.

The reason to ask for "valid" pointers is that it's very easy in Rust for anybody (even safely) to mint garbage invalid pointers. After all mechanically they're the same as integers (on platforms Rust supports). These are not valid, and they won't be valid even if you wish hard. However, they might work anyway (unlikely on CHERI), and as we see in another part of this thread, some programmers believe if you got lucky once everybody else in the universe owes it to you to make your software work forever for some reason. Rust does not provide such an insane promise.

Making unsafe code much safer

Posted May 4, 2024 14:44 UTC (Sat) by CChittleborough (subscriber, #60775) [Link]

Fair enough: "should" is too strong here. I was possibly drawing too heavily from chapter 22 of Programming Rust, 2nd Edition, which argues that unsafe features impose contracts that the compiler won't enforce and are typically "just explained in the feature’s documentation".

Though for the specific case of pointers, you can and probably should check for nil and misalignment, and assert_unsafe_precondition! lets you do that only in development builds if you think the overhead too big for releases. OTOH, functions which require (fx) two pointers that must both be in a single heap object are in the Too Hard basket.

I also quite agree about Rust avoiding insane promises!

Making unsafe code much safer

Posted May 7, 2024 7:24 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

> Rust does not provide such an insane promise.

To be fair, neither does C.

I think the main difference is the scope of this lack-of-a-promise. In C, pretty much every nontrivial API takes pointers, and pretty much no nontrivial API is prepared to deal with an "invalid" pointer (for any value of "invalid," with the possible exception of NULL). So you inevitably run into Hyrum's law (i.e. people write code that does something invalid, it accidentally works, time passes, more people write more dubious code, and now it's part of your API contract even though you never signed up for it). Rust does not work that way. The language statically checks the validity of references, and at runtime, references are (usually*) a zero-cost abstraction, so pretty much every nontrivial API uses references in preference to raw pointers, unless you are doing something where unsafe pointers are absolutely required (e.g. interfacing to C or another language), or significantly more performant than the safe equivalent. This makes it much harder for pointer validity to become an optional part of the API - the language won't let you construct an invalid reference, much less pass it as an argument.

You also can't cheat your way around this by casting or otherwise. The language assumes that shared references point to immutable objects, and that mutable references are exclusive (restrict, in C terminology), and both of those assumptions are pervasively emitted in LLVM IR, so if you break them, then it will all fall apart very quickly with numerous bizarre aliasing violations, even if you personally wrote 100% of the code. There is simply no way to "get away with" invalid references for more than the most trivial of cases, and raw pointers are right there if you really want them, so most people are content to use references correctly... most of the time. Unfortunately, there are still a few infelicities in unsafe rust where the most convenient way to create a possibly-invalid pointer is to first create a possibly-invalid reference and then cast it to pointer. This is in the process of being fixed, and will soon no longer be a problem (hopefully).

* The exception is dyn Trait and other fat pointer types. But those are generally negligible-cost, like C++'s virtual keyword.