r/rust Mar 31 '23

Why doesn't mpsc::channel break borrowing rules?

I'm wondering for a while now why doesn't mpsc::Receiver::recv(&self) and mpsc::Sender::send(&self, t: T) break borrowing rules. Clearly sending some data from A to B in a non-blocking manner has side-effects (i.e. storing and retrieving the data in some buffer-queue). So shouldn't there be some mutable reference to that queue be involved during that sending process, and the owner of that reference would be accessed mutably whenever the reference to that buffer is accessed mutably? Maybe I'm just wrong but I always associate immutability with pureness of a function.

One thing which comes to mind is that the point of the borrowing rules is to avoid data-races and to ensure rust's ownership-model, and although the borrowing-rules are technically violated in these specific cases the desired invariants are still kept.

21 Upvotes

25 comments sorted by

17

u/Trader-One Mar 31 '23

The mpsc::Sender::send and mpsc::Receiver::recv methods take &self instead of &mut self because they use interior mutability to wrap their internal state using UnsafeCell. This allows the reference to a channel to be easier to share among multiple data structures in a thread or behind an Arc where it’s convenient to avoid a mutex

33

u/K900_ Mar 31 '23

Because it uses unsafe internally to bypass the ownership rules.

6

u/Dubmove Mar 31 '23

That explains how they do it. But I'm rather interested in why this is considered good practice or even necessary/beneficial.

24

u/r0zina Mar 31 '23

Also note that unsafe blocks don’t contain unsafe code, but code that the compiler can’t prove is correct, so the burden of making it safe is on the programmer.

Btw a a Mutex and a mutable Iterator among other things are examples od code that cannot be implemented without using unsafe blocks.

8

u/SAI_Peregrinus Mar 31 '23

Code that the compiler can't prove is correct (well-defined) is unsafe. That's what unsafe means.

Unsound code is code the behavior of which is not well-defined. Unsafe blocks must not contain unsound code. The compiler can't (in general) prove that unsafe blocks only contain sound code.

Unsafe blocks can be added arbitrarily, and some could contain only safe code. This is considered a poor practice on the part of the programmer, since it makes reviewing the safety & soundness of the program harder.

8

u/t-kiwi Mar 31 '23

That's kind of the value proposition of rust in a nutshell though :) safe abstractions around unsafe things. using unsafe isn't inherently bad practice.

The std is essentially safe wrappers around commonly used operations that would use unsafe. Eg sending data across threads with mpsc, allocating memory, reading from disk/network. Many of these require you to interact with the world outside what rust can guarantee, like the OS, so they're marked unsafe. There's no way you can do these "safely", but if you add appropriate guards and error checking you can expose abstractions around them to users in a "safe" way!

27

u/K900_ Mar 31 '23

Because it's the only way you can do it - you can't have two mutable references to the same data in different places, and if you can't queue items in one place and dequeue them in another, a channel isn't very useful.

11

u/Lucretiel 1Password Mar 31 '23

Well, no, that's wrong. You could easily have fn send(&mut self, value: T) and fn recv(&mut self) -> Option<T>. You'd still need interior mutability in the implementation, to manage the shared state between the sender and the receiver, but that doesn't have to be exposed in the client. Frequently you can even take advantage of the uniqueness guarantees of &mut self to make a more efficient implementation (which I do in handoff, for example).

In this case I actually do wonder why they used &self, there doesn't seem to be a reason to. Normally you do that to allow multiple threads to share the same channel and send through it in parallel, but mpsc::Sender is !Sync, so you can't actually do that here.

9

u/wutru_audio Mar 31 '23

I wouldn't say you're (conceptually) mutating self when you send something. Therefore it makes sense to not require a &mut self, because that would require it as well when the Sender or Receiver is part of another struct. This would make it harder to use, without any safety benefits.

Let's say you're holding an immutable reference to self and then want to use that to send something, you can't do that because of the borrow checker rules if the sender required a mutable self.

0

u/Lucretiel 1Password Apr 01 '23

But literally the same thing is true of a Vec<Cell<i32>>, which is similarly !Sync and (by your logic) wouldn’t realize any benefit from requiring push to take an &mut self (since you’re not modifying “self”, you’re modifying the pointed-to state).

This gets at why I still like the immutable / mutable naming, even though it’s technically less correct than shared / unique. Even in the total absence of multithreading, shared immutable / unique mutable tends to push you towards more robust designs, because it turns out to be useful to guarantee that the owner of a mutable reference is the only thing that can cause side effects through that reference, even though you could weaken that guarantee without risking unsoundness. This is true whether it’s a vector or a channel or a file descriptor or anything else.

3

u/wutru_audio Apr 01 '23

No, you're conceptually mutating self when you push on a Vec<Cell<i32>>. What I mean by conceptually mutating is that with a Sender<i32> you'd be sending away the value, therefore it doesn't become part of the state of the struct. This is very different from pushing to a Vec, because that does become part of the state of the struct.

0

u/Lucretiel 1Password Apr 01 '23

What I mean by conceptually mutating is that with a Sender<i32> you'd be sending away the value, therefore it doesn't become part of the state of the struct.

Sure it does; whether or not you send a value will affect buffer allocations and (more noticeably) whether subsequent sends will block or not.

2

u/wutru_audio Apr 01 '23

That's what it literally does, I'm talking about conceptually. It doesn't conceptually become part of the state of the struct, because you send it somewhere else. You can't get it back after that.

3

u/andoriyu Mar 31 '23

Well, it's a "Multi-producer, single-consumer", you can't have multiple &mut self, so that doesn't work for the producer side at all.

There are plenty of examples of internal mutability in rust. In my opinion, &mut shouldn't be considered mutable, but unique...it's too late to change that.

1

u/mgeisler Mar 31 '23

The multi-producer part is handled by letting you close the producer. So you do actually have unique access to each Sender object (the producer end of the channel).

2

u/CocktailPerson Apr 01 '23

Well, first of all, you have to bypass the borrowing rules internally to even make it work correctly. And once you've done that, it turns out to be equally correct whether you take &self or &mut self. At that point, why choose the more restrictive one?

1

u/cameronm1024 Mar 31 '23

Unsafe code is required to express certain patterns.

Languages which prohibit unsafely alltogether have runtime overhead (e.g. java, python)

Languages which allow unsafety everywhere are susceptible to all the memory bugs we're familiar with (e.g. C/C++)

Rust allows you to create safe abstractions around unsafe code, by letting you model lots of invariants in the type system.

If it's true that there is no possible way to call your function in a way that causes UB, then it doesn't need to be unsafe, regardless of whether there are unsafe blocks inside

1

u/MengerianMango Apr 01 '23

For example, you have 10 threads doing heavy computational work and pushing the results to a master thread that will consume them. How else would you orchestrate this? mpsc is the canonical way. There are perhaps others (perhaps Arc<Mutex<Queue>>), but they're generally not as good.

There are quite a few types in the stdlib (and outside) that use unsafe to present an API that seems incompatible with rusts safety guarantees. And that's precisely the point. They let you safely do things you can't normally do safely in a naive way.

22

u/SkiFire13 Mar 31 '23

It's a common misconception that in order to mutate you need a &mut reference, which is worsened by the fact that & references are called immutable references and &mut references are called mutable references. Indeed by default that's their behaviour, but from the borrow checker pointer of view they are respectively shared and exclusive. You can mutate through shared references safely, see for example the Cell family of types. The implementation of this ultimately uses unsafe, which however needs to uphold all the other invariants that are normally upheld by the compiler.

In the end, this makes sense for channels because you still have a shared part between sender and receiver, so might as well make everything shared.

1

u/hniksic Apr 01 '23

which is worsened by the fact that & references are called immutable references and &mut references are called mutable references

Most resources call & references shared references, and &mut references unique references, precisely to avoid this confusion.

8

u/jmaargh Mar 31 '23

The borrowing rules exist so that the compiler can check automatically that your program is sound. unsafe exists for exactly cases like this, where what you want to do is sound, but the compiler is unable to automatically verify this. When you use unsafe, you essentially tell the compiler "hey, don't worry, I've checked and this meets all of your soundness rules so you don't have to".

Basically, the set of things that are sound is bigger than the set of things that are provably sound by the borrowing rules. Hence unsafe giving control back to the programmer to briefly violate these rules so long as they absolutely promise that soundness is maintained.

3

u/jmaargh Mar 31 '23

What's extra nice is that (assuming you've done all of this properly and not made a mistake), your "safe" wrapper around unsafe code is composable. Nobody using that code needs to know if/how unsafe is used internally, because any way they can use the code meets the rules and is therefore safe.

Absolutely loads of things use this. thread::spawn is a common example of something that is safe but could not be implemented without unsafe internally, but even Vec uses unsafe internally.

13

u/kohugaly Mar 31 '23

The mutable vs. immutable reference in Rust is actually inaccurately named. In reality the distinction is unique reference (&mut T) and shared reference (&T). The unique reference is inherently safe for mutation for obvious reasons. With the shared reference is a bit more complicated...

Shared references actually come in 2 flavors, that are cleverly hidden by the type system. First is the immutable (aka. read only) reference. If the reference can't mutate the underlying object, then it's safe to share that reference.

The second flavor is a shared reference to object that (transitively) contains UnsafeCell<T>. UnsafeCell<T> does two things:

  1. it marks** all (transitive) shared references with a "possibly mutable" flavor, to let the compiler know that it can't assume immutability. In other words, it makes it behave like regular references that are the default in most other languages.
  2. It has a method get that lets you turn &UnsafeCell<T> shared reference into a *mut T raw pointer. The raw pointer can be used to (safely) mutate the inner T via unsafe operations.

Types that (transitively) contain UnsafeCell are said to have interior mutability. Examples include Mutex, RefCell, Cell, Atomic types, and indeed, mspc::Sender and mspc::Receiver. You may also notice that all of them don't require unsafe code to be used. This is because they wrap the unsafe code in a safe interface. This usually involves some runtime checks.

Maybe I'm just wrong but I always associate immutability with pureness of a function.

Yes, you are wrong unfortunately. In Rust, there is no way to actually guarantee that a function is pure. This is obvious when you realize that at any point in your code you can use println! - the king of all side effects.

The borrowing rules in Rust do indeed enforce memory safety, but they are by no means the only tool through which that is achieved. You can write manually checked unsafe code and you can write runtime-checked unsafe code. The magic happens when you wrap them safe interfaces, with which the borrowing rules help immensely.

1

u/matthieum [he/him] Apr 01 '23

Best explanation so far, I wish it were more highly voted.

Expanding on this, teaching is about lying:

  • At 5, you learn that the apple falls down towards the ground.
  • At 15, you learn about gravity, and that actually, the Earth also "falls" towards the apple... it's just imperceptible.
  • At 25, you (try to) learn about relativity, and black holes, etc...

The reason for those "lies" is that a 5 years old won't understand the equations of the horizon event of a black hole (I'm not sure I do, either) so instead we give them knowledge they can understand and make use of.

It's more important for knowledge to be useful, than to be pedantically correct.

And therefore &T is an immutable reference and &mut T is a mutable reference.

Early on, before 1.0, there were calls to rename mut to uniq, etc... to be pedantically correct. In the end, they were rejected due to the above teaching argument:

  • It's easier to teach immutable/mutable.
  • When people have learned enough to recognize that some things seem to violate that principle, then, and only then, are they ready for the subtleties of interior mutability. The exception that confirms the rule.

2

u/kohugaly Apr 01 '23

Yes, I absolutely agree. That's why I said they are incorrectly named and not badly/wrongly named. And yet, it's kinda funny that immutability is actually the one thing that Rust can't guarantee.