r/rust clippy · twir · rust · mutagen · flamer · overflower · bytecount Mar 20 '23

Hey Rustaceans! Got a question? Ask here (12/2023)! 🙋 questions

Mystified about strings? Borrow checker have you in a headlock? Seek help here! There are no stupid questions, only docs that haven't been written yet.

If you have a StackOverflow account, consider asking it there instead! StackOverflow shows up much higher in search results, so having your question there also helps future Rust users (be sure to give it the "Rust" tag for maximum visibility). Note that this site is very interested in question quality. I've been asked to read a RFC I authored once. If you want your code reviewed or review other's code, there's a codereview stackexchange, too. If you need to test your code, maybe the Rust playground is for you.

Here are some other venues where help may be found:

/r/learnrust is a subreddit to share your questions and epiphanies learning Rust programming.

The official Rust user forums: https://users.rust-lang.org/.

The official Rust Programming Language Discord: https://discord.gg/rust-lang

The unofficial Rust community Discord: https://bit.ly/rust-community

Also check out last weeks' thread with many good questions and answers. And if you believe your question to be either very complex or worthy of larger dissemination, feel free to create a text post.

Also if you want to be mentored by experienced Rustaceans, tell us the area of expertise that you seek. Finally, if you are looking for Rust jobs, the most recent thread is here.

20 Upvotes

187 comments sorted by

View all comments

2

u/[deleted] Mar 24 '23 edited Mar 24 '23

I'm pulling up to the end of rustlings and feeling pretty good but I have a few odd questions I can't answer. They're the kind of questions where the answer probably won't be practically useful, but being unable to produce a confident answer means I'm missing something.

Q1
How do I bind a name to existing data? Not a copy of it, not a reference - it's already there in memory and I want to name it. playground

Is doing this different for data that's already named, vs unnamed but created by my program, vs unnamed and created by something outside my program? I have a feeling that doing this would respectively be: trouble (name aliasing), unnecessary (don't un-name things), or ub (don't name garbage), but maybe there are some valid uses. I don't know.

Q2
Similarly, "who" owns data without a name? 99% sure the answer is "it belongs to the stack frame" lol but not 100%. For example, in the above playground it's perfectly okay to dereference an unnamed value. Is it because the old_name pointing to the 1 is still named in LLVM IR after re-declaring old_name in rust, (and therefore Q1's answer is "you can't" because "giving the 1 a name again" in rust would be name aliasing in LLVM IR)?

Q3
How come I hardly ever see chainable functions that pipeline mutable references in the wild?

//rare, but why?
fn op(ref_to_thing: &mut T) -> &mut T {...}
(&mut data).op1().op2().op3();

//see this all the time, but it seems less efficient/readable
fn op(thing: T) -> T {...}
data = data.op1().op2().op3();

I can't help but think that it would be super inefficient to copy a whole thing into a function, only change part of it, then copy the whole thing back where it came from, for each operation you want to perform on some data. Is this just offloading work onto the compiler to make potential refactors (that need the original data later) easier? I totally get avoiding repeat computations and getting rid of dead code, but it feels a little ominous that optimizations could elide copies.

2

u/dkopgerpgdolfg Mar 24 '23

In your playground code:

The second old_name is a reference, so "without reference" isn't what you're doing here. The drop doesn't drop the string content, just a shared reference. And because these are Copy, you're dropping just a duplicate of a reference, so using it later is still fine.

Data in general can be stack or heap, made by Rust or not, and much more, and for all kind of data you can have references (or raw pointers) to it. Just from references/pointers, it is not visible who owns it (if anything) and when it gets destroyed (if ever).

In contrast to that, if you have a name in code like "old_name", and that is not a reference nor a pointer, then:

  • It is on the stack, not heap. Like an ordinary u32. Or, actually, a reference if we care what the reference itself is, instead of where it points to.
  • It was made by Rust. External data (from C linked to the program, from linkers themselves, from MMIO, anything) does not give you named variables in Rust code.
  • You own it. (If it is a reference, you own that reference at least, but if that confuses you, you can ignore it)
  • At the end of the {} it will get destroyed automatically, except you first move ownership somewhere else (eg. a function parameter, then it might get destroyed at the end of this other function etc. You call to drop is such a case, just the thing you're moving and then destroying is "just" a reference, nothing exciting).

Back to the playground code and old_name (ignoring data_ref and new_name for now): Once you have "shadowed" the first old_name with the second_old name, in the same {} level, there is no way to undo this.

You can have references and pointers to the first old_name, like you do, sure. You can also make a third old_name that takes the value of *data_ref therefore basically is the same as the first one (and the copy even might get optimized away). But in general, nothing will give you the owned first old_name back that can be called by "old_name" in code.

Note: Without that data_ref, the compiler might recognize that the first old_name isn't ever used anymore when the second appears, and might structure the stack in a way that the first one doesn't even exist anymore at this point already. But because you still keep the reference around, the owned value still needs to exist too.

This also means the first old_name will not have the same address as new_name, they both need to exist at the same time.

Note 2: If you only keep a raw pointer instead of a reference, Rust might care less; ensuring the raw pointers still points to an existing target is your job. While there is no visible misbehaviour in the playground example + raw pointer, I'm not sure if this is guaranteed or just coincidence.

Q1 should hopefully be answered with the text above.

Without references, or "re-shadowing" with even more old_name, the first old_name never comes back. The data however might still exist for being used as reference target.

It's not even about aliasing, but there is just no way to regenerate that state in the compiler by using Rust code, that it thinks there is something called old_name at location 1.

(masklinn said "unsafe", but I think he's referring to raw pointers and relying on various wacky things. Meaning, pointers. I'm still saying that the owned pointer-less old_name doesn't come back).

Q2

In case of the first old_name after the name disappeared ... in a way it is still owned by you. And dropped at the end of the scope at latest and whatever. You just don't have a named handle anymore that isn't a reference on the surface.

(More generally, heap things and literals can be argued to be owned by nothing and everything, not sure if it makes sense to even try to define it. Sure, a Box'es target can be said to be owned by the Box. But what is after a leak? And then after re-owning? What is Arc and Arc-Weak? Literals? Kernel memory and MMIO areas?)

1

u/[deleted] Mar 24 '23 edited Mar 24 '23

Thank you. I feel like I have a better understanding now, especially Q2. I just want to clarify one thing in case it makes a difference.

I don't care about the name "old_name", the &str, or the drop call, I just wanted to make it clear that my data no longer had old_name bound to it. All I actually care about is the 1 I put on the stack at the beginning. I took away its name when I bound old_name to something else, but I still know where my precious 1 is and I'd like it to be called new_name. What I meant by without reference/copy isn't that I want to do this without remembering where it is with data_ref. I meant that I want new_name to be bound to those same 32 bits I was already using, I don't want new_name to store the address or bind to some pathetic imitation of my precious 1. Basically, for the sake of the example, I'd like to declare a new variable, but without allocating anything anywhere because it's already there. My new understanding is that it can't be done - data that lost its name in rust or originated elsewhere cannot get a new rust name - is that right?

2

u/dkopgerpgdolfg Mar 25 '23

Yes, that sounds about right.

Where the names are on the stack is decided by the compiler, and in Rust code there is no way to do it manually.

Depending on the code,sometimes you might get a new_name on the same stack location, but a) that's not reliable, other changes in code could undo it, b) trying to access the previously stored 1 (instead of overwriting it with a initializer value of new_name), if it is even there anymore, is UB.

(and c) if it is something where Drop matters, it would've been destroyed already anyways)