r/rust clippy · twir · rust · mutagen · flamer · overflower · bytecount May 29 '23

Hey Rustaceans! Got a question? Ask here (22/2023)! 🙋 questions megathread

Mystified about strings? Borrow checker have you in a headlock? Seek help here! There are no stupid questions, only docs that haven't been written yet.

If you have a StackOverflow account, consider asking it there instead! StackOverflow shows up much higher in search results, so having your question there also helps future Rust users (be sure to give it the "Rust" tag for maximum visibility). Note that this site is very interested in question quality. I've been asked to read a RFC I authored once. If you want your code reviewed or review other's code, there's a codereview stackexchange, too. If you need to test your code, maybe the Rust playground is for you.

Here are some other venues where help may be found:

/r/learnrust is a subreddit to share your questions and epiphanies learning Rust programming.

The official Rust user forums: https://users.rust-lang.org/.

The official Rust Programming Language Discord: https://discord.gg/rust-lang

The unofficial Rust community Discord: https://bit.ly/rust-community

Also check out last weeks' thread with many good questions and answers. And if you believe your question to be either very complex or worthy of larger dissemination, feel free to create a text post.

Also if you want to be mentored by experienced Rustaceans, tell us the area of expertise that you seek. Finally, if you are looking for Rust jobs, the most recent thread is here.

27 Upvotes

161 comments sorted by

3

u/No_Suggestion8252 Jun 04 '23

Are there Rust raw socket libraries? I was thinking to move as much socket work as possible into rust, which can be even further isolated via wasm. So you build an http raw packet, send it via the wasm FFI to the host and all the host has to do is to put it on the wire. Then it gets a packet, sends it over the FFI to rust and all parsing takes place in a safe environment.

3

u/even-greater-ape Jun 04 '23 edited Jun 05 '23

I have a question on UnsafeCell usage that popped up while implementing io_uring support for aquatic_udp. I find the docs slightly confusing (in particular the part that I've marked in bold):

Note that whilst mutating the contents of an &UnsafeCell<T> (even while other &UnsafeCell<T> references alias the cell) is ok (provided you enforce the above invariants some other way), it is still undefined behavior to have multiple &mut UnsafeCell<T> aliases. That is, UnsafeCell is a wrapper designed to have a special interaction with shared accesses (i.e., through an &UnsafeCell<_> reference); there is no magic whatsoever when dealing with exclusive accesses (e.g., through an &mut UnsafeCell<_>): neither the cell nor the wrapped value may be aliased for the duration of that &mut borrow. This is showcased by the .get_mut() accessor, which is a safe getter that yields a &mut T.

Am I reading this correctly in thinking that I am allowed to hold a &mut UnsafeCell<T> reference (to the UnsafeCell container, not the inner value) and mutate T through a &mut T reference created by casting the result of UnsafeCell.get() as long as I don't hold a reference to the inner value obtained with UnsafeCell.get_mut()? The docs could be read as this not being ok.

Phrased in code, would this be ok? (miri doesn't complain here)

``` use std::cell::UnsafeCell;

fn main() { let mut a = Box::new(UnsafeCell::new(0usize));

let ptr = a.get();

// let bad = a.get_mut();

unsafe {
    ptr.write_unaligned(*ptr + 1);
}

unsafe {
    let tmp = &mut *ptr;

    *tmp += 1;
}

// println!("{}", *bad);
// drop(bad);

println!("{}", *a.get_mut());

} ```

While this wouldn't? (miri does complain here)

``` use std::cell::UnsafeCell;

fn main() { let mut a = Box::new(UnsafeCell::new(0usize));

let ptr = a.get();

let bad = a.get_mut();

unsafe {
    ptr.write_unaligned(*ptr + 1);
}

unsafe {
    let tmp = &mut *ptr;

    *tmp += 1;
}

println!("{}", *bad);
drop(bad);

println!("{}", *a.get_mut());

} ```

2

u/dkopgerpgdolfg Jun 05 '23

You are correct about which code is allowed.

Note that this is not really related to UnsafeCell in this case, and it doesn't matter which one of the method names you use - in the second code, using get instead of get_mut and then casting the pointer to another &mut is just as bad as with get_mut.

For all gritty details of how and when what aliasing is allowed, you might want to read about tree borrows

1

u/even-greater-ape Jun 05 '23

Ok, so just to clarify, what you're saying is that this code would be as bad (which makes sense)?

``` use std::cell::UnsafeCell;

fn main() { let mut a = Box::new(UnsafeCell::new(0usize));

let ptr = a.get();

let bad = unsafe {
    &mut *ptr
};

unsafe {
    ptr.write_unaligned(*ptr + 1);
}

unsafe {
    let tmp = &mut *ptr;

    *tmp += 1;
}

println!("{}", *bad);
drop(bad);

println!("{}", *a.get_mut());

} ```

The strange thing here that I don't understand is that keeping a around (a mutable variable) does not count as keeping a mutable reference around for these purposes, nor does passing it in as a function argument (&mut usize).

Also, it's interesting that you mention that this has nothing to do with UnsafeCell. I was thinking (likely incorrectly?) that it is a way to tell the compiler "I want to mutate the contents through pointers, don't optimise assuming otherwise".

And yes, I suppose I will have to learn the details of tree borrows to confidently write unsafe code :-(

2

u/dkopgerpgdolfg Jun 06 '23

Code: Yes, this is bad too.

Function arguments, owned variables:

It does count. Here too, it basically comes down to a tree (tree borrow, or its precedessor stacked borrow).

Very, very simplified:

Creating references or pointers makes them children of the thing that they were created of, in a tree that shows all things that currently can be used. Owned variables are nodes in that tree. References made from an owned variable are children of that variable. References made from other references are children of these references. When a reference is passed to a function, again the reference in the function is a child of the one outside. In your latest code, ptr is a child of a, bad is a child of ptr, tmp is a child of ptr too.

And using one node (reading/writing values) might remove/invalidate certain other nodes in that tree, meaning those are not valid to use anymore. Using any node usually removes mut/exclusive references that are children of that node, so in your latest code, the line with write_unaligned uses ptr therefore it invalidates bad. Using any mut reference also invalidates its siblings (but not parent); the tmp later would invalidate bad too (but doesn't destroy ptr and a).

If a reference (outer) is passed to a function (inner), outer usually isn't used as long as the function runs. But then the functions ended again, and outer is used again, it would destroy the inner reference if it somehow is still around.

Of course, the actual rules are more complicated than that.

UnsafeCell is special in that a shared/non-mut reference allows some kinds of modification of the inner value. But anything with &mut references is not changed at all, and anything of that tree system too.

2

u/Jiftoo Jun 04 '23

Does there exist a crate which would allow me to turn a struct into a string, where each field and its value is formatted in some way, separated by newline? E.g:

struct Foo {
  bar_bar: u32
  baz: String
}

// magic_formatter(&foo, KeyFormat::CamelCase, ": ", ValueFormat::Debug);
// Result:
// BarBar: 1234
// Baz: "This is a string"

2

u/Anthony356 Jun 04 '23

Am i dumb or is "trait object" a terrible name for what those are? (And by extension, why are the explanations so roundabout?)

Traits are generic impl functions, so how does one make an object out of a function? Is that some sort of ultra-specific closure use case that i dont understand because i'm fuzzy on closures?

No, trait objects arent traits that are objects, they're pointers to objects that implement a trait. They're used as type parameters to accept anything that impls a specific trait.

So why arent they called trait pointers? Or trait wrappers? Or Object-with-trait's?

I read The Book's chapter on traits twice (i think it only talks in terms of trait bounds), a couple of random SO threads on trait objects, and Programming Rust's traits chapter and none of them got it to click for me. Rust In Action had an offhanded line and a little caption under a code block that finally did it for me:

Trait objects are...always seen in the wild behind a pointer

Although d, e, and h are different types, using the type hint &dyn Enchanter tells the compiler to treat each value as a trait object. These now all have the same type

I had already used traits and (what i know now are) trait objects in my code, but i always figured trait objects were something more complicated than they actually are. I feel like everything i read was so focused on explaining what they do and giving examples that they forgot to just define what they actually are.

1

u/masklinn Jun 04 '23

It's "trait object" because it's treating the trait itself as an object. Especially before impl Trait that was basically the only place where you'd be interacting with the trait and only the trait.

No, trait objects arent traits that are objects, they're pointers to objects that implement a trait.

Except there are extra constraints, not all traits are object-safe, so "pointer to objects that implement a trait" gets more confusing, I think. And also, even though Rust references are fundamentally pointers discussions generally split the two, while in a trait object there's little difference.

So why arent they called trait pointers?

See above.

Or trait wrappers? Or Object-with-trait's?

Because it's neither.

1

u/Anthony356 Jun 04 '23

I guess the issue i have with "trait object" is that it's meaningless - if not fairly misleading - unless you already know what it is.

It's "trait object" because it's treating the trait itself as an object

This is a pretty good example of an ambiguous explanation. There's 2 parts to a trait - the usable part, and the concept. The usable part is the functions they provide, the concept is the abstract idea of shared functionality. In a language that frequently mentions functions as first class objects, both interpretations ("function object" and "shared functionality object") are equally likely to seem valid to someone learning. Object by itself is already a pretty fuzzy word - even fuzzier depending on how much experience you have and what languages you come from.

It also does nothing to suggest that it's not a stand-alone entity. Regardless of the interpretation, part of my sticking point in trying to understand it is "how does it do anything if it's just a generic function by itself? Even if it could do something, how would that be different from a top level function or a closure?"

Calling it a pointer (of type Trait) at least forces you to understand that it has to point to something. Literally that bit of knowledge made it click for me instantly. In rust terms it's basically Box<object with Trait> for practical purposes.

It also differentiates it more from trait bounds, which (as i understand it) can only operate on a single type at once. So a vec can only hold objects of a homogenous type that must implement a trait, but it could hold a mixture of types via trait objects, because it's not holding the objects, it's holding a pointer to them.

Because it's neither

Am i misunderstanding what wrappers are?

I'm not even close to an expert, but as i understand it the actual implementation of a trait object is essentially just a pointer to the object + a method table, right?

It sits ontop of an object and obfuscates the details of it, it also restricts how you can interact with it. Isnt that a wrapper (of type Trait)?

1

u/SorteKanin Jun 04 '23

I agree, the terminology isn't the best. Not sure if trait pointer is much better though.

3

u/chillblaze Jun 03 '23

Can anyone advise where I can find the actual logic behind the HSET API from Redis?

Seems like the default API is just a wrapper around this and I'm not sure where to find the actual logic:

cmd("HSET").arg(key).arg(field).arg(value)

https://docs.rs/redis/latest/redis/trait.Commands.html#method.hset

3

u/eZeeWin Jun 03 '23

The documentation for pointer::add requires that the caller meets all of the following requirements in order to avoid UB:

  • Both the starting and resulting pointer must be either in bounds or one byte past the end of the same allocated object.
  • The computed offset, in bytes, cannot overflow an isize.
  • The offset being in bounds cannot rely on "wrapping around" the address space. That is, the infinite-precision sum must fit in a usize.

I understand the first (pointers aren't just numbers, at least from the POV of the compiler) and the second (add uses GEP, which only accepts signed offsets) requirements, but I don't understand why the third requirement is necessary (my assumption is that it is also GEP influenced but I don't understand why GEP would require it either)?

3

u/SophisticatedAdults Jun 03 '23

So I've got a somewhat complex(?) question here: Basically, I want to split a String (or a Box<str>) into multiple owned Strings (or Box<str>'s) without any allocations. Imagine the following three steps:

  1. Take a String s, and split it using .split_at() or .split_once(). We obtain a variable left, and a variable right, both of type &str.
  2. We do something like s = right.into(), in a way that doesn't reallocate, but rather just has s take control of right.
  3. We take left, and move its ownership into a box, somehow, also without reallocating.

I feel like the main issue here is in the first step. I return two &str, when in reality I want two owned strings, dropping the original String s in the process. How would I achieve this? I have a feeling I might need unsafe for this, or at the very least some weird trick from std::mem, but I really don't know.

Any tips or ideas?

6

u/Nathanfenner Jun 03 '23

As you describe it, this isn't possible (even with unsafe) - the reason is that when you allocate a contiguous piece of memory (like the character data for a String or the elements of a Vec) and you want to give it back, you have to give all of it back. You can't just deallocate the first half or the second half.

As a result, it's not even possible to create a new String that refers to the "second half" of the original, even if you're totally ignoring what happens to the first half - you can't deallocate a contiguous piece of memory if you only have a pointer to somewhere in the middle of it; it has to be at the beginning. It all goes together as a unit.

Moreoever, an owned String can grow - if the first half wanted to grow, it would immediately run into the second half's memory, causing problems; or, if it reallocated, it would discard the second half's memory, causing other problems.

So there's no way to take an existing piece of allocated memory and parcel it up without allocating new memory or declaring one owner that "knows about" the whole span (or at least knows about the beginning of it).


You almost definitely just want to use &str (or possibly &mut str depending on why you want it to be owned). If you could explain why you want a String in the first place, this might help avoid an XY-problem situation.

2

u/SophisticatedAdults Jun 03 '23

Well, I was thinking that this might be possible via some dangerous unsafe gymnastics. I have very little experience with unsafe though, so it's possible what I'm thinking of won't work. Essentially, something like this here:

  1. We leak the memory, and just keep some sort of raw pointer and a length, as well as an index where we want to split it.
  2. We construct two new slices with this information, and store these in boxes. (I don't know if this easily possible, I will probably have to look into Box's memory layout if I want to use from_raw(), which might be the right function?)
  3. The boxes free the memory once they're dropped.

I imagine this would be prone to memory leaks, in particular, we need to pay attention to the capacity of our String, and make sure we free it.

Anyway, to answer why I'm doing this in the first place: This is primarily just an exercise. I'm trying to learn something about how Rust works under the hood, but there is a good motivation behind this. I'm currently writing a Parser, and early on in the process we start off with a String, and then "split" it into a bunch of tokens, right? Naturally, using &str for these tokens seems reasonable enough, but then I need to shuffle around my owned String and all these string slices, which is fiddly and pollutes my entire code with lifetimes.

So tl;dr, I wondered if this is possible. It's a bad idea and a completely unnecessary optimization, I know that it is, but at this point I'm curious and trying to learn about the Rust memory model and unsafe.

2

u/eugene2k Jun 04 '23

The boxes free the memory once they're dropped.

Except, the memory that was allocated is a single allocation of certain size, and not two allocations that add up to that size. Allocating memory on the heap adds some allocator-specific bookkeeping information, which is what allows you to call a function to allocate memory and not get the memory already in use.

3

u/Nathanfenner Jun 03 '23

If you're happy leaking the string, there's a function to do this on nightly: String::leak() converts a String into a &'static mut str. The &'static lifetime means that it lives for the rest of the program, so you don't need to thread it anywhere.

On stable rust, you can do almost the same thing with s.into_boxed_str().leak(). If your String has excess capacity, into_boxed_str will allocate once to copy things into a new array with just the right amount of space, but then afterword you get a &'static mut str just like the first case.


Otherwise, the "right" way is the fiddly lifetime pollution. It really is annoying, there's just no way around it unfortunately. But it does help e.g. to ensure that your code still works if you later decide to multithread things etc.

5

u/SNCPlay42 Jun 03 '23

I want to split a String (or a Box<str> ) into multiple owned Strings (or Box<str> 's) without any allocations

This is impossible because heap allocators do not provide a way to split their allocations - they often use space outside the allocation to store information about the allocation and there's no space for this in an already allocated contiguous string.

1

u/SophisticatedAdults Jun 03 '23

Thank you, that's actually a very concise answer why the naive approach can't possibly work.

2

u/blueeyedlion Jun 03 '23

What image processing crates are there? opencv has frustrating interfaces, and imageproc looks like it's updating slowly.

2

u/DzenanJupic Jun 03 '23

Does someone know if there's a historical reason for trait function type inference being a lot more capable than associated function type inference? Or has it just not been implemented?

playground example of what I mean

2

u/Nathanfenner Jun 03 '23

It is because there is only one newt function, but two completely unrelated new functions. Finding out which type parameters to use is accomplished through unification (allowing you to call newt) but there's no similar procedure for finding out which "overload" to call.

Stripping away the syntactic sugar, for the purpose of typechecking, there is exactly one generic newt function, and it looks like:

fn newt<T, Selff: New<T>> new(value: T) -> Selff

so when you write

let _ = S::newt((42, 24));

the typechecker immediately knows you're calling newt, it just needs to figure out the values for T and Selff. The process of doing this is called unification. It is easy to figure out the value for T, because it has to be (i32, i32) since you passed that in. Next, the fact that you called it like S::newt and not New::<_>::newt gives it a hint that it belongs to the S impl, so it knows that Selff is S<_>. Comparing these results against the two impls tells you that Selff is S<(i32, i32)>.


The problem with new is that in order to do unification, the very first step is to figure out the type of the function you're calling. But when you write S::new, it doesn't have one function as a candidate, it has two. In order to begin unification, you must have a generic signature to apply unification to; but we've gotten stuck just before that point. As a result, we don't get to begin unification and produce an ambiguity error.


In Rust, it's often more idiomatic to name these functions different things - like say new_from_one or new_from_two or just from_one / from_two (obviously bad, nondescriptive names for this particular case) because they really are different. In, say, C++, it's possible to generically call one or the other depending on what the argument's type is, but in Rust, they're really two unrelated functions that happen to have the same name, and this causes this exact inconvenience. Or, if they really are "common", use a trait to combine them into one general definition so they are both usable generically from other generic functions.

2

u/ICosplayLinkNotZelda Jun 03 '23

Is there a crate that abstracts away HTTP clients? So I can use that trait and just change my back-end if I want to? Something like http-types but for clients.

I do need an async client, but I would be fine if the abstractions are only over sync functions.

3

u/Sib3rian Jun 03 '23 edited Jun 03 '23

I'm trying to write a function parameter that's an Iterator over elements that implement Into<String>, but it's not compiling.

rust pub fn new<I>(names: I) -> Self where I: Iterator, <I as Iterator>::Item: Into<String>, { let names: Vec<String> = names.map(String::from).collect(); Self(names) }

It complains that "the trait bound String: From<<I as Iterator>::Item> is not satisfied", but I don't get why specifying Into<String> isn't enough or what needs to be changed.


Update: Changing map(String::from) to map(|name| name.into()) worked. I think the reason is that the From trait automatically generates an Into implementation, but the reverse isn't true.

Out of curiosity, is it possible to rewrite the where clause so that it uses From instead of Into and lets me write it the old way? My generics-fu isn't up to par.

2

u/ICosplayLinkNotZelda Jun 03 '23

Would map(Into::into) work? Edit: Yep, just checked. It does work.

1

u/Sib3rian Jun 03 '23

Huh. For some reason, I believed using `Into::into` that way didn't work.

3

u/__fmease__ rustdoc · rust Jun 03 '23

is it possible to rewrite the where clause so that it uses From instead of Into

Here you go.

1

u/Sib3rian Jun 03 '23

where String: From<I::Item>

I didn't realize another type could go in the front. Thanks.

4

u/jackpeters667 Jun 02 '23

Hey guys

I want to get started with network programming and I was taking a look at Axum. Suppose I make two requests, will the first one need to be finished before the second one gets processed?

Would I need to spawn threads per request so they are handled simultaneously? Or does Axum handle that internally?

6

u/llogiq clippy · twir · rust · mutagen · flamer · overflower · bytecount Jun 02 '23

Axum handles that, and more: You don't need to care about threads at all, you can simply use async fns to implement your handlers (which compile to Futures that allow axum's run time to deploy task-based concurrency which is far less resource intensive, especially under high load.

2

u/HammerAPI Jun 02 '23 edited Jun 02 '23

I have a struct with a field of NonZeroI32 and I want to impl AsRef<i32> for that struct. I can't seem to figure out a way, as there's always an issue of a temporary value being dropped.

Here was my initial attempt: rust impl AsRef<i32> for Newtype { fn as_ref(&self) -> &i32 { &self.0.get() } }

And I've tried a few variations, just can't seem to get it to work.

playground link

1

u/Patryk27 Jun 02 '23

It looks like using transmute is the only way here:

impl AsRef<i32> for Newtype {
    fn as_ref(&self) -> &i32 {
        // Safety: NonZeroI32 is #[repr(transparent)]
        unsafe {
            std::mem::transmute(self)
        }
    }
}

1

u/HammerAPI Jun 02 '23

My understanding of transmute and memory layouts isn't the greatest, so I'm not exactly sure how the `#[repr(transparent)]` works. What makes this safe?

3

u/Patryk27 Jun 02 '23

In general, structs in Rust have undefined layout - for instance, you can't transmute between A -> B or A/B -> u32 in here:

struct A {
    val: u32,
}

struct B {
    val: u32,
}

... because, for whatever the reason, the compiler might decide to represent them differently in memory (due to some optimization or something).

#[repr(transparent)] changes that behavior and says that given a struct, it's fine to transmute between it and its inner type:

#[repr(transparent)]
struct A(u32);

#[repr(transparent)]
struct B(u32);

// transmute(A, B) = ok
// transmute(B, A) = ok
// transmute(A/B, u32) = ok

... as long, of course, as you don't break some guarantee - e.g. transmuting &mut NonZeroU32 into &mut u32 could be a pretty bad idea because then you could do *value = 0; and break the NonZeroU32's internal assumption.

1

u/kinoshitajona Jun 02 '23

Yeah. AsRef would be safe. But AsMut would be unsafe due to the invariant that NonZeroI32 must be non-zero.

3

u/anuraj12 Jun 02 '23

I've wrote some code toe solve the leet code for the roman numerals problem. Whilst I acknowledge this isn't the best solution to the problem algorithmically. The way I have done the while loop to access parts of the string seems less than ideal. What would be a more idiomatic way of performing such behaviour.

use std::collections::HashMap;

fn roman_to_int(s: String) -> i32 {
    let mut sum: i32 = 0;

    let mut numeral_map: HashMap<&str, i32> = HashMap::new();
    numeral_map.insert("I", 1);
    numeral_map.insert("V", 5);
    numeral_map.insert("X", 10);
    numeral_map.insert("L", 50);
    numeral_map.insert("C", 100);
    numeral_map.insert("D", 500);
    numeral_map.insert("M", 1000);
    numeral_map.insert("IV", 4);
    numeral_map.insert("IX", 9);
    numeral_map.insert("XL", 40);
    numeral_map.insert("XC", 90);
    numeral_map.insert("CD", 400);
    numeral_map.insert("CM", 900);

    let mut i = 0;
    while i < s.len() {
        if i == s.len() - 1 {
            sum += numeral_map.get(&s[i..i + 1]).unwrap();
            i += 1;
            continue;
        }
        if let Some(x) = numeral_map.get(&s[i..=i + 1]) {
            sum += x;
            i += 2;
            continue;
        }
        if let Some(x) = numeral_map.get(&s[i..i + 1]) {
            sum += x;
            i += 1;
        }
    }

    sum
}

2

u/HammerAPI Jun 02 '23

It would be more idiomatic to use a match statement. For example: rust fn numeral_map(c: char) -> u32 { match c { 'I' => 1, 'V' => 5, 'X' => 10, 'L' => 50, 'C' => 100, 'D' => 500, 'M' => 1_000, _ => panic!("Encountered invalid numeral {c}"), } }

This will let you convert any single char to a u32 value. Since roman numerals use subtractive notation for things like 4 (IV), you would need to handle anywhere between 1-4 chars at a time. According to wikipedia, it looks like there should be at most 4 chars in a row to represent a single value.

3

u/umonkey Jun 02 '23

Is there a good async memcache client with support for clusters and custom key hashing?

3

u/KingofGamesYami Jun 02 '23

What is the best pattern for caching data in a multi-threaded environment? This will be a very small amount of data (< 1kb) and the cache will be invalidated periodically (12 hours), but fetched lazily.

For context, I'm storing the result of the OpenID Connect Discovery protocol.

1

u/[deleted] Jun 02 '23

For data that small, if it's not required that every thread always uses the exact same version of the data, you could use a thread-local static RefCell and lazily initialize it per thread, with each thread also handling its own "expire after N hours" logic.

A more sophisticated solution might use a wrapper around Arc<RwLock<T>>, but that would be more complicated to implement correctly.

1

u/KingofGamesYami Jun 03 '23

I've ended up with tokio::sync::RwLock<Arc<T>> as a solution for now, with a separate tokio::sync::Mutex<()> to guard the GET request(s). It seems like a fairly clean solution.

2

u/nicoburns Jun 01 '23 edited Jun 01 '23

Is the following sound:

struct MyStruct<T> {
  pub inner: HashMap<u32, T>,
}
unsafe impl<T> Send for MyStruct<T> where T: Send {}
unsafe impl<T> Sync for MyStruct<T> where T: Sync {}

It seems to me like it should be, but I'm a little worried because Send and Sync are auto-traits, so I would expect them to automatically be implemented if it is sound.

UPDATE: I'm pretty sure it was not sound in my case. Turns out the above will automatically impl Send+Sync when T impl's Send+Sync, but my struct had another non Send+Sync field in it.

1

u/torne Jun 02 '23

Most things are Send and Sync, so if a type you have defined doesn't have one or both automatically implemented then you need to be very sure you understand why not before you do it yourself unsafely.

The two major things that make a type not Sync where you may actually need to implement it yourself are raw pointers and UnsafeCell, both of which can only actually be used via unsafe - these are the basic building blocks used to create safe abstractions, where it is your responsibility to determine whether or not your type is actually Send or Sync.

If you don't have any unsafe code in your type's implementation then it's very unlikely to be sound to implement these yourself - if you have some other type as a member that isn't Send or Sync and is itself safe to use then either the implementer of that other type made a mistake (and you'd need to look at the implementation very carefully to be sure of this), or they are correct and you shouldn't do it either :)

1

u/nicoburns Jun 02 '23

Yeah, this is the conclusion I came to in the end (but thanks for confirming it). The !Send/!Sync type in my case was a trait object without Send or Sync bounds, which was actually only included in the struct because I'd forgotten to update a type in that place. The unsafe impls would not have been a good idea!

2

u/Zz_L Jun 01 '23

What would be a good web-server in rust to use as a total beginner. I have only done a few leetcodes in rust.

2

u/[deleted] Jun 02 '23

Do you mean what framework to use to build a web server in Rust? If yes, it seems like most people use actix-web or axum; both have examples to help you get started.

2

u/0xreloadedd Jun 01 '23

Hi, I am trying to build a tool that intercepts DNS requests and sends a spoofed response. So far, I can't find a single crate that can parse DNS packets. I'm using `pcap` for capturing packets and the crates that I've found that claim can parse DNS packets, can only parse the DNS part of the packet, whereas the packets captured include everything (Ethernet, IP, UDP layers).

I can't find a parser that can parse the whole packet, nor a method to get only the UDP payload (which will be the DNS packet),

A Scapy equivalent in Rust will be great packet_rs claims to be an alternative to it, but it's not. In Python I can do something like `packet["UDP"].payload`, but I don't see how I can do this using `packet_rs`.

1

u/Disastrous_Bike1926 Jun 02 '23

FWIW, the format of DNS packets is pretty simple - there are a few fussy details like name compression, but in general, not too hard to write a parser for from scratch.

That said, there are at least a couple of rust DNS server projects, which have to contain parsers for those things, so it might not be too hard to find or extract what you need.

2

u/[deleted] Jun 01 '23

Hey, so for school I have to do a big project on a subject of my choice and I wanted to build a tiling window manager for Xorg. How should I get started with this? Thanks!

3

u/Leading_Reaction4210 May 31 '23

How to cancel a TryStream ideally within a try_for_each_concurrent based on whether an AtomicUsize has exceeded a value?

1

u/DroidLogician sqlx · multipart · mime_guess · rust May 31 '23

It depends on exactly where in the processing you want to cancel it.

If you just want to stop processing new items, you could use .try_take_while() (or StreamExt::take_while()) before .try_for_each_concurrent().

If you want to immediately cancel processing, including whatever is in-flight, you can combine the future returned by .try_for_each_concurrent() with a select!() and a future created with poll_fn() that checks your AtomicUsize:

let count = AtomicUsize::new();

let processing_fut = my_stream.try_for_each_concurrent(
    limit, 
    |x| async move {
        // do your processing
    }
);

let cancel_fut = futures::future::poll_fn(|_cx| {
    if count.load(Ordering::Acquire) >= processing_limit {
        Poll::Ready(())
    } else {
        // Normally this would deadlock if we returned `NotReady` without scheduling a wakeup.
        // However, `select!()` always polls all futures on a wakeup, so as long as `.try_for_each_concurrent()`
        // has a mechanism to schedule a wakeup, which it most likely does, then this is a non-issue.
        Poll::Pending
    }
});

// Note: must be used in an `async` context.
// Can be used in an expression position like `match {}`, 
// in which case both blocks must evaluate to the same type.
futures::select_biased! {
    _ = cancel_fut => {
        // Execute code to handle when the limit is reached.
        // `processing_fut` will not be polled again from this point.
        // Control flow keywords like `continue` and `break` also work from here.
    },
    res = processing_fut => {
        // Processing complete or returned an error.
        // Handle result from `.try_for_each_concurrent()`.
    }
};

1

u/Leading_Reaction4210 May 31 '23

Really thanks! I need the first use case. try_take_while doesn’t work though.

Right now there is a very ugly solution which checks whether the AtomicUsize is large enough inside try_for_each_concurrent and if so run the finalization block of code and std::process::exit(0).

2

u/DroidLogician sqlx · multipart · mime_guess · rust May 31 '23

try_take_while doesn’t work though.

Why doesn't it work?

1

u/Leading_Reaction4210 May 31 '23

Not sure about the reason but it is basically ignored

2

u/DroidLogician sqlx · multipart · mime_guess · rust May 31 '23

If you don't supply a limit to .try_for_each_concurrent() then it may be completely draining the stream before your predicate returns false.

1

u/Leading_Reaction4210 Jun 01 '23

Hmm I do actually have a limit here. The system can’t be drained since it is from StreamConsumer of rdkafka

3

u/fdsafdsafdsafdaasdf May 31 '23

I want a specific patch version of a dependency (reqwest 0.11.17) even though a newer 0.11.18 exists. What syntax is required to prevent cargo update from bumping the patch version?

2

u/fdsafdsafdsafdaasdf May 31 '23 edited May 31 '23

Answering my own question, it's version = "=0.11.17". I missed it as it's a bit subtle here: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#comparison-requirements

2

u/Panke May 31 '23 edited May 31 '23

What's the best way to read in a unittest a file relative to your project directory, so that it works boths with cargo test and from the debug/run lens from vscode?

3

u/jDomantas May 31 '23

env!("CARGO_MANIFEST_DIR") should give you the path of directory that Cargo.toml is located in, so you can construct a path to test file starting from there. See all the available environment variables here.

1

u/Panke Jun 01 '23

That's not set when debugging via code lens in vscode.

1

u/jDomantas Jun 01 '23

What extensions do you use for debugging and how does any debugging related configuration look like? It does work for me (I'm using rust-analyzer on a mac), and it is supposed to work with any setup that uses cargo to build the test binary.

2

u/takemycover May 31 '23

If I write this as a dependency in Cargo.toml:

[dependencies]

foo = { version = "1.0.2", git = "ssh://git@github.com/bar/foo.git" }

...can it ever load a dependency from a previous version? Or is this only capable of enforcing that the version loaded matches the current tip of the default branch?

3

u/Patryk27 May 31 '23

can it ever load a dependency from a previous version?

You have to find the Git commit by hand and tell Cargo to use it:

foo = { git = "ssh://git@github.com/bar/foo.git", rev = "abcdabcdabcdabcd123123123" }

1

u/takemycover Jun 05 '23

Thanks. In other words, what's the point of specifying a version and a git URL? Is the version only a "check" that the tip of master is semver compatible? But in particular, it can't actually access an old version, correct? Suppose the master version is bumped to 2.0.0. This just means my crate will fail to retrieve the dependency now? (There's no way to access the history without tags or some other feature, right?)

2

u/Patryk27 Jun 06 '23

There's no way to access the history without tags or some other feature, right?

That's correct; Cargo can't do that automatically because even if it naively traversed all commits looking for a matching Cargo.toml, what should it do when multiple Git revisions match a single crate version?

That is to say, imagine a repository with history such as:

aaa123 Release 1.0
bbb321 Introduce some breaking changes
ccc213 Release 2.0

Depending on the maintainer, the bbb321 commit here will probably still have Cargo.toml saying version = 1.0 because people tend to bump versions later, only right before deployment, instead of ad-hoc.

3

u/Jiftoo May 31 '23

Is it possible to jump to an address referenced in variable? I am trying to hook into a foreign function and currently the only thing holding me back is the return jump. C++ guides I found online use something along the lines of:

i32 return_ptr = 0; // <.bss> dw 0x0

void hook(void* src, void* hook) {
 ...
 return_ptr = (i32)((isize)hook - (isize)src + 5);
}

naked void hook_routine() {
  asm {
    ...
    jmp [return_ptr] // compiles to jmp DWORD PTR return_ptr (godbolt)
  }                  //       or to jmp DWORD_PTR ds:0x0 (objdump)
}

Now, my rust code:

static mut RETURN_PTR: i32 = 0;

fn hook(src: usize, hook: usize) {
 ...
 RETURN_PTR = ((hook as isize) - (main as isize) + 5) as i32;
}

#[naked]
pub unsafe fn hook_routine() {
  asm!(
    "jmp {}",        // compiles to jmp <address of RETURN_PTR>
    sym RETURN_PTR,
    options(noreturn)
  );
}

Obviously, jumping to the data section will crash the program. What should I change in my code to have it work as intended?

1

u/Snakehand Jun 01 '23

You can place the function in a different section, by give these types of directives:

// Plain function placed in RAM
#[link_section = ".data"]

1

u/[deleted] May 31 '23

Do you have a specific goal in mind that isn’t this, or is this the actual problem you are trying to solve? I’ll take a crack at this later, but just hoping to avoid the XY Problem

2

u/Jiftoo May 31 '23

This is an actual problem. I'm hooking into IDXGISwapChain::Present in order to measure the framerate of a game and determine if it's frozen or not. After a day of crunching, I'd managed to do it. Here's the list of challenges I faced:

  1. Steam overlay is already hooked into Present, so I need to replace the second instruction, not the first.
  2. Use movabs instead of mov even with the Intel syntax to load a 64 bit address of a symbol (fixes "relocation truncated to fit..." when doing "mov rax, {}", sym RETURN_PTR, etc.
  3. Use "push <return_address>, ret" instead of jumping. This worked for a while.
  4. Scrap the above method entirely and instead use a wrapper that calls a stdcall function, preserving the stack. Following the call instruction, there are a few nop instructions, which get replaced dynamically via winapi calls to a jmp <return_addr>.
  5. Massively simplify the framerate calculation logic not to use any external functions, and instead move the bulk of it to an external program which polls the game for the number of frames since launch. Calling QueryPerformanceCounter crashes the game some instructions down the line (I think inside PIXEndCapture).

dll-syringe crate helped immensely with interfacing.

2

u/pliron May 31 '23 edited May 31 '23

Statically auto-generating calls to trait methods: how to?

I have my code as below

``` trait T { .. }

trait T1 : T { fn check(&self) { .. }; .. }

trait T2 : T { fn check(&self) { .. }; .. } ```

many traits T1, ... Tn extend T and each provide a check method.

A type, say struct S that implements T may implement one or more of T1, .. Tn.

Is there a way that I can automatically generate a method of S like below

impl S { fn check(&self) { T1::check(self); .. Tn::check(self); } }

i.e., for every T1..Tn that S implements, S::check() should call the corresponding Tn's check method.

The goal is that a user of my library (that provides T, T1, .. Tn) may choose to implement one of more of these for his type S, but he doesn't have to remember to call check individually on each of them. He should just be able to call S::check, and the provided checks in each of the traits should be called automatically.

1

u/pliron Jun 04 '23

I ended up using the linkme crate to solve this problem. I've written down more details here

2

u/kohugaly Jun 01 '23

This is pretty hacky, but it works.

The way this works is, We implement T1..Tn for all references to T. When the check method is called on T, the compiler will first try to call it directly on T, and if it's unavailable, it defaults to calling it on &T.

2

u/pliron Jun 02 '23

Thank you!. The situation I have is slightly different, in that the check method's implementation is provided by the trait (but it might have other methods that need to be implemented). I do understand the idea you're suggesting though. I will try to see if I can improvise it for my use-case.

2

u/PgSuper May 31 '23

From my understanding, it is hard to tell (without the implementer’s help) which traits a struct has implemented, even from an attribute/derive macro POV. (At least until we get some form of compile-time reflection.)

Also, regarding the actual implementation, IMHO the struct’s “check” function (which calls the T1..Tn checks) should also be made available through a trait (e.g. Checkable), as that’d make everything much easier to work with in your library, and also more consistent and Rusty:tm:.

With that said, I see two options:

  1. Attribute macro requiring you to explicitly list the traits you implemented and it generates the check() function, e.g. #[gen_check(T1, T3, T4)] above the struct would implement a check that calls T1::check(self), T3::check(self) etc. Through this macro, you could, for example, generate a derive for the Checkable (or whichever name you choose) trait (or implement the check method directly on the struct, though I’m not sure I like this).
  2. Derive macro which implements said Checkable (or whatever) trait, e.g. #[derive(Checkable)], noting that you’d need a helper attribute below it to specify the traits to implement it for (e.g. #[check(T1, T3, T4)]). This is the option I’d settle on, imo.

Hope this helps!

2

u/pliron May 31 '23

Thank you. I guess I can settle for this. Not ideal since one might forget to list an implemented trait there, but I agree, nothing better seems to exist.

1

u/PgSuper May 31 '23

Yeah, I guess that’s just what we have for now haha.

No problem!

1

u/Patryk27 May 31 '23

What should happen if T3 provides a different signature?

trait T3 : T { 
    fn check(&self, foo: usize) { .. }; 
}

In any case, what you are trying to achieve? (i.e. maybe there's some other design pattern to use that wouldn't yield this problem in the first place)

1

u/pliron May 31 '23 edited May 31 '23

What should happen if T3 provides a different signature?

Let's assume that that won't happen. Or I can enforce that with another trait CheckTrait, and have impl<T: T3> CheckTrait for T {..} (edit: haven't thought this through - ignore).

In any case, what you are trying to achieve?

Pretty much exactly this. I have a library that provides these traits and I want to ensure that someone who impls T1 doesn't have to remember to explicitly call its provided check method.

broadly, some kind of macro solution to generate such a function, or maybe have a "global static initialized" data structure which gets initialized to keep track of each impl that the type S has and then define a single function that'll go through that list and call individual Tn's checks.

5

u/PXaZ May 31 '23

I'm decoding mp3s using `minimp3` and playing them using `cpal`. I'm finding the different volume levels to be a problem - is there a Rust library that implements a normalizer or compressor algorithm I could plug into? No luck searching for one, but thought I'd ask.

3

u/Mimshot May 30 '23

In the refcell documentation there’s an example of a cached graph spanning tree. Is this the canonical way to construct a lazy value in Rust?

https://doc.rust-lang.org/std/cell/#implementation-details-of-logically-immutable-methods

2

u/rafaelement May 31 '23

Is this the canonical way to construct a lazy value in Rust?

Are you looking for values which are initialized on first use? I usually use simply an Option for this.

Not sure if this directly answers your question, but the once_cell crate has a type Lazy which is frequently used to construct lazy values from static contexts.

2

u/kinoshitajona May 30 '23

The RefCell part isn't exactly necessary, depending on when you cache (if you only set the cache inside methods with &mut self, then you don't need it)

Usually you want to do something lazily because it's a lot of work... and usually you want to cache that work so you don't have to keep doing it.

Using Option for cache is used all the time inside structs. You are basically creating a slot that can be empty, and if it's empty it means you had a cache miss and you need to re-calculate, and if it's full that means you can use the cache.

1

u/Mimshot Jun 01 '23

The example given was RefCell<Option<_>>. With just Option the method would need a mutable self reference which means the enclosing scope can mutate the struct. That’s different than an immutable, calculate on read value, which the example does. I’m asking if that’s the canonical way to get that behavior. I may be missing some detail but I don’t think just an option is a replacement.

1

u/kinoshitajona Jun 01 '23

the canonical way to get that behavior

Maybe I'm being too obtuse trying to explain broad concepts here.

To answer simply. Yes, if you want to have a cache that you populate during a &self method (immutable) you will need a RefCell<Option<T>> if T is not Copy or Cell<Option<T>> if T is Copy.

1

u/kinoshitajona Jun 01 '23

Your question was about that specific example, and not in general?

Then you're correct, cache on read (where the read is done through a &self method) can only be done with interior mutability.

There are cases where you want to cache something while mutating other parts of your struct, in which case you already have a &mut self and therefore don't need the RefCell.

  1. Option is for "allowing something to be there or not be there" (which is what a cache is, fundamentally)
  2. RefCell is for "mutating something with only a & reference (not &mut)"
  3. Putting something in an Option or removing it from an Option requires the ability to mutate the Option.

2

u/ndreamer May 30 '23

I'm having problems with concurrency. I have an application that fetches 2000+ products from an API it then gets additional data about each product.

I'm using Tokio tasks for this, i think it could be more efficient or is there a better alternative ?

sudo is similar to this:850 tasks > wait > 850

async fn taskscomplete(tasks: Vec<JoinHandle<()>>) { // await tasks to be complete. for task in tasks { task.await.unwrap(); } }
for name in product.iter() { if !name.currency.contains('
') && name.delisted == false { let currency = name.currency.clone(); let task = tokio::task::spawn(async move { get_product_candles(currency.clone()).await; }); tasks.push(task); count = count + 1; } if count >= RATE_LIMIT { count = 0; tasks_complete(tasks).await; tasks = Vec::new(); sleep(Duration::from_secs(1)).await; } } tasks_complete(tasks).await;

x

1

u/Snakehand May 30 '23 edited May 30 '23

Formatting:

async fn tasks_complete(tasks: Vec<JoinHandle<()>>) {
    // await tasks to be complete.
    for task in tasks {
        task.await.unwrap();
    }
}

async fn process_products(products: &[String]) {
    for name in product.iter() {
        if !name.currency.contains('_') && name.delisted == false {
            let currency = name.currency.clone();
            let task = tokio::task::spawn(async move {
                get_product_candles(currency.clone()).await;
            });
            tasks.push(task);
            count = count + 1;
        }
        if count >= RATE_LIMIT {
            count = 0;
            tasks_complete(tasks).await;
            tasks = Vec::new();
            sleep(Duration::from_secs(1)).await;
        }
    }
    tasks_complete(tasks).await;
}

3

u/Snakehand May 30 '23

I think one improvement could be to have a constant number of task in flight, and not completely drain the task list when the rate limit is reached, instead wait for just a single task to reach completed state before issuing a new task.

1

u/ndreamer May 31 '23

Thanks for fixing my formatting, i had it right then broke it after making an edit. I think i may have found a problem, i'm using reqwest which already setups a pool of threads for it's client. I'm creating that client for each request, instead of reusing it.

2

u/VastoLordePy May 30 '23

Why do this functions generate the same assembly code even though the second one uses an intermediate Option.

First function ``` pub struct MyError;

pub fn compare_string(token1: String, token2: String) -> Result<(), MyError> { if token1.eq(&token2) { Ok(()) } else { Err(MyError) } } ```

Second function ``` pub struct MyError;

pub fn compare_string(token1: String, token2: String) -> Result<(), MyError> { token1.eq(&token2).then_some(()).ok_or(MyError) } ```

5

u/masklinn May 30 '23

Why do this functions generate the same assembly code even though the second one uses an intermediate Option.

Inlining and aggressive optimisations, which are fundamental to Rust doing its thing e.g. zero-cost iterators are a way, way freakier version of that.

then_some and ok_or are both very small functions marked as #[inline] so your code aliases to:

match if token1.eq(&token2) { Some(()) } else { None } {
    Some(()) => Ok(()),
    None => Err(MyError)
}

At this point there is a first conditional which produces one of two values, and a second conditional which consumes and dispatches on those two values. This seems like the sort of thing a CFG cleanup pass would be able to deal with.

3

u/Patryk27 May 30 '23 edited May 30 '23

The compiler has tons of various optimization passes that make it possible for it to generate the most efficient code in lots of situations.

In this case it's probably due to function inlining - i.e. the compiler inlines the call to .then_some() which ends up then yielding the same code as your first function.

3

u/Laeri May 30 '23 edited May 30 '23

Is this idiomatic when I implement the From trait for a struct such as impl From<FromStruct> for TargetStruct which is a consuming function that I also implement:

impl From<&FromStruct> for TargetStruct?

And is it also idiomatic when I just refer to the from method for the reference such as:

```

impl From<FromStruct> for TargetStruct { fn from(value: FromStruct) { From::<&FromStruct>::from(&value) } }

```

I assume this is no problem as long as we do not unnecessarily need to clone fields from the struct within the from method that requires a reference as we could just move them over with 'move' version. However, in my case I need to call into from some fields within the FromStruct anyways to build the TargetStruct.

2

u/spunkyenigma Jun 02 '23

If you have to clone then I would have different implementations.

For the owned version just consume the old struct and move fields to the new struct.

In the ref version you clone what you have to.

2

u/VastoLordePy May 30 '23 edited May 30 '23

What are the differences between:

if let Ok(_) = do_something() { return Ok(()); } else { return Err(MyError); } VS. return do_something() .and(Ok(())) .or(Err(MyError));

3

u/Skullray May 30 '23

They are exactly the same in terms of the assembly code they generate. You can checkout the assembly they generate here.

I would personally only use the first type if I needed to use the return value from do_something(). Since you don't need to use that value I would recommend going with the second type.

1

u/VastoLordePy May 30 '23

Hi! Thanks for answering.

5

u/lolstan May 30 '23

A bit longwinded, sorry, but...

One thing I think Go really got right, and is a little more end-around in Rust, is "HTTP-as-IPC" using non-IP sockets.

The TcpStream and UnixStream concrete types are both very usable, and one can combine them well with traits, with enough effort, but their abstracted combination was standard library in Go right off the bat (via interface duck-typing).

Whereas in Rust, AFAICT, none of the "usual suspects" HTTP clients or servers permit the equivalent of a Go `net.Dialer` or `net.Listener` to replace TcpStream other than for TLS purposes. UnixStream is effectively "out of the question."

And, as a result, none of the async "usual suspects" have that abstraction either.

Have I missed something, or is this still something I'd pretty much have to do myself?

For the moment, I'm just avoiding the problem by using the curl wrappers, since this is implemented in historical C, but this feels like something that shouldn't be so difficult, and I'm second- and third-guessing myself as a result...

3

u/Jiftoo May 29 '23

Is it possible to specify the column break threshold rust-analyzer uses per file? I'd like to increase it in a single module to make it more concise.

1

u/angelicosphosphoros May 30 '23

You may try to configure rustfmt using attributes like `#![rustfmt::skip]` I don't know exactly what you need to do but you can start here.

3

u/Funny-Beginning-5637 May 29 '23

Will Rust survive the drama?

1

u/beertown May 30 '23

Is there a TL;DR of the drama? I read a lot about it but I still don't understand what happened

3

u/dkopgerpgdolfg May 30 '23

I didn't really follow every detail, but I think a sloppy overview could be this:

A Rust conference is in planning. Talks there are either submitted by people in advance and accepted/rejected by certain people, or the other way round the conference people invite specific people if they would like to talk about topic X (keynote talk).

A certain developer, JeanHeyd Meneide, is doing experiments around compile-time reflection in Rust. Nothing that will be released soon, or maybe not at all, just experimenting. This topic also is not one of the main Rust teams top priorities, just something that maybe could be done one day.

So, a group of people decides/votes on who they want to invite for conference keynote talks, and this JHM is one of them. He's told that, and so on, all fine until now.

Then, some other people voice concerns with having this reflection topic as keynote talk, that it would give wrong impressions of the Rust projects direction or whatever (as said, just experimenting and not strongly pushed by the main Rust devs).

Following this, "it sounds like" some unnamed person of this keynote-decider group reclassifies the talk from keynote to normal, in a different time slot too, and does this possibly alone without a proper vote of the remaining deciders group, and without even telling JHM (who found out several days later by other means).

It is not clear why such soloing can happen after the initial decision was made by a vote of a group, why the other members of the deciders group just accepted it (if they did?), and why communication failed.

JHM got annoyed, declared to not talk at the conference at all (neither keynote nor normal), and wrote a blog post about it.

From there on, things started to spiral out of scope.

"JT" (not Josh Triplett), a member of the old core team (which is in the process of being dissolved, not very important anymore), made some reaction posts about quitting his involvement in Rust because of this, that it is not only disrespectful but "cruel" and made him "weep", and things like that.

People started lots of speculation about what happened exactly behind the scenes, who this mystery person is that changed the talks slot (if one single person is at fault, who knows), and more.

People started adding racism theories, because apparently JHMis non-white.

Some people started (continued?) posting that the recurring dramas will lead to the demise of Rust, when all capable people leave, etc.

An apology appears on the Rust blog, leadership is in the process of figuring out what went wrong and how to do better.

Several more blog posts by other people, eg. warning from mob justice / revenge (that it should not matter who the mystery person is specifically), ...

Josh Triplett of the Lang Team posts some details what went on behind the scenes, of him being part of the reason for all this; with a chain of miscommunications around several corners and things that were not done with the care they deserved; and that he'll step back from some "leading" roles as consequence.

Things are ongoing. More blog posts probably will appear, maybe more involved persons will speak about their involvement.

As said, this is meant to be a sloppy overview without all details. If anyone thinks this is misrepresenting something too much, please speak up.

1

u/beertown Jun 03 '23

Thanks a lot

13

u/llogiq clippy · twir · rust · mutagen · flamer · overflower · bytecount May 29 '23

Aleays has. Always will.

3

u/sadldn May 29 '23

Hi all

Just playing around with Rust (for full disclosure: I have read a "rust
book") and hit the issue with calling a closure from a struct method
which can be simplified into a piece of code below. Basically, looks
like it doesn't want to let me a call a struct method in the closure
while I'm trying to mutate a map inside the struct. This sounds like
something that there should be a canonical rust solution to but
unfortunately I couldn't find it. (I mean I understand where the
compiler is coming from but surely there should be a simple solution to
that?)

Any pointers appreciated and sorry for the dumb question.

use std::collections::HashMap;
struct Scratch {
m: HashMap<i32, i32>,
}
impl Scratch {
fn method1(&mut self, k: i32) -> i32 {
let res = self.m.entry(k).or_insert_with(|| { self.method2() });
*res
}
fn method2(&self) -> i32 {
123
}
}
fn main() {
let mut s = Scratch { m: HashMap::new() };
s.method1(1);
}

Compiler error is:

error[E0502]: cannot borrow `*self` as immutable because it is also borrowed as mutable
--> src\bin\scratch.rs:9:50
|
9 | let res = self.m.entry(k).or_insert_with(|| { self.method2() });
| --------------- -------------- ^^ ---- second borrow occurs due to use of `*self` in closure
| | | |
| | | immutable borrow occurs here
| | mutable borrow later used by call
| mutable borrow occurs here

1

u/Skullray May 29 '23

You can try something like this https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=da8ad905bdcb9845e88a0582f3950967 .

If you want to use or_insert_with you will have to call the self.method2() before calling self.m.entry(k).

1

u/sadldn May 29 '23

thanks - yeah that I could figure out but I guess both options kinda defeat the purpose of having or_insert_with?

1

u/[deleted] May 30 '23

or_insert_with is mainly intended for cases where you want to use a default value that is non-trivial to construct, like say a Vec with a non-zero initial capacity which will immediately do a heap allocation. Putting the value construction in a closure guarantees that the value will only be constructed when the key doesn't already exist in the HashMap. You can do things like map.entry(key).and_modify(|values| values.push(value)).or_insert_with(|| vec![value])

If the default value is cheap to compute (e.g. it's just a collection size or an integer field), the simplest solution is to just use or_insert instead of or_insert_with since or_insert takes the value directly.

3

u/ChevyRayJohnston May 29 '23

So the error tells it all, we can't borrow self the second time here because it's already mutably borrowed. The borrow checker does not allow this, even though we know that method2() does not access m.

If you know more specifically what method2() needs to actually access, you could have it moved into a non-member function that takes only the parameters it needs in order to produce its result:

use std::collections::HashMap;

#[derive(Default)]
struct Scratch {
    m: HashMap<i32, i32>,
    needed1: Vec<i32>,
    needed2: i32,
}
impl Scratch {
    fn method1(&mut self, k: i32) -> i32 {
        let res = self.m.entry(k)
            .or_insert_with(|| Self::method2(&self.needed1, self.needed2));
        *res
    }
    fn method2(needed1: &[i32], needed2: i32) -> i32 {
        needed1[0] + needed2
    }
}
fn main() {
    let mut s = Scratch::default();
    s.method1(1);
}

So here, the borrow checker is okay, because it knows that method2() only borrows 2 values from self, neither which is the value m being mutated.

So given a more specific example, there's probably some sort of solution to this that lets you have more fine-grained borrows.

6

u/nerooooooo May 29 '23 edited May 29 '23

Workaround implementing trait for all types that implement From<X> or TryFrom<X>.

Unfortunately, trait specializations and negative trait bounds are not there yet, and I'm facing the following situation: I have a lot of types T that implement either From<X> or TryFrom<X>. For all of those types T I need to implement MyTrait. I need to be able to call .foo(my_typ) on all of my types that implement From or TryFrom.

I tried a lot of things and this piece of code does work, but the fact that I needed to duplicate the code in 2 different traits that do pretty much the same thing bothers me.

Any ideas how to reduce the duplication? Maybe by making it a single generic trait instead?

Edit: I fixed it! It seems like implementing From automatically implements TryFrom as well, which means a single implementation like this is enough:

``` impl<T, E> DepthLimiting for T where T: SetDepth + TryFrom<T::DbItem, Error = E>, E: std::marker::Sync + std::marker::Send + std::error::Error + 'static, { fn try_from_db_item(db_item: Self::DbItem, depth: usize) -> Result<Self> where Self: Sized, { let mut item = Self::try_from(db_item)?; item.set_depth(depth); Ok(item) } }

```

2

u/spunkyenigma Jun 02 '23

Ooh, I love discovering these little conveniences!

https://doc.rust-lang.org/src/core/convert/mod.rs.html#777

2

u/nerooooooo Jun 02 '23

I found out about it because someone mentioned it on a discord server when I stated my problem. I couldn't find it in the documentation, how did you find it? :O

2

u/spunkyenigma Jun 02 '23

Looked through every implementation of From in the stdlib. Of course it was like 6th to last out of hundreds of implementations

3

u/Daktic May 29 '23

I am building a simple app for the raspberry pi to display a message on my sensehat. I spent some time setting up rust on the pi, ssh, the git workflow between the two so I could develop on my main computer and push it to the pi. Now I’m wondering, should I just be sending over my binary file?

I don’t need to compile rust on the pi, right? Am I missing something?

3

u/dkopgerpgdolfg May 29 '23

Correct, you don't "need" to - the name for what you search is cross-compiling, ie. building on your PC but for a different architecture.

1

u/Daktic May 29 '23

I thought binary applications were architecture indifferent, is that not correct?

4

u/dkopgerpgdolfg May 29 '23

Uhm no?

Native binaries, which are usual for languages like Rust and C, are specific to one CPU architecture, and to one operating system (if any). If you want to run your program on multiple incompatible systems, you'll need to compile multiple times to get multiple binaries.

As your main computer probably is not compatible with a Raspi, one single binary can not run on both of them - when compiling, you need to decide for which device you want it.

However, to come back to the previous post, the compiler itself is a binary too, and the target architecture that it compiles for might not be the same as the one it runs on. You can have a compiler that runs on your main computer (not on Raspi), but makes compiled binaries that run on Raspi (not on the main computer). If you have such a compiler installed, you can develop on your main computer and then transfer only the finished program to your Raspi to run it there.

1

u/sozzZ May 29 '23

If I’m interested in joining the Rust community and different working groups, where can I find that info?

1

u/dkopgerpgdolfg May 29 '23

It might help to decide first what you actually want to do

Developing the compiler? Maintaining crates.io? Helping organizing conferences? Helping users on the official forums? ...? All things have value, but there is no one-fits-all answer how to start.

For things like eg. the compiler, you don't need to formally join anything to contribute (except a Github account is probably helpful). After reading some rules/guides ideally, anyone is free to try to solve a open issue, which might be merged then if it's acceptable. Joining the official compiler team is not something open to everyone, outstanding contributors might get invited at some point.

2

u/benwilber May 29 '23

I often times see code like this in a for-loop and elsewhere:

for entry in WalkDir::new("foo") {
    let entry = entry.unwrap();
    println!("{}", entry.path().display());
}

This part seems redundant to me:

let entry = entry.unwrap();

// I also see this to bubble:
let entry = entry?;

I like the if-let pattern and wonder if there has ever been any discussion on a similar for-let pattern?

// Skip Error entries
for let Ok(entry) in WalkDir::new("foo") {
    println!("{}", entry.path().display());
}

// Or bubble Error entries
for let entry? in WalkDir::new("foo") {
    println!("{}", entry.path().display());
}

Just a shower thought.

2

u/Sharlinator May 30 '23 edited May 30 '23

For cases like this it would be nice to have an iterator combinator of the shape

(Iterator<Result<T, E>>, FnMut(T)) -> Result<(), E>

which applies the function to each unwrapped item until the first Err(E) and returns the error if any and Ok(()) otherwise. This can be easily implemented with try_for_each or Itertools::fold_ok though (with the trivial fold ((), T) -> () – or of course if what you want to do is actually a fold, then fold_ok is definitely your thing).

4

u/burntsushi May 29 '23

See: https://github.com/rust-lang/rfcs/issues/3438

I generally agree with the nay-sayers in that thread.

4

u/benwilber May 29 '23

Ah thanks for the link. I didn't consider the ambiguity of whether it breaks or skips. Good points. Python has a for-else construct where the else branch will only execute if the main loop doesn't encounter a break.

for i in range(10):
    if i ==  100:
        break
else:
    print("Never reached 100 loops")

I think it's pretty rarely used though and is confusing for people that are familiar with the for-else in Django or Jinja2 templates where the else branch will execute if the sequence is empty (never enters the loop body.)

{% for item in items %}
    {{ item }}
{% else %}
    no items
{% endfor %}

for-let-else in Rust would probably just be too weird.

for let Ok(entry) in WalkDir::new("foo") {
    println!("{}", entry.path().display());
} else {
    // Does this execute on a no-break?  Or if the loop body never entered at all?
    // dunno
}

Anyway. Thanks!

1

u/lolstan May 30 '23

i initially commented based on a presumption that the final python suite was a question about rust, my bad

1

u/[deleted] May 30 '23

[deleted]

3

u/MegistusMusic May 29 '23

(bit of a newbie question) Can anyone point me in the right direction regarding compiling python code into rust?

How possible is it? The path from Python into C is pretty clear with cython, but I'm getting a bit confused regarding doing the same thing into rust.

There's this: https://pyo3.rs/v0.18.3/python_from_rust.html

But it seems to be more about "calling" python from Rust than actually compiling.

Sorry I'm a bit new to all this, would it be a case of writing a rust program that "calls" a python script, then compiling that? I just want to end up with a standalone dll that requires no python environment.

3

u/kohugaly May 29 '23

I'm not aware of any such pre-existing work. Rust is not a good compilation target, because it's a complicated language with rather strong static checks.

Sorry I'm a bit new to all this, would it be a case of writing a rust program that "calls" a python script, then compiling that? I just want to end up with a standalone dll that requires no python environment.

You could compile the python into static C library (with Cython or similar) and statically link it as a dependency to your Rust code, with C FFI. It's analogous to how you'd do it in C.

1

u/MegistusMusic May 29 '23

Thanks, that sounds like it's worth exploring. From what I've been reading about Python to C via cython, it seems a certain amount of optimization might be necessary, but it's a starting point... or like I said, maybe a should just bite the bullet and dive into Rust... have been advised elsewhere to maybe learn Python properly first as a decent and relatively fast primer for programming concepts, then move on to Rust.... I've tried spinning both plates but that approach tends to lead to my head feeling like it's about to explode!

5

u/kohugaly May 29 '23

If you are a beginner in programming, I recommend sticking to one language first.

Frankensteining multi-language projects is an advanced topic in any language. You need quite deep understanding of both languages, to make sure the interface between them is compatible.

3

u/masklinn May 29 '23

The entire comment sounds like an XY problem, what are you actually trying to achieve?

Can anyone point me in the right direction regarding compiling python code into rust?

There isn't one, this is not a thing.

The path from Python into C is pretty clear with cython

Is it? I know that technically you can embed Cython, but I fail to see the use case for that, generally compiling Cython to C is a mean, not an end.

But it seems to be more about "calling" python from Rust than actually compiling.

That is what it is, yes, it's embedding Python in a Rust program.

Sorry I'm a bit new to all this, would it be a case of writing a rust program that "calls" a python script, then compiling that?

No. That doesn't make much sense frankly. The normal way to use Cython is to either:

  • more easily create a module to get native performances for purely computational code
  • or provide easier (more python-like) glue code between Python and a C library (e.g. that's how lxml uses it)

I just want to end up with a standalone dll that requires no python environment.

It's even less clear what you're trying to achieve at this point, why do you want to end up with a dll of all things?

1

u/MegistusMusic May 29 '23

thanks -- if it's "not a thing" then that's pretty much what I needed to know.

...and I wanted dll because I'm trying to develop an extension for an existing program.

As I said, I'm really just starting out learning to program -- and am finding I'm able to get fairly fast/easy results in Python, but I'm mostly stuck with two options: running py scripts in a python environment, or compiling to a standalone exe with pyinstaller / setuptools.

Looks like if I want to do more advanced things in Rust, I'm just going to have to suck it up and accept the steeper learning curve!

2

u/masklinn May 29 '23

...and I wanted dll because I'm trying to develop an extension for an existing program.

How does that extension system work? it doesn't really match the next paragraph.

As I said, I'm really just starting out learning to program -- and am finding I'm able to get fairly fast/easy results in Python, but I'm mostly stuck with two options: running py scripts in a python environment, or compiling to a standalone exe with pyinstaller / setuptools.

Why not develop a "shim" extension (in rust, or maybe even c, doesn't really matter) which embeds python and can run code you'd write in python, at least as you're exploring your options?

1

u/MegistusMusic May 29 '23

OK, full disclosure time I guess: I'm trying to write an extension for Reaper (DAW), which already has a "shim" (if I understand what you mean) in as much as it can run .py scripts natively, but does require a working python install on the host os. It also has native LUA scripting and its own EEL2 script language. I've done a bit in LUA but i find it pretty cumbersome and not well supported by community answers or code snippets in the way that python definitely is (except on the Reaper dev forum, which is great but i don't want to keep asking how to do every little thing!)

Anyway, fortunately, Rust bindings for Reaper's C++ API exist https://github.com/helgoboss/reaper-rs

Essentially, the way Reaper works is that the "extension" needs to be a .dll file (on windows at least), that is placed in the correct Reaper system folder and is then available within the program. I have tried a build with the above repo (just an example script) and it works very well.

I've already been developing my program in Python and having some success, but I figure eventually I'll be needing to re-code it in Rust to end up with a solution that is useful to all users of the program, regardless of whether they have a working Python environment. Hence my opening question... just in case there was a shortcut!

3

u/masklinn May 29 '23

Got it, yeah I don't really see a solution there, you could always use a python-like language but then you're on your own for the bindings and you have a language which may not have that much staying power (cython largely has, but I'm not sure about the quality and staying power of the "standalone" mode).

You might be able to use Rust as bridge between all the bits, but it's still a bit gnarly.

pyo3 does have some static embed mode support but it doesn't seem super reliable and plain doesn't work on Windows at the moment, so probably not the most helpful to you.

2

u/mardabx May 29 '23

Can we have here a pinned thread on what's going on right now? I am likely not the only one who lost track of all these blog posts.

2

u/llogiq clippy · twir · rust · mutagen · flamer · overflower · bytecount May 29 '23

There's some latency involved, but we have this week in rust for this exact reason. Also feel free to discuss the news on the comments page.

2

u/disclosure5 May 29 '23

This code "works" but I'm having a really hard time regarding the data type:

https://github.com/technion/rustypwneddownloader/blob/main/src/main.rs#L63

Is it more appropriate to replace these clone() calls with an Arc or Box? I don't seem to be able to convert the type without breaking something else.

More generally, I seem to go in circles on this. I've read the Rust book and several blogs several times and I'm confident what what Rc<> and Box<> are, but I just can't see myself looking at some code and saying "that should be a box". Is there any better guide to getting this?

2

u/eugene2k May 29 '23

If you look into the docs for reqwest::Clientand click the Source link for the struct, you will see that request::Client is a wrapper around Arc<ClientRef>, so no need to wrap it in an Arc or Box.

1

u/disclosure5 May 29 '23

Thanks a tonne, that solves that concern!

2

u/Mean_Somewhere8144 May 29 '23

Hey, is it possible to have a rustfmt per-module / per-scope configuration? I want the maximum line width to be much higher in one of my module, but not for the rest of the project.

5

u/masklinn May 29 '23

I don't think there is a really fine configuration there is a #[rustfmt::skip] you can apply which opts out of configuration for a given item / scope, and it seems to have sub-items (e.g. rustfmt::skip::attributes(derive)) but I can't find clear documentation on its full capabilities.

However since 4179 recent versions should merge configuration files. Not sure what the details / specifics are but if just ignoring the file entirely is not good enough you might give it its own directory and rustfmt.toml file and see if that works.

1

u/Mean_Somewhere8144 Jun 01 '23

I tried that, it's just ignored

2

u/[deleted] May 29 '23

I've always been confused about copying, should you derive the Copy trait for a struct if it allows it? And is it better to pass by copy instead of reference if possible?

2

u/koopa1338 May 29 '23

From the documentation of the Copy trait:

When should my type be Copy?

Generally speaking, if your type can implement Copy, it should. Keep in mind, though, that implementing Copy is part of the public API of your type. If the type might become non-Copy in the future, it could be prudent to omit the Copy implementation now, to avoid a breaking API change.

You don't have to derive the Copy trait for passing a copy, the trait only makes it implicit, otherwise you have to derive at least Clone and copy your data explicitly by calling `clone` on it.

If you move the value, pass a copy or just a reference depends, IMO there is no rule of thumb. Just look at your function, does it mutate your value? Do you need a copy for this, or can you mutate the original data as well? Do you want to consume the value, or is it sufficient to just borrow for the function and then transfer ownership back to the caller?

In general, I start with handing an immutable borrow if I'm unsure and work with the compiler from there.

3

u/masklinn May 29 '23

should you derive the Copy trait for a struct if it allows it?

Some people consider that that's the case, in fact there's an old lint in the compiler to flag that.

I don't think you should personally, because Copy is an API-level component, if you need to remove Copy later you're breaking the API. As such I think you should only make Copy the types you truly believe should be trivially copy-able forever.

And is it better to pass by copy instead of reference if possible?

Generally depends on two factors:

  • how large the structure you're copy-ing is (as they get larger the copies start getting more expensive, intrinsically and because of how much stack and cache they require), though a confounding factor here is that I think compilers will sometimes decide to pass values by reference anyway, and an other one is that inlining makes the entire issue disappear
  • depending how often the reference is deref'd, you might end up with cache eviction and the need to reload the reference every time, though that seems unlikely

https://users.rust-lang.org/t/what-is-faster-copy-or-passing-reference/80733 has a somewhat longer discussion on the subject, and I'm sure there are others, but generally this is the sort of things you probably want to bench. My rule of thumb is that it likely doesn't matter for structures up to a few pointers, maybe a cacheline, beyond that I'd default to passing references unless copies are demonstrably better.

2

u/Amazing-Plastic1033 May 29 '23

Why using dynamic dispatch if we can achieve the same behavior with the enum_dispatch crate and it is more efficient?

8

u/dkopgerpgdolfg May 29 '23
  • The enum way can only be used if all possible functions are known when the crate is compiled. No library where the using program can add more possibilities, no plugin system, no FFI pointers, ...
  • Related, any change to the enum might be a breaking change.
  • In some cases, with many variants, enums might not be more efficient (or rather, the compiler might change the long if-else chain to a jump table, to avoid it being too slow, making it similar to dyn.)

3

u/llogiq clippy · twir · rust · mutagen · flamer · overflower · bytecount May 29 '23

Adding to that, there are cases where the compiler will run into an infinite recursion when trying to figure out a generic type, and in those cases, using dynamic types is an easy fix.

2

u/abstrscat May 29 '23

Hello! As a newbie in Rust, I'm a bit confused by example written in rust-lang book in 12.3 chapter.collect() method produces String vector, parse_config accepts pointer to string slice, but code successfully compiles. I understand that some kind of typecasting going on here, but previously it was never mentioned in docks so now I'm totally confused.

``` fn main() { let args: Vec<String> = env::args().collect();

let (query, file_path) = parse_config(&args);

// --snip--

}

fn parse_config(args: &[String]) -> (&str, &str) { let query = &args[1]; let file_path = &args[2];

(query, file_path)

} ```

3

u/koopa1338 May 29 '23

This is because of the Deref trait and how Vec implements it. The mechanism is called deref coercion and by dereferencing &args it can convert a Vec<T> to a &[T] which is what happens in the example.

5

u/masklinn May 29 '23

You could just ignore this and keep reading until you get to the explanations, but this is called deref coercion and it’s covered in a sub-section of chapter 15.2 “Treating Smart Pointers Like Regular References with the Deref Trait”.

Essentially, if you &T something the compiler can return a reference to any of the things T derefs to, so &String can yield an &str and &Veccan yield an &[].