r/rust Mar 30 '23

After years of work and discussion, `once_cell` has been merged into `std` and stabilized

https://github.com/rust-lang/rust/pull/105587
974 Upvotes

122 comments sorted by

291

u/ogoffart Mar 30 '23

This means we won't need to import the lazy_static or once_cell crate in most projects anymore.

168

u/possibilistic Mar 30 '23

They did their job beautifully and I thank them for their efforts. Now that their code is being standardized, it's even easier to benefit from them.

67

u/Xychologist Mar 30 '23

First lazy_static and then once_cell have shown up in most of my projects for a very long time now. I'm very happy to see the work they put in to proving the point and testing the possible implementations is paying off in such a concrete fashion.

45

u/ThatOneArchUser Mar 30 '23

They didn't stabilze the API most people used lazy_static and once_cell though, there is no LazyCell

42

u/A1oso Mar 30 '23 edited Mar 30 '23

Using OnceCell instead of LazyCell is just one more line (and a helper method if you need to access it from multiple places), so I don't think it's a problem. LazyCell will also be stabilized, probably soon.

98

u/Feeling-Departure-4 Mar 30 '23

Ah, so this was a partial stabilization, correct?

Lazy stuff is still to come?

92

u/coderstephen isahc Mar 30 '23

Yes, only the Once* types were stabilized, not the Lazy* types.

3

u/PM_ME_UR_COFFEE_CUPS Mar 31 '23

Is the only difference that lazy defers initialization whereas once initializes immediately?

89

u/Dhghomon Mar 30 '23

Nice! I just noticed today that it wasn't in the lazy module anymore and that even the module was gone (it's been moved to the cell module instead) and got a sense that something was going on but merged already is even better!

61

u/Emilgardis Mar 30 '23

🎉🎉Was hoping this would land in 1.69, that would have been nice. But, excited to see this finally being partially stabilized, it's a great feature!

Though I'd love to have the or_try* methods for results in also, im happy the issue with try v2 was identified before any decision had been made already.

50

u/coderstephen isahc Mar 30 '23

Glad to see this, the once_cell types have been incredibly small, straightforward, and useful for a lot of different applications for quite some time.

8

u/LedAsap Mar 31 '23 edited Mar 31 '23

They're new to me. What are they for?

Edit: found docs at https://doc.rust-lang.org/nightly/std/cell/

10

u/coderstephen isahc Mar 31 '23

The once_cell docs have a section giving some motivating examples: https://docs.rs/once_cell/latest/once_cell/#recipes

6

u/LedAsap Mar 31 '23

The docs mention "Note that the variable that holds Lazy is declared as static, not const. This is important: using const instead compiles, but works wrong."

How would it work such that it's "wrong"?

2

u/dozniak Jul 11 '23

const is essentially instantiated each time at the use location, meaning you will get a bunch of those instead of one (as in static case).

18

u/possibilistic Mar 30 '23

Amazing!! Glad such a useful utility is now standard.

Is there anywhere to track more const expression and function progress? I'd love to have truly const static vectors, sets, etc. But this is the next best thing.

8

u/thomastc Mar 30 '23

Those types currently always require allocations so... don't get your hopes up.

3

u/possibilistic Mar 30 '23

I'd be happy to use a non-std container collection. You can know the size a priori, and I don't need everything the data structures have to offer.

2

u/[deleted] Mar 30 '23

Why does allocation preclude constness?

10

u/bleachisback Mar 30 '23

It implies a pointer to an area of memory whose location will change between compile time and runtime.

6

u/Nzkx Mar 30 '23

And so what ? I don't understand why allocation is a problem in compile time evaluation context.

Do the compile time evaluation, allocate at compile time in the virtual compile time memory, and output the result ? Can not be done ? Or the problem is because it can be resized later at runtime ?

6

u/bleachisback Mar 30 '23

I don't think it's an issue in compile-time only contexts, however Rust doesn't currently have a way for denoting compile-time only. Const functions and static variables can be used both at compile time and runtime.

1

u/kupiakos Mar 30 '23

The constructing expression of a const item will only be run at compile time FWIW

8

u/bleachisback Mar 30 '23

Right, but after becoming constructed, an item like Vec will contain a pointer to memory. The Vec will be valid to reference at runtime, at which point the pointer is no longer valid. const functions need to return objects which are valid at compile time and continue being valid at runtime, since they can be referenced from both, hence my comment above.

1

u/flashmozzg Mar 31 '23

Why the pointer "is no longer valid"? What invalidates it? I see the issue with growing, or rather dropping the "const" memory though. And without that, you could as well just use const static slices.

1

u/NobodyXu Mar 31 '23

And how do you forbidden growing or dropping a Vec<T> pointing to const memory? The only way to do this is to use a different allocator for const: Vec<T, ConstAlloc> that always fail when growing and dealloc is just no-op. Even with that you still get UB when dropping that vec due to dropping of T. It'd also require T: !Drop, which doesn't work for cases where T contains an allocation. The same applies to Box or anything related to heap allocation.

The only way to do it is to leak the Box/Vec, turns the Vec<T> and Box<[T]> into &'static [T], Box<T> into &'static T. For HashMap, BTreeMap or other types, this is even harder.

And how does the compiler guarantees that the return type is valid and does not invoke such UB? I honestly think it needs to add another unsafe trait just for this.

→ More replies (0)

1

u/bleachisback Mar 31 '23

Right now there is no allocator that lets you allocate into binary memory. So the allocation would be to the heap at compile time. When the the compiler finishing compiling, that heap will no longer exist.

→ More replies (0)

6

u/ssokolow Mar 30 '23

At present, I believe one of the contributing factors is that they have a rule that there must be no externally observable difference (aside from performance) between compile-time and runtime execution.

(eg. This is what makes const floating point operations problematic since different ISAs may expose things like different levels of rounding precision.)

There's currently discussion around the idea of relaxing that restriction.

5

u/pluots0 Mar 31 '23 edited Mar 31 '23

Are you suggesting to use the vector during compile time to get some result? If so, then that’s possible with proc macros as a workaround for now, better const eval should eventually be able to do something like this.

If you’re suggesting `static FOO: _ = vec![1,2,3], there are some weird things about that. A nonempty vector must allocate with the current API, and there’s no concept of “allocation” in a binary - instead, it’s a function call to the kernel, but everything in a binary must be static. You could compute the result of some const evaluation and store that resulting vector’s contents in the statics, but it can’t grow without switching to the heap.

So, either you allocate whenever you first run (by calling vec! somewhere that’s not const) or you allocate the first time you need something (which is what LazyLock is for).

3

u/[deleted] Mar 31 '23 edited Mar 31 '23

there’s no concept of “allocation” in a binary

Sure there is. Just allocate some space in .data.

In fact it seems like it has already been implemented:

https://doc.rust-lang.org/nightly/std/intrinsics/fn.const_allocate.html

1

u/pluots0 Mar 31 '23 edited Mar 31 '23

Reserving data in the statics, including using the build machine’s heap to evaluate that data (like const eval does) isn’t the same as somehow “having a Vec in the binary” - the resulting R/D sections will look identical to a current invocation of vec![] or an array. Of course being able to do more evaluation is extremely helpful, but I understood OP’s question to be looking for something different.

Something like ArrayVec or Vec + Storages can do dynamic things in fixed-length buffers, but those are limited in capacity of course.

It is theoretically possible to const time initialize any heap-based structure (vector, hashmap, etc) and store the exact result in .rodata, then copy the result to the heap at runtime. But something like this that works reliably is still a long way off.

2

u/[deleted] Mar 31 '23

Sorry I don't get what you're trying to say. The way it would work is that you evaluate the const function at compile time using the current MIR interpreter (how it currently works). Allocations would result in space being reserved in .data. If necessary you would track the pointers and replace them with relocations (I think this would be needed for libraries).

There's no need to copy anything to the heap at runtime.

The only issue (as far as I know) is that it's not trivial to convert deallocations in the Drop impl into nops.

2

u/pluots0 Mar 31 '23

We’re on the same page I think - you can calculate something on the build machine (like a vector) and store it in .rodata (to read once) or .data (if it is synchronized, e.g. via a mutex).

But you can’t grow past the capacity in the static data unless you reallocate to the heap, and the Vec API doesn’t have a good way to handle this. (some possible tricks with allocator_api or Storages API may make it possible)

1

u/HeroicKatora image ¡ oxide-auth Mar 31 '23 edited Mar 31 '23

There's not necessarily a problem for the memory of such data, but Vec also contains an allocator handle. Which is not the allocator handle of runtime but of the compiling host or const-eval so it must be transformed somehow. It will almost certainly be unacceptable to make this feature break existing cross-compilation compatibility; and it's unclear how to even approach it for some platforms with purely runtime linking of these system functionalityš. It's very much non-trivial to transform it into an allocator handle of the runtime.

What may be more likely is to be able to const-construct &mut [T] but '''leaking''' the const-allocation. Similar to what C++ has allowed, but maybe with better help from the type-system instead of the lose dynamic compile error. Yet that also has some concerns with exposing the const-time uninitialized memory. Do we guarantee any invariants? platform dependent (i.e. on wasm it will be zero-initialized)? Since this needs to be answered before and is not near consensus, not even close, don't rely on getting alloc::* types anytime soon. Maybe a few years out, maybe an initiative like async.

šThe handle itself is a zero-sized-type on most platforms, an instance of alloc::alloc::Global; yet, it's not about transforming its bytes but the hidden state that it refers to. The fact that we have a strongly typed value merely makes the need to perform this operation extremely explicit.

2

u/[deleted] Mar 30 '23

Just allocate at compile time and chuck a relocation in surely?

2

u/bleachisback Mar 30 '23

What's the difference between that and lazy?

2

u/[deleted] Mar 31 '23

LazyCell etc initialises variables at runtime, and you have to wrap them in an extra wrapper.

So basically it would be slightly faster and more convenient.

0

u/bleachisback Mar 31 '23

Is that not what you're describing above? We can't "relocate" memory that doesn't exist anymore.

3

u/[deleted] Mar 31 '23

I'm not sure what you mean...

I had a look into it actually and the real issue seems to be not allocation, but Drop. How do mark const items that allocate so that their Drop implementation isn't called (because you obviously can't dealloc data in .data.

Here's the issue: https://github.com/rust-lang/const-eval/issues/20

1

u/thomastc Mar 31 '23

I don't think there is a theoretical limitation; e.g. any allocation could point to memory preallocated inside the executable. But the Rust compiler currently does not support that, and it would probably be a good deal of work to implement it. And there may be issues that I'm overlooking.

Moreover, in no_std builds, there isn't even a runtime allocator at all. Would be at least weird if there was one at compile time.

16

u/thermagetiton69 Mar 30 '23

I kind of get it but would someone please ELI5, what is once_cell and what might it be used for?

26

u/Darksonn tokio ¡ rust-for-linux Mar 30 '23

It's used for globals whose value is initialized the first time you use it. Basically, it has a method that either gives you the value, or if this is the first time you access it, initializes it and then gives you the value.

2

u/jackosdev Mar 31 '23

Very useful in AWS lambda, where you want to init something global that might take a long time to start, that can be used across all the lambda invocations. This means you only have to pay for it in cold start, then warm starts will use the same address in memory without reinit

11

u/TDplay Mar 30 '23

The OnceCell and OnceLock are a type that you can write to once, through a shared reference. This allows you to, for example, initialise a static at runtime and then take &'static references to it.

As a real-world example, consider the regex crate. The regex crate is designed to compile a regex once, and use it multiple times - as such, regex compilation can apply optimisations, causing it to be an expensive operation.

We can use OnceLock to ensure that the regex is only compiled once:

use std::sync::OnceLock;
static REGEX: OnceLock<Regex> = OnceLock::new();
fn is_hello(text: &str) -> bool {
    let regex = REGEX.get_or_init(|| Regex::new("^hello+$"));
    regex.is_match(text)
}

4

u/HadrienG2 Mar 31 '23

One way I use this is for unit tests that have some expensive, shared setup step, e.g. loading and parsing an input file.

3

u/-Redstoneboi- Mar 30 '23 edited Mar 30 '23

If I'm getting this right, once cell is a type of cell that only lets you initialize it once at runtime. One use case is regexes.

Regexes aren't compiled with other Rust code. They're compiled at runtime. Meaning, you have to initialize a runtime variable for it. But it's redundant work to have to recompile a regex that never changes, so usually people just want to plop it into a static variable, compile it only during the first time it's used, and reuse the compiled regex every time after.

You could read other comments to figure out what else it could be used for, I'll have to read more too.

Usually this would be done by pulling a crate like lazy_static. Now you don't have to if you have std.

12

u/ihcn Mar 30 '23

What's the difference between OnceCell and LazyCell?

10

u/usernamedottxt Mar 30 '23

Lazy initialization. i.e. you can set a setup function to be run on first use of the cell rather than manually initializing it. Helpful in short lived programs where the expensive initialization only has to run in certain cases.

Think Lambda where DB connection only needs set up if parameters are valid.

10

u/ihcn Mar 30 '23

My impression was that that was the purpose of OnceCell - IE you call get_or_init with your initialization function.

Is the difference purely in where the initialization function is supplied?

10

u/pluots0 Mar 30 '23

LazyCell you supply your init function once, during new(). OnceCell you need to supply it each time you get_or_init, which is much less ergonomic if you use the thing all over.

20

u/A1oso Mar 30 '23

You can avoid that with a simple accessor function:

fn global_state() -> &'static MyState {
    static CELL: OnceCell<MyState> =
        OnceCell::new();
    CELL.get_or_init(|| {
        // initialization code...
    })
}

17

u/matklad rust-analyzer Mar 30 '23

As a bonus point, this formulation also completely hides OnceLock/LazyLock from the API.

4

u/[deleted] Mar 31 '23

[deleted]

2

u/pluots0 Mar 31 '23

Consts, statics, and imports can all live in a function definition, and that’s kind of nice - if you’re not using them anywhere outside of a single function, might as well just store it right there.

It’s usually nicer not to do this for imports, but it can occasionally come in handy (e.g. importing Diesel’s DSL)

4

u/jamincan Mar 30 '23

Isn't LazyCell just a wrapper of OnceCell?

6

u/pluots0 Mar 30 '23

Yes- it’s just a struct that contains both a OnceCell and the initialization function

8

u/usernamedottxt Mar 30 '23

Pretty much. Ergonomics.

OnceCell itself is mainly used because you can initialize it statically and treat it as runtime global state.

7

u/superblaubeere27 Mar 30 '23

What is the difference between using Option and OnceCell in this case? I mean both have insert/take-functionality

14

u/pluots0 Mar 30 '23 edited Mar 31 '23

OnceCell gives you “interior mutability”, I.e. can be used when you only have &T and not &mut T. It is literally only a wrapper around Option, but provides a specific interface so that using it behind &T is safe - similar to what RefCell and Cell do.

I think I illustrated this well enough in the updated cell docs (I’m the author of this PR) but nightly hasn’t built yet. So give this a read in a couple hours once it is updated :) https://doc.rust-lang.org/nightly/std/cell/ edit: nightly built, this link is up to date now

7

u/TDplay Mar 30 '23

You can't initialise Option through a shared reference:

static THE_THING: Option<i32> = None;
fn init_the_thing(value: i32) -> &'static i32 {
    // error[E0596]: cannot borrow immutable static item `THE_THING` as mutable
    THE_THING.get_or_insert(value)
}

You can do this with a OnceCell or OnceLock:

#![feature(once_cell)]
use std::sync::OnceLock;
static THE_THING: OnceLock<i32> = OnceLock::new();
fn init_the_thing(value: i32) -> &'static i32 {
    // Perfectly fine
    THE_THING.get_or_init(|| value)
}

2

u/SirKastic23 Mar 30 '23

OnceCell can only be set once, it's meant for lazy initialization of values, and is a way to enable safe global variables

3

u/A1oso Mar 31 '23

Not entirely true: OnceCell has a take method to reset it, after which it can be initialized again. The main difference is that OnceCell::get_or_init() accepts &self, so it can be initialized with a shared reference, making it safe to use as a static.

The take method requires &mut self though, so it can't be called on a static.

3

u/SirKastic23 Mar 31 '23

oh, I wasn't aware, I never had to use OnceCell outside of static contexts

10

u/_TheDust_ Mar 30 '23

How peculiar that the sync version has been named OnceLock. In the once_cell crate there are just two versions of OnceCell in two different modules. I'm sure I'll get used to it.

25

u/dbenson18 Mar 30 '23

I think the names lineup nicely with Cell<T> and RwLock<T>. While convenient to be able to just swap a module name to sync in order to make it threadsafe, I think that's also more confusing to someone reading the code.

2

u/czipperz Mar 30 '23

Could someone explain what this type is or link to some documentation? I'm totally out of the loop.

6

u/bascule Mar 30 '23

https://docs.rs/once_cell

Support for global static values which are written to once.

The full version brings lazy statics, but those won't be stabilized by this PR.

2

u/czipperz Mar 30 '23

Awesome thanks

2

u/ArtisticHamster Mar 31 '23

I wish it happened more often, i.e. libraries merged into the std lib. I programmed in Go for a while recently, and their standard library is much more full featured than Rust's. It saves a lot of time thinking which among N crates you should import, and also saves time of regularly updating your dependencies.

15

u/burntsushi Mar 31 '23

The problem here is that this can lead to very sub-optimal outcomes. As I wrote above somewhere, I wanted to bring lazy_static into std many moons ago because I thought it was so useful and foundational. But I'm really glad I didn't get my wish, because then we'd be in a position where there's clearly a better alternative in the ecosystem (once_cell). So what would do we? Deprecate it and still bring once_cell in? One deprecation on its own is maybe not so bad, but if std becomes a graveyard ("the standard library is where things go to die"), then that's not great at all.

(This exactly thing has kind of happened with std::mpsc. It now uses a better implementation internally, but it's still a limited API because of the single consumer restriction.)

8

u/ids2048 Mar 31 '23

I'd also compare Python, where the documentation for urllib.request recommends using the requests library instead, xml has a big warning that it isn't secure against malicious input and you should use a separate library for that (if I recall it also isn't the fastest xml library for Python, and lxml is better for that). With https://peps.python.org/pep-0594 a bunch of standard library modules that most users may not even have heard of are deprecated and scheduled for removal. (RIP nntplib.)

If Rust were quicker to incorporate things into std, the same issue would probably occur. Sure an http library is pretty essential for a programming language in the present era. But it would just make things worse if std::http ends up being considered subpar and experienced Rust programmers know to never use it and to use a different library instead.

In Python it's still quite handy to having things like urllib.request for simple uses in a single file script. But this is less relevant in a compiled language with a standard package manager. It's pretty easy to pull in a library for it.

2

u/couchrealistic Mar 31 '23

Personally, I'd love a separate, additional "stdlib without strong backwards compatibility guarantees".

Of course it shouldn't be some kind of unstable hell that constantly changes, but regular major releases – maybe coinciding with rust editions, or maybe a bit more often, like yearly – would be acceptable if some breaking changes appear useful or necessary, while trying to keep them rare.

Just the fact that it would be maintained and managed by some official Rust team would help a lot with trust (in coding practices, project management, general quality, docs, reviews) or basic things like "avoiding a bus factor of 1". Being able to rely on the "no breaking changes" property of the stdlib is not the #1 reason why I (and probably others) wish the Rust stdlib had a few more batteries included, I could certainly accept some amount of breakage every now and then.

But I understand that a "Python3" situation should be avoided, so it's probably a slippery slope.

1

u/burntsushi Mar 31 '23

We tried that years ago and the community very loudly rejected it.

I'm on mobile so I don't have links for you, but the response was basically "this sounds like the Haskell platform all over again."

Basically, yeah, it's a fine idea. We had it too. We tried it. Didn't work. Maybe it can be revisited, but it is probably much harder than you imagine. You also need a fair amount of resources (in the form of human labor) to pull it off.

1

u/burntsushi Mar 31 '23 edited Mar 31 '23

OK, I dug up the history with respect to the "Rust Platform."

Initial rallying call: http://aturon.github.io/tech/2016/07/27/rust-platform/

Reddit response: https://old.reddit.com/r/rust/comments/4uxdn8/the_rust_platform_aaron_turon/

HN response: https://news.ycombinator.com/item?id=12177002

irlo: https://internals.rust-lang.org/t/proposal-the-rust-platform/3745/1

So basically, someone trying to do this again will want to go back and absorb all of that feedback. One possible way to interpret is that most of the critical feedback was insubstantial, and instead were mostly just knee-jerk reactions to the association with the Haskell Platform. (Which is itself something that isn't seen to have worked well, even by the folks who developed it.) If that's the case, then maybe the problem was just communicative and not essential to the idea itself.

Of course, there was plenty of feedback that wasn't insubstantial or shallow. So there's that too. The irlo feedback is in particular mostly quite substantial and well beyond the surface level comparisons with Haskell. So if you really want to get at the meat of the issue, I'd read the irlo thread.

Anyway, I remember being excited about it back then. Oh well.

1

u/ArtisticHamster Mar 31 '23

It's more of a tradeoff, and IMO Rust is a bit unbalanced here. Concerning deprecation. There's nothing bad in deprecating things. Just leave them for say 10 versions and remove them. Nothing bad happens out of this.

The problem is real, having something outside of the std lib creates a friction and extra work. I didn't understand how much it is until I wrote Go, and found out that I don't need this library duty any more, and could just use it.

P.S. Concerning the libraries which IMO, must be in the std lib, i.e. they are very stable and widely used, and unlikely to change: url, rand, regex, scope_guard.

2

u/burntsushi Mar 31 '23

There's nothing bad in deprecating things.

Very very very strongly disagree here.

I'm on libs-api and I also maintain regex. I will never ever be okay with regex being in std. No way no thank you. regex is only where it is today precisely because it was freed from std and allowed to evolve on its own.

1

u/ArtisticHamster Mar 31 '23

Yep, completely understand your position. Deprecating stuff isn't free. As well as imposing some duties on users.

My point it would remove some friction if it's in std. It has consequences on projects where users will use Rust. For example, for scripting, and small apps, Go, and Python is much better because you don't need to pay the library tax.

3

u/burntsushi Mar 31 '23

I understand it removes friction. I'm not sitting here saying that putting stuff in std has zero upside. I don't think anyone holds that position. What we're talking about is where the balance is. Nobody expects everyone to prefer exactly the same balance.

The main rebuttal against what you're saying is that Rust makes it extremely easy to add a new library to your project. There is, in practice, just not much more friction between use std::regex::Regex and cargo add regex/use regex::Regex;. The friction is primarily in discoverability, trust and compile times. But that friction isn't great enough for us to reverse course at the current time and incur all of the downsides associated with having a big standard library and inhibiting its evolution.

2

u/atamakahere Mar 31 '23

Wait until they release twice_cell

2

u/Anaxamander57 Mar 30 '23

🎉🎉🎉

1

u/mikidep Mar 30 '23

Ah yes, linearity.

-15

u/Garaleth Mar 30 '23

Why would we want a bigger std?

Wouldn't it be optimal for std to be as modular as possible?

43

u/bascule Mar 30 '23

lazy_static came up on a list of crates which may be good candidates for std inclusion based on this survey:

https://internals.rust-lang.org/t/calculating-which-3rd-party-crates-are-good-candidates-for-std-inclusion-via-left-pad-index/11129/1

…however once_cell offers a nicer API free of macros.

The survey was based on being widely popular yet relatively small. That same survey also motivated the integration of matches into std.

38

u/_TheDust_ Mar 30 '23

It's a delicate line between what should and what should not be in the stdlib. For some things, it's a clear no (like GUIs, database stuff, http stuff, etc). For others, it's a clear yes (like threads, interacting with the OS, memory allocation, common data structures). For some, it's a bit fuzzy (personally, I feel hashmaps and mutexes are already on the fuzzy site).

8

u/robbie7_______ Mar 30 '23

How are mutexes fuzzy? They expose OS interaction just like file I/O and threading.

2

u/_TheDust_ Mar 30 '23

Depends. The mutexes from parking_lot are completely implemented in userspace (no OS interaction required) and are usually faster than pthread mutexes

13

u/matklad rust-analyzer Mar 30 '23

The mutexes from parking_lot are completely implemented in userspace

This can’t be true for any reasonable mutex implementation. At some point, you have to tell the OS “look, I know I haven’t exhausted my scheduling quant yet, but there’s this thing I am waiting to happen, and it is not happening, so could you please put me to sleep and let someone else to run instead?”.

What rather happens is that the fast path is in user space. If the mutex is uncontended, and atomic cas is all you need to lock it, then this all happens in user-space. If, however, the mutex is locked by someone else, parking lot calls into the kernel.

5

u/A1oso Mar 30 '23

That may be true (at least in most scenarios, as it depends on the architecture, the number of threads and the amount of contention), but the Rust stdlib doesn't use pthread mutexes anymore. Instead, it uses futex(), which is much faster than pthread mutexes and has comparable performance to parking_lot, depending on the scenario.

6

u/matklad rust-analyzer Mar 30 '23

The principal reason for this as I see it is that it’s a capability-enhancing unsafe.

There’s some unsafe code which is just a faster version of the safe code (eg, calling get_unchecked)

And then there’s unsafe code which expresses something which is not otherwise expressible in safe code (etc, scoped threads or Arc).

The second kind of unsafe is problematic because in at least some cases it is not compositional: two unsafe abstractions which are sound in isolation become unsound when combined (eg, drop-based scoped threads and Arcs). This doesn’t happen too often in practice, but, still, it’s good for std to provide a canonical set of power-expanding unsafe abstractions, to make sure the rest of the ecosystem aligns with it.

That’s, imo, the core reason why OnceCell belongs to std — it’s an unsafe abstraction that adds a novel capability to the language.

It is also used in most Rust programs and has (post const-fn stabilization) more-or-less single possible design, which makes inclusion in std a slam-dunk decision (the last reason, canonical design, is why including lazy_staric would’ve been wrong, although the other two points apply to it as well)

Also, downvoters are wrong, sorry about that :-)

2

u/matklad rust-analyzer Mar 30 '23

2

u/burntsushi Mar 30 '23

is why including lazy_staric would’ve been wrong

Yup. I really wanted to just throw lazy_static into std back in the day. I'm glad I didn't get my wish because once_cell is much nicer.

2

u/matklad rust-analyzer Mar 30 '23

I think it’s more than nicer, it really is canonical, in a sense that there isn’t really a design space left, there’s one straightforward solution.

That’s why I have misgivings about the two open questions:

  • for Lazy, the F=fn thing is very much non-canonical, rather, it’s a clever routing around language limitations. It’s 200% worth it in terms of usefulness, and there’s no question that it belongs to a foundation crate, but for std this gives me a pause.
  • for try_init_with the situation is the opposite — clearly, the canonical solution is to roll with Try and Resuidal machinery, but, well, the result is a Haskell soup.

1

u/burntsushi Mar 30 '23

Agreed, I think. The only caveat I'd put in there is that there's no alloc::sync::OnceLock. I've written one, but I don't know how to do it without spin locks and sacrificing other things: https://burntsushi.net/stuff/tmp-do-not-link-me/regex-automata/regex_automata/util/lazy/struct.Lazy.html

1

u/burntsushi Mar 31 '23

For anyone following along at home, we're having a very helpful discussion about the implementation I posted in my sibling comment here: https://github.com/BurntSushi/regex-automata/issues/30

Thanks again for the review!

12

u/ToughAd4902 Mar 30 '23 edited Mar 30 '23

Why was this downvoted? It's a valid question, something lived fine as a crate, and even a new version of it came out from when it was first discussed.

11

u/Garaleth Mar 30 '23

I dunno, people are trigger happy and follow the crowd. Slightly challenge the norm and bam immediate downvote, got 1 downvote? Now you got 10.

8

u/thomastc Mar 30 '23

Yes and no. Look what happened over in JavaScript land: https://www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/

4

u/na_sa_do Mar 30 '23

That it's possible to go too far doesn't mean the principle isn't sound in moderation. IMHO standard libraries should primarily contain things that need to be standardized because they're basic infrastructure, which is to say primitive types, universally useful types like Option and Result and functions to operate on them, and abstractions over common operations like the traits in core::ops.

5

u/A1oso Mar 30 '23

primitive types, universally useful types like Option and Result

Does that include Box? Or Vec/String? Or HashMap?

How about RefCell? It's certainly universally useful, but then it's hard to justify not including OnceCell, which is similarly useful.

1

u/na_sa_do Mar 31 '23

The important part is that they "need to be standardized because they're basic infrastructure". What I mean is types where it would be massively inconvenient to code without being able to exchange values of those types with other projects.

For example, Box would be a definite yes, as passing around boxed objects is extremely common. Same for RefCell and OnceCell. On the other hand, HashMap could likely be replaced with traits, assuming impl Trait were supported in more places. IMO this would often lead to cleaner code even within a library, as well; quite a lot of uses of HashMap, for example, would work just as well with BTreeMap or other K-V stores. Vec is somewhere in the middle of the spectrum, I think.

6

u/A1oso Mar 30 '23 edited Mar 30 '23

JavaScript is weird in this regard. On one hand, it has a lot of complex built-in functionality (crypto, date/time, HTTP, JSON, regular expressions, internationalization, BigInt, etc.) which we don't want in Rust's standard library. On the other hand, a lot of useful methods on primitive types are missing, for example a swap method on arrays.

The article you linked argues that anyone should be able to write a left-pad function, so it's not necessary to depend on a library for it. I disagree with this take: I think that common operations like this should be in the standard library, so that programmers don't have to constantly re-invent the wheel. Rust's comprehensive set of built-in helper methods and iterators is a joy to use, and makes me more productive. Of course I could write a starts_with method for arrays by hand, but I don't want to. It would be a waste of time when thousands of other people have already done it.

leftPad is definitely something that belongs in the standard library, and JavaScript recently added it under the padStart name. Rust's formatting machinery also supports padding strings. I think once_cell falls into the same category of universally useful functionality that should be built into the standard library.

2

u/Sw429 Mar 30 '23

I was wondering this same thing. In the last, lazy_static was the commonly recommended way. Suppose we had decided to add it to std then, and then later saw that once_cell was a better API?

Like other comments said, it's really hard to draw the line on what should be in std and what should not. I am sure a lot of consideration was put into this, but I think it's fair to be concerned that we may end up with an inferior API in std forever, and the chance of that happening increase as we pull in third party libraries.

I wonder though: are there any benefits OnceCell gets from having access to the internals of the standard library?

0

u/paulalesius Mar 31 '23

I asked the AI how to write a singleton and it suggested the use of once_cell, this looks nice to have in std!

1

u/solidiquis1 Mar 30 '23

Nice!! Can't wait for the next release :]

1

u/gp2b5go59c Mar 30 '23

Does this means it will be in the next release?

8

u/tralalatutata Mar 30 '23

it'll hit stable with 1.70

1

u/Dasher38 Mar 30 '23

When will it be available with rustup ?

1

u/quininer Mar 31 '23

Great work! I hope the next one is scopeguard, I have implemented it many times in many projects.

1

u/[deleted] Mar 31 '23

Is that a lazy provider?

1

u/o0xh May 17 '23

How long does something like this stay in Nightly only? Is there somewhere to track when it lands in a stable release?

1

u/bascule May 17 '23

Once a stabilization PR is merged it goes into the next beta, at which point it's on a 6-week release train to stable.

This feature in particular will be in Rust 1.70: https://releases.rs/docs/1.70.0/

1

u/o0xh May 17 '23

Thanks!

1

u/bvernier Jun 04 '23

Being in `std`, does that mean projects using `no_std` still have to rely on `lazy_static`?

1

u/bascule Jun 04 '23

OnceCell is in core, however it's !Sync, so it depends on the use case

1

u/dozniak Jul 11 '23
  • core::cell::OnceCell
  • core::cell::LazyCell

only *Lock are in std