Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inferred types _::Enum #3444

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Conversation

JoshuaBrest
Copy link

@JoshuaBrest JoshuaBrest commented Jun 7, 2023

This RFC is all about allowing types to be inferred without any compromises. The syntax is as follows. For additional information, please read the bellow.

struct MyStruct {
    value: usize
}

fn my_func(data: MyStruct) { /* ... */ }

my_func(_ {
    value: 0
});

I think this is a much better and more concise syntax.

If you plan on pressing the dislike button, please leave a comment explaining your disproval. Every piece of constructive feedback helps.

Rendered

@JoshuaBrest JoshuaBrest changed the title Infered enums Infered types Jun 7, 2023
@Lokathor
Copy link
Contributor

Lokathor commented Jun 7, 2023

I'm not necessarily against the RFC, but the motivation and the RFC's change seem completely separate.

I don't understand how "people have to import too many things to make serious projects" leads to "and now _::new() can have a type inferred by the enclosing expression".

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jun 7, 2023
@JoshuaBrest
Copy link
Author

I don't understand how "people have to import too many things to make serious projects" leads to "and now _::new() can have a type inferred by the enclosing expression".

In crates like windows-rs even in the examples, they import *. This doesn't seem like good practice and with this feature, I hope to avoid it.

use windows::{
    core::*, Data::Xml::Dom::*, Win32::Foundation::*, Win32::System::Threading::*,
    Win32::UI::WindowsAndMessaging::*,
};

@Lokathor
Copy link
Contributor

Lokathor commented Jun 7, 2023

Even assuming I agreed that's bad practice (which, I don't), it is not clear how that motivation has lead to this proposed change.

@JoshuaBrest
Copy link
Author

Even assuming I agreed that's bad practice (which, I don't), it is not clear how that motivation has lead to this proposed change.

How can I make this RFC more convincing? I am really new to this and seeing as you are a contributor I would like to ask for your help.

@Lokathor
Copy link
Contributor

Lokathor commented Jun 7, 2023

First, I'm not actually on any team officially, so please don't take my comments with too much weight.

That said:

  • the problem is that you don't like glob imports.
  • glob imports are usually done because listing every item individually is too big of a list, or is just annoying to do.
  • I would expect the solution to somehow be related to the import system. Instead you've expanded how inference works.

Here's my question: Is your thinking that an expansion of inference will let people import less types, and then that would cause them to use glob imports less?

Assuming yes, well this inference change wouldn't make me glob import less. I like the glob imports. I want to write it once and just "make the compiler stop bugging me" about something that frankly always feels unimportant. I know it's obviously not actually unimportant but it feels unimportant to stop and tell the compiler silly details over and over.

Even if the user doesn't have to import as many types they still have to import all the functions, so if we're assuming that "too many imports" is the problem and that reducing the number below some unknown threshold will make people not use glob imports, I'm not sure this change reduces the number of imports below that magic threshold. Because for me the threshold can be as low as two items. If I'm adding a second item from the same module and I think I might ever want a third from the same place I'll just make it a glob.

Is the problem with glob imports that they're not explicit enough about where things come from? Because if the type of _::new() is inferred, whatever the type of the _ is it still won't show up in the imports at the top of the file. So you still don't know specifically where it comes from, and now you don't even know the type's name so you can't search it in the generated rustdoc.

I hope this isn't too harsh all at once, and I think more inference might be good, but I'm just not clear what your line of reasoning is about how the problem leads to this specific solution.

@JoshuaBrest
Copy link
Author

Is your thinking that an expansion of inference will let people import less types, and then that would cause them to use glob imports less?

Part of it yes, but, I sometimes get really frustrated that I keep having to specify types and that simple things like match statements require me to sepcigy the type every single time.

whatever the type of the _ is it still won't show up in the imports at the top of the file. So you still don't know specifically where it comes from, and now you don't even know the type's name so you can't search it in the generated rustdoc.

Its imported in the background. Although we don't need the exact path, the compiler knows and it can be listed in the rust doc.

I hope this isn't too harsh all at once, and I think more inference might be good, but I'm just not clear what your line of reasoning is about how the problem leads to this specific solution.

Definitely not, you point out some great points and your constructive feedback is welcome.

@BoxyUwU
Copy link
Member

BoxyUwU commented Jun 7, 2023

Personally _::new() and _::Variant "look wrong" to me although i cant tell why, I would expect <_>::new() and <_>::Variant to be the syntax, no suggestion on _ { ... } for struct exprs which tbh also looks wrong but <_> { ... } isnt better and we dont even support <MyType> { field: expr }

[unresolved-questions]: #unresolved-questions


A few kinks on this are whether it should be required to have the type in scope. Lots of people could point to traits and say that they should but others would disagree. From an individual standpoint, I don’t think it should require any imports but, it really depends on the implementers as personally, I am not an expert in *this* subject.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this question needs to be resolved before the RFC is landed, since it pretty drastically changes the implementation and behavior of the RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to discuss it (:

@SOF3
Copy link

SOF3 commented Jun 12, 2023

I would like to suggest an alternative rigorous definition that satisfies the examples mentioned in the RFC (although not very intuitive imo):


When one of the following expression forms (set A) is encountered as the top-level expression in the following positions (set B), the _ token in the expression form should be treated as the type expected at the position.

Set A:

  • Path of the function call (e.g. _::function())
  • Path expression (e.g. _::EnumVariant)
  • When an expression in set A appears in a dot-call expression (expr.method())
  • When an expression in set A appears in a try expression (expr?)
  • When an expression in set A appears in an await expression (expr.await)

Set B:

  • A pattern, or a pattern option (delimited by |) in one of such patterns
  • A value in a function/method call argument list
  • The value of a field in a struct literal
  • The value of a value in an array/slice literal
  • An operand in a range literal (i.e. if an expression is known to be of type Range<T>, expr..expr can infer that both exprs are of type T)
  • The value used with return/break/yield

Set B only applies when the type of the expression at the position can be inferred without resolving the expression itself.


Note that this definition explicitly states that _ is the type expected at the position in set B, not the expression in set A. This means we don't try to infer from whether the result is actually feasible (e.g. if _::new() returns Result<MyStruct>, we still set _ as MyStruct and don't care whether new() actually returns MyStruct).

Set B does not involve macros. Whether this works for macros like vec![_::Expr] depends on the macro implementation and is not part of the spec (unless it is in the standard library).

Set A is a pretty arbitrary list for things that typically seem to want the expected type. We aren't really inferring anything in set A, just blind expansion based on the inference from set B. These lists will need to be constantly maintained and updated when new expression types/positions appear.

@JoshuaBrest
Copy link
Author

s (set A) is encountered as the top-level expression in the following positions (set B), the _ token in the expression form should be treated as the type expected at th

That is so useful! Let me fix it now.

@SOF3
Copy link

SOF3 commented Jun 12, 2023

One interesting quirk to think about (although unlikely):

fn foo<T: Default>(t: T) {}

foo(_::default())

should this be allowed? we are not dealing with type inference here, but more like "trait inference".

@JoshuaBrest
Copy link
Author

One interesting quirk to think about (although unlikely):

fn foo<T: Default>(t: T) {}

foo(_::default())

should this be allowed? we are not dealing with type inference here, but more like "trait inference".

I think you would have to specify the type arg on this one because Default is a trait and the type is not specific enough.

fn foo<T: Default>(t: T) {}

foo::<StructImplementingDefault>(_::default())

@SOF3
Copy link

SOF3 commented Jun 12, 2023

oh never mind, right, we don't really need to reference the trait directly either way.

@clarfonthey
Copy link

I've been putting off reading this RFC, and looking at the latest version, I can definitely feel like once the aesthetic arguments are put aside, the motivation isn't really there.

And honestly, it's a bit weird to me to realise how relatively okay I am with glob imports in Rust, considering how I often despise them in other languages like JavaScript. The main reason for this is that basically all of the tools in the Rust ecosystem directly interface with compiler internals one way or another, even if by reimplementing parts of the compiler in the case of rust-analyzer.

In the JS ecosystem, if you see a glob import, all hope is essentially lost. You can try and strip away all of the unreasonable ways of interfacing with names like eval but ultimately, unless you want to reimplement the module system yourself and do a lot of work, a person seeing a glob import knows as much as a machine reading it does. This isn't the case for Rust, and something like rust-analyzer will easily be able to tell what glob something is coming from.

So really, this is an aesthetic argument. And honestly… I don't think that importing everything by glob, or by name, is really that big a deal, especially with adequate tooling. Even renaming things.

Ultimately, I'm not super against this feature in principle. But I'm also not really sure if it's worth it. Rust's type inference is robust and I don't think it would run into technical issues, just… I don't really know if it's worth the effort.

@SOF3
Copy link

SOF3 commented Jun 12, 2023

@clarfonthey glob imports easily have name collision when using multiple globs in the same module. And it is really common with names like Context. Plus, libraries providing preludes do not necessarily have the awareness that adding to the prelude breaks BC.

@JoshuaBrest
Copy link
Author

And honestly, it's a bit weird to me to realise how relatively okay I am with glob imports in Rust, considering how I often despise them in other languages like JavaScript. The main reason for this is that basically all of the tools in the Rust ecosystem directly interface with compiler internals one way or another, even if by reimplementing parts of the compiler in the case of rust-analyzer.

In the JS ecosystem, if you see a glob import, all hope is essentially lost. You can try and strip away all of the unreasonable ways of interfacing with names like eval but ultimately, unless you want to reimplement the module system yourself and do a lot of work, a person seeing a glob import knows as much as a machine reading it does. This isn't the case for Rust, and something like rust-analyzer will easily be able to tell what glob something is coming from.

I can understand your point, but, when using large libraries in conjunction, like @SOF3 said, it can be easy to run into name collisions. I use actix and seaorm and they often have simular type names.

@JoshuaBrest
Copy link
Author

JoshuaBrest commented Jun 12, 2023

Personally _::new() and _::Variant "look wrong" to me although i cant tell why, I would expect <_>::new() and <_>::Variant to be the syntax, no suggestion on _ { ... } for struct exprs which tbh also looks wrong but <_> { ... } isnt better and we dont even support <MyType> { field: expr }

In my opinion, it's really annoying to type those set of keys. Using the QWERTY layout requires lots of hand movement. Additionally, it's syntax similar to what you mentioned has already been used to infer lifetimes, I am concerned people will confuse these.
Frame 1

@clarfonthey
Copy link

Right, I should probably clarify my position--

I think that not liking globs is valid, but I also think that using globs is more viable in Rust than in other languages. Meaning, it's both easier to use globs successfully, and also easier to just import everything you need successfully. Rebinding is a bit harder, but still doable.

Since seeing how useful rust-analyzer is for lots of tasks, I've personally found that the best flows for these kinds of things involve a combination of auto-import and auto-complete. So, like mentioned, _ is probably a lot harder to type than the first letter or two of your type name plus whatever your auto-completion binding is (usually tab, but for me it's Ctrl-A).

Even if you're specifically scoping various types to modules since they conflict, that's still just the first letter of the module, autocomplete, two colons, the first letter of the type, autocomplete. Which may be more to type than _, but accomplishes the goal you need to accomplish.

My main opinion here is that _ as a type inference keyword seems… suited to a very niche set of aesthetics that I'm not sure is worth catering to. You don't want to glob-import, you don't want to have to type as much, but also auto-completing must be either too-much or not available. It's even not about brevity in some cases: for example, you mention cases where you're creating a struct inside a function which already has to be annotated with the type of the struct, which cannot be inferred, and therefore you're only really saving typing it once.

Like, I'm not convinced that this can't be better solved by improving APIs. Like, for example, you mentioned that types commonly in preludes for different crates used together often share names. I think that this is bad API design, personally, but maybe I'm just not getting it.

@programmerjake
Copy link
Member

I do think inferred types are useful when matching for brevity's sake:
e.g. in a RV32I emulator:

#[derive(Copy, Clone, Default, Eq, PartialEq, Ord, PartialOrd, Debug, Hash)]
pub struct Reg(pub Option<NonZeroU8>);

#[derive(Debug)]
pub struct Regs {
    pub pc: u32,
    pub regs: [u32; 31],
}

impl Regs {
    pub fn reg(&self, reg: Reg) -> u32 {
        reg.0.map_or(0, |reg| self.regs[reg.get() - 1])
    }
    pub fn set_reg(&mut self, reg: Reg, value: u32) {
        if let Some(reg) = reg {
            self.regs[reg.get() - 1] = value;
        }
    }
}

#[derive(Debug)]
pub struct Memory {
    bytes: Box<[u8]>,
}

impl Memory {
    pub fn read_bytes<const N: usize>(&self, mut addr: u32) -> [u8; N] {
        let mut retval = [0u8; N];
        for v in &mut retval {
            *v = self.bytes[addr.try_into().unwrap()];
            addr = addr.wrapping_add(1);
        }
        retval
    }
    pub fn write_bytes<const N: usize>(&mut self, mut addr: u32, bytes: [u8; N]) {
        for v in bytes {
            self.bytes[addr.try_into().unwrap()] = v;
            addr = addr.wrapping_add(1);
        }
    }
}

pub fn run_one_insn(regs: &mut Regs, mem: &mut Memory) {
    let insn = Insn::decode(u32::from_le_bytes(mem.read_bytes(regs.pc))).unwrap();
    match insn {
        _::RType(_ { rd, rs1, rs2, rest: _::Add }) => {
            regs.set_reg(rd, regs.reg(rs1).wrapping_add(regs.reg(rs2)));
        }
        _::RType(_ { rd, rs1, rs2, rest: _::Sub }) => {
            regs.set_reg(rd, regs.reg(rs1).wrapping_sub(regs.reg(rs2)));
        }
        _::RType(_ { rd, rs1, rs2, rest: _::Sll }) => {
            regs.set_reg(rd, regs.reg(rs1).wrapping_shl(regs.reg(rs2)));
        }
        _::RType(_ { rd, rs1, rs2, rest: _::Slt }) => {
            regs.set_reg(rd, ((regs.reg(rs1) as i32) < regs.reg(rs2) as i32) as u32);
        }
        _::RType(_ { rd, rs1, rs2, rest: _::Sltu }) => {
            regs.set_reg(rd, (regs.reg(rs1) < regs.reg(rs2)) as u32);
        }
        // ...
        _::IType(_ { rd, rs1, imm, rest: _::Jalr }) => {
            let pc = regs.reg(rs1).wrapping_add(imm as u32) & !1;
            regs.set_reg(rd, regs.pc.wrapping_add(4));
            regs.pc = pc;
            return;
        }
        _::IType(_ { rd, rs1, imm, rest: _::Lb }) => {
            let [v] = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
            regs.set_reg(rd, v as i8 as u32);
        }
        _::IType(_ { rd, rs1, imm, rest: _::Lh }) => {
            let v = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
            regs.set_reg(rd, i16::from_le_bytes(v) as u32);
        }
        _::IType(_ { rd, rs1, imm, rest: _::Lw }) => {
            let v = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
            regs.set_reg(rd, u32::from_le_bytes(v));
        }
        // ...
    }
    regs.pc = regs.pc.wrapping_add(4);
}

pub enum Insn {
    RType(RTypeInsn),
    IType(ITypeInsn),
    SType(STypeInsn),
    BType(BTypeInsn),
    UType(UTypeInsn),
    JType(JTypeInsn),
}

impl Insn {
    pub fn decode(v: u32) -> Option<Self> {
        // ...
    }
}

pub struct RTypeInsn {
    pub rd: Reg,
    pub rs1: Reg,
    pub rs2: Reg,
    pub rest: RTypeInsnRest,
}

pub enum RTypeInsnRest {
    Add,
    Sub,
    Sll,
    Slt,
    Sltu,
    Xor,
    Srl,
    Sra,
    Or,
    And,
}


pub struct ITypeInsn {
    pub rd: Reg,
    pub rs1: Reg,
    pub imm: i16,
    pub rest: ITypeInsnRest,
}

pub enum ITypeInsnRest {
    Jalr,
    Lb,
    Lh,
    Lw,
    Lbu,
    Lhu,
    Addi,
    Slti,
    Sltiu,
    Xori,
    Ori,
    Andi,
    Slli,
    Srli,
    Srai,
    Fence,
    FenceTso,
    Pause,
    Ecall,
    Ebreak,
}
// rest of enums ...

@Aloso
Copy link

Aloso commented Jun 12, 2023

I do like type inference for struct literals and enum variants.

However, type inference for associated functions doesn't make sense to me. Given this example:

fn expect_foo(_: Foo) {}
foo(_::bar());
  • According to this RFC, the _ should be resolved to Foo (the function argument's type), but this isn't always correct. I suspect that this behavior is often useful in practice, but there are cases where it will fail, and people may find this confusing. For example, Box::pin returns a Pin<Box<T>>, so _::pin(x) couldn't possibly be inferred correctly.

  • Even when Foo has a bar function that returns Foo, there could be another type that also has a matching bar function. Then _ would be inferred as Foo, even though it is actually ambiguous.

  • Another commenter suggested that we could allow method calls after the inferred type (e.g. _::new().expect("..."), or _::builder().arg(42).build()?). But this still wouldn't help in a lot of cases, because methods often return a different type than Self (in contrast to associated functions, where Self is indeed the most common return type).

    For example, the _ in _::new(s).canonicalize()? can't be inferred as Path, because Path::canonicalize returns Option<PathBuf>.

  • Another issue is that it doesn't support auto-deref (e.g. when a function expects a &str and we pass &_::new() 1, which should be resolved as &String::new(), but that may be ambiguous).

All in all, it feels like this would add a lot of complexity and make the language less consistent and harder to learn.

Footnotes

  1. I realize this is a contrived example

@pitaj
Copy link
Contributor

pitaj commented Aug 1, 2024

From a pure lang design pov, inferring enum variants has little use because we have glob imports for variants. If people dislike glob imports used for enum variants, I will suspect that they will dislike this syntax for enum variants too. So for that reason I find the proposed part about enum variants to be of little use.

In my view, it's more about not needing to import in the first place.

If a library I'm using has

fn foo(bar: BarEnum);

Then this feature allows using the enum without importing it.

foo(_::VariantA);

@kennytm
Copy link
Member

kennytm commented Aug 1, 2024

Why is "glob import" kept being raised here? You can't restrict the scope of the glob import exactly on the patterns of a match statement. It has to pollute the current block at minimum, and it will bring name conflict in places where _::X can be unambiguous.

enum RustFlag {
    Name(String),
    Size(u64),
}

#[repr(C)]
#[derive(Copy, Clone)]
enum CFlagTag {
    Name = 1,
    Size = 2,
}

impl RustFlag {
    fn tag(&self) -> CFlagTag {
/* // not going to work:

        use RustFlag::*;
        use CFlagTag::*;
        match self {
            Name(_) => Name,
            Size(_) => Size,
        }
*/
        match self {
            _::Name(_) => _::Name,
            _::Size(_) => _::Size,
        }
    }
}

@spunit262
Copy link

I've actually been secretly messing around with implementing this for a while (to be clear well before #3444 (comment)), mainly just for personal learning and end up with a very simple minimal implementation. Getting that simple implementation was very hard, and I think I understand why the compiler team is so skeptical of it's feasibility. I'll try to get a draft-pr of it up soon.

@traviscross
Copy link
Contributor

traviscross commented Aug 1, 2024

There's been discussion in this thread and elsewhere about what the lang team is trying to say here by signaling openness to an experiment according to our process. Let me try to respond to that.

The lang team is not endorsing this RFC. Our endorsement of it would be acceptance of the RFC, by FCP, and we haven't even proposed FCP here.

However, we know that the kind of feature request embedded in this RFC comes up over and over again in different forms and places. We know that many people want something like this. But we also know that it's going to need a lot of refinement. Many lang design questions would still need to be answered. Many type system questions would need to be answered. Much consensus building would need to be done.

So what lang is trying to say here, in my view, is that we're open to someone who has demonstrated the ability to navigate this sort of thing (an "experienced contributor") picking this up as an owner (in the project goals sense) and driving the kind of experimentation (according to our process) that may lead to a design with which the types team would be happy and to an RFC that we could accept. It's likely that RFC would be very different than this one.

We're serious about this experienced contributor bit. There's no green light to go forward here without that.1 Something this complicated needs an experienced contributor who is enthusiastic about this to drive it forward and mentor any work by others.

Footnotes

  1. Who exactly counts as an experienced contributor, as it pertains to any particular experiment, is subject to our discretion, and may vary depending on how we perceive the complexity or domain of that experiment.

@JoshuaBrest
Copy link
Author

There's been discussion in this thread and elsewhere about what the lang team is trying to say here by signaling openness to an experiment according to our process. Let me try to respond to that.

The lang team is not endorsing this RFC. Our endorsement of it would be acceptance of the RFC, by FCP, and we haven't even proposed FCP here.

However, we know that the kind of feature request embedded in this RFC comes up over and over again in different forms and places. We know that many people want something like this. But we also know that it's going to need a lot of refinement. Many lang design questions would still need to be answered. Many type system questions would need to be answered. Much consensus building would need to be done.

So what lang is trying to say here, in my view, is that we're open to someone who has demonstrated the ability to navigate this sort of thing (an "experienced contributor") picking this up as an owner (in the project goals sense) and driving the kind of experimentation (according to our process) that may lead to a design with which the types team would be happy and to an RFC that we could accept. It's likely that RFC would be very different than this one.

We're serious about this experienced contributor bit. There's no green light to go forward here without that.1 Something this complicated needs an experienced contributor who is enthusiastic about this to drive it forward and mentor any work by others.

Footnotes

  1. Who exactly counts as an experienced contributor, as it pertains to any particular experiment, is subject to our discretion, and may vary depending on how we perceive the complexity or domain of that experiment.

Is it your suggestion that a new owner create their own RFC, replacing this one?

@fee1-dead
Copy link
Member

it will bring name conflict in places where _::X can be unambiguous.

maybe even better, just use import that renames it into a single letter. So your example would be

        use RustFlag as R;
        use CFlagTag as C;
        match self {
            R::Name(_) => C::Name,
            R::Size(_) => C::Size,
        }

@kennytm
Copy link
Member

kennytm commented Aug 2, 2024

@fee1-dead that's not a "glob import" then

(Abbreviation was mentioned before in #3444 (comment) and #3444 (comment) (the "Bitflags" section) and #3444 (comment).)

@fee1-dead
Copy link
Member

In my view, it's more about not needing to import in the first place.

If a library I'm using has

fn foo(bar: BarEnum);

Then this feature allows using the enum without importing it.

foo(_::VariantA);

I have a feeling this might introduce complications in privacy if not precisely specified. There would be sealed enums in the parameters of public functions. Similarly for sealed structs in parameters of public functions. Normally people can't refer to those types because they are sealed but adding this could allow them.

@fee1-dead that's not a "glob import" then

Yes. My bad (was writing that comment very late), but still, I see less value in this feature because of the ability to do this in a different way.

@traviscross
Copy link
Contributor

traviscross commented Aug 2, 2024

Is it your suggestion that a new owner create their own RFC, replacing this one?

That's one of the questions that the experienced contributor who would be driving and mentoring this work would decide.

@JoshuaBrest
Copy link
Author

@fee1-dead Is there somewhere I should go to ask for this? I don't want this issue to fade.

@kanashimia
Copy link

You can emulate inferred types with TAIT and -Znext-solver=globally

// compile flags: -Znext-solver=globally
#![feature(type_alias_impl_trait)]
#![allow(unused)]

struct Bar {
    x: u32
}

struct Foo {
    bar: Bar,
    y: u32,
}

fn bar(lol: Foo) {}

macro_rules! init {
    ($($tt:tt)+) => {
        'block: {
            type InferredType = impl ?Sized;

            if false {
                let fake_value: InferredType = loop {};
                break 'block fake_value;
            }

            InferredType { $($tt)+ }
        }
    }
}

fn main() {
    bar(init! { 
        bar: init! { x: 1 },
        y: 2,
    });
}

But it isn't that nice for match blocks with inferred type names, or for inferring inherent methods.

@cyqsimon
Copy link

cyqsimon commented Nov 8, 2024

I feel like this is one of those language features that sounds nice but would run into many complications during implementation. Not to mention the detriment to readability and added risk to refactoring. Take this for example:

struct TypeA(String);
struct TypeB(String);
struct SomeLargerType {
    a: TypeA,
    b: TypeB,
}
fn big_business_logic_function(t: &mut SomeLargerType) {
    // imagine lots of logic above and below
    t.a = _("foo".to_owned());
}

If during a refactoring we changed t.a to t.b but forgot to update the assigned value, the compiler will just happily do it. Some big safeguards of strong typing has been lost here. Obviously this is not a mistake you would make in such a trivial case, but hopefully you can see how this could happen when the refactoring job is big and generics are involved.


For a slightly more realistic example, consider this more sinister case:

trait MyTrait {}
#[derive(Default)]
struct TypeA {
    key: String,
    field_alpha: u8,
    field_bravo: u8,
}
impl MyTrait for TypeA {}
#[derive(Default)]
struct TypeB {
    key: PathBuf,
    field_alice: String,
    field_bob: u64,
}
impl MyTrait for TypeB {}
fn do_some_generic_logic<T: MyTrait>(t: T) -> T {
    todo!()
}
fn big_business_logic_function() -> TypeA {
    let thing = _ { key: "foo".into(), ..Default::default() };
    do_some_generic_logic(thing)
}

The fact that changing the return type to TypeB wouldn't cause an error is concerning.

@SOF3
Copy link

SOF3 commented Nov 8, 2024

@cyqsimon the same argument can be applied against type inference in general. Consider

use derive_more::From;
#[derive(From)]
struct TypeA(String);
#[derive(From)]
struct TypeB(String);
struct SomeLargerType {
    a: TypeA,
    b: TypeB,
}
fn big_business_logic_function(t: &mut SomeLargerType) {
    // imagine lots of logic above and below
    t.a = String::from("foo").into();
}

You would run into the same problem when changing t.b since the .into() changes from <String as Into<TypeA>>::into to <String as Into<TypeB>>::into.

@cyqsimon
Copy link

cyqsimon commented Nov 8, 2024

@SOF3 Yeah that's fair; on second thought I retract my opposition.

That being said I do like my explicit types, so if this goes through I personally will likely refrain from using it.

@lukasvrenner
Copy link

lukasvrenner commented Jan 6, 2025

This may be difficult to distinguish from the unnamed match-all _, which has very different behavior. One is binding an unused variable and the other is matching a specific type.

math Foo {
    _ { x: 1, y: 2 } => { ... },
    _ { x: 2, y: _ } => { ... },
    _ => { ... },
}

let _ { x, y } = foo;
let _ = bar;

Currently, _ can be used for two purposes: discarding a variable or inferring type annotation. They have different meanings but that's okay because they're never used in the same place. I think it's a bad idea to allow it to be used to infer struct or enum names because that overlaps with the other, very different, usage of _.

While it might be considered more concise, I believe it hides too much and makes the code harder to understand -- especially for people unfamiliar with the codebase -- without adding much more value than having to type a few less characters.

@igotfr
Copy link

igotfr commented Jan 11, 2025

is it possible infer this way?

match dir as Direction {
     ::North => { .. }
     ::East => { .. }
     ::South => { .. }
     ::West => { .. }
}

@joshtriplett
Copy link
Member

@igotfr That would be ambiguous with other meanings of ::name, and it wouldn't make it obvious that something was omitted.

@lemon-gith
Copy link

Hiya, @kennytm made a note in the discussion I originally posted in that this is the RFC to be in, so I'm duplicating this here for ease of access :)

Just a thought, but what if this form of syntax were to be introduced:

match dir using Foo::Bar::Direction {
    North => { ... }
    East => { ... }
    South => { ... }
    West => { ... }
}

where using implicitly scopes Foo::Bar::Direction::*, for ease of use?

  • I don't think people will mind if the line with match gets a little longer
  • This allows for narrower scoping than a preceding use statement
  • I'm impartial to the exact keyword used, but using is somewhat akin to use and so identifies the usage more clearly?
    • @berkus brings up the possible use of in instead, which I also find quite intuitive
  • I personally don't like the keyword as for this use-case, since as tends to be reserved as a keyword for renaming things

However, this comment on #2830 brings up the very valid point of match statements that destructure composite types, and hence different (possibly conflicting) type-names.
For which I'd like to offer an extension to the syntax, to mirror type-specification in function declarations:

match (fruit, company) using (
    fruit: Foo::Fruits,
    company: Bar::Companies
) {
    (Apple, Google) => { ... }
    (Orange, Samsung) => { ... }
    (Durian, Apple) => { ... }
    _ => { ... }
}

This is another reason I would prefer a keyword like using, because it lends itself well to pluralisation.
But, I do understand that it's probably annoying to have to add a new keyword to the lexer, especially with this strange extended syntax that the parser will have to make heads or tails of.
Please see Amendment, for revised thoughts

I might be very wrong, but as far as I can tell, any super deep nesting would require either:

  • manual construction of a complicated scrutinee
    • in which case, the one that constructed that scrutinee can also construct a complicated type specification :)))
  • existing types to pull from
    • which the compiler should(?) be able to infer, since those types will have to have been explicitly defined somewhere

Well, these are my two cents on the matter (ok, a bit more than two). Like most other programmers I dislike writing more code than I need to, but I also appreciate both explicit and strong typing, and the power that that affords match statements; thus, I feel that being able to scope a type just where you need it would be a great solution to the verbosity. This is an idea that borrows from a great many other ideas, but I would love to see something added to the language to address this, either way.
And for anyone that's read all my ramblings, thank you, I appreciate it, have a lovely day ^-^

Amendment: it has occurred to me now that it would perhaps be quite annoying to implement such a complex new syntax, what I wonder is if it would not be possible to simply allow type annotations within the match line, e.g.

match (fruit, company): (Foo::Fruits, Bar::Companies)
{
    (Apple, Google) => { ... }
    (Orange, Samsung) => { ... }
    (Durian, Apple) => { ... }
    _ => { ... }
}

These type annotations (or something similar) should be able to afford the compiler the right amount of information, while still being quite simple and familiar.
The points I made above about more complicated scrutinees still hold true here, I'm just modifying the syntax a little.

@JoshuaBrest
Copy link
Author

JoshuaBrest commented Jan 14, 2025 via email

@lemon-gith
Copy link

I don’t like it. You have to know what you’re writing before you write it to use this syntax.

Interesting, please forgive my confusion, but could you please give me an example of when you would write code that you didn't know the type of?

Also, as a note, I'm not asking for this explicit typing to be enforced in all match statements, just to be permitted for those that we would like to simplify.

let fred = (fruit, company);
match fred: (Foo::Fruits, Bar::Companies) {
    (x, Google) if x != Apple => { ... },
    (Orange, Samsung) => { ... },
    y if (<y condition>) => { ... },
    _ => { ... }
}

This would also be fine, using match guards to filter things out, since the exact type of the value being passed is known, if the type can be destructured, the compiler will be able to type its constituents, too, no?

Please do let me know if I'm missing something glaringly obvious.

@idanarye
Copy link

when you would write code that you didn't know the type of?

I think "didn't know" is too strong. The way I see it, this feature is for cases where fully writing the type will be cumbersome and redundant to understanding the code.

Here is a simplistic example:

use std::collections::hash_map::Entry;
use std::collections::HashMap;

fn main() {
    let mut hashmap = HashMap::<usize, usize>::new();

    for (i, num) in [1, 2, 3, 2, 1].into_iter().enumerate() {
        match hashmap.entry(num) {
            Entry::Occupied(entry) => {
                println!("Alreayd seen {num} at index {}", entry.get());
            }
            Entry::Vacant(entry) => {
                println!("Inserting new entry for {num} - index {i}");
                entry.insert_entry(i);
            }
        }
    }
}

The important type here is HashMap. Entry is an implementation detail - it's not even pub used with it in std::collections, and you have to go into std::collections::hash_map to get it.

Does spelling out Entry (which means being forced to import it or fully qualify it) help understanding what this code does?

@igotfr
Copy link

igotfr commented Jan 23, 2025

Hiya, @kennytm made a note in the discussion I originally posted in that this is the RFC to be in, so I'm duplicating this here for ease of access :)

Just a thought, but what if this form of syntax were to be introduced:

match dir using Foo::Bar::Direction {
    North => { ... }
    East => { ... }
    South => { ... }
    West => { ... }
}

where using implicitly scopes Foo::Bar::Direction::*, for ease of use?

  • I don't think people will mind if the line with match gets a little longer
  • This allows for narrower scoping than a preceding use statement
  • I'm impartial to the exact keyword used, but using is somewhat akin to use and so identifies the usage more clearly?
    • @berkus brings up the possible use of in instead, which I also find quite intuitive
  • I personally don't like the keyword as for this use-case, since as tends to be reserved as a keyword for renaming things

However, this comment on #2830 brings up the very valid point of match statements that destructure composite types, and hence different (possibly conflicting) type-names.
For which I'd like to offer an extension to the syntax, to mirror type-specification in function declarations:

match (fruit, company) using (
    fruit: Foo::Fruits,
    company: Bar::Companies
) {
    (Apple, Google) => { ... }
    (Orange, Samsung) => { ... }
    (Durian, Apple) => { ... }
    _ => { ... }
}

This is another reason I would prefer a keyword like using, because it lends itself well to pluralisation.
But, I do understand that it's probably annoying to have to add a new keyword to the lexer, especially with this strange extended syntax that the parser will have to make heads or tails of.
Please see Amendment, for revised thoughts

I might be very wrong, but as far as I can tell, any super deep nesting would require either:

  • manual construction of a complicated scrutinee
    • in which case, the one that constructed that scrutinee can also construct a complicated type specification :)))
  • existing types to pull from
    • which the compiler should(?) be able to infer, since those types will have to have been explicitly defined somewhere

Well, these are my two cents on the matter (ok, a bit more than two). Like most other programmers I dislike writing more code than I need to, but I also appreciate both explicit and strong typing, and the power that that affords match statements; thus, I feel that being able to scope a type just where you need it would be a great solution to the verbosity. This is an idea that borrows from a great many other ideas, but I would love to see something added to the language to address this, either way.
And for anyone that's read all my ramblings, thank you, I appreciate it, have a lovely day ^-^

Amendment: it has occurred to me now that it would perhaps be quite annoying to implement such a complex new syntax, what I wonder is if it would not be possible to simply allow type annotations within the match line, e.g.

match (fruit, company): (Foo::Fruits, Bar::Companies)
{
    (Apple, Google) => { ... }
    (Orange, Samsung) => { ... }
    (Durian, Apple) => { ... }
    _ => { ... }
}

These type annotations (or something similar) should be able to afford the compiler the right amount of information, while still being quite simple and familiar.
The points I made above about more complicated scrutinees still hold true here, I'm just modifying the syntax a little.

why don't just to use as instead using or : or anything else?

@daniel-pfeiffer
Copy link

daniel-pfeiffer commented Feb 2, 2025

This has also long annoyed me. I find it should be possible to put declarative statements at the beginning of every block, like:

match (fruit, company) {
    use Foo::Fruits::*;
    use Bar::Companies::*;

    (_::Apple, Google) => { ... } // Hmmm, ambiguous example, but that’s probably rare.
    (Orange, Samsung) => { ... }
    (Durian, _::Apple) => { ... }
    _ => { ... }
}

I came here after having felt a need for generalised _ in values.

fn f_unit() -> SomeLongnamedUnitType {
    // SomeLongnamedUnitType
    _
}

fn f_struct(x: f32) -> SomeComplicatedStruct<Type> {
    // SomeComplicatedStruct::<Type> { x }
    _ { x }
}

/* Edit: commented out, because this is actually pretty much a function, not a type syntax
fn f_tuple(x: f32) -> SomeComplicatedTuple<Type> {
    //  SomeComplicatedTuple::<Type>(x)
    _(x)
} */

I guess the objective should not be to squeeze it into every edge case. Instead, accept it only where there is unambiguously only one type that _ can stand for. Else the error message would list the found alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.