How can I ensure that the dynamic type of my custom Scala collection is preserved during a map()?

The compiler falls back on an instance of GenericCanBuildFrom the one defined in the IndexedSeq object GenericCanBuildFrom s produce their builders by calling genericBuilderB on the originating collection, and a requirement for that generic builder is that it can produce generic collections that can hold any type B — as of course, the return type of the function passed to a map() is not constrained In this case RNA is only an IndexedSeqBase and not a generic IndexedSeq so it's not possible to override genericBuilderB in RNA to return a RNA specific builder — we would have to check at runtime whether B is Base or something else, but we cannot do that I think this explains why in the question, we get a Vector back. As to how we can fix it, it's an open question… Edit : Fixing this requires map() to know whether it's mapping to a subtype of A or not. A significant change in the collections library would be needed for this to happen.

See the related question Should Scala's map() behave differently when mapping to the same type?

If the static type of the rna variable is IndexedSeqBase, the automatically inserted CanBuildFrom cannot be the one defined in the RNA companion object, as the compiler is not supposed to know that rna is an instance of RNA. So where does it come from? The compiler falls back on an instance of GenericCanBuildFrom, the one defined in the IndexedSeq object.

GenericCanBuildFroms produce their builders by calling genericBuilderB on the originating collection, and a requirement for that generic builder is that it can produce generic collections that can hold any type B — as of course, the return type of the function passed to a map() is not constrained. In this case, RNA is only an IndexedSeqBase and not a generic IndexedSeq, so it's not possible to override genericBuilderB in RNA to return a RNA-specific builder — we would have to check at runtime whether B is Base or something else, but we cannot do that. I think this explains why, in the question, we get a Vector back.As to how we can fix it, it's an open question… Edit: Fixing this requires map() to know whether it's mapping to a subtype of A or not.

A significant change in the collections library would be needed for this to happen. See the related question Should Scala's map() behave differently when mapping to the same type?.

On why I think it's not a good idea to statically type to a weaker type than RNA. It should really be a comment (cause it's more an opinion but that would be harder to read). From your comment to my comment: Why not?

As a subclass of IndexedSeqBase, RNA is able to do everything IndexedSeqBase does, as per the Liskov substitution principle. Sometimes, all you know is that it's an IndexedSeq, and you still expect filter, map and friends to keep the same specific implementation. Actually, filter does it — but not map filter does it because the compiler can statically guarantee it.

If you keep elements from a particular collection, you end up with a collection from the same type. Map cannot guarantee that, it depends on the function that is passed. My point is more on the act of specifying explicitly a type and expecting more than what it can deliver.As a user of the RNA collection, I may write code that depends on certain properties of this collection such as efficient memory representation.

So let's assume I state in val rna: IndexedSeqBase that rna is just an IndexedSeq. A few lines later I call a method doSomething(rna) where I expect the efficient memory representation, what would be the best signature for that? Def doSomethingT(rna: IndexedSeqBase): T or def doSomethingT(rna: RNA): T?

I think it should be the latter. But if that's the case, then the code won't compile because rna is not statically an RNA object. If the method signature should be the former, then in essence I'm saying that I don't care about the memory representation efficiency.

So I think the act of specifying a weaker type explicitly but expecting a stronger behavior is a contradiction. Which is what you do in your example. Now I do see that even if I did: val rna = RNA(A, G, T, U) val rna2 = doSomething(rna) where somebody else wrote: def doSomethingU(seq: IndexedSeqU) = seq.

Map(identity) I would like to have rna2 be a RNA object but that won't happen... It means that this somebody else should write a method that takes a CanBuildFrom if they want to have callers get more specific types: def doSomethingU, To(seq: IndexedSeqU) (implicit cbf: CanBuildFromIndexedSeqU, U, To) = seq. Map(identity)(cbf) Then I could call: val rna2: RNA = doSomething(rna)(collection. BreakOut).

Thanks a lot for your detailed explanation. I mostly agree. I just find that it's a pity that in this particular case, the consistency that was present in other parts of the collections library breaks down.It is really impossible for map to be defined in a way where we could preserve the same type of collection just like a filter?

Maybe something like def smarterMapB(f: (A) ⇒ B)(implicit sameTypeEv: A =:= B = null), where sameTypeEv is not null if we're mapping to the same type? €¦ Just thinking out loud… – Jean-Philippe Pellet Apr 14 at 13:16 Or rather: (implicit canUseCalleeBuilderEvidence: B – Jean-Philippe Pellet Apr 17 at 18:26.

On why I think it's not a good idea to statically type to a weaker type than RNA. It should really be a comment (cause it's more an opinion but that would be harder to read). As a subclass of IndexedSeqBase, RNA is able to do everything IndexedSeqBase does, as per the Liskov substitution principle.

Sometimes, all you know is that it's an IndexedSeq, and you still expect filter, map and friends to keep the same specific implementation. Filter does it because the compiler can statically guarantee it. If you keep elements from a particular collection, you end up with a collection from the same type.

Map cannot guarantee that, it depends on the function that is passed. My point is more on the act of specifying explicitly a type and expecting more than what it can deliver. As a user of the RNA collection, I may write code that depends on certain properties of this collection such as efficient memory representation.

So let's assume I state in val rna: IndexedSeqBase that rna is just an IndexedSeq. A few lines later I call a method doSomething(rna) where I expect the efficient memory representation, what would be the best signature for that?

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions