0000299: Redesigning mappings, arrays, lvalues - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update

0000299	LDMud 3.5	Implementation	public	2004-11-27 00:01	2022-10-06 20:50

Reporter	~~lars~~	Assigned To	Gnomi
Priority	normal	Severity	feature	Reproducibility	N/A
Status	assigned	Resolution	open

Summary	0000299: Redesigning mappings, arrays, lvalues
Description	Short: Redesigning arrays and mappings, and lvalues. From: Lars Date: 2002-08-13 Type: Feature State: New The current problem with arrays and mappings is that even though they are by-reference objects, they implement it differently. This is most visible in the construct lhs += rhs, which for arrays creates a copy of lhs, but not for mappings. Reason is that in arrays the data is stored in the array header structure, whereas mappings store the data in a separate memory block. Strings again act like arrays even though they use a separate memory block for the data, but otoh they are easy to duplicate and it would be unnatural for them to be by-reference datatyps. Having a by-reference type may feel unusual for programmers coming from other languages, but is no functional problem. Note: Python handles everything by reference. list.append() changes in-place, list + x creates a new list. Solution 1: Implement the by-reference semantics consequentially This means: 'lhs = rhs1 + rhs2' always creates a new item, copies the content of rhs1 and rhs2 into it and then assigns the new item to lhs, freeing whatever was in lhs before. 'lhs += rhs' takes the contents of rhs and adds them to the existing lhs. LPC already has the semantic that lhs[] works directly on the given array/mapping, but it might help if programmers could specify directly that the lhs is to be made duplicated if referenced by more than one owner. For example: 'unique lhs += rhs' would act like 'lhs = lhs + rhs' 'unique lhs[i] = j' would act like 'lhs = copy(lhs); lhs[i] = j' 'unique' could also be used in a rhs context and would act like copy(). The special form 'unique lhs1, lhs2, lhs3,...' would act like 'lhs1 = copy(lhs1); lhs2 = copy(lhs2); lhs3 = lhs(3);' In order to implement this efficiently, it might be useful to have a separate svalue type for arrays with fixed number of elements (structs, tuples). Another idea would be to store the initial elements in the array header structure, and let later changes to the array replace the first svalue entry with a special svalue (T_ARRAY_EXTENSION) pointing to the additional data. The disadvantage would be that ({}) != ({}) (but ([]) != ([]) already anyway). Solution 2: Implement a by-value semantics. Both 'lhs = rhs1 + rhs2' and 'lhs += rhs' create a new item, copy the content of rhs1 and rhs2 into it and then assigns the new item to lhs, freeing whatever was in lhs before. The advantage of using the '+=' operator would be that the interpreter can avoid duplicating lhs if it doesn't have more than one reference. To implement this efficiently, the driver would have to implement a copy-on-write semantic. To allow the sharing of arrays and mappings, programs would explicitely create references to it, like 'return &foo'. A good implementation of these references would be to use indirection like this: type a = value; a = (T_TYPE, value) type b = &a; type c = &b; a = (T_LVALUE)\ b = (T_LVALUE)- (3 refs, (T_TYPE, value)) c = (T_LVALUE)/ The lvalue-resolution code would then detect and collapse lvalue holders with only one ref left. Handling references to subranges or single elements would require some more effort - views maybe? One view would cache the referenced element and write it back to the underlying structure when the underlying structure as a whole is read, or when another view to the same structure is about to be changed. This would imply a back- pointer from the lvalue-holder to the list of views. With this, the language would need a way to ignore the lvalue mode: type a = value; type b = &a; b = 0; --> removes value from a and b &b = 0; --> removes value from b, but not from a &b = 1; --> just assigns '1' to b, ignores the '&' as b is not an lvalue. Another modification would be to allow only read access to the reference: type b = const ref a;
Tags	No tags attached.

Gnomi 2022-10-06 20:49 manager ~0002702	We would like to change array semantics into a full reference type (similar to mappings). So changes to the array size (due to operator assignments like += or using slice assignments) will not create any copies of arrays anymore. Because the new behavior would be more intuitive than the current one. Target for this would be LDMud 3.7. As this is may break a lot of code, the next step would be to add a pragma (default on) to warn when array operations (slice assignment, operator assignments) create a new array and the original array has more than one reference (i.e. when the behavior would be different with an array with pure reference semantics). Also warn on comparisons with the empty array (==, !=, member, in, -=, &=, mapping lookup).

Date Modified	Username	Field	Change
2004-11-27 00:01	~~lars~~	New Issue
2009-10-06 02:11	zesstra	Relationship added	related to 0000546
2009-10-06 02:16	zesstra	Project	LDMud => LDMud 3.5
2018-02-04 00:19	Gnomi	Relationship added	has duplicate 0000572
2022-10-06 20:49	Gnomi	Note Added: 0002702
2022-10-06 20:50	Gnomi	Assigned To	=> Gnomi
2022-10-06 20:50	Gnomi	Status	new => assigned