mmtk/vm/slot.rs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363
//! This module provides the trait [`Slot`] and related traits and types which allow VMs to
//! customize the layout of slots and the behavior of loading and updating object references in
//! slots.
use std::hash::Hash;
use std::marker::PhantomData;
use std::{fmt::Debug, ops::Range};
use atomic::Atomic;
use crate::util::constants::{BYTES_IN_ADDRESS, LOG_BYTES_IN_ADDRESS};
use crate::util::{Address, ObjectReference};
/// `Slot` is an abstraction for MMTk to load and update object references in memory.
///
/// # Slots and the `Slot` trait
///
/// In a VM, a slot can contain an object reference or a non-reference value. It can be in an
/// object (a.k.a. a field), on the stack (i.e. a local variable) or in any other places (such as
/// global variables). It may have different representations in different VMs. Some VMs put a
/// direct pointer to an object into a slot, while others may use compressed pointers, tagged
/// pointers, offsetted pointers, etc. Some VMs (such as JVM) have null references, and others
/// (such as CRuby and JavaScript engines) can also use tagged bits to represent non-reference
/// values such as small integers, `true`, `false`, `null` (a.k.a. "none", "nil", etc.),
/// `undefined`, etc.
///
/// In MMTk, the `Slot` trait is intended to abstract out such different representations of
/// reference fields (compressed, tagged, offsetted, etc.) among different VMs. From MMTk's point
/// of view, **MMTk only cares about the object reference held inside the slot, but not
/// non-reference values**, such as `null`, `true`, etc. When the slot is holding an object
/// reference, we can load the object reference from it, and we can update the object reference in
/// it after the GC moves the object.
///
/// # The `Slot` trait has pointer semantics
///
/// A `Slot` value *points to* a slot, and is not the slot itself. In fact, the simplest
/// implementation of the `Slot` trait ([`SimpleSlot`], see below) can simply contain the address of
/// the slot.
///
/// A `Slot` can be [copied](std::marker::Copy), and the copied `Slot` instance points to the same
/// slot.
///
/// # How to implement `Slot`?
///
/// If a reference field of a VM is word-sized and holds the raw pointer to an object, and uses the
/// 0 word as the null pointer, it can use the default [`SimpleSlot`] we provide. It simply
/// contains a pointer to a memory location that holds an address.
///
/// ```rust
/// pub struct SimpleSlot {
/// slot_addr: *mut Atomic<Address>,
/// }
/// ```
///
/// In other cases, the VM need to implement its own `Slot` instances.
///
/// For example:
/// - The VM uses **compressed pointers** (Compressed OOPs in OpenJDK's terminology), where the
/// heap size is limited, and a 64-bit pointer is stored in a 32-bit slot.
/// - The VM uses **tagged pointers**, where some bits of a word are used as metadata while the
/// rest are used as pointer.
/// - The VM uses **offsetted pointers**, i.e. the value of the field is an address at an offset
/// from the [`ObjectReference`] of the target object. Such offsetted pointers are usually used
/// to represent **interior pointers**, i.e. pointers to an object field, an array element, etc.
///
/// If needed, the implementation of `Slot` can contain not only the pointer, but also additional
/// information. The `OffsetSlot` example below also contains an offset which can be used when
/// decoding the pointer. See `src/vm/tests/mock_tests/mock_test_slots.rs` for more concrete
/// examples, such as `CompressedOopSlot` and `TaggedSlot`.
///
/// ```rust
/// pub struct OffsetSlot {
/// slot_addr: *mut Atomic<Address>,
/// offset: usize,
/// }
/// ```
///
/// When loading, `Slot::load` shall load the value from the slot and decode the value into a
/// regular `ObjectReference` (note that MMTk has specific requirements for `ObjectReference`, such
/// as being aligned, pointing inside an object, and cannot be null. Please read the doc comments
/// of [`ObjectReference`] for details). The decoding is VM-specific, but usually involves removing
/// tag bits and/or adding an offset to the word, and (in the case of compressed pointers) extending
/// the word size. By doing this conversion, MMTk can implement GC algorithms in a VM-neutral way,
/// knowing only `ObjectReference`.
///
/// When GC moves object, `Slot::store` shall convert the updated `ObjectReference` back to the
/// slot-specific representation. Compressed pointers remain compressed; tagged pointers preserve
/// their tag bits; and offsetted pointers keep their offsets.
///
/// # Performance notes
///
/// The methods of this trait are called on hot paths. Please ensure they have high performance.
///
/// The size of the data structure of the `Slot` implementation may affect the performance as well.
/// During GC, MMTk enqueues `Slot` instances, and its size affects the overhead of copying. If
/// your `Slot` implementation has multiple fields or uses `enum` for multiple kinds of slots, it
/// may have extra cost when copying or decoding. You should measure it. If the cost is too much,
/// you can implement `Slot` with a tagged word. For example, the [mmtk-openjdk] binding uses the
/// low order bit to encode whether the slot is compressed or not.
///
/// [mmtk-openjdk]: https://github.com/mmtk/mmtk-openjdk/blob/master/mmtk/src/slots.rs
///
/// # About weak references
///
/// This trait only concerns the representation (i.e. the shape) of the slot, not its semantics,
/// such as whether it holds strong or weak references. Therefore, one `Slot` implementation can be
/// used for both slots that hold strong references and slots that hold weak references.
pub trait Slot: Copy + Send + Debug + PartialEq + Eq + Hash {
/// Load object reference from the slot.
///
/// If the slot is not holding an object reference (For example, if it is holding NULL or a
/// tagged non-reference value. See trait-level doc comment.), this method should return
/// `None`.
///
/// If the slot holds an object reference with tag bits, the returned value shall be the object
/// reference with the tag bits removed.
fn load(&self) -> Option<ObjectReference>;
/// Store the object reference `object` into the slot.
///
/// If the slot holds an object reference with tag bits, this method must preserve the tag
/// bits while updating the object reference so that it points to the forwarded object given by
/// the parameter `object`.
///
/// FIXME: This design is inefficient for handling object references with tag bits. Consider
/// introducing a new updating function to do the load, trace and store in one function.
/// See: <https://github.com/mmtk/mmtk-core/issues/1033>
///
/// FIXME: This method is currently used by both moving GC algorithms and the subsuming write
/// barrier ([`crate::memory_manager::object_reference_write`]). The two reference writing
/// operations have different semantics, and need to be implemented differently if the VM
/// supports offsetted or tagged references.
/// See: <https://github.com/mmtk/mmtk-core/issues/1038>
fn store(&self, object: ObjectReference);
/// Prefetch the slot so that a subsequent `load` will be faster.
fn prefetch_load(&self) {
// no-op by default
}
/// Prefetch the slot so that a subsequent `store` will be faster.
fn prefetch_store(&self) {
// no-op by default
}
}
/// A simple slot implementation that represents a word-sized slot which holds the raw address of
/// an `ObjectReference`, or 0 if it is holding a null reference.
///
/// It is the default slot type, and should be suitable for most VMs.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct SimpleSlot {
slot_addr: *mut Atomic<Address>,
}
impl SimpleSlot {
/// Create a simple slot from an address.
///
/// Arguments:
/// * `address`: The address in memory where an `ObjectReference` is stored.
pub fn from_address(address: Address) -> Self {
Self {
slot_addr: address.to_mut_ptr(),
}
}
/// Get the address of the slot.
///
/// Return the address at which the `ObjectReference` is stored.
pub fn as_address(&self) -> Address {
Address::from_mut_ptr(self.slot_addr)
}
}
unsafe impl Send for SimpleSlot {}
impl Slot for SimpleSlot {
fn load(&self) -> Option<ObjectReference> {
let addr = unsafe { (*self.slot_addr).load(atomic::Ordering::Relaxed) };
ObjectReference::from_raw_address(addr)
}
fn store(&self, object: ObjectReference) {
unsafe { (*self.slot_addr).store(object.to_raw_address(), atomic::Ordering::Relaxed) }
}
}
/// For backword compatibility, we let `Address` implement `Slot` with the same semantics as
/// [`SimpleSlot`] so that existing bindings that use `Address` as `Slot` can continue to work.
///
/// However, we should use `SimpleSlot` directly instead of using `Address`. The purpose of the
/// `Address` type is to represent an address in memory. It is not directly related to fields
/// that hold references to other objects. Calling `load()` and `store()` on an `Address` does
/// not indicate how many bytes to load or store, or how to interpret those bytes. On the other
/// hand, `SimpleSlot` is all about how to access a field that holds a reference represented
/// simply as an `ObjectReference`. The intention and the semantics are clearer with
/// `SimpleSlot`.
impl Slot for Address {
fn load(&self) -> Option<ObjectReference> {
let addr = unsafe { Address::load(*self) };
ObjectReference::from_raw_address(addr)
}
fn store(&self, object: ObjectReference) {
unsafe { Address::store(*self, object) }
}
}
#[test]
fn a_simple_slot_should_have_the_same_size_as_a_pointer() {
assert_eq!(
std::mem::size_of::<SimpleSlot>(),
std::mem::size_of::<*mut libc::c_void>()
);
}
/// A abstract memory slice represents a piece of **heap** memory which may contains many slots.
pub trait MemorySlice: Send + Debug + PartialEq + Eq + Clone + Hash {
/// The associate type to define how to access slots from a memory slice.
type SlotType: Slot;
/// The associate type to define how to iterate slots in a memory slice.
type SlotIterator: Iterator<Item = Self::SlotType>;
/// Iterate object slots within the slice. If there are non-reference values in the slice, the iterator should skip them.
fn iter_slots(&self) -> Self::SlotIterator;
/// The object which this slice belongs to. If we know the object for the slice, we will check the object state (e.g. mature or not), rather than the slice address.
/// Normally checking the object and checking the slice does not make a difference, as the slice is part of the object (in terms of memory range). However,
/// if a slice is in a different location from the object, the object state and the slice can be hugely different, and providing a proper implementation
/// of this method for the owner object is important.
fn object(&self) -> Option<ObjectReference>;
/// Start address of the memory slice
fn start(&self) -> Address;
/// Size of the memory slice
fn bytes(&self) -> usize;
/// Memory copy support
fn copy(src: &Self, tgt: &Self);
}
/// Iterate slots within `Range<Address>`.
pub struct AddressRangeIterator {
cursor: Address,
limit: Address,
}
impl Iterator for AddressRangeIterator {
type Item = Address;
fn next(&mut self) -> Option<Self::Item> {
if self.cursor >= self.limit {
None
} else {
let slot = self.cursor;
self.cursor += BYTES_IN_ADDRESS;
Some(slot)
}
}
}
impl MemorySlice for Range<Address> {
type SlotType = Address;
type SlotIterator = AddressRangeIterator;
fn iter_slots(&self) -> Self::SlotIterator {
AddressRangeIterator {
cursor: self.start,
limit: self.end,
}
}
fn object(&self) -> Option<ObjectReference> {
None
}
fn start(&self) -> Address {
self.start
}
fn bytes(&self) -> usize {
self.end - self.start
}
fn copy(src: &Self, tgt: &Self) {
debug_assert_eq!(src.bytes(), tgt.bytes());
debug_assert_eq!(
src.bytes() & ((1 << LOG_BYTES_IN_ADDRESS) - 1),
0,
"bytes are not a multiple of words"
);
// Raw memory copy
unsafe {
let words = tgt.bytes() >> LOG_BYTES_IN_ADDRESS;
let src = src.start().to_ptr::<usize>();
let tgt = tgt.start().to_mut_ptr::<usize>();
std::ptr::copy(src, tgt, words)
}
}
}
/// Memory slice type with empty implementations.
/// For VMs that do not use the memory slice type.
#[derive(Debug, PartialEq, Eq, Clone, Hash)]
pub struct UnimplementedMemorySlice<SL: Slot = SimpleSlot>(PhantomData<SL>);
/// Slot iterator for `UnimplementedMemorySlice`.
pub struct UnimplementedMemorySliceSlotIterator<SL: Slot>(PhantomData<SL>);
impl<SL: Slot> Iterator for UnimplementedMemorySliceSlotIterator<SL> {
type Item = SL;
fn next(&mut self) -> Option<Self::Item> {
unimplemented!()
}
}
impl<SL: Slot> MemorySlice for UnimplementedMemorySlice<SL> {
type SlotType = SL;
type SlotIterator = UnimplementedMemorySliceSlotIterator<SL>;
fn iter_slots(&self) -> Self::SlotIterator {
unimplemented!()
}
fn object(&self) -> Option<ObjectReference> {
unimplemented!()
}
fn start(&self) -> Address {
unimplemented!()
}
fn bytes(&self) -> usize {
unimplemented!()
}
fn copy(_src: &Self, _tgt: &Self) {
unimplemented!()
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn address_range_iteration() {
let src: Vec<usize> = (0..32).collect();
let src_slice = Address::from_ptr(&src[0])..Address::from_ptr(&src[0]) + src.len();
for (i, v) in src_slice.iter_slots().enumerate() {
assert_eq!(i, unsafe { v.load::<usize>() })
}
}
#[test]
fn memory_copy_on_address_ranges() {
let src = [1u8; 32];
let mut dst = [0u8; 32];
let src_slice = Address::from_ptr(&src[0])..Address::from_ptr(&src[0]) + src.len();
let dst_slice =
Address::from_mut_ptr(&mut dst[0])..Address::from_mut_ptr(&mut dst[0]) + src.len();
MemorySlice::copy(&src_slice, &dst_slice);
assert_eq!(dst.iter().sum::<u8>(), src.len() as u8);
}
}