L2 cache line size is 32 bytes on T4 instead of 64 bytes on T series before. As result BIS instruction prefetches only 32 bytes. Jbb2005 runs show that prefetching 64 bytes is still better on T4 so 2 BIS instructions should be issued.
BIS can't be use for general prefetching since it may fault. New PrefetchAllocation node was added for allocation prefetching.
Changed prefetchAlloc_bis parameter from memory to regP.
Use AllocatePrefetchInstr on Sparc to allow specify what instruction to use for allocation prefetching (0: prefetch write, 1: BIS).
Added new instructions on Sparc cacheLineAdrX to reduce number of instructions generated for finding next cache line address.
Added new flag AllocateInstancePrefetchLines to specify number of lines to prefetch for instance allocation.
L1_data_cache_line_size() renamed to prefetch_data_size().
Prefetch instructions in x86 .ad use MacroAssembler instructions.
Added Abstract_VM_Version::reserve_for_allocation_prefetch() method used in ThreadLocalAllocBuffer::end_reserve().
I have to use FLAG_SET_ERGO() for AllocatePrefetchLines*2 setting since VM_Version::initialize() is called twice on Sparc.