Strand-based computing hardware and dynamically optimizing strandware are included in a high performance microprocessor system. The system operates in real time automatically and unobservably to parallelize single-threaded software into a plurality of parallel strands for execution by cores implemented in a multi-core and/or multi-threaded microprocessor of the system. The microprocessor executes a native instruction set tailored for speculative multithreading. The strandware directs hardware of the microprocessor to collect dynamic profiling information while executing the single-threaded software. The strandware analyzes the profiling information for the parallelization, and uses binary translation and dynamic optimization to produce native instructions to store in a translation cache later accessed to execute the produced native instructions instead of some of the single-threaded software. The system is capable of parallelizing a plurality of single-threaded software applications (e.g. application software, device drivers, operating system routines or kernels, and hypervisors).
展开▼