|
In the realm of image processing, new needs are growing for recognition including facial feature detection and computational radiology; mining for financial analytics and multimedia research; and synthesis in photorealistic rendering and physical simulation.
Typical Recognition-Mining-Synthesis applications are for example recognizing a tumour, finding incidence in large datasets of tumours, and predicting how a particular tumour will progress. It is all about dealing efficiently with complex multimodal datasets, in the end and having an accurate amount of computing power.
Intel's terascale research programme tries to cover these new market needs. The company is developing scalable multi-core architectures on silicon. The memory performance issue is particularly challenging - stacked or shared - as well as thermal constraints. High bandwidth I/O and communications play a big role too. Evidently, the principle concern is the cost. The complex execution environment must be highly intelligent because it has to support thousand of threads. It depends on the model-based applications including virtual environments, educational simulation, financieal modelling, media search and manipulation, and web mining.
Intel underwent an evolution from a multi to a many core-policy. There are large and small cores and the company is looking at the type of application to implement the right amount of cores. There are other parameters too, such as the cache.
The company began to develop terascale experimental hardware including the Polaris prototype, 3D stacked memory, and started to perform Silicon Photonics research and investigated optical fiber, multiplexer, and so on. Then Intel started to move the experimental technology to the next level of the optical memory link. How to provision large amounts of memory for this many core technology? The proof of concept was latency impact and initialization protocols. The future work will consist in optical packaging and integration.
The Blade form-factor has near memory capacity constraints. One needs large remote memory with near memory latency attributes. The electrical solutions are power and wire hungry. The optical links offer a scalable solution with a minimum of latency impact, according to the speaker.
The programming challenge is that the engineers face irregular patterns and data structures. Scale to multi-core today is hard but scaling to many-core tomorrow will be even harder. One of the developments raising many interest in the community is Ct, a throughput programming language. The user writes serial-like core independent C++ code. The primary data abstraction is the nested vector, supporting dense, sparse, and irregular data. The Ct parallel runtime is auto-scaling to increasing cores. The Ct JIT compiler provides auto-vectorization, SSE, AVX, and Larrabee.
Intel is working with a Chinese partner on the Ct programming language. The universal parallel computing research centres are a catalyst for enabling the pervasive use of parallel computing.
Moving from theory to practice, Justine Rattner talked about Larrabee, a many core architecture. Many cores and many threads enable the scaling to TeraFLOPS. It is a standard IA programming model. Typical application categories are gaming, graphics, and media; financial services; oil and gas exploration; and medical imaging. You need the best architecture for the best algorithms.
The architecture evolution is heading for a collision course. There is a battle going on for the control of the computing platform, according to the speaker. CPU is evolving toward throughput computing and motivated by energy-efficient performance. GPU is evolving toward general purpose computing.
In order to make parallel computing pervasive Intel has set up a joint programme. There is a forthcoming collaboration with Cray. Intel is powering 75% of the TOP500 sites. The quad core IA is up to 257. In Europe Intel is heavily engaged at setting up cost-effective systems at the multi-flop level. The real world problems are taking us beyond the petascale level. The big issue is the power for the coming years between 2015-2018. Maybe we should look at voltage scaling. Using subthreshold technology can be a solution.
Justine Rattner ended with a Batman-quote: "I do nothing that a man of unlimited funds, superb physical endurance, and maximum scientific knowledge could not do." |