PRE-SYMPOSIUM TUTORIALS , Tuesday, August 3, 1999
You may select the all-day tutorial; or one tutorial from the morning session and/or one from the afternoon session. Each tutorial registration fee includes attendance at the tutorial session and materials. There are no student fees for the tutorials. Cancellations of tutorial registrations made after July 23, 1999 will be subject to the total fee. We reserve the right to cancel the tutorials due to insufficient participation or other unforeseeable problems, in which case tutorial fees will be refunded in full.
TUESDAY, AUGUST 3, 1999
7:00 a.m. - 6:00 p.m. Registration
9:00 AM - 4:30 PM - Full Day Tutorial
Tutorial 1: Cryptography, Security and Privacy
Prerequisite: Participants should be familiar with networked computing (the Internet, client/server applications, etc.) as well as basic mathematics and computer programming. This basic background information is essential to anyone involved in the Internet today, including technical staff as well as executives.
Tutorials 2,3,4,5,6 (half-day) each
8:30 - 12:00 PM
Tutorial 2: Performance Analysis and Prediction of Large-Scale Scientific Applications
We will begin with definitions (weak scalability, strong scalability, parallel efficiency, etc.) and a short overview of the performance analysis techniques mentioned above. We will then introduce rigorous metrics for performance, both serial and parallel. Performance expectations at a coarse level will be emphasized using examples. We will discuss, in detail, the single most important bottleneck in single-processor performance - the memory subsystem. We will demonstrate how users can obtain diagnostic
information about memory performance of their codes and how such information can help predict achievable single-processor performance.
With performance goals properly defined, we will offer a discussion of commonly and not-so-commonly utilized techniques for performance optimization of Fortran codes. Serial and parallel performance optimization will be analyzed.
We will then discuss topics related to analytical modeling of performance and scalability of large-scale applications. We will adopt a top-down approach in which computation and communication components are analyzed separately and any overlap between them is
considered. We will be careful to differentiate between algorithmic scalability and "real" scalability, where the latter takes into consideration constraints from a specific implementation. Codes from the ASCI workload will be utilized as examples throughout the lecture.
The tutorial will not emphasize any particular machine; rather it will generally address performance of application on RISC processors and on widely utilized parallel systems such as the SGI Origin 2000, IBM SP2 and Cray T3E.
The target audience is a mixture of computational scientists, computer scientists, and code developers interested in performance analysis of "real-life" applications. By carefully defining terms and metrics, we fully expect to overcome the "lingo" barrier associated with a diverse audience, while providing an in-depth understanding of the issues in a manner relevant to all backgrounds. The tutorial will also be useful to those trying to define future-generation, high-end computing needs.
Tutorial 3: The Globus Grid Programming Toolkit
in practice difficult and time consuming, because of the need to deal with complex and highly heterogeneous systems. The Globus grid programming toolkit is designed to help application developers and tool builders overcome these obstacles to the construction of "grid-enabled" scientific and engineering applications. It does this by providing a set of standard services for authentication, resource location, resource allocation, configuration, communication, file access, fault detection, and executable management. These services
can be incorporated into applications and/or programming tools in a "mix-and-match" fashion to provide access to needed capabilities.
Our goal in this tutorial is both to introduce the capabilities of the Globus toolkit and to help attendees apply Globus services to their own applications. Hence, we will structure the tutorial as a combination of Globus system description and application examples.
Tutorial 4: Cluster Computing: The Commodity Supercomputing
The question naturally arises: How does Clusters, redefine concepts of traditional supercomputing ?; How is this different from traditional supercomputing or MPP computing?; Is this offers a completely different programming paradigm?; How one can make a Cluster based Supercomputer and what are its implications? This tutorial offers answer to all these questions and will also go beyond the hype.
1:30 - 5:00 PM
Tutorial 5: Distributed Systems Performance Analysis Using Net Logger and Pablo
In this tutorial, we will present the NetLogger and Pablo toolkits which are both targeted toward understanding and improving the performance of applications in distributed computing environments. The components of each toolkit will be covered, along with case studies showing their use with actual distributed applications.
Participants will gain an understanding of the approaches taken by the two toolkits, become familiar with the capabilities provided by each, and be equipped to assess how they might use the toolkits to improve performance in their own computing environments.
Tutorial 6: High-Performance Computing with Legion
much larger, more complex resource pools. With Legion, for example, a user can easily run a computation on a supercomputer at a national center while dynamically visualizing the results on a local machine. As another example, Legion makes it trivial to schedule and run a large parameter space study on several workstation farms simultaneously. Legion permits computational scientists to use cycles wherever they are, allowing bigger jobs to run in shorter times through higher degrees of parallelization.
Key capabilities include the following:
- Legion eliminates the need to move and install binaries manually on multiple platforms. After Legion schedules a set of tasks over multiple remote machines, it
automatically transfers the appropriate binaries to each host. A single job can run on multiple heterogeneous architectures simultaneously; Legion will ensure that
the right binaries go to each, and that it only schedules onto architectures for which it has binaries.
- Legion provides a virtual file system that spans all the machines in a Legion system. Input and output files can be seen by all the parts of a computation, even when the computation is split over multiple machines that don't share a common file system. Different users can also use the virtual file system to collaborate, sharing data files and even accessing the same running computations.
- Legion's object-based architecture dramatically simplifies building add-on tools for tasks such as visualization, application steering, load monitoring, and job migration.
- Legion provides optional privacy and integrity of communications for applications distributed over public networks. Multiple users in a Legion system are protected from one another.
These features also make Legion attractive to administrators looking for ways to increase and simplify the use of shared high-performance machines. The Legion implementation emphasizes extensibility, and multiple policies for resource use can be embedded in a single Legion system that spans multiple resources or even administrative domains.
This tutorial will provide background on the Legion system and teach how to run existing parallel codes within the Legion environment. The target audience is supercomputing experts who help scientists and other users get their codes parallelized and running on high performance systems.
6:00 - 7:30 p.m. Evening Reception <Seascape Room>