File Coordination Library
- I/O system resources such as MDS, OSS/OST are shared across the processes of a single application and/or multiple simultaneous programs.
- More and more HPC systems become large, the amount of I/O requests to each I/O system resource becomes larger, and load of them becomes heavier.
- The heavy loads of I/O system resources cause I/O performance degradation, and application performance degradation.
- It is one of the scalability issues of leadership-class high performance computing systems.
- We target the case that each process of the parallel application creates each own file and writes the file each.
- Many parallel application adopt the I/O pattern.
- The performance issue of the parallel I/O on current parallel file systems is shown.
- The I/O coordination technique to mitigate the issue is proposed.
Parallel I/O performance of current parallel file systems
I/O Performance [Striping]
- File striping causes performance degradation.
- Because the number of write processes accessing the each OST is larger.
I/O Performance [write size]
- Striping Count 1 ( not striping )
- Striping Offset the OST of same Z-axis
- In the case that the number of write-processes is small ( 1-4 writers / OST ) , larger stripe size gets better I/O performance.
- If the number of write-processes becomes larger, the performance of OST decreases.
Sequence of Write Operation
The Elapsed Time of File Output
The Basic Performance of IOC
- stripe size 16MB
- the number of parallel writers of each OST 2 processes
- NICAM : Nonhydrostatic Icosahedral Atmospheric Model
- under development in cooperation with CCSR and JAMSTEC
Environment and Configuration
|Output Time (320Nodes)|
|Output Time (1280Nodes)|
- We show the performance issue of the parallel I/O on current parallel file systems.
- The number of processes accessing OST becomes larger, the workload of OST becomes larger. So the I/O performance of OST becomes less.
- We propose the I/O coordination technique to mitigate the issue.
- Coordinator processes coordinate I/O requests from computing nodes, and control the amount of I/O requests the OST have to handle at the same time.
- The performance evaluation showed better performance than the original I/O.