Abstract: Static data is known to require sophisticated analysis. However, data that is generated in real time, e.g., via simulations, also require advanced analytical approaches. The project involves developing and testing infrastructure to support the integration of ensemble analysis methods with the Apache Spark ecosystem. Focus points include the characterization of the optimization and characterization of various Spark partitioning algorithms, along with a potential model for streaming applications. The motivation stems from the need to eventually integrate Spark's built-in streaming capabilities with iterative simulation-analysis.