Carp - Cartesian Product (all-to-all) Computation Framework

Overview

Carp is a software to parallelize computation for any possible combination of two records in dataset.
The software users do not need to write any parallel program, but write just sequential programs.
All parallelizing tasks are done by the Carp software.


Directories

src/ Source files
mk/ Makefile headers
bin/ [ps]Carp binaries
+ pCarp Parallel(MPI) version of Carp ( for product run )
+ sCarp Sequential version of Carp ( for debugging )
include/ Header file of Carp
lib/ Liblaries of Carp
sample/ Sample programs


How to Build

$ make [ COMPILER=<Compiler for pCarp> COMPILER_SCARP=<Compiler for sCarp> ]
Compiler names : gcc, fccpx, gccpx, icc, clang
After build, you can find binaries and liblaries in src/ and lib/, respectively.



User Programs (Input program, Calculation/Output program)

To use the Carp software, you have to develop the Input program and Calculation/Output program.

Input Program
Read records and send them to the Carp.
Calculation/Ouput Program
Recieve two records, process them, and output its results.
When these programs executed by Carp software, MPI rank number and the number of MPI ranks are added to arguments.

Sample programs are located in "sample" directory.
API documentation is here.



User Program Compilation

For pCarp
$ <CC> -I<INCLUDE_DIR> -L<LIB_DIR> source.c -lpcarp
For sCarp
$ <CC> -I<INCLUDE_DIR> -L<LIB_DIR> source.c -lscarp


Execution

Parallel Execution
$ mpiexec pCarp -x "<Reader Prog for Dataset1> [args...]"  [-y "<Reader Prog for Dataset2> [args...]"] -c "<Calculation Prog> [args...]"
Serial Execution
$ sCarp -x "<Reader Prog for Dataset1> [args...]"  [-y "<Reader Prog for Dataset2> [args...]"] -c "<Calculation Prog> [args..]"
* If you don't specify the -y option, Carp works as Cartesian Power caluclation (all possible combination in single Dataset)

* You can use following format specifiers in program arguments. These are replaced by the specified values.
%[0][(width)]r MPI rank number
(width) : minimum number of chars
0 : padding with zeroes (default: padding with spaces)
%R MPI rank number, equalize width by padding with zeroes
%t The number of MPI ranks
%% percent sign(%)


Environmental Variables

You can change shared memory size by setting following environmental variables.

CARP_SHM_SIZE
Size of a ring buffer, used to communicate between User prog. and Carp. (Default size : 1MB)
CARP_SHM_RING_LEN
Length of ring buffer. (Default length : 32)

Code skeleton

Input Program
#include "carp.h"
int main(int argc, char *argv[]){
	int rank = atoi(argv[argc - 2]);
	int total = atoi(argv[argc - 1]);

	while( ... ){
		read_record_from_file(buf, &size, ...); // read record
		carp_put_datasize(size);
		carp_write(buf, size);
	}
	carp_finalize();
}
Calculation/Ouput Program
#include "carp.h"
int main(int argc, char *argv[]){
	int rank = atoi(argv[argc - 2]);
	int total = atoi(argv[argc - 1]);

	int rc;

	while( ... ){
		// get record 1
		rc = carp_get_datasize(&size1);
		if ( rc == CARP_DATA_FINISHED ) { // end of data
			break;
		}
		carp_read(buf1,size1);
		// get record 2
		carp_get_datasize(&size2);
		carp_read(buf2,size2);
		
		// calculation and output
		result = calc(buf1, buf2);
		output_to_file(result);
	}
	carp_finalize();
}

API Documentation

 All Files Functions Variables Defines

Generated on 15 Feb 2016 for Carp by  doxygen 1.6.1