Monday, September 7, 2009

Resources on demand - close but not there yet...

Following a user request a few weeks ago we started upon an ambitious plan to reserve computing resources on a national wide scale. The plan was simple. The objective was to have weather predictions made using the wrf model on a daily basis for three European cities; Athens, Barcelona and Lisbon. As this model is parallelized using the MPI library we needed enough resources on three separate Grid sites. We forwarded this request to our national NGI (HellasGrid) which in turn arranged for the reservation of the resources and voila: a total of 36 physical cores was daily reserved for our user experiment which we came to know as the Thermopolis Project.

The outcome of this project was very successful. First of all, the reservation policies were quickly arranged by the Grid site administrators and secondly the overall calculations were completed more or less in time thus qualifying as predictions and not just calculations.

It is thus good to know that such reservation policies can be arranged in coordination with the NGI even though there is still ground to cover until we reach the point where a user will have resources on demand. Actually I am not sure if this is the point we want to reach but if you consider the "Grid will provide cpu resources much like the power grid provides electrical power" phrase that stuck in my mind in the beginning of my involvement with Grid technologies this may well be it. Nonetheless it is reassuring to know that in a couple of days or maybe less a researcher doing environmental or seismological or whatever sort of research with a clear impact on our day to day life can ask for and get the required resources needed for his or her work.

Wednesday, July 15, 2009

Can OpenMP be supported on the EGEE Grid?

Following a hack made about a month ago on the GRAM module the question still remains open. Is it possible to run OpenMP parallelized code on the production EGEE Infrastructure? The answer is still 'yes and no'. It is yes, because one can either use preproduction services such as the Cream CE or adopt to hacked versions of the GRAM module of the PBS server and it is no because none of the above can claim to be of production quality -at least for the moment they can't.

It thus looks like it is an open race where tech gurus and developers may compete against in the near future over best practices in the adaptation of shared memory parallelism on a Grid infrastructure.

The most important issue to address in this informal competition is IMHO how to connect information stemming from the user on the UI across the WMS service onto the gatekeeper on the CE and finally onto the PBS service. Actually, the one to succeed in this task will not only have solved and addressed the issue of running shared memory parallel applications on a production Grid but moreover the issue of exploiting the Infrastructure in the exact accordance of one's needs.

In example, memory requirements or wall-clock time estimations may then be adapted by the users of the infrastructure thus allowing more memory demanding jobs to be executed on specific machinery of the infrastructure and for backfilling mechanisms to be implemented.

One can only begin to imagine the boost in robustness that such a mechanism may bring to the wider EGEE community. It will lead us from a production infrastructure to a production quality infrastructure.