News Story

Computer Meeting Minutes - October 8, 2014

Date: 2014 October 8 2-3PM.

Attendees:  G. Oakham,D. Rogers,  P. Kalyniak, R.Thomson, A. Bellerive, T. Gregorie.
Staff:  Wade, Michael, Stephen

1. Office moves

   Due to space pressures, Stephen will be moving to a single office
   with his current large office being used to house Theory RAs. 
   Yeremia has moved to his new office in Earth Sciences.

    Mikhail departed at the end of September and will move his
   computers out of the room previously shared with Stephen and
   Yeremia by the end of the week.

2. Research Compute Cluster

   The Theory group, Pat, specifically, has some funds available to be
   applied to supplement the departmental research computing cluster
   with the primary objective of supporting their computational
   requirements. The available funds would provide about an additional
   16 cores based on recent server purchases in the Faculty of  Science.

   A discussion was held to determine if there was interest from other
   research groups to also supplement the computing infrastructure to
   leverage purchasing power to obtain better pricing. There appears
   to be interest, but the timing may not be optimal, so it was
   proposed that a proposal be developed with pricing to purchase the
   computing for the Theory group and that this information be shared
   with the committee to further gauge interest in a larger purchase.
   The funds need to have been spent by the end of the calendar year,
   so the action item is to develop a configuration and obtain a quote
   or quotes by the end of October.

 

   Dave is going to follow up with his colleagues at the Ottawa
   Hospital to determine if there was interest in supplementing their
   contribution to the compute cluster. Gerald had spoken to Kevin and
   reported that there was no requirements at this time from Exo.

   Dave also expressed interest in the Intel Phi co-processor and has
   funds earmarked to obtain a server with these co-processors. His
   colleagues at NRC are looking to evaluate running EGS code on the
   Phi co-processors. Pending these results, Dave will determine how
   to best to supplement the cluster for his groups computing requirements.

 

   A broader issue raised by Rowan was with regards to the aging
   compute cluster and whether there is a plan to keep the cluster
   current and relevant or allow it to die slowly. With the current
   CFI and NSERC rules, funding for new local computing infrastructure
   is generally not permitted. As a consequence and with high demand
   for computing, Rowan has submitted a request to the Compute Canada
   Resource Allocation committee for computing resources on Compute
   Canada's available platforms. A determination is expected by the
   end of the year.

   Gerald proposed an evergreening approach similar to the current
   departmental computing infrastructure approach adopted for the past
   few years with the CMO.  

   Another discussion focused on the efficacy of the current queuing
   discipline implemented for batch queuing. Currently there is a
   single monolithic queue with first in and first out. With specific
   owners of resources and large scale job submissions users were
   blocked from their own resources. Wade proposed a return to the
   resource owner-based priority queuing that was implemented
   before. However, there were two principal stakeholders, but now
   there are more resource owners, specifically, the Ottawa Hospital,
   EXO, SNO, and soon Theory. Alain also suggested that the queuing
   system be flexible to accommodate quick short jobs from
   non-resource owner groups as well, without being relegated to the
   bottom of the queues. Gerald asked for a proposal for the next
   Computer Committee meeting.

   Also a request was received from a graduate student of Tong's to
   run OpenMPI parallel jobs was raised and a short discussion of how
   to accommodate this request. It was suggested that the student
   should perhaps look at using HPCVL which has support for OPenMPI.
 

3. Current IT issues

   Alain reported tyr/thor is very slow, especially with processes
   like firefox which have been running for a long time. He suggested
   to setup daemon to kill those heavy processes if they had been
   running for a while.

   Alain reported a Sunray lab issue about users having locked their
   screen and leaving, which prohibits users from the next class from
   logging in from the specific terminal. Wade suggested that this was
   an implementation issue and that the user screen locks should be
   disabled.

   Another issue raised was the slowness of pine due to the spam
   filter defined for pine. With large inboxes in excess of 200
   messages, it takes a long time to open mail with pine. Dave
   indicated that Bill had fixed this for him previously by removing
   the pine defined spam filter(s). The proposed solution is to remove
   the pine pattern filter for spam. With Spamassassin and grey
   listing, the volume of spam has greatly been reduced.

4. CMO

   Gerald asked for evergreening proposals for the current CMO funds
   for the current academic cycle.

5. State of current departmental infrastructure

   Alain asked a question about the adequacy of tyr and thor. It was
   agreed that they were long past their best before date. Wade
   indicated that there needs to be overhaul of many of the
   services. This included the bastien and interactive nodes - tyr and
   thor, the SunRay service, mail service, web service, etc. There are
   planned new services such as a web-based calendar booking system
   and a locally hosted ownCloud service similar to DropBox as well. 

   Gerald questioned what resources would be required to do this. Wade
   indicated that these resources are available through the
   evergreening plan of the departmental infrastructure for the past
   couple of years, but due to time constraints this plan had not been
   realized. The plan was to virtualized many of these services on
   virtual machines removing the dependency on the physical
   infrastructure. The cloudshare service used for hiring new Physics
   faculty is a service is an example of a service developed on a
   virtual server. Older hardware will be phased out as these services
   are migrated.

6. Proposed Agenda for next meeting on November 5th 2-3PM

   o Evergreening of Research Computing Cluster Proposal
   o Queuing Discipline Proposal
   o Departmental IT Infrastructure Changes Proposal

Search Carleton