Medicine iBrix pilot (MiBrix) project
Project Motivation
From an ITS perspective, this project is an opportunity to answer questions like:
-
What is the full cost of providing a production service using this class of technology?
-
How does the Ibrix software really work? (and hence get staff experience on the product)
-
Is this a good way of networking storage elements and instruments into a Grid?
-
How does it really scale in capacity, cost and performance?
-
How does the MTBF scale with large arrays of disks, what maintenance workload does this generate in practice, and hence what is the overall service availability impact?
-
How do Faculty of Medicine's storage requirements grow over time?
-
What are good roles and other applications for this class of technology (e.g. instrument-facing data collectors; "harvesting" spare disk space; etc.) ?
-
How does this implementation compare with alternatives?
-
From an e-Research perspective, this pilot gives the opportunity to:
-
explore the use of a commercially available distributed storage-grid software package in a real-world pilot;
-
integrate Faculty of Medicine's IT facilities into the Monash Campus Grid (MCG);
-
demonstrate geographically distributed storage architecture operating over the network, e.g. including:
o instrument-facing storage
(data collectors attached to microscopes and other imaging systems)
o faculty compute/storage facility
o central ITS LaRDS storage facility
o it will also be integrated with the LaRDS tape archive facility.
In summary, the project it holds promise of:
Project conduct
The project will be conducted on the following basis:
1. This is a limited period (1-3 years max) pilot of around 60 TB (45-90 TB max) of disk storage capacity.
2. The pilot uses Ibrix distributed storage software and Dell 15 TB storage "blocks", each "block" comprising its own server and disk modules.
3. At the end of the pilot, it is envisaged that any data that needs to be retained will be transitioned across onto then available production ITS data storage (LaRDS) services under then standard funding arrangements. The actual hardware and software may be retired, redeployed or transitioned into the production service upon terms to be agreed between the parties at the end of the pilot having regard to circumstances then pertaining.
4. Medicine will fund the purchase of the hardware and software including vendor-supplied installation, warranty, licence, support and maintenance services for the duration of the pilot. SLA will be as per whatever Medicine negotiates with vendor. ITS response will be "best-effort", as is ITS normal practice towards other pilots.
5. ITS will provide accommodation in central data centre/s, including power, air-conditioning, fire protection, network and related services.
6. The purpose of the pilot is to assess the performance and full cost (TCO) of providing production services using this class of technology. Hence:
6a. ITS staff will provide support and be involved as if this was a normal ITS production service. Similarly, normal ITS production procedures (e.g. regarding facility access, security, downtime, vendor call-out, maintenance, configuration, allocation, etc.) will apply. The intent of this section is to ensure that ITS staff have a meaningful level of involvement and exposure to the technology. In the event that support effort becomes excessive, Faculty of Medicine IT staff could optionally assist with some aspects of the support functions (e.g. space allocation).
6b. In the event that support costs (either in terms of accommodation or ITS staff load) become unexpectedly excessive, or the service becomes unreliable or unsupportable then the scope or duration of the pilot may be reviewed or curtailed. This unlikely eventuality would be an indication that the pilot had failed, in that the technology had not lived up to its reasonable expectations. Costs and performance will be reviewed at least annually, of more frequently if problems occur or performance drops below expectations.
7. This pilot is seen within the broader context that Medicine and ITS are working together to achieve the following outcomes:
7a. Medicine seeks that all of its research data be held centrally, and not locally by individual researchers;
7b. An integrated solution with the best balance of local, faculty and central facilities;
7c. All research data to be backed-up/archived/preserved (as appropriate for the data) on central ITS (LaRDS) tape libraries: by itself, the Ibrix/Dell system can leave data exposed to loss due to hardware failure;
7d. Other software components (e.g. SRB) are being explored between ITS and Medicine to assist researchers control data storage across the various available storage assets;
7e. Researchers be encouraged to record metadata along with the data sufficient to ensure the ongoing utility of the data. It is envisaged that in parallel during the period of the pilot, ITS will provide standardized data management environment/s (e.g. ARCHER) to capture such metadata, such that by the end of the pilot all data transitioning into long-term storage will be tracked via such a metadata-based information management environment.
Background to the IBRIX product
IBRIX Fusion is a software-based, fully-integrated, 'enterprise-class' scalable file storage solution suite comprised of a highly scalable POSIX compliant parallel file system, logical volume manager, high-availability features and a comprehensive management interface that includes a graphical user interface (GUI) and a command-line interface (CLI).
IBRIX Fusion allows enterprise system administrators to build file systems that can scale to up to 16 petabytes of capacity in a single namespace, and provide up to 1 terabyte/s of aggregate I/O throughput performance, independently and non-disruptively as the demands of the enterprise grow. IBRIX Fusion is hardware, network, and protocol independent and can be deployed as either a host-installed cluster file system solution or exported over NFS, CIFS, or other industry standard protocols as a scalable NAS solution.
Based on a patented Segmented File System™ architecture, IBRIX Fusion enables enterprises to pull together their I/O and storage systems into a single environment that is multipurpose and sharable across a multitude of applications, and to manage that environment through a centralized interface.
|