An affordable and feature rich (x86_64 port) iSCSI solution for general and commercial use embracing the RapidDisk modules for speed.
Hello. My name is Petros Koutoupis and by profession (and passion) I am a Linux software developer. I am also a freelance author. You can read a bit more about me and my experience here and here. I primarily focus on kernel and device driver development but as of the past few years, I have also focused on creating my own embedded Linux-based distributions for single purpose applications.
Throughout my entire development life I have always been fascinated by data storage and have written many drivers, kernel code, and software applications to either do amazing things or just fix bugs. I am currently working on implementing my very own low cost and energy friendly (x86_64 port) iSCSI target software distribution (code named: RapidDisk LX 2.0) which can easily be used for Home, Small-to-Medium business, and even entry level Enterprise data storage solutions.
The RapidDisk LX 2.0 iSCSI target software distribution project is intended to provide a low cost and power friendly minimal solution with most if not all enterprise level features. It will be utilizing a customized version of my Linux-based distribution: RapidDisk LX. RapidDisk LX is an OIN® Licensee.
My goal is to create an updated x86_64 port of my original minimal operating system with more features and functionality utilizing open source software. Part of this goal is to publish all open source packages, binary images, and documentation used in this project on the official project page.
*** All funding of this project will help pay for the test equipment (one 64-bit server) and partly for web hosting. Well, it will also help pay for some of your rewards. ***
iSCSI is a standardized and very stable protocol for data storage communication and aside from the SCSI command set it uses familiar networking standards: Ethernet and TCP/IP. Most modern operating systems support iSCSI as initiators, some of which may require the installation of additional software.
There have been several common misperceptions regarding how iSCSI compares to the traditional Fibre Channel or Serial Attached SCSI (SAS) deployments. Thankfully, in the present iSCSI has become a more mainstream technology and large vendors such as Dell and EMC have been offering iSCSI products, rendering these notions invalid. I have taken the following bullet points from the Dell website:
- Performance: The first misperception is that iSCSI cannot provide the performance necessary for enterprise applications. Many key applications have random I/O data patterns, and the performance bottleneck ends up being the time it takes to write and read data from hard disk drives, not the network bandwidth.
- Manageability: Another misperception is that iSCSI is more difficult to manage than Fibre Channel. The management applications for iSCSI SANs have become intelligent, wizard based or even self managing removing much of the complexity of day to day activities. Additionally, because iSCSI utilizes standard Ethernet equipment and because many IT staffs are more familiar with Ethernet than Fibre Channel, iSCSI networks are often easier to manage.
- Network security: A very common misperception is that iSCSI SANs are not as secure as Fibre Channel SANs. In fact, when logically or physically separated, iSCSI networks are just as secure as Fibre Channel.
Although the biggest selling point to iSCSI is price. It is an extremely affordable technology; especially when compared to Fibre Channel, SAS, Infiniband, etc. Even more affordable when we are talking about a 1 Gigabit standard Ethernet solution.
The idea is for this solution to utilize commodity off-the-shelf hardware as can be provided by HP, Dell, Oracle, etc. Although this Kickstarter project specifically focuses on an x86_64 port. Over time specific hardware solutions will be thoroughly tested, certified by the project's efforts, and listed on the project's website.
Again, iSCSI is the protocol of interest, that is 1 GigaBit Ethernet (Gbe) and 10 Gbe which can be configured for:
- Load balancing; that is, if the same target device is mapped across both ports and to the same host.
- Host Clustering
- Storage Management (via a web hosted user interface accessible through most modern browsers)
Once installed, the operating system distributions will run headless and over an Ethernet management port, although the local RS232 serial port will be configured to access a local serial console which will offer another method of access and management.
Depending on hardware types, there may also be future software support to manage hardware states, trigger audible alarms, monitor power input status (and react appropriately if connected to a Battery Backup Unit or BBU), and possibly more. This is still being researched.
The reason for choosing to support commodity off-the-shelf hardware is a simple one. Why spend $10,000 or much more for a single proprietary storage array enclosure when you can spend a fraction of that cost and get the same features and functionality. Think about it. Just for a single iSCSI controller (that means, no chassis/enclosure, no hard drives, no nothing else) from a reseller can cost around $2000-3000 and sometimes more. Don't believe me? Just Google the 3 port (1 management) controller for a Drobo B1200i storage solution: DR-B1200I-1G11. You will see listings range from $2000 to $2800. Instead, you can obtain a fully hard drive populated rack mountable server for the same price! Just imagine configuring the operating system on one or a bunch of HPProLiant DL300, ProLiant DL100, and/or Dell PowerEdge servers. Again, using affordable supported commodity hardware is the emphasis behind my goals.
As one would expect, all of the magic will occur in the software running on the hardware. It will be hosted by an updated version of my own custom minimal Linux distribution built straight from source code, RapidDisk LX. RapidDisk LX is an OIN® Licensee. Currently RapidDisk LX has been built and tested on the ARM-based BeagleBoard-xM. The goal of RapidDisk LX 2.0 is to release a more feature filled distribution and to port it to the x86_64 architecture. Some of the more notable features are listed below.
RapidDisk LX was originally designed to export RapidDisk and RapidCache enabled volumes over iSCSI to offer high performance. Since then, the operating system has evolved and begun to incorporate advanced features which include basic volume management to even hard drive power, profile, and port management. It is currently 10 Megabytes in size!!! Note that the operating system does not enable video and audio support. I will continue down this path (see software management section below) and on the x86_64 port will also disable other components such as the USB subsystem, etc. This will ensure maximum stability and security, that is by eliminating all unused features.
The video below showcases an early build of RapidDisk LX running on the ARM-based BeagleBoard-xM. Notice how from the moment the countdown of the bootloader reaches 0 to the login prompt is approximately 6 seconds. Imagine an iSCSI target solution that boots up and load the user configuration in under 10 seconds!!!
The following video showcases my working with the GPIO pins using the same RapidDisk LX build on the BeagleBoard-xM plugged directly to a breadboard design I had laid out for signal testing purposes. While this is not something applicable for the x86_64 port, it just showcases the flexibility of the operating system.
Hard Drive and General Volume Management
Utilizing an updated and modified version of an existing project of mine, DrvAdm, the user will be able to schedule tasks to (or manually) spin down/up hard disk devices; Immediately auto-detect recently inserted/removed hotplug disks; Schedule a task or manually reset the host controller, bus, and/or hard disk target; modify hard drive parameters; and more.
Relying on the Device Mapper framework and LVM2, the user will have the ability to pool all attached drives into redundant or non-redundant arrays and in turn dynamically create/resize/destroy logical volumes to map over the iSCSI network. Using LVM2, snapshot support can dynamically be enabled and mapped as a Logical Unit Number (or LUN) over the iSCSI Storage Area Network (or SAN) with read and read/write permissions.
Another great feature of this project is the ability to dynamically create/resize/destroy RapidDisk RAM based volumes and map them over the network for high performance and low power consuming needs. With RapidCache, the administrator can utilize a RapidDisk volume as the front end Write-Through/Read-Through caching node for existing volumes. This will significantly increase Sequential and Random read performance and in turn also consume less power when accessing recently cached data. You can view some of the performance data here. Note that the performance will be limited by a Ethernet 1 Gigabit cap. Although the numbers will still be significantly higher than a normal SATA spinning disk and also consistent for both sequential & random access.
NOTE - Write-Back cache will be available as an option (via dm-cache) but must be used with extreme caution. It would be advisable to use it in scenarios where the caching volume is a locally attached Solid State Disk (SSD) and or when a Battery Backup Unit (BBU) is configured for last minute data sync's to the target volume(s). I say this to prevent any and all data corruption in events of system/ power failure.
NOTE - All on-line (and in use) Logical Volumes and RapidDisk RAM disks can be dynamically resized without the need to remap the Logical Unit.
Storage and Network Management
A local web server will be installed to ease the process of storage/ volume and network (volume mapping, firewall settings, etc.) management. The user can simply connect to this web based user interface via any web browser. Although I will be honest with you and admit that very little time will be spent testing Internet Explorer. Most of my focus will be on anything utilizing Gecko (Firefox) and WebKit (Chrome/ Chromium, Safari, etc.) rendering engines.
Another method of management will be offered through the local serial port. A user will be able to connect via a serial enabled console and have access to the system via a terminal shell.
There are future plans to implement methods for gathering system statistics, performance data, etc. Also is also plans to create facilities to send event and status notifications to key administrators.
All software updates will be handled via the web based user interface. A binary file will be provided to upload to the remote server through the user's browser. The update process can happen while the server is on-line and processing data. This is made possible because of the fact that the software image is immediately loaded into RAM on boot and resides there until shut down. So, when the update process occurs and the new image is written to storage, all else is uninterrupted and the new image will not load until the system has been rebooted.
More Advanced (and Debugging) Features
The solution will also have the ability to enable detailed SCSI tracing either manually, or automatically (on detected failure) for better problem isolation. All SCSI trace data will be captured to log which in turn could be used to diagnose and sometimes recreate the error (through an external I/O replay mechanism). This is similar but will be more powerful than the older Linux SCSI device driver I wrote called SCSI Trace (scsitrc) to accomplish a slightly similar task. I only highlight SCSI Trace to show that it is very possible for me to implement such a feature on the iSCSI target solution.
File System Layout
As a result of my being an embedded Linux software developer, I decided to take a different and somewhat unique approach of how the operating system will run. During installation (and initialization of newly attached disk drives), each disk drive will be partitioned specifically for the following:
- MBR: Bootloader (in this case, GRUB)
- Partition 1: Fixed size. SquashFS of rootfs and kernel images . Set to Read-Only until image update.
- Partition 2: Fixed size. Configuration files for volumes and applications (includes firewall, authentication, etc.). Also contains all system trace logs.
- Partition 3: The remainder of the volume's free space intended for user defined storage.
Again, each disk device will hold redundant copies of Partitions 1 & 2 (and MBR data). Everytime a new disk device is attached to the system, the user must initialize it. I am sure I can even enable a checkbox within the user interface to automate this task.
Security is an important issue to any storage and networking administrator. For the most part, the file system layout adds a layer of security on its own, by making it all the more difficult for anyone to gain access and alter anything unless they really spent some time reading the physical hard drive connected to another machine. As I mentioned earlier, video, USB, among other unused subsystems will be disabled making it increasingly difficult to get any scripts or files onto the target image. Also connection can only be done via the local web server through a remote browser or via the serial console (directly connected to the local serial port). There will not be any Secure Shell (SSH), FTP, or any other method by which anyone can connect to the target image. Lastly, iptables (yes, the well known and extremely powerful Linux firewall) will be installed allowing the administrator the ability to create whatever rules are necessary to make the target that much more secure.
Notes On Rewards
- Below you will find an image of what the RapidDisk t-shirt will look like:
The final product will be distributed in binary format as an installable ISO image file. All open source packages used will be posted on the project's official website.
As seen above, I am already working with the ARM-based BeagleBoard-xM. I am looking to get a completely stable build (for the x86_64) of the distribution available to the public no later than April of 2013. Possibly even sooner (which may schedule the alpha/beta releases to those who funded the appropriate levels sooner). This scheduling takes into account the development of custom software, thorough testing, debugging, and patching.
What kind of off the shelf hardware will this run on? Is this an alternative (on just the software side) to purchasing an expensive data storage array?
The idea is for this to install like any other operating system on any x86_64 server with multiple disk drives. So yes, the goal is to have an alternative solution to purchasing a very expensive data storage array.