Storage

Storage and backup

LinuxVTL kristof Mon, 02/22/2010 - 07:40
Topics

LinuxVTL is an implementation of a VTL in Linux. Not really an appliance, but some software which behaves like a real VTL. Do not expect the performance of a Protectier or a FalconStore, as the changelog show this is really work in progress : the log for one of the last changes shows fixes for silent data corruption.

Data-deduplication isn't included, but you can off course use FUSE-ZFS with dedup on. Also, if you're using LinuxVTL on a separate server, you know you must either have HBA cards, or use iSCSI. Check out the homepage for usefull tips, like setting up an OpenSolaris client, or the settings for your favorite backup software.

Eaten by a robot

Topics

We have a giant SL8500 tape library at work, which regularly needs to be fed. I once had the dull task of feeding the beast 600 tape media though the front-end CAP, at 14 media per time, which took me quickly about 4 hours to perform. Yesterday, we had a new batch of 300 tapes, which had to be entered. Luckily, the StorageTek technicians were present, doing a firmware upgrade of the library. As they opened the library for easy access of some vital parts, I took the chance of entering the machine, and putting the tapes in their slots from the inside.

A SL8500 is a U-shaped library, which can be entered from top of the 'U'. On the long sides, tapes reside on both sides, and on the curved side of the U, the tape drives are mounted. So I accessed the library in a narrow corridor of 2 meter tall walls, completely filled with tapes containing the precious electronic data of my enterprise. These corridors sure are narrow, so I guess one of the requirements of being a StorageTek admin is not having a Burger King subscription. I only hoped that the technicians didn't forget to deacivate the handbots, cause an encounter with a iron gripper, moving at 5 meters/second would definitively send me on a one-trip down to the cemetery.

The picture doesn't do the library right : in this dark corridor, only being lit by some faint LEDs, and with my crappy phone camera, only the lowest 50 centimeters are visible (I placed the camera on the floor). Imagine this being 2 meters high, and 15 meters long, and you have an idea what we're talking about.

I finished in an hour, and once the STK engineers had finished their labor, I let the library rescan its new inventory, which it finished within 3 minutes (I love StorageTek hardware). 450 new Terabytes ready to use. Happy munching, Optapemus Prime !

TSM 6.1 : Proof of Technology

Topics

I got invited by IBM to attend their Proof of Technology session about TSM6.1. It was an interesting opportunity to delve some deeper into the new functionalities of TSM6.1, and to chat with other TSM customers about their experiences. First, the 'good' news : I wasn't the only one who got problems with the installation on AIX; apparently only on AIX6, some installations were flawless, but on the AIX5 platform, noone succeeded yet in installing TSM6.1.

The session was introduced with an explanation about TSM FastBack, which is basically a CDP solution for remote branches, aimed at Windows installations. It captures data changes at the block level, with very minimal impact to the system, and also provides a near instant restore capability that enables applications to be up and running within minutes after data recovery is initiated.

TSM6.1 then : to my big surprise, we were invited to a hands-on session with an upgrade from 5.5 to 6.1. It all went quite faster than expected, so I took the remaining time to study the data-deduplication in TSM. First of all, data-dedup can only be applied to sequential file device classes. It uses a SHA-1 hashing algorithm, which can give a infinitesimal probablility of a hash-collision ( a 40PB archive give a 0.5*10-28 chance on a collision). Creating or upgrading your storage pool to a dedup-based one is performed by this command :

update stgpool datadedup=yes

As soon as you execute this command, an extra process is started :

Process  Process Description   Status
  Number
--------  --------------------  -------------------------------------------------
     283  Identify Duplicates   Storage Pool FILEPOOL, Volume /tsmpool2/00006664.
                                 BFS, Files Processed: 2000, Duplicate
                                 Extents Found: 344, Duplicate Bytes Found:
                                 3,238,123 Current Physical File (bytes):
                                 2,626,676,296.
                                Status: Processing

This process starts identifying duplicate data blocks, and goes idle when nothing is left to process, so you'll allways see at least one of these processes running per deduplicated storage pool. This process is the first part of data-deduplication and, while running, can use lots of CPU and memory resources : existing FILE volumes are opened and read (I/O intensive), data is fingerprinted into chunks (CPU intensive) and a SHA digest is calculated on the chunk (CPU intensive). To avoid false-positives, size of chunk and another quick digest are also checked. Common chunks are replaced with pointers to the location of the common data (DB updates, LOCK contention). The second phase of data-dedup is the effective removal of the spurious datablocks, by a reclamation process.

There was also some presentation about the reporting features introduced into TSM6.1, which I already studied. The reporting is based on a combination of ISC, ITM, TEP and DB2, and needs a whopping 2.2GB download. The reporting feature I looked into was only the TEP part, but apparently there's also some reporting baked into ISC. The provided reports are pretty basic, but IBM is planning on creating a script library for all TSM customers. You can also extend your TSM based reporting with BIRT, an Eclipse plugin. Customers which now rely on TSM Operational Reporting, might be interested to know that this product will be discontinued with TSM6.1...

IBM looks determined to push Sun and HP customers to Linux on mainframe. At work, we compared the performance of TSM on AIX and on zLinux. IBM seemed *very* curious about our performance figures on these tests, and I got an IBM representative asking me lots of questions about it. I was quite surprised at the time of the tests to see TSM performe so well on zLinux, but it did perform about 18% less better than the AIX version. It still outperformed TSM on Solaris by a whopping 300%...

SSD myths dispelled - sortoff kristof Fri, 04/17/2009 - 22:36
Topics

Many eeePC installations recommend some precaution while formatting the internal SSD drives with ext3. Most articles warn that the continuous writes of journaled file systems or swap spaces might trash eventually the drive. I too chose a setup with ext2 and no swap partition on my netbook.

Robert Penz tries to dispel some of the prejudices around SSD drives, where he states that with a 2 million cycle at 50MB/sec you'd still get a life cycle of 20 years.

Interesting read, but I don't buy all arguments - he makes some good remarks, though I think the truth lies somewhere in the middle :
- first of all, Robert probably accepts figures from enterprise SSD disks, which are a different quality level than the regular SSD drives found in netbooks.
- I really would like to see results from writing over the same block over again and again, a life cycle of 20 years is imho in those conditions impossible.
- mis-configurations can fill up logfiles pretty quickly, so yes, netbooks can experience heavy writes too.

Does it makes such an overall difference ? Hell, SSD drives are pretty fast, so fsck's are also. My Linux boxes are pretty stable, I still haven't seen my netbook crash, forcing a fsck, despite some moments of heavy usage or sudden battery drains. If I would be forced to reinstall my eeePC, I still would go for the same setup. Maybe a swap partition would be handy, but an additional 1GB disk space on 20GB is a nice tradeoff for some memory gain.

A quick look at TSM 6.1

Topics

TSM 6.1 was released some weeks ago, but I didn't had any chance yet to test this out. Many things have changed in the backup flagship of IBM, so time for a quick glance. First the bad news :

  • TSM isn't available on 32bit Linux, and at this time only for 64bit SLES. No word yet if RHEL will be supported in the future.
  • the TSM downloads are hefty, ranging from 2 to 5 GB ! So be prepared for some long download times. Many of the packages are replaced by installation suites. TSM also contains a full blown DB2 9.5, the reporting software comes with ITM. Lots of software integration with different IBM solutions, so your procurement division might have lots of fun figuring out the licensing of these software clusters.
  • All installations are performed by software installation wizards, which behave stupidly. I tried the installation on AIX, but it kept failing on AIX prereqs, like the AIX Technology Level, and some APARs. The reporting software behaves in the same moronic way. In despair I turned to a Windows 2003 in VMware, where I had more luck in installing everything.

More info later on TSM itself, but I noticed that data dedup is present, but under the form of a separate housekeeping job. You might provide some extensive testing into your TSM 6.1 environment to see if the performance load of D3 doesn't level out the performance gain of D3 in other housekeeping jobs.

TSM now also contains a default reporting, based on ITM. Reporting alone is a download of 2.2GB, containing ISC, ITM and DB2. You can monitor & report about other TSM instances, even 5.5 versions. The reports are pretty basic, but if you're used to TSM Reporter, this might be a nice addition. Basic reporting contains stuff like backup jobs, schedules, database size and distributions of the number of objects inspected vs backed up. Still no real competitor towards professional TSM reporting suites like EMC Data Protection Advisor or Aptare Backup Manager.

TSM V6.1 kristof Fri, 02/27/2009 - 07:25
Topics

Yesterday, IBM gave a technical overview of the upcoming 6.1 release of Tivoli Storage Manager (or TSM for short), which should be available for download starting from March 27th. This long awaited release will contain some interesting features like :

  • Database being a separate DB2 instance
  • Data deduplication as a separate, out-of-band housekeeping process, and off-course only for FILE devclasses. Good news is that Active Data Pools are also supported.
  • Real-time monitoring and reporting built-in, based on ITM and TCR
  • NetApp Snap support
  • VMWare VCB support
  • Active Directory support

There is also less good news : the installer seems to be replaced with the same ISC installation procedure (which makes it probably impossible to install on non-supported platforms)

10 days without your datacenter == bankruptcy kristof Sun, 01/25/2009 - 12:03
Topics
Snow in London kristof Mon, 11/03/2008 - 08:54
Topics

I stayed in London last week for a course of Implementing Cisco SAN networks solutions. It was the sixth time in London, and I must say I like the melting pot of cultures, languages and people London is. It even snowed on Tuesday night, first time in 34 years it snowed there in October.
Only drawback the course had, was that the course location changed on the last minute from central London to Brentford, 17 km from my hotel. This meant a 10 minute walk from my hotel to the underground station, 30 minutes on the tube, and finally a 20 minute walk to the class room, and all of that twice day. Aw, my aching feet !

Geek with guns kristof Sat, 05/24/2008 - 12:11
Topics

EMC World 2008 has ended; great event, lots of interesting stuff in the domain of storage and backup. Data deduplication certainly is a hot item in backup apparently, funny that we were actually looking into this area before even knowing of data dedup in VTLs. Interesting to see how inband dedup (like Avamar), outband (in VTLs) and the dedup technology in the upcoming TSM 6.1 will work together (or not at all ;)

I survived the last couple of nights mainly on Johnny Walker Black Label. Johnny Walker has always been one of my favorite blends, but the black one (12 yo) I had yet to taste. I tasted spicy heat and sweetly malty, creamy vanilla, and a very very small amount of peat.

Time to leave Vegas, but first why not stay a day longer and have some more fun in Sin City ? The US is the land of milk, honey and lots of guns, so we decided to go for some target shooting at the Gun Shop. I chose a shooting session with two of the most renowned automatic rifles, the AK-47 and the Uzi 9mm.

The AK-47 is better know as Kalashnikov, is a true legendary weapon, known for its extreme ruggedness, simplicity of operation and maintenance, and unsurpassed reliability even in worst conditions possible. It is used not only as a military weapon, but also as a platform for numerous sporting civilian rifles and shotguns. The gun, which is way more lighter and slimmer than I expected, is fed from 30 rounds, stamped steel magazines of heavy, but robust design. Shooting the AK felt very light, and the gun almost has little or no backdraft. Very sensitive trigger too, it's no wonder lots of accidents happen with this baby.

The Uzi 9mm is a much smaller gun, with smaller bullets too. The UZI submachine gun was developed in Israel by designer Uziel Gal in around 1949. UZI had been adopted by police and military of more than 90 countries, including Israel, Germany, Belgium. It was also produced under license in Belgium by FN Herstal; more compact versions, Mini and Micro UZI, which were developed in the early eighties, and are adopted by many police forces around the world. The Uzi has the same shooting experience as the AK, though the trigger is less sensitive. Bullets from an Uzi leave larger bullet holes too : whereas the AK left nice round holes, this one almost shreds the target paper.

Shooting these arms subdues the gunman literally with a rain of bullet shells. These empty shells fly all around, are burning hot, and leave an impressive burn mark whenever they hit a naked arm. Lets just say I have a nice souvenir from this experience ;)

EMC World 2008, Las Vegas

Topics

How big is your electronic footprint ? We're all producing electronic material at an astonishing rate : pictures, e-mail, work documents. And all that stuff has to be stored somewhere. Luckily, storage devices are getting bigger and cheaper every year : I just bought a 8GB memory stick for under 40$. Storage is expanding at an incredible rate : the megapixel count of digital cameras keeps expanding every year, more and more people are taking more and more pictures, and video content is just starting to boom. I'm not yet even talking about high-definition material, only a small part of the early adopters is starting to use this, and probably only partially.

Storage is an incredibly intriguing area to work in. That's why I'm visiting EMC world right now, held in Las Vegas. EMC World is a yearly event by EMC for technical people, digging in the storage areas EMC is working on. EMC World is big, even for American standards. The current head count is over 9300 visitors, and that's not counting the people organising this event (catering, security, ...). The Mandalay Bay hotel and convention center is equally sized, getting from one convention room to another can be quite a walk. I received a pedometer which counted an astonishing 7.4km for the first convention day only !

Las Vegas itself isn't quite my cup of tea : last year I already wrote about the plastic fake environment, but luckily the Luxor hotel is rather modest in this area. It featured a sushi bar, which was my first sushi experience, and I must say I like it. Unfortunately, they didn't offer Japanese whisky, so I had to content myself with a taste of a shot of Jim Beam American whisky. One glass is rather insufficient to offer a decent review, but I found Jim Beam rahter sweet, and begging to be mixed with coke. I must say whisky and sushi make a very good combination.

Time to head off for the second EMC World day, there's lot's of stuff to explore...