Thursday, April 18, 2024

Using File Virtualization for Disaster Recovery

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Running a state’s attorney general’s office takes the right IT platform. The Pennsylvania Attorney General’s Office, for example, has a main data center, a secondary site for disaster recovery (DR) and has to look after 22 remote sites. That network encompasses about 1,000 employees statewide.

About 75 ESX Server hosts operate around 145 virtual machines (VM) courtesy of technology by VMware Inc. a division of EMC Corp. of Hopkinton, MA. On the storage side, the organization has two FAS 3050 filers by Network Appliance Inc. (NetApp) of Sunnyvale, CA. These reside at its main office. A NetApp FAS 3020 filer has been added at its DR site.

With 500 attorneys to look after, you won’t think it would have much in the way of legal troubles. Not so. The PA Attorney General’s Office, like everyone else, is having to rethink its approach to data retention and e-discovery. Recent legislation mandates data being available for a certain number of years, and e-discovery rules mean that data has to be able to be found and made available fast.

“We needed to revamp our storage environment to make it searchable and secure for a variety of retention periods,” said Paul Lubold, the infrastructure and operations manager at the PA Attorney General’s Office based in Harrisburg, PA. “That’s why we became interested in file virtualization technology.”

File virtualization is typically achieved using a file area network (FAN). A FAN is a way to aggregate file systems so they can be moved easier and managed centrally. This technology makes sense to anyone who has deployed more than a handful of NAS boxes. As each box has its own file system, network connection, etc., adding another device requires duplicate installation, setup and maintenance functions. The organization utilized NetApp VFM to build a global namespace which decouples the files from the physical filer they sit on.

“With a global namespace, we built a virtual environment that enables us to move data where it’s needed and that is transparent to the user,” said Lubold. “I don’t have to come in to work anymore in the middle of the night to make changes to our file systems.”

In the past, he said, it was a real challenge dealing with employee moves – and 500 attorneys meant a lot of changes. There were frequent shifts from one office to another or sometimes remote users moving to a home office. That would entail manually moving data from one server or NAS filer to another. According to Lubold, it would take an hour to two hours to move that data across a WAN link. With 15 to 16 terabytes of file data all on NAS, he made the decision to accommodate those demands by having a system in place that eliminated much of the scripting and manually labor involved.

“If I moved data from LUN to LUN in the past, I had to come in over the weekend, completely tear down the file environment, move it, reestablish connectivity, and change paths to a to bigger box,” said Lubold. “Now NetApp VFM runs on an appliance and it allows data to follow users wherever they go. There is no such thing as a maintenance window anymore.”

VFM works in conjunction with Microsoft Distributed File System (DFS). HQ, the DR site and some of the larger remote offices have DFS running in order to replicate data back and forth. It is typically deployed on an existing domain controller. The smaller sites operate using client based software so there is no need to operate extra equipment there.

“We use DFS to do replication across WAN links,” said Lubold.

While DFS does the file replication, VFM takes care of file management and global namespace capabilities. The system has been tested at the main office in Harrisburg and at three nearby facilities. From there it is being rolled out to all offices statewide over a two-week period.

In the initial phase, the organization is using DFS/VFM for global namespace management, file virtualization, data migration and remote site replication. Snapshots are being used to backup data and send it offline for DR. This immediately simplifies the storage infrastructure as there is no more breaking and re-establishing of the mapping processes.

“We will experience no more mapping breaks due to the implementation of global namespace technology,” said Lubold. “Once you create your global namespace, you don’t have to move anything again.”

This article was first published on

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles