Manage Learn to apply best practices and optimize your operations.

Storage for manufacturing

Manufacturing environments typically have different storage requirements than corporate apps, and have to deal with globally dispersed design teams as well as growing regulatory concerns. Here's how several prominent manufacturers have met the challenge.

Maybe it's because they make giant tractors and construction equipment that manufacturing companies like Caterpillar...

Inc. come across as large, ponderous operations. People imagine massive factories that house huge assembly lines, and assume the IT infrastructure needed to support the collaborative design and development of these monster machines will be equally massive and complex. It may have been this way once, but not today.

"The IT infrastructure is not as big as you might think. There are a lot of parts used in the design of one of our products, but for the most part we have a few basic designs and a lot of different configurations. There's not a tremendous amount of data," says Kenneth Olson, technical specialist in the storage management group at Peoria, IL-based Caterpillar, which touts itself as the world's largest maker of construction and forestry equipment.

The situation is similar at mailing product manufacturer Pitney Bowes Inc., Stamford, CT, which designs and manufactures mailing equipment ranging in size from desktop postage meters to mailing automation equipment that fills an entire room. Pitney Bowes uses the Windchill ProjectLink collaborative design and development product from Needham, MA-based PTC, and the data associated with its products amounts to 750GB--up from 500GB a few years ago--out of a total enterprise storage capacity of 170TB, reports Steve Blum, Pitney Bowes' director of ISS Enterprise Computing Services. Aerospace manufacturers, by comparison, can gobble up 10GB to 20GB of design data storage each day when they're in the midst of a major product development, notes Victor Gerdes, senior product manager for CAD/PDM at PTC.

Regardless of the amount of data, the demands of manufacturing and product development--especially in a global, widely distributed collaborative product development environment--present unique challenges to corporate IT managers who are increasingly being asked to oversee these engineering systems along with the usual corporate information systems. "This is quite different from classic enterprise information systems," says Rick Villars, vice president of storage systems at IDC, Framingham, MA.

A collaborative world
The days when products were designed and built by one company are gone. Products are now designed by teams of often geographically dispersed designers and developers who share product designs, specifications and requirements across a global network. Outside suppliers and subcontractors play a vital role in a product's development, and components are designed and sourced from a variety of providers. "It's now a distributed world and you can have a thousand engineers, consultants and subcontractors spread out across the world all wanting to access the same data," says Villars.

This raises an issue manufacturers didn't have to confront when they kept designs inside or shared them among a group of trusted partners in the U.S. "Now our customers are realizing that they're sending designs to outsourcing partners, many of which are in Asia, and they're worrying about how to protect their designs," says Gerdes. He expects manufacturers to increasingly demand digital rights management (DRM) capabilities.

DRM aside, CNS Inc. may epitomize the new manufacturer even more than equipment manufacturers like Caterpillar and Pitney Bowes. CNS, a publicly held company headquartered in Minneapolis, makes Breathe Right nasal strips, a consumer product. "We have about 60 people and do all the design and development, but we outsource the actual manufacturing," says Don Himsl, IT director at CNS.

The company designs its products using AutoCAD and shares its designs among its product development staff. Design data is stored on CNS' 2.2TB SAN from Compellent Technologies, which automatically protects critical design files using Compellent's data instant-replay feature. Ironically, the product packaging designs, done in Adobe Illustrator, require much more storage than the AutoCAD designs.

Kichler Lighting, a manufacturer of lighting fixtures in Cleveland, follows the same approach. "We design and engineer the products and manage the product data, but we outsource the actual manufacturing to a number of companies," says Michael Sink, Kichler's manager of network and operations infrastructure. The company built its own product lifecycle management (PLM) tool using workflow software. Otherwise, it relies on basic file sharing over the network to enable engineers and designers to collaboratively develop products. Kichler Lighting manages the collaborative design process itself through the Artesia Teams digital asset management solution from Artesia Technologies Inc., which handles the shared files and provides the necessary design check-in/out process to ensure synchronization.

A different mindset
Beyond the collaborative aspects, IT managers need to think differently about their systems and data. Systems, for example, tend to be file- rather than database-oriented, notes IDC's Villars. Instead of huge volumes of data transactions churning through the systems every day or every hour with frequent reads and writes, the design groups tend to load only a few files and work on them for hours at a stretch. And the files may not be that large. Although a few files sometimes hit 100MB or more, most run less than 5MB, not much more than a large PowerPoint file. The designs may ultimately be rendered as rich graphics--a task often done in batch mode--but until then they consist primarily of mathematical equations that don't consume much storage space.

There's also a new vocabulary, as well as a new set of acronyms. Product data management (PDM) refers to the process of capturing and storing all data about a product and its design. "This is meta data, and it's easy to centralize meta data," says Mark Kerr, technical manager for IBM's industrial sector group.

PLM is concerned with managing product data from inception through the end of the product's lifecycle, which could stretch for decades. "We're still supporting equipment that was made in the 1940s and '50s," says Caterpillar's Olson. Kerr recalls RFPs from the aerospace industry that specified 50 years of retrievable data. PLM data for new products today may still be in use in 2050. "IT managers will need a strategy for moving the data to new media every five years," advises Kerr.

Collaborative product development (CPD) describes a development process in which teams of product designers and engineers work together on a project. Essentially a groupware challenge, the data must be stored and managed to ensure that each participant is working with the latest copy of the data. CPD involves synchronizing data, tight version control and data caching at widely dispersed sites to reduce the amount of data repeatedly fetched over the network.

"A lot of engineering that was once kept in our facility is now moving around the world," notes Olson. The process can also get complicated when subcontractors and other third parties are involved.

Centralized or distributed?
With the advent of global product development and manufacturing, large- and midsized manufacturers have to decide where product data will reside. If it's centralized, it's easier to manage, protect and secure. However, remote users may encounter performance issues if they must continually retrieve and store files across a WAN. When the operation is far flung, the cost of multiple global links can quickly mount.

"A centralized or distributed [product data] vault is the big debate," says IBM's Kerr. If the organization has a big network in place and the data volumes are comparatively small, then a centralized approach is preferred. "If you're setting up multiple vaults, your challenge will be to synchronize them; [then] you'll have the challenge of backup and recovery," he adds.

Because the data primarily takes the form of files, the latest versions of NFS can simplify some of this. "NFS v.4 has replication, caching and a global namespace," Kerr reports. That allows a company to set up a single file system globally. Organizations can then set up multiple local NAS servers, point them to a centralized SAN-based file system and use NFS to move the files around the network. Such a global file system will enable companies to scale out their NAS storage.

"With a global file system and a global namespace, you can keep adding NAS servers and still have a single storage pool," says Kerr.

Another option, suggests PTC's Gerdes, is to store content and meta data centrally, but to deploy content caching servers at remote sites. Designers and engineers can work off the caching servers while the work is periodically saved at the central location and the cache is refreshed. "The nice thing about a content caching server is that you don't have to back it up since all the data has been stored and backed up centrally," he points out. In addition, content caching servers typically are inexpensive compared to other servers.

Santa Clara, CA-based National Semiconductor Corp. (NSC) operates multiple chip design centers in the U.S. and overseas. It prefers to store and manage design data centrally, where it has mirrored Network Appliance (NetApp) Inc. file servers, each with 20TB of usable capacity, and a mirrored data center. "But we still have local sites, too," says Klaus Preussner, director of information services.

To ensure that remote sites can get to the stored design data, NSC maintains a large VPN that uses multiple T1 and T3 links to its Santa Clara headquarters. At remote sites, "we have filers that we use for intelligent data staging," says Preussner, which are the equivalent of Gerdes' content caching servers. Data synchronization is performed in batch mode. The company is also evaluating the use of memory cache at remote sites.

Pitney Bowes also opted to centralize all of its CPD work at its Danbury, CT, data center. It maintains an EMC Corp. SAN with 170TB of raw disk and a Hewlett-Packard (HP) Co. 9000 Superdome server to run the Windchill CPD system for its product design and engineering group. Pitney's design engineers in the U.K. and France access the appropriate design files from Danbury over a T3 link. However, duplicate copies of active files are kept locally, says Patrick Leahy, senior analyst for enterprise business apps at Pitney. Windchill tracks the meta data and automatically synchronizes the local copies with any updates. "Whether the data is in Danbury or local to our users, it all looks like local storage," adds Leahy.

Caterpillar also prefers to centralize design data, which can be accessed by design engineers from as far away as Japan. The master copy is stored at the data center and can be directly accessed by engineers anywhere in the world. Some remote offices save and store their work locally as well as on the main system, but the central IT group isn't involved with that. "We are definitely not replicating data all over the place," says the firm's Olson. Caterpillar invested a considerable amount of money in building its own global T1 and T3 network. "We now own a lot of our fiber so we don't have to lease lines," adds Olson. The upshot: Caterpillar design engineers can get to any data they want over the network from anywhere in the world, and get it quickly.

Caterpillar's engineering data is stored primarily on the central SAN, which consists of multiple arrays from EMC, Hitachi Data Systems and IBM Corp. The documentation groups and some business units prefer NAS storage, for which the company uses EMC Celerra for enterprise NAS as well as a number of NetApp filers deployed by the various business units, Olson explains. Although IT would like to centralize and standardize, in practice the firm ends up mixing centralized and local storage of various types because the business units insist on their storage preferences.

Performance considerations
However IT managers deploy the manufacturing infrastructure, they must take performance into consideration. Moving design files over a crowded corporate network, even if the files are only the size of average PowerPoint presentations, raises performance issues.

But when it comes to performance for design tasks, IT managers need to set expectation levels for their users. "We're not talking about business transactions where people expect an instant response," says Kichler Lighting's Sink. The engineers typically download a drawing when they begin work and may work on it for the rest of the day. "If it takes 15 seconds for the drawing to load, they can wait. It all depends on how you set expectations," he says.

To deliver the expected performance, Kichler relies on its Gigabit Ethernet corporate backbone with 100Mb delivered to the desk. "Design and engineering work is predictable traffic, not like OLTP. We don't see a lot of sudden spikes," says Sink.

For storage performance, the company relies on its EMC Clariion CX600. Kichler expects a planned upgrade to the CX700, or possibly a DMX, to give it a 30% boost in performance. Although Kichler's most graphic-intensive design files can run 50MB to 60MB, these are the exception. Most of its AutoCAD files come in under 1MB. "These aren't very big, but we do have 10,000 to 15,000 of them," says Sink.

To control cost without sacrificing performance, NSC is implementing tiered storage. Much of its design data is saved on primary storage. However, the company plans to use a second tier of low-cost disk for less-frequently accessed design data, the firm's Preussner explains. This tier will offer lower performance at a lower cost. A third tier also consists of low-cost disk used by designers for scratch space--temporary storage that's not protected and, if lost, won't matter.

Centralized vs. distributed product data

Future directions
Manufacturing IT infrastructures are coming under closer scrutiny because of regulatory compliance and litigation concerns, as well as the need to support long product lifecycles and engineers' penchants "to keep things forever," says IDC's Villars. He adds that it will force manufacturing organizations to find ways to cost-effectively store more data for longer periods of time. He expects manufacturers to turn to tiered storage and information lifecycle management in various forms.

"We also expect manufacturing companies to move to grid architectures," says Villars. Grid units (Hewlett-Packard calls them smart cells, while IBM calls them bricks) combine a CPU and storage as a single entity. For example, with 1,000 grid units some of the units can be grouped for NAS and dedicated to the engineering department, others can be reserved for primary block storage for critical applications and a few grid units can be reserved for low-cost archival storage.

An IT manager taking on the responsibility for a company's product design and development infrastructure will find the challenges different, but not terribly difficult. The added storage requirements won't be overwhelming, although network bandwidth for organizations that don't have robust networks in place will present some problems.

Coming from a corporate environment increasingly driven by the demand to maintain 24/7 availability and where every system, including mail and compliance, is becoming critical, the product design and development world may actually be a refreshing change. "Order entry and fulfillment are mission-critical applications. Engineering work isn't usually mission-critical," says Olson. "Should something go down, the business will manage if some engineers don't have their systems for a little while." How often do corporate IT storage managers hear that?

Dig Deeper on Data storage strategy