File servers help consolidate your data from multiple users to one single repository. This helps in maintaining a common storage for the entire data and brings in convenient file sharing between multiple users & projects. You allocate large volumes of RAID protected disks to a file server and then allocate shares to your users. Users have full rights on their folders and can have full or limited access to shared folders.
While most of the people think the real data is in databases, I believe the real data is on the file servers. Users create a lot of data by their knowledge & experience. The data here is not just a transaction, it is the intellectual property of the person creating it. I saw someone making one final report in MS Excel which pulls out data from 11 different Excel files. A department updates its data in one file and the entire report gets updated immediately. Imagine the impact if this is lost.
Apart from losing the data, the user needs to spend a huge amount of time rebuilding the entire algorithm again as against an ERP software where you may lose an important record but the basic application can be reinstalled. Therefore, it becomes even more important to protect this data from any accidental deletion or overwriting.
Whether you are a small office or a large enterprise, file servers are the most critical data assets and need to be protected properly.
The biggest problem that you face backing up file servers is that the small files take much longer to get backed up and organizations have terabytes of such data with them.
Many small organizations still prefer making copies of the data on an external USB drive. This starts as a good practice but does not get continued to a long-term regular practice as you are not always available to sit down and perform the copy paste activity. Investing in automatic backup infrastructure and managing it isn’t a simple task in small organizations with small IT setups.
I have seen so many organisations losing their data to such a small thing. Many setups are too small to even have RAID protection.
Protecting the data on the cloud is a good option in such scenarios. Allow us to install the Virtual Vault software in the environment and schedule the backups. The first-time backup will process the entire data and the subsequent “always incremental” ensures only the changes get backed up that too get compressed locally. This minimizes the bandwidth requirement for the organization. Zero infrastructure investment and pay as you use model makes it even easier to handle the cost of protection. You can also add up protecting your desktops/laptops with the same environment without the need to buy more infrastructure or licenses.
Large enterprises have the affordability of deploying external storage with dual controllers, RAID protection and hot spare disks to keep themselves safe as the more redundancy you add, the more reliable it becomes. However, what if a user is looking for data that he had created a day before. Most of the file server recovery requests that you get are for the data that is some previous version of a small file. You rarely see a complete crash.
For larger organizations, protecting file servers is the biggest challenge. The volumes are huge and the tape infrastructure is slow. I have seen organizations upgrading their tape infrastructure every 18-24 months believing it to reduce their backup windows. It does look good initially but is not a long-term solution. The backup windows keep increasing as the large number of small files take much longer to backup.
Based on their volumes of data on file servers, you end up taking Full backups every month, as it takes 4-5 days or even more to complete a full cycle. Incremental backups for the rest of the month help shorten the backup window.
Recoveries during the month stay dependent on these and therefore recoveries take an equally long time to complete. On top of this, managing & storing tapes needs a lot of care and investment. Organizations should consider categorising their data based on usage patterns and archive the old data to immensely help in reducing the backup window.
I had an experience of daily incremental backups taking too long to complete. Equally time consuming & media consuming as the full backups. We diagnosed this further and found out that the access rights were configured in such a way that every time you add/remove a user from a set of shared folders, the access bit of all the files in the folders change and get a new timestamp.
Incremental backups work on a timestamp and the moment they see a change, all the files get backed up even if the content has not changed. We immediately initiated a pilot on sample data and could show the benefits of our solution with deduplication. It picked up only the changed data instead of picking up all the files again on a file access bit change.
Another good way for large enterprises is to consider archiving their file servers regularly. I met someone recently who does this every year manually. They take a full backup of the file server & find the files not accessed for over a year. Considering the backup as an archive, they delete these files from the production systems. If a user requests for an old file, they recover it from the old backups.
This helps them reduce the file server workload as well as get better backup performance. The only issue here is a lot of manual effort to take care of this.
This can be automated, and a professional archiving tools that can help you automate the entire process and have easier access to your older data. In fact, you can give access to the user to retrieve back his data on his own. Utilizing the cloud based archival infrastructure could help reduce the cost of retention drastically and provide much easier access and user experience when an old data file is required.
Cloud based backups push your encrypted data out of your network reducing the risk of data corruption by internal threats, malware & ransomware attacks. You can also consider backing up to cloud and archive old backups to low cost storage on the cloud. Cloud based backups allow backing up only the changed data in every backup cycle which reduces the daily backup window tremendously. You can pick and choose what you want to backup, create data types-based policies and devote your resources for other applications.
2. Case In point
I worked for one of the media agencies in Delhi. The environment predominantly had MS Exchange and 8 TB file server. Tape based backups were implemented and file servers were backed up only on the weekends. They took 3 days to complete and therefore no room for MS Exchange backups. MS Exchange backups were taken during weekdays and no room for file server backups during the week days. We implemented the Virtual Vault backup solution for their file servers. First backups did take some time to populate a large volume of data manually through external disks. Subsequently, file server backups would complete in 15-20 minutes every day. A phenomenal reduction in the backup window, tape usage & maintenance, and daily data protection.
Coupled with our 24x7 monitoring service ensuring your backup cycles, troubleshooting failures, reporting, audit & recovery requirements brings big relief to your environment so that you can focus your resources on your business and allow us to manage your data.
The best way to fix your concerns over backing up the large volumes of small files is dependent on the overall size of your file serving environment. Smaller or mid-size (upto 10-12 TB) file server environments should back them up on cloud ensuring that they leverage the benefits of technologies like deduplication and archive the backups of their older files. This reduces the investment in on-prem solutions and tape infrastructure management.
Larger organizations should adopt a complete Information Lifecycle Management strategy. Deploy an automated professional archival infrastructure. Archiving old data would help reduce the need to backup data considerably. To backup the remaining most recently used data, consider using deduplication based cloud backups. You get the advantages of investing in lesser resources even for the production file servers and getting the best of breed strategy for their environment.
Deploying an on cloud strategy can also help reduce the internal threats damaging the data. Keeping it away form the production network reduces the risk of data loss in event of a malware or ransomware attack.