Business In Arch Viz
Business in Arch Viz. Vol. 9 - IT Infrastructure & Networking (Part 2 of 2)
What does your data backup system and strategy look like?
ArX Solutions: We use the rule of three: one original protected with drive redundancy and hot spare disks, one copy on site to local storage, one copy out of the office.
Beauty and the Bit: Always having your main copy and several satellite copies done automatically week by week. We are really paranoid about losing data so we even make more local copies of several projects that are actually running. The strategy is to have lots of copies so you never regret anything.
Designstor: Our backup system uses best practices of daily, weekly and monthly sets. It also includes machine replication. We use disk backup and off site mirroring (no more tape drives).
Factory Fifteen: We run a 4 bay RDX tape drive, we backup live projects, resources and admin nightly. These tapes are rotated weekly so there is always a set offsite. There is also a backup server (also a domain controller). In the case of the main server having a critical failure, the live projects would be copied from the tapes to the backup server. This would be enough to get by while the main server is rebuilt.
This of course relies on the live projects and the tapes being administered effectively.
Kilograph: We keep one on site back of our active projects, and one off site back up. Our archived projects are kept on site and we are currently in the process of integrating Amazon Glacier for deep storage backup of our archives.
MIR: This is a huge and kind of problematic issue. We create more data than what it makes sense to back up, and have not yet found a good solution to this problem. We are working on new solutions to this now, that will be more an integrated part of our workflow.
Neoscape: The isilon has built in snapshotting, and we can lose up to 3 hard drives or 2 nodes and still keep running. Beyond that we have disaster backups that go to LTO tape library, which get cycled out and taken offsite.
The Digit Group: We have recently shifted from a hardware (tapes/CD/DVD) to an online (Cloud) strategy, using multiple providers to solve our real-time/once an hour/once a day/once a week/once a month and data banking needs.
PixelFlakes: We can't afford to have a day without our files, so we run a highly redundant system. Essentially, we have a second file server that fully replicates our primary file server throughout the day. They have automatic failover, should one of them decide to die. This also forms our primary onsite backup. We also regularly replicate the servers off site via this system using Synology Snapshot Replication.
Public Square: We have used a lot of methods - online backup, tapes, hard drives we throw in boxes, offloading to a smaller RAID array.
Pure: Size and safety. We have now 3 mirrored servers. 2 in-house and 1 external.
2G Studio: The backup system should be exactly the same as the main storage. But the main problem is, it is too expensive if you also want to build the same system. The most annoying part is, the copying process is really painful and it takes a lot of time. It will also make your network traffic is very high when the backup processes on business days, making the network disconnect and will slow down the production flow almost 100 percent.
You might say, you can set the backup system overnight or on weekend, but the problem is, our output is quite big, almost every day our artists render overnight and some of them use the render farm as well. If we always render on business days, it going to make things more complicated. So, we finally choose to do it in a manual way. For all the project data, we archived it through 3dsmax, and we will save it with all the psd and final images, and copy it manually to our backup system.
Ricardo Rocha: UPS, Local redundancy, supplemental storage for old projects and permanent offsite storage.
Steelblue: The primary server manages shadow copies for one week's worth of data that allows users to restore previous versions without any other assistance from IT. Incremental tape backups are run constantly to tape with a full backup run periodically and kept offsite.
Transparent House: We are using the most stable enterprise HDDs in our NAS solution, which giving us a little more room of stability. As well everything is built in RAID there high chance to recover the data. We are backing up old projects to make sure when client will come back to us we will have all assets to continue to work on this project. We do not have mirror of whole server, because it very expensive. Cloud services are also not working for us for now, with 100tb of cloud space it will cost an insane amount of money and you will be leashed to it. Stop payment and lose what you had. We tried it, and realized that if you need files quickly, there no way to download even 3tb project in an hours. As I mentioned before we are not there yet.
Urban Simulations: A third server manages it… instead of backing everything up, twice a day our system chooses only 3d files and no frames are stored because it's the easy material to be rendered again if something fails and all the 3d material done by our artist should be a great loss.
Any specific advice for larger studios or smaller studios about building or growing a proper IT infrastructure?
ArX Solutions: Ask for advice, the more we all share our experience, the better for everybody. You'll be surprised how people share this information.
Beauty and the Bit: Don't get mad at it, be reasonable, build your structure by little steps and take decisions that bring possible scalability. The key point is never being overwhelmed by the technical part of business.
Designstor: For smaller studios, keep it simple and absolutely do not neglect backup when planning. Take advantage of cloud storage (Google offers unlimited storage for business accounts) to replicate critical systems and data off site. Backup is a big expense but it’s absolutely necessary.
Kilograph: My advice for building or growing a proper IT infrastructure is invest heavily in the firm’s server components (network cards, hard drives, RAID cards, and switches to name a few), never buy computers with components that are more the a few years old because in the long run, technology will pass you by and those computers will not be able to be upgraded to keep up with the times and you will have to buy a new computer in the end, and lastly, develop a flowchart of every component, license, software, and plugin because that flowchart will be what saves you from chasing that one thing that may be wreaking havoc on that visualization pipeline.
Neoscape: If you are able get a dedicated IT manager who will have the best interests of the company in mind (not just be a barrier to getting things done), track the problems, so that you can identify the recurring ones. For small studios, try to maintain discipline among users workstations, discourage users from “going off the ranch” and installing random software and putting the company in jeopardy from a licensing standpoint or network weaknesses.
PixelFlakes: Plan. Upgrading when it’s too late or sorting backups once you’ve lost important data is of course not an option.
Public Square: Get high quality machines, and try building them from the ground up. Buy some used parts if it means you will get a more powerful system up.
2G Studio: My advice would be... everyone need to know about the 3dsmax and the render engine. Mostly on file project size management. Test the scene with or without converting the object to proxy, need to test the file size, vs render speed, vs network traffic. Although it seems like very simple advice, it took us years to figure out. Lots of test we have been done before we finally come up with our current network setup.
Ricardo Rocha: A reliable system is better than a fancy full feature consumer grade one.
Steelblue: Supporting your company internally is a great way to keep a cost efficient handle on IT services. But knowing when support requires outside assistance is key. When you balance your own time and the concern over making a decision that could cost the company time and money it becomes obvious when outside support is necessary.
Transparent House: Invest in your tools, it will gives you more smooth result, every dollar will pays of for sure. Separate things from long term and short term investment, build better workstations. Most R&N is going locally while farm is busy, think ahead, of upgrade possibilities. Build or upgrade your network to maximum on today's technologies. Time is getting more expensive every day, there no reason to spend it on fixing or waiting.
Urban Simulations: Avoid getting the ultimate hardware or software, it's always expensive and overestimated, get 75% of the power saving money and use the easy thing. The thing with great, impressive features can lock you for years and really often you are only using the same features as the easy and cheap ones.
Are there any specific tools that you use on a daily basis that help you manage the network and its administration?
ArX Solutions: Yes, our IT staff monitor software installations, inventory, remote accesses, etc.
Beauty and the Bit: Sure, the NAS´s built in tools for backup are really good since you don't have to be an illuminati to understand and use.
Factory Fifteen: Remote Desktop Connection manager v2.7
Kilograph: Spiceworks mostly as it is a one stop shop for all IT administration.
MIR: We use Windows remote desktop, and also a software that we are beta-testing (which is secret).
Neoscape: One primary useful tool is our monitoring system , it is essential to understanding any failures among the equipment in the data center, along with that the support system where users can submit trouble tickets.
PixelFlakes: Almost all our management tools are cloud based. Our network is run off Ubiquiti Unifi hardware, that has a fully cloud managed dashboard. They are in our opinion the best on the market now, and it makes a huge difference to the ease of administration.
Public Square: Mostly just backburner. And remote desktop to keep an eye on things.
Pure: Yes, we created our own tools to see which machines are free to render and made it very simple to use always the biggest power available
Ricardo Rocha: Remote connection and SSH clients mostly, alongside specialized OS for the taks.
Steelblue: Sonicwall VPN for remote access. Also exploring Silent Install Builder and Deadline for renderfarm management.
Transparent House: As we are using 3ds Max in our pipeline, we can the included package tools. For years it’s been pretty stable, and in most cases has not required extra tools for network rendering. There some useful small utilities to monitor health and load of the farm, but we using it only on couple of machines. We have a chat specifically for the farm to always be on the same page with team. For now it feels enough but maybe in a future we will need something more that that.
Urban Simulations: common windows tools… nothing fancy.
What are some hard lessons you’ve learned the hard way setting up a visualization network?
ArX Solutions: Don't trust the big guys: Dell, HP, they will sell you and over budgeted servers that can handle our workload. Try to look for advice in the industry.
Beauty and the Bit: That most of the times what seems cheap in first place can be expensive later on.
Designstor: We implemented virtualization a number of years ago on the recommendation of an IT company we were working with. Virtualization sounds like a great thing (it was THE thing for a while), but in practice we were badly burned. When it’s working, it’s great; it maximizes resources and handles hardware failures automatically. When something goes wrong, it requires some serious knowledge to fix. We didn’t have that knowledge and relied on an outside company to help, so when something happened (and it did!) we were not able to respond quickly. We are implementing a plan soon to remove the last of our virtualized systems.
Factory Fifteen: There is no cheap and easy route.
Kilograph: Network efficiency and stability make or break a firm. A full day of down time is thousands of dollars, and over the years I have learned that keeping the network as stable as possible will save you from sleepless nights of troubleshooting
Neoscape: There are many times when we tried to be penny wise, but ended up being pound foolish, the Isilon, the Cisco switches, are both enterprise level hardware that perform on a much higher plane than the consumer level version of those devices.
PixelFlakes: Even with the best planning, sometimes things don't work out. You might have thought a particular piece of hardware would resolve an issue to no avail. For example, initially we thought moving to a RAID 6 system alone would work. We tried it and it was just too slow for us, leaving us with a very expensive but slow storage system. This required some on the spot brainstorming which resulted in the trial of SSD caching, which worked! Another issue is physical space; we don't have a dedicated server room... yet. This means our racks must stand within the main, open plan office which of course could lead to noise and heat issues. To solve this, we started using APC Netshelter CX cabinets. They are insulated/sound proofed and work beautifully.
Public Square: REDUNDANCY! You don’t want to lose your data.
Pure: To realize that when the server doesn't work, all are screwed. It’s clear for sure, but luckily we had this just once for half a day and it was a nightmare to experience how dependent we are on this.
Ricardo Rocha: Redundancy, and backup. This are the worst, also down time.
Steelblue: Never underestimate the amount of data that artists can quickly generate. When building out a data infrastructure double the size of storage you think you'll need
Transparent House: Couple of things, memory and internal storage very important for the node machine, it has to be minimum match your workstations, if not you will have situations when the farm will not be able to handle your project, second is power consumptions, make sure the place where your farm is placed has enough power to work at maximum fulfill. And one more thing, maybe very obvious, update your software in between the projects, maybe this for more younger studios.
Urban Simulations: Getting the latest one thing… such as hardware and software makes your work harder rather than easier… wait 3 months after every release.
How do you manage network rendering? Do you use 3rd party tools to manage that process or have you developed your own tools?
ArX Solutions: No, we just use Autodesk's backburner or the distributed rendering from Vray.
Beauty and the Bit: 3rd party tools for the moment.
Designstor: We use a highly customized version of Deadline as well as many scripts and programs developed in house.
Factory Fifteen: Deadline
Kilograph: We use Thinkbox software for our network rendering manager. Great tool and helps with IT oversight because it allows me to see all the specs per render machine as well as up time and in some cases downtime.
MIR: We are helping out with beta testing some tools. In general we don't use network rendering that much. We mainly work on images, and haven't had any problems with rendering times the latest years.
Neoscape: We use deadline, which has been another hard lesson, after years and years of fighting with Backburner, Deadline just works, it is very powerful and can be used across platforms and with many software, which is a huge benefit.
PixelFlakes: We use backburner for animations / interactive work, however our primary usage for them is distributed bucket rendering / quick tests for our artists. We therefore don’t ‘queue’ our jobs as traditional CGI / FX artists would. We just need to funnel as much power as possible when the time comes to test a render / kick out a high-resolution draft.
Public Square: Just old fashioned backburner. Keep it simple.
Pure: we created our own tools
2G Studio: For still image we just let the workstation render overnight and use several render farm nodes for each workstation. For animation we use backburner. There are 3rd party tools to manage the process, but I heard some companies are still having some issue with the 3rd party tools. Right now we are still comfortable with backburner and how we do the render.
Ricardo Rocha: We use of the shelf tools, Simplicity is sometimes underrated.
Steelblue: Currently exploring 3rd party tools. To date have been able to manage with native tools within 3dsmax/vray/corona.
Transparent House: As I mentioned before we are using included utilities in our pipeline. There are just couple of situations when we thought it would be better, but went back to standards.
Urban Simulations: Backburner is a quick and easy solution but we have tested a lot of good tools.
Have you run into any issues with power consumption and the amount of hardware that needs to be run? Explain the challenges and what you had to change.
ArX Solutions: Not really
Beauty and the Bit: When we moved to our new office we were worried about it so the smarter move we did was hiring the higher power range with the electric co. We run some nice “power consuming” devices, kitchen (that gets really crowded at some moments of the day) and also cooling system can give you some headaches if you don't plan it. Also coffee machines are really demanding…we love to drink high quality coffee.
Designstor: We had terrible issues with our server room cooling and power for a while, all of which were the fault of the systems in our building. Most buildings aren’t meant for such power-intensive operations. We spent lots of time putting systems in place to warn about problems and to react accordingly (automatic shutdowns, backup power, etc.). Like data backup, power backup is a huge but critical expense that can’t be underestimated.
Factory Fifteen: Yes in our last two offices we had power cuts all the time. In our new office we built the electrics from scratch and we have had fewer issues. Something always blows at some point.
Kilograph: Not too many issues with power consumption. Being a small/medium firm, we just make sure all computers are on battery backups with voltage regulators.
MIR: Our workstations generate a lot of heat and noise. We are thinking about upgrading our offices so that the workstations can be located elsewhere than where we sit.
Neoscape: We have the whole Data Center on UPSs which takes planning, but was grown rather organically, we started with one APC Symmetra LX 8kVA, then added another, then another, we just moved and decommissioned one and added a APC Symmetra PX 30KW 3 phase system. These 3 UPSs are able to run the entire data center in the case of a power failure.
PixelFlakes: Not really. There are power considerations, but this is one of the more predictable aspects of configuring a system. We calculate the maximum load power requirements of all our servers and make sure that we specify the appropriate power supplies. On that subject, when we first set everything up we accidently placed our PSU on a 3AMP instead of a 12AMP kettle cable. This resulted in our fans failing and our MD having to make a 3am emergency trip to the office as our server was sending email warnings that all our hard-drives were overheating. Good times.
Public Square: Sure, we were in a shitty old building years ago and we would lose power all the time once we sent a heavy animation sequence. I think it really depends on the building you are in and how their electrical is set up.
Pure: Our electricity bill is huge. When we moved in the office space we took care that we got five special lines on top of what it normal so we secured the power supply.
2G Studio: This is always the biggest challenge mostly for small studio like us back then. As you know firstly I worked as a freelancer at my own house, then kept growing. When you are at the stage growing, you will always be faced with this problem.
Here, there is a power limit, the government count is per house. If you want to get a bigger limit, you need to pay lots of money for it. The price for the electricity is also different. This is the hardest part in our life, when we were able to buy a render farm but we could not run it because we didn’t have the power capacity. Fortunately we managed to buy three properties side by side in Bali, and we set the power limit quite high for each property. We can use it for our new office without any problem and the running cost will not that big.
Ricardo Rocha: Yes. Power consumption and proper power outlets are not to be ignored, also knowing what are the power requirements for hardware. For this you’ll need specialised help, this is dangerous.
Steelblue: Yes blowing circuits has led to the installation of additional 110 and 220 circuits in our office space for the workstations and servers.
Transparent House: Yes, I mentioned it in our hard lessons, at some point we replaced our old farm, with new and bigger one. And once it was installed everything was fine, there was some slow projects in a couple of weeks, but then we’ve got bigger project with huge amount of rendering, and once the farm was up in full speed there was situations when the power outage started. A couple times we didn’t notice the reason, but we solved it quickly but updating the wiring and some stuff which I not really familiar with. Now it’s almost three times more powerful than it needs to be in areas where the farm is located, to be sure this will not happen even if will add more nodes later.
Urban Simulations: A well known blade render system made us pay 4 times the nowadays energy consumption because the stability of the CPUs and a lot of fancy features we haven't used anytime. Moving to an easy, tailor made server without any cooling feature gave us a flexibility and low consumption solution.
About this article
We talk to top studios about their IT infrastructure and networking.
About the author
Founder at CGarchitect