Can your system play x number of streams of my favorite codec?
When you’re searching for a new shared storage system it’s common to compare a system’s published stream counts against those from different manufacturers, but there’s much more behind these numbers than meets the eye.
The purpose of this article is to explain how stream counts can be generated, and highlight some of the less obvious things that can affect a shared storage system’s ability to do (or not do!) a certain number of streams of video playback. We think it’s important when making comparisons to understand how stream counts are generated, and to know what other factors can affect streaming performance and playback. We will also explain some of the ways we generate our own video stream counts and discuss why we get them that way.
(This is a two-part article. We suggest first having a look at Part 1, which can be found here.)
The Not-So-Obvious Variables
At SNS we spend much of our engineering time researching new ways to improve shared storage for video editing. We investigate and test thousands of combinations. During one of our recent marathon testing and tuning sessions — ten in-depth weeks of performance/integration testing using different drives, NLEs, GbE and 10GbE adapters, Mac, Windows, workstation models, kernel tunings, configurations, etc. — we compiled a list of some not-so-obvious things that can affect a shared storage system’s ability to play a number of streams of real-world video. (The goal is always to be able to get as many streams of video playing as reliably as possible on a system, so the more we understand the nuances, the more we can improve our shared storage products for our customers.)
Below are some of the more interesting and important variables that can influence performance in a multi-user storage system.
(None of the items below are merely theoretical; each has been confirmed to have a noticeable influence on the number of video streams a storage system can concurrently deliver to multiple users without dropping a frame. When testing and tuning a complex, multi-user shared storage system it is imperative to take a holistic approach to the overall environment, which is why you will see things like “amount of RAM in the workstations” on the list below. As possible as it is to cripple a storage system with something like a misconfigured or improperly spec’ed switch, it is likewise possible to compensate for a storage system’s shortcomings (for example, by loading a workstation with RAM) and possibly inadvertently inflate how many streams it is capable of handling.)
- Protocol (AFP, SMB, SMB2, SMB3, NFS, Fibre Channel, iSCSI, a mix of all of them): Storage systems usually support a number of different protocols, and each of those protocols has its own strengths and weaknesses. It is unlikely that all protocols will yield the exact same stream counts, so if you’re comparing systems you should be comparing the number of streams supported with each protocol you plan to use. Our stream counts err on the conservative end of the possible protocols.
- File sizes of clips in use: Surprising as it may sound, this goes hand-in-hand with how much RAM you have in your workstations! It is possible to inflate (or erroneously state) a stream count by using clip sizes that are less than the amount of RAM in the workstations. For example, let’s say you have twenty 4GB clips and twenty workstations that have 8GB of RAM each, then you start a looped playback on all twenty systems. The storage system will be mostly idle during this test! This is because it is possible for a workstation with a sufficient amount of RAM to cache the video data and just play that stream out of its own RAM. For this reason, our stream count tests use clips that are larger than the amount of RAM in the workstations we’re using.
- Physical placement of the data on a rotational drive: The physical location of the data on the drives’ platters can influence performance such as stream counts. This is because the outer area of a platter is larger and can therefore more efficiently deliver more data in a single rotation than the inner area of the platter. Roughly speaking, the first 10% of a drive’s outer portion is the “sweet spot.” Some storage systems may attempt to leverage this phenomenon with strategies such as short stroking (using just the outer portions of the platters) or tiering (system attempts to move frequently used data to outer portions of the platters or to faster storage like SSD). It is possible to overestimate stream counts by exclusively using the outer portions of the drives’ platters when testing. This is why we “burn” the first 25% of the drives in our tests, meaning, we skip over the best area of the drives and don’t use them for stream count testing.
- Number of workstations playing the streams: The distribution of the streams being played can influence how many streams can be handled. This is a complex aspect to nail down, but all else being equal it can be easier for a storage system to play six streams on one computer than one stream each on six computers. Our published numbers are obtained by using multiple workstations — we frequently have eight or more workstations of various models involved to get a smaller bit rate codec like ProRes 422 (145).
- All users using same files or different files: Would you say that 20 computers playing the same video clip (i.e., same exact file) would be equivalent to the storage system playing 20 streams? It doesn’t quite work that way! This is an extreme case, but it’s a much easier scenario for a system to handle because, for example, that one very popular file might fit completely into the storage system’s RAM. Also in this example the disks have much less seeking to do, which means lower latency than having each computer playing 20 unique files spread out across the disks. For this reason we test with absolutely unique files—20 streams means 20 individual files.
- Fragmentation of the file system: No, it’s not 1996, and yes, this is still a thing. The amount of fragmentation still matters, and as of today Finder is still a notorious offender for fragmenting the file systems, particularly when there are parallel copies occurring to a single file system. A storage system will be more capable of reliably playing a higher number of streams when the streams are not scattered all over the file system, period. Though in reality, fragmentation is not much of a problem until the amount of fragmentation becomes severe.
- What’s playing in the clip (for compressed/VBR codecs): Some codecs can use a lower data rate when there is less change/movement from frame to frame. This effect is very pronounced with higher throughput codecs like 4K and 6K, and can therefore produce misleading results about how many streams can be played, because even though these codecs have a known data rate range, the data rates aren’t perfectly constant. It may be possible to play many more streams of a relatively static compressed video than if that same codec were used for something with a lot of changes from one frame to the next.
- Switch/network configuration and congestion: It is possible that a storage system can easily manage to give you, say 30 streams of a codec, but the network infrastructure (switches, NICs, cables, configuration) could be somehow holding it back from delivering the expected level of performance. This is particularly true in the case of 10GbE <-> GbE environments (e.g. storage connected via 10GbE to switch, clients connected via GbE), where specifically classed switches are required!
Outside of factors that are specific to the storage system’s ability to reliably play a certain number of stream counts, here are some more things to consider…
Playback isn’t the only thing to consider
How did those video files get into the storage system? How are new files going to get there?
At some point, perhaps many times throughout the day, new files will be placed onto the storage system in real world use, meaning that the new data will probably be copied or ingested from some computer to the storage system. And if that happens during a period of high playback activity it’s possible that a very busy storage system will be pushed to the limit (even if the aggregate throughput is within expected limits), because it’s more difficult for the drives to juggle a massive amount of concurrent reads and writes (compared to only reading/playing back) in a way that doesn’t result in dropped frames. In fact, it even matters what sorts of files are being written! Generally, it’s more demanding on the system to write a lot of small files, rather than a few larger files, while playback is occurring. This is where initial system configuration and a good understanding of your workflow can make a big difference.
And how about scrubbing? Is that considered in the stream count? It is more demanding overall on a storage system when workstations scrub while other workstations are doing normal playback. Scrubbing is more likely to cause other workstations in normal playback to drop frames. This is because storage systems are usually pretty good at predicting what data you are going to want next when you’re just normally playing video, because that’s a normal sequential IO pattern. But things change when you’re scrubbing: That nice and orderly sequential IO pattern goes out the window and becomes random IO, for which your next data request is not nearly as easy to predict. On something like a USB drive, this isn’t an observed phenomenon because you are the only person using that USB drive, but it’s a different story on a shared storage system: When you have many other users doing normal playback (sequential IO) on a shared storage system and then a user starts scrubbing (random IO), the system has to work much harder to juggle the scrubbing request with the other users’ demands to continue uninterrupted normal playback and not drop frames. So, can you play 20 streams of a codec and have someone scrubbing without causing frame drops? This is important to know if you intend to frequently be playing 20 streams! (We include scrubbing operations in our stream count tests…)
The stream that breaks the camel’s back
What’s the difference between 13 streams and 14 streams, or 36 streams and 37 streams? In all cases it always comes down to some extra unit of work that pushes things over the edge. But maybe instead of one extra stream it’s someone scrubbing while other users are playing. Or maybe in a seriously edge case all it takes is one person simply opening another project and populating the timeline with media while other users are playing lots of streams. Sometimes the consequence is just that one computer drops a frame; in other cases it can be that all computers drop their frames. The fact is that there exists a given breaking point for every system, and the closer you get to that point the easier it is to just nudge things over the edge. This is why it is important to have some headroom!
Can’t I just use SSD?
Yes! Using SSD can certainly improve (or at least mask) some of these issues, and undoubtedly the price/capacity will improve to the point that SSD makes economic sense in most deployments. We’re not there yet unfortunately. If you need the speed of SSD for video purposes, then you probably need a huge amount of capacity to go along with it. And for now, that means you will be paying a premium for the same amount of capacity.
Criteria for success
Would you consider a stream count reliable for your needs if you could play that number of streams for 30 minutes without dropping a single frame, but no longer? How about 15 minutes? 5 minutes? 30 seconds? It’s rather easy for a shared storage system to manage a burst of a high workload for a short time, but it’s quite another to consistently sustain it for potentially hours. This could be very important to you, especially for long-form workflows, which is why we confirm that we’re able to do uninterrupted playback with a number of streams for at least an hour, and for some tests we let things roll overnight or longer for a soak test.
A stream count may be intentionally conservative, or it may bump right up to the ceiling of what is possible, with no room for doing anything else. If you believe you will have a workload that is going to frequently be at the upper limit of the stream count then you should plan to increase the capability of the shared storage system, or at least have reliable procedures in place to load balance your daily use of it.
We hope this information helps to illustrate that the process of gathering stream counts is multi-faceted and the results of these kinds of tests can be interpreted in different ways.
For practical reasons the detailed background on how a shared storage system’s stream counts were generated will not always be available. If you’re comparing stream counts for shared storage systems from multiple companies, make sure you have all the information you need to be confident you’re making a fair comparison.
To be sure, some of the topics discussed above (headroom, for example) may be impossible to uncover from simply comparing one system’s published stream counts to another system’s. Stream count estimates are helpful to know, — you can see ours here — but they should not be a make-or-break item on your shared storage evaluation list. It is much more important to have good discussions with your storage provider so that your workflow and performance goals for your new system are well understood.