Month: March 2014

How to build your own 180TB RAID6 storage array for $9,305

Storage Pod 4.0, side by side

We’ve all been there: Your computer’s 2-terabyte drive has filled itself up again, and it’s time to delete some movies and uninstall some games. But wait! Instead of deleting data like some kind of chump, I have a better idea: Build your own 180-terabyte RAID6 storage array, and never run out of space ever again. With 180 terabytes of storage under the hood, never again will the Steam Summer Sale give you storage anxiety; never again will you have to decide which files get backed up. The best part? Building your own 180TB storage array will cost you just $9,305.

The 180TB storage array, like many of our other hard drive-related stories, comes from our friends at Backblaze. Backblaze is a cloud-based backup company that provides unlimited storage for a fixed monthly price — a service it can only provide because it builds its own Storage Pods, instead of using commercial devices that are well over twice the price. Backblaze originally open sourced the specifications of Storage Pod 2.0 in 2011 — and now, as the company continues to grow and seek out cheaper and higher density storage solutions, it has just published the details of Storage Pod 4.0.

First, the specifications. Storage Pod 4 consists of a custom-designed 4U server case containing 45 4TB hard drives, a single 850W power supply, and a motherboard/CPU/RAM that runs the controller software. The centerpiece of the installation, though, is a pair of Rocket 750 40-port SATA PCIe host adapter expansion boards, priced at around $700 each. These specs are a big step up from Storage Pod 2.0 and 3.0, which required two PSUs, and nine five-drive NAS backplanes that then connected to three SATA expansion cards. By wiring the hard drives directly into the host adapter, Backblaze says Storage Pod 4 has between four and five times the throughput of its predecessor.

Rocket 750 40-port SATA expansion cards, inside the Backblaze Storage Pod 4.0

Rocket 750 40-port SATA expansion cards, inside the Backblaze Storage Pod 4.0

If you want to build your own Storage Pod, Backblaze does provide a complete parts list and blueprint, but it would be a pretty epic endeavor. Instead, Backblaze suggests that you buy an empty Storinator chassis from 45 Drives, which is based on the Backblaze Storage Pod, and fill it up with your own drives. This method will cost you around $12,500, rather than Backblaze’s cheaper in-house cost of $9,305. In case you’re wondering, Backblaze is currently filling its Storage Pods with Hitachi (HGST) and Seagate 4TB hard drives, but it wants to try out Western Digital’s Red drives in the near future. (Read: Who makes the most reliable hard drives?)

The Thailand hard drive crisis, three years on

What’s odd about Storage Pod 4.0, however, is that its cost-per-gigabyte is almost identical to Storage Pod 2.0, released back in July 2011. Storage Pod 2.0 provided 135TB at a cost of $7,394, or 5.5 cents per gig; Storage Pod 4.0 is 180TB for $9,305, or 5.1 cents per gig.

Hard drive cost per gigabyte, from 2009 to 2013

If the Thailand flooding of 2011 hadn’t occurred, we’d probably be around 3 cents per gig. After the floods, hard drive prices shot up, and it took almost 30 months for hard drive prices to start trending below their July 2011 level. This is why, after almost three years, 4TB drives are still the most cost effective (before the Thailand floods, the cost-per-gig was almost halving every two years, in line with Moore’s law).

The good news, though, is that 5- and 6-terabyte drives are now on the market — they’re just incredibly expensive. The WD/HGST helium-filled 6TB drive is one of the most exciting hard drives to hit the market in the last decade — but priced at around $750, or 12 cents per gig, it just doesn’t make economical sense for large storage arrays.

For a complete parts list, chassis blueprint, and info on how to build your own Storage Pod 4.0, hit up the Backblaze website. It’s worth noting that Backblaze’s controller/RAID6 software is proprietary — so if you do go down the DIY route, you’d probably end up using something like FreeNAS, or rolling your own software. (Let’s face it, 180TB storage arrays aren’t really for home users; this is enterprise- and supercomputing-level stuff).

Facebook’s facial recognition technolgy is now as accurate as the human brain, but what now?

Facial recognition markers

Facebook’s facial recognition research project, DeepFace (yes really), is now very nearly as accurate as the human brain. DeepFace can look at two photos, and irrespective of lighting or angle, can say with 97.25% accuracy whether the photos contain the same face. Humans can perform the same task with 97.53% accuracy. DeepFace is currently just a research project, but in the future it will likely be used to help with facial recognition on the Facebook website. It would also be irresponsible if we didn’t mention the true power of facial recognition, which Facebook is surely investigating: Tracking your face across the entirety of the web, and in real life, as you move from shop to shop, producing some very lucrative behavioral tracking data indeed.

The DeepFace software, developed by the Facebook AI research group in Menlo Park, California, is underpinned by an advanced deep learning neural network. A neural network, as you may already know, is a piece of software that simulates a (very basic) approximation of how real neurons work. Deep learning is one of many methods of performing machine learning; basically, it looks at a huge body of data (for example, human faces) and tries to develop a high-level abstraction (of a human face) by looking for recurring patterns (cheeks, eyebrow, etc). In this case, DeepFace consists of a bunch of neurons nine layers deep, and then a learning process that sees the creation of 120 million connections (synapses) between those neurons, based on a corpus of four million photos of faces. (Read more about Facebook’s efforts in deep learning.)

Once the learning process is complete, every image that’s fed into the system passes through the synapses in a different way, producing a unique fingerprint at the bottom of the nine layers of neurons. For example, one neuron might simply ask “does the face have a heavy brow?” — if yes, one synapse is followed, if no, another route is taken. This is a very simplistic description of DeepFace and deep learning neural networks, but hopefully you get the idea.

Sylvester Stallone, going through DeepFace's forward-facing algorithm

Sylvester Stallone, going through DeepFace’s forward-facing algorithm. Notice how the slight tilt/angle in (a) is corrected in (g). (d) is the “average” forward-looking face that is used for the transformation. Ignore (h), it’s unrelated.

Anyway, the complexities of machine learning aside, the proof is very much in the eating: DeepFace, when comparing two different photos of the same person’s face, can verify a match with 97.25% accuracy. Humans, performing the same verification test on the same set of photos, scored slightly higher at 97.53%. DeepFace isn’t impacted by varied lighting between the two photos, and photos from odd angles are automatically transformed (using a 3D model of an “average” forward-looking face) so that all comparisons are done with a standardized, forward-looking photo. The research paper indicates that performance — one of the most important factors when discussing the usefulness of a machine learning/computer vision algorithm — is excellent, “closing the vast majority of [the] performance gap.”

Facebook facial recognition fail

Facebook tries to impress upon us that verification (matching two images of the same face) isn’t the same as recognition (looking at a new photo and connecting it to the name of an existing user)… but that’s a lie. DeepFace could clearly be used to trawl through every photo on the internet, and link it back to your Facebook profile (assuming your profile contains photos of your face, anyway). Facebook.com already has a facial recognition algorithm in place that analyzes your uploaded photos and prompts you with tags if a match is made. I don’t know the accuracy of the current system, but in my experience it only really works with forward-facing photos, and can produce a lot of false matches. Assuming the DeepFace team can continue to improve accuracy (and there’s no reason they won’t), Facebook may find itself in the possession of some very powerful software indeed. [Research paper: “DeepFace: Closing the Gap to Human-Level Performance in Face Verification“]

What it chooses to do with that software, of course, remains a mystery. It will obviously eventually be used to shore up the existing facial recognition solution on Facebook.com, ensuring that every photo of you on the social network is connected to your account (even if they don’t show a visible tag). From there, it’s hard to imagine that Zuckerberg and co will keep DeepFace purely confined to Facebook.com — there’s too much money to be earnt by scanning the rest of the public web for matches. Another possibility would be branching out into real-world face tracking — there are obvious applications in security and CCTV, but also in commercial settings, where tracking someone’s real-world shopping habits could be very lucrative. As we’ve discussed before, Facebook (like Google) becomes exponentially more powerful and valuable (both to you and its share holders) the more it knows about you.