The tools that make TensorFlow productive
Analytical frameworks come with an entire ecosystem.
Deployment is a big chunk of using any technology, and tools to make deployment easier have always been an area of innovation in computing. For instance, the difficulties and uncertainties of installing software and keeping it up-to-date were one factor driving companies to offer software as a service over the Web. Likewise, big data projects present their own set of issues: how do you prepare and ingest the data? How do you view the choices made by algorithms that are complex and dynamic? Can you use hardware acceleration (such as GPUs) to speed analytics, which may need to operate on streaming, real-time data? Those are just a few deployment questions associated with deep learning.
In the report Considering TensorFlow for the Enterprise, authors Sean Murphy and Allen Leis cover the landscape of tools for working with TensorFlow, one of the most popular frameworks currently in big data analysis. They explain the importance of seeing deep learning as an integral part of a business environment—even while acknowledging that many of the techniques are still experimental—and review some useful auxiliary utilities. These exist for all of the major stages of data processing: preparation, model building, and inference (submitting requests to the model), as well as debugging.
Given that the decisions made by deep learning algorithms are notoriously opaque (it’s hard to determine exactly what combinations of features led to a particular classification), one intriguing part of the report addresses the possibility of using TensorBoard to visualize what’s going on in the middle of a neural network. The UI offers you a visualization of the stages in the neural network, and you can see what each stage sends to the next. Thus, some of the mystery in deep learning gets stripped away, and you can explain to your clients some of the reasons that a particular result was reached.
Another common bottleneck for many companies stems from the sizes of modern data sets, which often beg for help in getting ingested and through the system. One study found that about 20% of businesses handle data sets in the range of terabytes, with smaller ranges (gigabytes) being most common, and larger ones (petabytes) quite rare. For that 20% or more using unwieldy data sets, Murphy and Leis’s report is particularly valuable because special tools can help tie TensorFlow analytics to the systems that pass data through its analytics, such as Apache Spark. The authors also cover options for hardware acceleration: a lot of research has been done on specialized hardware that can accelerate deep learning even more than GPUs do.
The essential reason for using artificial intelligence in business is to speed up predictions. To reap the most benefit from AI, therefore, one should find the most appropriate hardware and software combination to run the AI analytics. Furthermore, you want to reduce the time it takes to develop the analytics, which will allow you to react to changes in fast-moving businesses and reduce the burden on your data scientists. For many reasons, understanding the tools associated with TensorFlow makes its use more practical.
This post is part of a collaboration between O’Reilly and TensorFlow. See our statement of editorial independence.