1. 程式人生 > >Shooting The Machine Learning Rapids With Open Source

Shooting The Machine Learning Rapids With Open Source

There are a lot of different kinds of machine learning, and some of them are not based exclusively on deep neural networks that learn from tagged text, audio, image, and video data to analyze and sometimes transpose that data into a different form. In the business world, companies have to work with numbers, culled from interactions with millions or billions of customers, and providing GPU acceleration for this style of machine learning is just as vital as the types mentioned above. Up until now, many of the popular machine learning tools, which are open source, have been exclusively used on workstations or servers that used CPUs as their processing engines. To be fair, the SIMD engines inside of many popular CPUs have been supported with many of these tools, the Apache Arrow columnar database being an important one that often underpins the data scientist workbench; the Apache Spark in-memory database has been tweaked to make use of SIMD and vector units and also has other means of acceleration by compiling down to C instead of Java. But with the launch of Rapids, a collection of integrated machine learning tools that are popular among data scientists, Nvidia and the communities that maintain these tools are providing the same kind of acceleration that HPC simulation and modeling and machine learning neural network training have enjoyed for years.