Artificial neural networks with millions of adjustable parameters and a similar number of training examples are a potential solution for difficult, large-scale pattern recognition problems in areas such as speech and face recognition, classification of large volumes of web data and finance. The bottleneck is that neural network training involves iterative gradient descent and is extremely computationally intensive. In this paper we present a technique for distributed training of Ultra Large Scale Neural Networks (ULSNN) on Bunyip, a Linux-based cluster of 196 Pentium III processors. To illustrate ULSNN training we describe an experiment in which a neural network with 1.73 million adjustable parameters was trained to recognize machine-printed Japanese characters from a database containing 9 million training patterns. The training runs with a average performance of 163.3 Gflops/s (single precision). With a machine cost of $150,913, this yields a price/performance ratio of 92.4¢ /Mflops/s (single precision).
neural-network, Linux cluster, matrix-multiply;
机译:miR-17?92簇家族在小脑和髓母细胞瘤发展中的作用miR-17?92簇家族在小脑和髓母细胞瘤发育中的作用miR-17? 92簇家族在小脑和髓母细胞瘤的发展
机译:最初98个口罩的完善性的严格证明(第3号X问题的解决方案)
机译:MX,4月国内价格升至98日元以上〜USCP收于305 ¢,上涨了30 ¢
机译:在PIII集群上以92¢/ MFlops / s的超大型神经网络训练
机译:超大规模序列聚类分析的并行计算框架
机译:超大型序列聚类分析的并行计算框架
机译:Roadrunner '98:在分布式任务训练中训练有效性