Accelerating WinML and NVIDIA Tensor Cores | NVIDIA Technical Blog
Accelerating AI Training with NVIDIA TF32 Tensor Cores | NVIDIA Technical Blog
Understanding Tensor Cores
Deep Tensorized Learning — TensorLy-Torch 0.4.0 documentation
Types oNVIDIA GPU Architectures For Deep Learning
PyTorch 1.0 preview (Dec 6, 2018) packages with full CUDA 10 support for your Ubuntu 18.04 x86_64 systems. - vxlabs
Faster and Memory-Efficient PyTorch models using AMP and Tensor Cores | by Rahul Agarwal | Towards Data Science
Performance Debugging of Production PyTorch Models at Meta | PyTorch
How Fast GPU Computation Can Be. A comparison of matrix arithmetic… | by Andrew Zhu | Towards Data Science
Accelerating Inference Up to 6x Faster in PyTorch with Torch-TensorRT | NVIDIA Technical Blog
Types oNVIDIA GPU Architectures For Deep Learning
Understanding LazyTensor System Performance with PyTorch/XLA on Cloud TPU | PyTorch
Programming Tensor Cores in CUDA 9 | NVIDIA Technical Blog
Mixed Precision Training with NVIDIA Volta
Types oNVIDIA GPU Architectures For Deep Learning
Video Series: Mixed-Precision Training Techniques Using Tensor Cores for Deep Learning | NVIDIA Technical Blog
Tensor Cores: Versatility for HPC & AI | NVIDIA
Working With PyTorch Tensors -- Visual Studio Magazine
Faster and Memory-Efficient PyTorch models using AMP and Tensor Cores | by Rahul Agarwal | Towards Data Science
Pytorch Tutorial from Basic to Advance Level: A NumPy replacement and Deep Learning Framework that provides maximum flexibility with speed | by Kunal Bhashkar | Medium
Tensor Cores and mixed precision *matrix multiplication* - output in float32 - PyTorch Forums
Understanding Tensor Cores
Tensor Cores: Versatility for HPC & AI | NVIDIA
Faster and Memory-Efficient PyTorch models using AMP and Tensor Cores | by Rahul Agarwal | Towards Data Science
Pytorch Core Code Research | Miracleyoo
Video Series: Mixed-Precision Training Techniques Using Tensor Cores for Deep Learning | NVIDIA Technical Blog