🔖 Supercomputing in Julia

Super Computing: HPC, Distributed Computing, Cloud computing, Cluster computing, Grid computing, Parallel computing, Hardware arch (ARM, CUDA, GPU, MIPS), Kernels1

See also

  • 🏚️ means the package may not support current versions of Julia.
  • 🏗️ means the package may be a WIP.

Concurrency and Parallel Computing


General Concurrency Packages:

  • Actors.jl :: An Actor Model implementation in Julia.
  • FLoops.jl:: provides a macro @floop. It can be used to generate a fast generic iteration over complex collections.
  • Folds.jl :: A unified interface for sequential, threaded, and distributed fold.
  • Heptapus.jl :: The roofline function is a translation of the roofline code from Accelerated finite element flow solvers
  • Hwloc.jl :: The Portable Hardware Locality (hwloc) package wraps the hwloc library to provide a portable abstraction (across OS, versions, architectures, …) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading.
  • TiledIteration.jl :: Julia package to facilitate writing mulithreaded, multidimensional, cache-efficient code.
WIP or may not work
  • 🏚️ Blocks.jl :: A framework to represent chunks of entities and parallel methods on them.
  • 🏚️ ScaLAPACK.jl :: Scalable Linear Algebra PACKage.

APIs and bindings

  • ArrayFire.jl by @JuliaComputing :: Julia Wrapper for the ArrayFire library.
  • Elly.jl :: Hadoop HDFS and Yarn client.
  • MPI.jl :: MPI wrappers for Julia
  • Slurm.jl :: Experimental Julia interface to
WIP or may not work
  • 🏚️ ArrayFire.jl by @hshindo :: Julia bindings for ArrayFire.
  • 🏚️ HDFS.jl :: An interface wrapper over the Hadoop HDFS library that wraps the HDFS C library libhdfs and provides APIs similar to Julia Filesystem APIs which can be used for direct access to HDFS files.
  • 🏚️ OCCA.jl :: Julia interface into OCCA2 by @tcew, an extensible multi-threading programming API written in C++.

Cloud Computing

  • AWS.jl :: supports the EC2 and S3 API’s, letting you start and stop EC2 instances dynamically.
  • AWSCore.jl :: Amazon Web Services Core Functions and Types.
  • AWSS3.jl :: AWS S3 Simple Storage Service interface for Julia.
  • GCloud.jl :: Tools for working with Google Compute engine via the cloud CLI.
  • GoogleCloud.jl :: Google Cloud APIs for Julia.
WIP or may not work

Multiprocessing and Distributed Computing



  • ClusterManagers.jl :: Support for different clustering technologies.
  • Dagger.jl :: A framework for out-of-core and parallel computation and hierarchical Scheduling of DAG Structured Computations.
  • DistributedArrays.jl :: A task persistency mechanism based on hash-graphs for Dispatcher.jl.
  • FunHPC.jl :: Functional High-Performance Computing - A high-level API for distributed computing, implemented on top of MPI. Also on Bitbucket.
  • HPAT.jl :: High Performance Analytics Toolkit (HPAT) is a Julia-based framework for big data analytics on clusters.
  • JuliaMPIMonteCarlo.jl :: Illustrative examples using Julia and MPI to do Markov Chain Monte Carlo (MCMC) methods.
  • MessageUtils.jl :: A collection of utilities for messaging.
  • Persist.jl :: The package Persist allows running jobs independent of the Julia shell.
  • Schedulers.jl :: It provides elastic and fault tolerant parallel map and parallel map reduce methods. The primary feature that distinguishes Schedulers parallel map method from Julia’s Distributed.pmap is elasticity where the cluster is permitted to dynamically grow/shrink.
  • SimJulia.jl ::A discrete event process oriented simulation framework written in Julia inspired by the Python library SimPy.
WIP or may not work
  • 🏚️ ChainedVectors.jl :: Few utility types over Julia Vector type.
  • 🏚️ ClusterDicts.jl :: Global and Distributed dictionaries for Julia.
  • 🏚️ Collectl.jl :: Plotting information from Collectl in julia.
  • 🏚️ Dispatcher.jl :: A framework for out-of-core and parallel computation and hierarchical Scheduling of DAG Structured Computations.
  • 🏚️ DispatcherCache.jl :: Tool for building and executing a computation graph given a series of dependent operations.
  • 🏚️ Dtree.jl :: Julia wrapper for a distributed dynamic scheduler for HPC applications.
  • 🏚️ Flume.jl :: A port of the Google Flume Data-Parallel Pipeline system.
  • 🏚️ HavenOnDemand.jl :: Julia package to access HPE Haven OnDemand API.
  • 🏚️ hpcc.jl :: Implementation of the HPC Challenge kernels in Julia.
  • 🏚️ IBFS.jl :: Grid simulation solver.
  • 🏚️ LCJC.jl :: Loosely Coupled Julia Clusters.
  • 🏚️ ParallelGLM.jl :: Parallel fitting of GLMs using SharedArrays.
  • 🏚️ PTools.jl :: A collection of utilities for parallel computing in Julia.
  • 🏚️ SGEArrays.jl :: SGEArray implements a simple iterator in Julia to efficiently handle Sun Grid Engine task arrays.

SIMD Computing



  • MPIArrays.jl :: Distributed arrays based on MPI onesided communication.
  • SIMD.jl :: Explicit SIMD vector operations for Julia.
WIP or may not work
  • 🏚️ SIMDPirates.jl :: A library for SIMD intrinsics. The code was stolen from SIMD.jl, whose authors and maintainers deserve credit for most of the good work here. Aside from pirating code, SIMDPirates also provides an @pirate macro that lets you imagine you’re commiting type piracy
  • 🏚️ SIMDVectors.jl :: An experimental package that uses the PR #15244 to create a stack allocated fixed size vector which supports SIMD operations and very similar in spirit to the SIMD.jl package.
  • 🏚️ Yeppp.jl :: A low level, high performance library for vectorized operations, elementwise operation on arrays.


WIP or may not work
  • 🏚️ RawMutex.jl :: A __MUT__ual __EX__clusion program object in Julia that allows multiple program threads to share the same resource, such as file access, but not simultaneously.
  • 🏚️ MT-Workloads :: Multi-threaded workloads in Julia.

GPU computing


  • CVortex.jl :: Julia wrapper for GPU accelerated vortex filament and vortex particle methods.
  • CuCountMap.jl :: Fast StatsBase.countmap for small types on the GPU via CUDA.jl
  • CUDA.jl :: This package wraps key functions in CUDA Driver API.
  • FoldsCUDA.jl:: provides Transducers.jl-compatible fold (reduce) implemented using CUDA.jl. This brings the transducers and reducing function combinators implemented in Transducers.jl to GPU. Furthermore, using FLoops.jl, you can write parallel for loops that run on GPU.
  • OpenCL.jl :: OpenCL 1.2 Julia bindings - a cross platform parallel computation API for programming parallel devices, with implementations from AMD, Nvidia, Intel, and others, similar in scope to PyOpenCL.
WIP or may not work
  • 🏚️ CLBLAS.jl :: CLBLAS integration for Julia.
  • 🏚️ CUBLAS.jl :: Julia interface to CUBLAS.
  • 🏚️ CUDAnative.jl :: Support for compiling and executing native Julia kernels on CUDA hardware.
  • 🏚️ CUDArt.jl :: Wrapper for CUDA runtime API.
  • 🏚️ CUDNN.jl :: Julia wrapper for the NVIDIA cuDNN GPU deep learning library.
  • 🏚️ CURAND.jl : Wrapper for NVidia’s cuRAND library.
  • 🏚️ HSA.jl :: Julia Bindings for the HSA Runtime.
  • 🏚️ julia-CuMatrix :: CUDA Matrix library.
  • 🏚️ julia-kernels :: A small suite of tools aimed at being able to write kernels in Julia, which could be executed on the CPU, or as GPU kernels.
  • 🏚️ MXNet.jl :: The dmlc/mxnet Julia package that brings flexible and efficient GPU computing and state-of-art deep learning to Julia. MXNet.jl is a part of Apache MXNet project now.
  • 🏚️ Titan.jl :: Write GPU kernels using pure Julia.
  • 🏚️ UberSignals.jl :: Concept for a fast event signal system, using JIT and GPU acceleration, loosely inspired by Reactive.jl.
  • 🏚️ Transpiler.jl :: Transpiling from Julia’s typed AST to CUDA / OpenCL code.


WIP or may not work


  1. Julia.jl is under COPYRIGHT © 2012-Now SVAKSHA, dual-licensed for the data (ODbL-v1.0+) and the software (AGPLv3+), respectively. ↩︎