🔖 DataScience in Julia

Data retrieval, manipulation ,and storage. 1

See also

  • 🏚️ means the package may not support current versions of Julia.
  • 🏗️ means the package may be a WIP.

DataBase API

  • DBInterface.jl :: An abstract DBI interface to provide a database-independent API protocol that all database drivers can be expected to comply with.
  • JDBC.jl :: Julia interface to Java database drivers.
  • LevelDB.jl :: Julia interface to Google’s LevelDB key value database.
  • Memcache.jl :: Julia memcached client.
  • ODBC.jl :: A low-level ODBC interface for the Julia programming language. Tabular Data I/O in Julia
WIP or may not work
  • 🏚️ Accumulo.jl :: Apache Accumulo client.
  • 🏚️ D4M.jl :: A D4M module for Julia. D4M was developed in MATLAB by Dr Jeremy Kepner and his team at Lincoln Labs.
  • 🏚️ DBAPI.jl :: A new database interface proposal.
  • 🏚️ DBPerf.jl :: The code repository that benchmarks all the Julia Database Drivers / Wrappers.
  • 🏚️ kyotocabinet.jl :: Implementation of Kyoto Cabinet in Julia language.
  • 🏚️ Neo4j.jl :: Messing around with building a Neo4j driver for Julia.
  • 🏚️ SciDB-Julia :: The SciDB-Julia package allows users of Julia to interface with SciDB.
  • 🏚️ ViewDBI.jl :: View-based DBI for Julia.
  • 🏚️ Q.jl :: Julia for kdb+ database.


HDF5 format

  • HDF5.jl :: Save and load data in the HDF5 file format from Julia.
  • JLD2.jl :: HDF5-compatible file format in pure Julia.
WIP or may not work
  • 🏚️ EasyData.jl :: Simple/Fast(+HDF5) solution to writing datasets & plots to file.

NOSQL databases

  • CQLdriver.jl :: A Julia package for interfacing with CQL compliant databases. Used with DataFrames.jl.
  • DataKnots.jl :: an extensible, practical and coherent algebra of query combinators.
  • LMDB.jl :: A Julia wrapper interface to Lightning Memory-Mapped Database (LMDB) key-value embedded data store developed by Symas for the OpenLDAP Project.
  • Mongo.jl :: Mongo bindings for the Julia programming language.
  • Mongoc.jl :: MongoDB bindings (newer) and a wrapper around libbson, for the Julia language.
  • Redis.jl :: A fully-featured Redis client for the Julia programming language.

Relational Database Management Systems and SQL

  • LibPQ.jl :: A Julia wrapper for the PostgreSQL libpq C library.
  • MySQL.jl :: Julia bindings and helper functions for MariaDB/MySQL C library.
  • Octo.jl :: an SQL Query DSL in Julia.
  • SQLite.jl :: Julia interface to the SQLite library with support for operations on DataFrames.
  • SQLStrings.jl :: It provides the @sql_cmd macro to allow SQL query strings to be constructed by normal-looking string interpolation but without risking SQL formatting errors or SQL injection attacks on your application.
WIP or may not work
  • 🏚️ MariaDB.jl :: A wrapper around the MariaDB C connector.
  • 🏚️ MySQL.jl :: MySQL DBI driver that uses the C MySQL API and obeys the DBI.jl protocol.
  • 🏚️ SQLAlchemy.jl :: Wrapper over Python’s SQLAlchemy library.
  • 🏚️ DBI.jl :: Abstract DBI interface meant to provide a database-independent API.
  • 🏚️ Postgres.jl :: Postgres database interface for the Julia language. {Tag: Unmaintained}
  • 🏚️ PostgreSQL.jl :: An interface to PostgreSQL from Julia, maintained from an older fork use LibPQ.jl instead.
  • 🏚️ DBDSQLite.jl :: DBI-compliant driver for SQLite3.

Loading datasets

  • DataDeps.jl: reproducible data setup for reproducible science.
  • FaceDatasets.jl :: A package for easy access to face-related datasets.
  • Faker.jl :: A package that generates fake data.
  • MLDatasets :: Utility package for accessing common Machine Learning datasets in Julia
  • PubMedMiner.jl :: Return and analyze a PubMed/Medline search using MESH descriptors and their corresponding UMLS concept.
  • RDatasets.jl :: Julia package for loading many of the datasets available in R.
  • WorldBankData.jl :: The World Bank data.
WIP or may not work
  • 🏚️ CommonCrawl.jl :: Interface to common crawl dataset on Amazon S3.
  • 🏚️ Maker.jl :: A tool like make for data analysis in Julia.
  • 🏚️ ModelerToolbox.jl :: Utilities for working with many different versions/parameterizations of models.
  • 🏚️ NetflixPrize.jl :: Julia package for handling the Netflix Prize data set of 2006.
  • 🏚️ PublicSuffix.jl :: Julia Interface for working with the Public Suffix List.
  • 🏚️ REDCap.jl :: A Julia frontend for the REDCap API.
  • 🏚️ Socrata.jl :: An API wrapper for accessing the Socrata Open Data API and importing data into a DataFrame. Socrata is an open data platform used by many local and State governments as well as by the Federal Government in USA.
  • 🏚️ UCIMLRepo.jl :: A small package to allow for easy access and download of datasets from UCI ML repository.

Data Manipulation

  • DataFrames.jl :: In-memory tabular data in Julia.
  • DataFramesMeta.jl :: Metaprogramming tools for DataFrames and AbstractDict objects. These macros improve performance and provide more convenient syntax.
  • IndexedTables.jl :: Tabular data structures where some of the columns form a sorted index.
  • Pandas.jl :: A Julia front-end to Python’s Pandas package
WIP or may not work
  • 🏚️ StructuredQueries.jl :: Data manipulation facilities for Julia.
  • 🏚️ FastGroupBy.jl :: Some helper functions to make some group by operations on DataFrames and IndexedTables faster.


  1. Julia.jl is under COPYRIGHT © 2012-Now SVAKSHA, dual-licensed for the data (ODbL-v1.0+) and the software (AGPLv3+), respectively. ↩︎