Skip to content

Strip Jupyter Notebook Output

Jupyter notebooks without multimedia outputs are more friendly to source control since git is not good at comparing binary data (e.g., plots, pictures, videos) in jupyter notebooks. And they tend to bloat the size of git repositories.

nbconvert

You can use nbconvert to remove the output cells of Jupyter notebooks.

jupyter nbconvert --clear-output --inplace my_notebook.ipynb

Git automation

YOu can use Git automation to strip the output automatically on git commit. The following git filter settings keep full notebooks as-is but commit the "clean" version.

In your project folder's .git/config:

.git/config
[filter "strip-notebook-output"]
    clean = "jupyter nbconvert --clear-output --inplace --stdin --stdout --log-level=ERROR"

And in your project folder's .gitattributes:

.gitattributes
*.ipynb filter=strip-notebook-output

How this works:1

  • The attribute tells git to run the filter's clean action on each notebook file before adding it to the index (staging).
  • The filter is our friend nbconvert, set up to read from stdin, write to stdout, strip the output, and only speak when it has something important to say.
  • When a file is extracted from the index, the filter's smudge action is run, but this is a no-op as we did not specify it. You could run your notebook here to re-create the output (nbconvert --execute --inplace).
  • Note that if the filter somehow fails, the file will be staged unconverted.

nbstripout

kynan/nbstripout is a python package to automatically setting up nbconvert and git filter.

nbstripout-fast

nbstripout-fast is a simple python script that runs much faster than nbconvert --clear-output.

In your project folder's .git/config:

.git/config
[filter "nbstripout"]
  clean = nbstripout-fast
    # smudge = cat
    # required = true
[diff "ipynb"]
    textconv = nbstripout-fast -t

And in your project folder's .gitattributes:

.gitattributes
[filter "nbstripout"]
  clean = nbstripout-fast
    # smudge = cat
    # required = true
[diff "ipynb"]
    textconv = nbstripout-fast -t

Comments