Strip Jupyter Notebook Output
Strip Jupyter Notebook Output
March 22, 2024
Jupyter notebooks without multimedia outputs are more friendly to source control since git is not good at comparing binary data (e.g., plots, pictures, videos) in jupyter notebooks. And they tend to bloat the size of git repositories.
nbconvert
You can use nbconvert to remove the output cells of Jupyter notebooks.
jupyter nbconvert --clear-output --inplace my_notebook.ipynbGit automation
YOu can use Git automation to strip the output automatically on git commit. The following git filter settings keep full notebooks as-is but commit the “clean” version.
In your project folder’s .git/config:
.git/config
[filter "strip-notebook-output"]
clean = "jupyter nbconvert --clear-output --inplace --stdin --stdout --log-level=ERROR"And in your project folder’s .gitattributes:
.gitattributes
*.ipynb filter=strip-notebook-outputHow this works:1
- The
attributetells git to run the filter’s clean action on each notebook file before adding it to the index (staging). - The filter is our friend
nbconvert, set up to read from stdin, write to stdout, strip the output, and only speak when it has something important to say. - When a file is extracted from the index, the filter’s smudge action is run, but this is a no-op as we did not specify it. You could run your notebook here to re-create the output (
nbconvert --execute --inplace). - Note that if the filter somehow fails, the file will be staged unconverted.
nbstripout
https://github.com/kynan/nbstripout can automatically set up nbconvert and the git filter.
Last updated on