For research purposes I'm trying to crawl the public Docker registry ( https://registry.hub.docker.com/ ) and find out 1) how many layers an average image has and 2) the sizes of these layers to get an idea of the distribution.
However I studied the API and public libraries as well as the details on the github but I cant find any method to:
Feb 26, 2018 - Hello, Maybe this has been solved already but i'm looking to figure out what is making the Docker image i'm building so incredibly enormous.
- retrieve all the public repositories/images (even if those are thousands I still need a starting list to iterate through)
- find all the layers of an image
- find the size for a layer (so not an image but for the individual layer).
Can anyone help me find a way to retrieve this information?
Thank you!
EDIT: is anyone able to verify that searching for '*' in Docker registry is returning all the repositories and not just anything that mentions '*' anywhere? https://registry.hub.docker.com/search?q=*
user134589user134589
9 Answers
You can find the layers of the images in the folder /var/lib/docker/aufs/layers; provide if you configured for storage-driver as aufs (default option)
Example:
Now to view the layers of the containers that were created with the image 'Ubuntu'; go to /var/lib/docker/aufs/layers directory and cat the file starts with the container ID (here it is 0ca502fa6aae*)
This will show the result of same by running
To view the full layer ID; run with --no-trunc option as part of history command.
ViswesnViswesn
Here is a good article about Show Layers of Docker Image
You can first find the image ID:
Then find the its layers and their sizes:
Note: I'm using Docker version 1.13.1
YuciYuci
They have a very good answer here:https://stackoverflow.com/a/32455275/165865
Just run below images:
sunnycmfsunnycmf
This will inspect the docker image and print the layers:
lvthillolvthillo
In my opinion,
docker history <image>
is sufficient. This returns the size of each layer.What suprised me is that just changing the owner created a huge blob.
030030
- https://hub.docker.com/search?q=* shows all the images in the entire Docker hub, it's not possible to get this via the search command as it doesnt accept wildcards.
- As of v1.10 you can find all the layers in an image by pulling it and using these commands:
3) The size can be found in
/var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/size
although LAYERID != the diff_ids found with the previous command. For this you need to look at /var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/diff
and compare with the previous command output to properly match the correct diff_id and size.PietPiet
Can check out dive written in golang.
Awesome tool.
You can adjust the source code so that it exports all the info it shows into a
json
file.LevonLevon
one more tool : https://github.com/CenturyLinkLabs/dockerfile-from-image
GUI using ImageLayers.io
Community♦
resultswayresultsway
I've solved this problem by using the search function on Docker's website where '*' is a valid search that returns 200k repositories and then I crawled each invididual page. HTML parsing allows me to extract all the image names on each page.
PietPiet