Scoped Storage Stories: listFiles() Woe

Android 10 is greatly restricting access to external storage via filesystem APIs. Instead, we need to use other APIs to work with content. This is the seventh post in a series where we will explore how to work with those alternatives, starting with the Storage Access Framework (SAF).

I thought I was going to switch to covering the MediaStore approach for storing content in Android 10. Torsten Grote, though, had other ideas. 😀

On Twitter, Torsten pointed out a problem with listFiles() in DocumentFile: it might not list the files. This problem serves as a nice illustration of the twisty nature of the Storage Access Framework APIs and how not everything will work the way that you might expect.

Once developers make the leap into using the Storage Access Framework, some come to the unfortunate conclusion that the SAF is only for on-device storage. This is not the case.

Remember that when you use an Intent action like ACTION_OPEN_DOCUMENT_TREE, the user can choose from any document provider available on their device. For many users, there will be few options other than the built-in providers that handle external/removable storage and media. But, some users may install apps that offer other document providers, particularly for cloud-based document stores.

As a result, the Storage Access Framework APIs – as defined in DocumentsContract and DocumentsProvider – are set up to support cloud-based document stores. That manifests itself in many ways, such as the abstract notion of “document tree” (since a cloud provider might not have directories) and a “display name” for content (since a cloud provider might not use filenames).

Under the covers, listFiles() uses buildChildDocumentsUriUsingTree() on DocumentsContract. This returns a Uri that can be queries to get the child documents based on a tree Uri, such as the one that you would get from ACTION_OPEN_DOCUMENT_TREE. That, in turn, will call queryChildDocuments() on DocumentsProvider for the provider associated with the tree Uri. This returns a Cursor with the child document details… sometimes.

The documentation for queryChildDocuments() contains this note:

If your provider is cloud-based, and you have data cached locally, you may return the local data immediately, setting DocumentsContract#EXTRA_LOADING on Cursor extras to indicate that you are still fetching additional data. Then, when the network data is available, you can send a change notification to trigger a requery and return the complete contents. To return a Cursor with extras, you need to extend and override Cursor#getExtras().

(if you are asking yourself “Cursors have extras?”, yes, a Cursor can have extras, though normally it does not)

DocumentFile ignores this, neither using it nor returning it to the caller of listFiles(). As a result, users of listFiles() have no way to know that their list of files does not represent the full list, but rather some subset (or even an empty list) while the provider is fetching the details. Torsten indicated that the Nextcloud app has a document provider that will return empty lists at the outset, though I have not attempted to reproduce that behavior. It fits the API, though, so not only might Nextcloud behave like this, other cloud document providers might as well. And, it’s all perfectly legitimate.

One can imagine some future KTX version of listFiles() that would return a Flow. It would register a ContentObserver to find out about file changes and emit a fresh list on the Flow when changes are detected. Unfortunately, DocumentFile does not have that today.

Lacking that, you still have a few options, including:

Ignore the problem
Ignore the problem but offer some sort of “refresh” option that the user can use to cause you to call listFiles() again, hoping that you will get the actual list of files on a subsequent call
Use listFiles() but also query the buildChildDocumentsUriUsingTree() Uri yourself, just to get the EXTRA_LOADING flag, so you can take that into account
Create your own reactive listFiles() alternative that uses EXTRA_LOADING and a ContentObserver to handle both the case when all the files are listed up front and the case where the list of files is loading

This is one area where the DocumentFile API is tuned more towards local providers and does not handle cloud providers all that well. The downside of offering an API that resembles File is that File works with filesystem contents, and its API does not match what we might want for a cloud-based document store. Perhaps some future library will offer an easy API that is more cloud-aware.

The entire series of “Scoped Storage Stories” posts includes posts on:

— Dec 14, 2019

Scoped Storage Stories: listFiles() Woe

Meta