The Repository Pattern

There are lots of possible ways that your app’s data can be stored. It could be local, remote, or both. The local copy could be in SQLite, XML files, JSON files, or other forms. The server could be using REST, GraphQL, gRPC, or something else.

And, on the whole, your UI should not care.

Your UI code has enough problems to deal with. Figuring out where the data comes from (to show the user) and where it goes (after getting input from the user) is more than it should have to bear.

That is where the repository pattern comes into play. In a nutshell: you design a single API that abstracts out all of the storage stuff. The repository implementation deals with all of the decision-making for where the data goes, what all has to get updated, what has to be refreshed from some remote source, and so on. The API just offers “give me X” and “here is an update to Y” and so on — the basic operations that the UI needs in order to function.

Therefore, in many respects, the repository pattern is not significantly different from any other abstraction that one might use. However, since data storage and retrieval is usually the reason why the app exists, it is important to give this pattern some thought.

What the Repository Does

A repository has a few key roles inside of your app.

Manages Data Storage

First and foremost, this is where you isolate all of the details of the data storage, including all the esoteric rules that your app may require (e.g., we have to update the catalog after midnight in the server’s time zone).

The repository is responsible for:

Making any real-time requests of a server that may be necessary, to retrieve data that is not yet available locally
Managing or directing any in-memory cache of that data
Saving the data in a local persistent store, whether on a temporary or long-term basis
Orchestrating any background data transfers that may be necessary (e.g., responding to push requests, periodically synchronizing with a server)

The details of this will vary widely from app to app. Some of those details may be dictated by business requirements. Some of those details may be dictated by the server team. Some of those details might be under your control. As a result, there is no single recipe for implementing a repository — all books like this can do is explain the role, illustrate some implementations, and provide general guidance.

Normalizes Model Objects

Your UI code probably will work best with a nice clean object model representing the data that the app needs to allow the user to see and manipulate.

However, it is quite likely that you will not get a clean object model from the data storage code:

Plain SQLite uses Cursor and ContentValues, which do not resemble business objects
Object wrappers around SQLite like Room may impose their own limitations, such as Room’s approach for relations
Web service APIs cannot model some data structures at all (e.g., M:N relations), requiring some amount of data conversion to craft the desired object model
Some Web service APIs will have further limitations, because the developers of the Web service had a different vision at the time they created the Web service (e.g., older approaches, targeting other platforms)
And so on

Your UI code should not have to deal with any of that.

So, another part of the repository is to normalize the data from the data storage into clean model objects that the UI code can consume. So, the repository gets to convert those Retrofit POJOs and those Room POJOs (neither of which may resemble the other) into some consistent POJOs that form the object model that the rest of the app uses.

Provides a Clean Reactive API

The UI code needs to be able to make generic requests for normalized data, with the repository handling all of the “dirty details” for making that happen.

At the same time, the UI code needs to have the patience to allow the repository to do its work. The responsiveness of the repository could range from microseconds to seconds, depending on a lot of environmental factors:

Is the data that the UI wants in a memory cache? A disk cache?
Do we have to perform a SQL request? How about a network call?
Do we have to do several of these things, because the UI is seeking a big object graph, and our data storage options deal in smaller slices?

Here, “reactive” could mean RxJava, or possibly LiveData. It could be some form of event bus. It could be a callback system. What it has to be is asynchronous — the API exposed by the repository has to force the UI code to receive the data at some time in the future, not immediately.

Isolates Rest of App from Strategy Changes

You might be tempted to cut corners on the previous point, and have some APIs exposed by the repository that return immediately. So long as those APIs are set up to gracefully fail — such as returning null if the data is not cached in memory — that can be fine. However, in general, that is still not a good idea, for one simple reason: things change. Today, your implementation might support those real-time APIs. Tomorrow, your implementation might not, for any number of reasons:

You elect to switch to some newer approach that simply lacks an equivalent to the in-memory cache that you are using
You elect to switch to some newer approach that does not offer its own real-time API, and you need to “pass along” the reactive approach
You jettison this particular cache because you keep running out of memory
And so on

If you design a reactive API around a generalized object model, you should be able to change the implementation of the repository without requiring changes in the UI code. The only time that the UI code would change is if the data structure itself changes (e.g., new fields or objects added to the object model).

Prev Table of Contents Next

This book is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license.