Searching a Book

The above code snippets come from the TimeMachine/RoomFTS sample project, demonstrating the use of FTS virtual tables for searching through large blocks of text. The techniques outlined in this chapter, via this sample, resemble the techniques used in the APK edition of The Busy Coder’s Guide to Android Development, which offered a similar full-text search capability for a few years.

The Book

This sample app, though, does not allow searching through a book on Android app development. Instead, it allows the user to read and search a copy of the Project Gutenberg edition of H. G. Wells’ “The Time Machine”. This book is stored in a series of text files in the assets/ directory. On first run, the app will pour that text into a SQLite database, both in a regular table and in an FTS4 virtual table. It then has two fragments, each with a RecyclerView:

The Data Model

The text files in assets/ are subdivided into chapters, with the chapter title as the core of the filename:

Android Studio, Showing the Contents of assets/
Android Studio, Showing the Contents of assets/

We want to be able to full-text search the prose of those chapters. For that, we would only need an FTS table, as we can get the text itself from the assets. However, to make this sample a bit more realistic, we will pour the book contents into regular SQLite tables, then add an FTS table for searching. That will more closely resemble common FTS scenarios, where the data to be searched is also in SQLite.

To that end, we have a ChapterEntity to model a chapter:

package com.commonsware.android.room.fts;

import android.arch.persistence.room.Entity;
import android.arch.persistence.room.PrimaryKey;

@Entity(tableName = "chapters")
public class ChapterEntity {
  @PrimaryKey(autoGenerate = true)
  long id;
  String title;

  ChapterEntity(String title) {
    this.title=title;
  }
}

Notably, this has the chapter title, but it does not have the chapter prose. Instead, we have a 1:N relation with a ParagraphEntity, that will represent an individual paragraph from a chapter:

package com.commonsware.android.room.fts;

import android.arch.persistence.room.Entity;
import android.arch.persistence.room.ForeignKey;
import android.arch.persistence.room.Index;
import android.arch.persistence.room.PrimaryKey;

@Entity(tableName="paragraphs",
  foreignKeys=@ForeignKey(entity=ChapterEntity.class, parentColumns="id",
    childColumns="chapterId", onDelete=ForeignKey.CASCADE),
  indices={@Index(value="chapterId")})
public class ParagraphEntity {
  @PrimaryKey
  long sequence;
  String prose;
  long chapterId;

  ParagraphEntity(String prose) {
    this.prose=prose;
  }
}

Here, the sequence is the order in which this paragraph appears in the book overall. By dividing our prose into paragraphs, we can offer per-paragraph FTS searches. This app does not presently use the chapter information, though it could (e.g., section headings in the scrollable book, nav drawer to jump to a particular chapter).

The Database

Our database needs to hold the chapters, paragraphs, and full-text search index. Moreover, when we create the database, we need to set up those chapters, paragraphs, and full-text search index.

This gets a bit tricky, particularly if we want to use Room for as much of this as possible.

The RoomDatabase subclass in this app is BookDatabase, with a declaration that it is tied to ChapterEntity, ParagraphEntity, and a BookStore that is our DAO:

@Database(entities = {ChapterEntity.class, ParagraphEntity.class}, version=1)
abstract public class BookDatabase extends RoomDatabase {
  static final String DB_NAME="time-machine.db";

  abstract public BookStore store();

As with many of the other samples in this book, the BookDatabase is a singleton:

  private static volatile BookDatabase INSTANCE=null;

  synchronized static BookDatabase get(Context ctxt) {
    if (INSTANCE==null) {
      INSTANCE=create(ctxt);
    }

    return INSTANCE;
  }

And, as with many of the other samples, the create() method creates our database:

  private static BookDatabase create(Context ctxt) {
    RoomDatabase.Builder<BookDatabase> b=
      Room.databaseBuilder(ctxt.getApplicationContext(), BookDatabase.class,
        DB_NAME);

    b.addCallback(new Callback() {
      @Override
      public void onCreate(@NonNull SupportSQLiteDatabase db) {
        super.onCreate(db);

        db.execSQL("CREATE VIRTUAL TABLE booksearch USING fts4(sequence, prose)");
      }
    });

    BookDatabase books=b.build();

    populate(ctxt, books);

    return books;
  }

Our Callback gives us control when the database is created, and we directly create a booksearch FTS table using the supplied SupportSQLiteDatabase. This bypasses the need for a Room entity for this table. If this app had more than one schema version, and booksearch was added in a later schema, we would also be registering Migration objects, where one or more of those would also create this booksearch FTS table.

The other departure is the populate() call. In that method, we load the BookDatabase with our chapters and paragraphs, if that data does not already exist. In principle, we could do this in onCreate() of the Callback. However, in that method, all we have is a SupportSQLiteDatabase, meaning that we cannot use our Room entities and DAO, but instead would roll all of the database code manually. If we do that, we might as well consider dumping Room altogether. So, instead, populate() will check the database to see if we already have our prose loaded, and if not, will load the prose into the database.

Our BookStore has a simple chapterCount() @Query method to return the number of chapters:

  @Query("SELECT COUNT(*) FROM chapters")
  abstract int chapterCount();

populate() starts by calling that method to determine whether we our database is empty:

  private static void populate(Context ctxt, BookDatabase books) {
    if (books.store().chapterCount()==0) {
      try {
        AssetManager assets=ctxt.getAssets();
        int sequenceNumber=0;

        for (String path : assets.list("")) {
          List<String> paragraphs=paragraphs(assets.open(path));
          String title=title(path);
          ChapterEntity chapter=new ChapterEntity(title);
          List<ParagraphEntity> paragraphEntities=new ArrayList<>();

          for (String paragraph : paragraphs) {
            paragraphEntities.add(new ParagraphEntity(paragraph));
          }

          sequenceNumber=
            books.store().insert(chapter, paragraphEntities, sequenceNumber);
        }
      }
      catch (IOException e) {
        Log.e("BookDatabase", "Exception reading in assets", e);
      }
    }
  }

If it is empty, we use AssetManager to iterate over those 17 text files in assets/. For each, we use a paragraphs() method to divide the text into paragraphs, with a blank line serving as the delimiter between paragraphs:

  // inspired by https://stackoverflow.com/a/10065920/115145

  private static List<String> paragraphs(InputStream is) throws IOException {
    BufferedReader in=new BufferedReader(new InputStreamReader(is));
    List<String> result=new ArrayList<>();

    try {
      StringBuilder paragraph=new StringBuilder();

      while (true) {
        String line=in.readLine();

        if (line==null) {
          break;
        }
        else if (TextUtils.isEmpty(line)) {
          if (!TextUtils.isEmpty(paragraph)) {
            result.add(paragraph.toString().trim());
            paragraph=new StringBuilder();
          }
        }
        else {
          paragraph.append(line);
          paragraph.append(' ');
        }
      }

      if (!TextUtils.isEmpty(paragraph)) {
        result.add(paragraph.toString().trim());
      }
    }
    finally {
      is.close();
    }

    return result;
  }

We use a title() method to convert the “slug”-style filename into a normal title, using spaces instead of dashes for word separators:

  private static String title(String path) {
    String[] pieces=path.substring(0, path.length()-4).split("-");
    StringBuilder buf=new StringBuilder();

    for (int i=1;i<pieces.length;i++) {
      buf.append(pieces[i]);
      buf.append(' ');
    }

    return buf.toString().trim();
  }

We convert the paragraphs into ParagraphEntity instances, then call an insert() method on our BookStore, passing it the ChapterEntity and each of the ParagraphEntity instances to be inserted into our database, plus the last-used sequence number (initialized at 0). insert() is supposed to put all of this data into the database and return the new last-used sequence number, to be applied in the next pass of the loop.

insert() uses a @Transaction annotation, so it can perform multiple database operations inside of a single transaction:

  @Transaction
  int insert(ChapterEntity chapter, List<ParagraphEntity> paragraphs,
             int startingSequenceNo) {
    long chapterId=insert(chapter);

    for (ParagraphEntity paragraph : paragraphs) {
      paragraph.chapterId=chapterId;
      paragraph.sequence=startingSequenceNo++;
      insertFTS(paragraph);
    }

    insert(paragraphs);

    return startingSequenceNo;
  }

First, insert() calls another insert() method, to insert a ChapterEntity. That is just a standard @Insert method:

  @Insert
  abstract long insert(ChapterEntity chapter);

It returns the primary key used for this ChapterEntity, as the entity has @PrimaryKey(autoGenerate = true) on its long id field. We can then use that, plus an incremented sequence number, to fill in the missing details on the ParagraphEntity.

And, at this point, we cheat.

As was noted earlier in the chapter, we are supposed to use SupportSQLiteDatabase to work with booksearch. In reality, at least with Room 1.1.0, we can use @RawQuery for this. However, @RawQuery requires that its method take a SupportSQLiteQuery as a parameter:

  @RawQuery
  protected abstract long _insert(SupportSQLiteQuery queryish);

That is inconvenient. So, we wrap that in an insertFTS() method that generates the SupportSQLiteQuery:

  void insertFTS(ParagraphEntity entity) {
    _insert(new SimpleSQLiteQuery("INSERT INTO booksearch (sequence, prose) VALUES (?, ?)",
      new Object[] {entity.sequence, entity.prose}));
  }

The _insert() method has the leading underscore and is marked as protected to try to minimize the likelihood that anyone outside of BookStore would use that method. Unfortunately, we cannot make the method be private, because then the Room code generator cannot generate an implementation in a concrete subclass of BookStore.

Once we have iterated over all of the ParagraphEntity instances and added them to the FTS table, we insert them into their own table, using another insert() method:

  @Insert
  abstract void insert(List<ParagraphEntity> paragraphs);

Finally, we return the updated sequence number, for use in a future pass of this insert() method.

The net: when we open our BookDatabase and start working with it, we will lazy-populate the database, including the FTS table.

The BookStore also has a method for loading our paragraphs, in sequence order, using the paging library to maximize speed and minimize memory usage:

  @Query("SELECT * FROM paragraphs ORDER BY sequence ASC")
  abstract DataSource.Factory<Integer, ParagraphEntity> paragraphs();

The Searches

To search the booksearch FTS table, we turn once again to @RawQuery methods on the BookStore:

  @RawQuery(observedEntities = ParagraphEntity.class)
  protected abstract DataSource.Factory<Integer, BookSearchResult> _search(SupportSQLiteQuery query);

  @RawQuery
  protected abstract List<BookSearchResult> _searchSynchronous(SupportSQLiteQuery query);

There are two such methods, _search and _searchSynchronous(). The latter is for testing purposes, and it does the database I/O synchronously. The former is for the actual UI, and it uses the paging library.

However, @RawQuery insists upon having an observedEntities property, if your method returns an asynchronous type: DataSource.Factory, LiveData, Observable, etc. This is a fairly bizarre limitation, particularly given the fact that we are supposed to use @RawQuery for FTS tables, which have no associated entities. With luck, this will get addressed someday. In the meantime, we use ParagraphEntity, as being the closest entity that matches this table, and hope that this holds up.

Both of these methods return instances of a POJO named BookSearchResult:

package com.commonsware.android.room.fts;

public class BookSearchResult {
  long sequence;
  String snippet;
}

The sequence field provides the primary key for the ParagraphEntity, should we want to find the full paragraph. However, mostly, we will use the snippet field, which will supply us with a SQLite-prepared bit of text highlighting our search term in the result.

As with _insert(), these two methods need a SupportSQLiteQuery, so we wrap them with methods that hide that detail:

  DataSource.Factory<Integer, BookSearchResult> search(String expr) {
    return _search(buildSearchQuery(expr));
  }

  List<BookSearchResult> searchSynchronous(String expr) {
    return _searchSynchronous(buildSearchQuery(expr));
  }
  private SimpleSQLiteQuery buildSearchQuery(String expr) {
    return new SimpleSQLiteQuery("SELECT sequence, snippet(booksearch) AS snippet FROM booksearch WHERE prose MATCH ? ORDER BY sequence ASC",
      new Object[] {expr});
  }

The Repository

Our BookDatabase and BookStore are wrapped in a BookRepository:

package com.commonsware.android.room.fts;

import android.arch.paging.DataSource;
import android.content.Context;
import io.reactivex.Single;

public class BookRepository {
  private static volatile BookRepository INSTANCE=null;
  private final Context ctxt;

  synchronized static BookRepository get(Context ctxt) {
    if (INSTANCE==null) {
      INSTANCE=new BookRepository(ctxt);
    }

    return INSTANCE;
  }

  private BookRepository(Context ctxt) {
    this.ctxt=ctxt.getApplicationContext();
  }

  Single<DataSource.Factory<Integer, ParagraphEntity>> paragraphs() {
    return Single.create(emitter ->
      emitter.onSuccess(BookDatabase.get(ctxt).store().paragraphs()));
  }

  Single<DataSource.Factory<Integer, BookSearchResult>> search(String expr) {
    return Single.create(emitter ->
      emitter.onSuccess(BookDatabase.get(ctxt).store().search(expr)));
  }
}

DataSource.Factory is asynchronous, in that our paragraphs() and search() methods do not perform any database I/O immediately. However, due to our lazy-loading of the book contents into the database, the get() method on BookDatabase does do database I/O, and potentially a fair bit of it (if this is the first run of the app). To avoid StrictMode violations and jank, we need to move BookDatabase.get() onto a background thread. So, the BookRepository wraps the BookStore edition of the paragraphs() and search() methods in its own, returning a Single, so that we can perform the BookDatabase.get() call on a background thread.

The ViewModels

The BookRepository is used by two AndroidViewModel subclasses.

One — BookViewModel — will be used by the UI that displays the entire book contents:

package com.commonsware.android.room.fts;

import android.app.Application;
import android.arch.lifecycle.AndroidViewModel;
import android.arch.lifecycle.LiveData;
import android.arch.lifecycle.LiveDataReactiveStreams;
import android.arch.lifecycle.Transformations;
import android.arch.paging.DataSource;
import android.arch.paging.LivePagedListBuilder;
import android.arch.paging.PagedList;
import io.reactivex.Single;
import io.reactivex.schedulers.Schedulers;

public class BookViewModel extends AndroidViewModel {
  final LiveData<PagedList<ParagraphEntity>> paragraphs;

  public BookViewModel(Application app) {
    super(app);

    Single<DataSource.Factory<Integer, ParagraphEntity>> liveParagraphs=
      BookRepository.get(app).paragraphs().subscribeOn(Schedulers.single());

    paragraphs=Transformations
      .switchMap(LiveDataReactiveStreams.fromPublisher(liveParagraphs.toFlowable().cache()),
        factory -> new LivePagedListBuilder<>(factory, 50).build());
  }
}

Here, we:

There is a similar SearchViewModel for processing the results of a search:

package com.commonsware.android.room.fts;

import android.app.Application;
import android.arch.lifecycle.AndroidViewModel;
import android.arch.lifecycle.LiveData;
import android.arch.lifecycle.LiveDataReactiveStreams;
import android.arch.lifecycle.Transformations;
import android.arch.paging.DataSource;
import android.arch.paging.LivePagedListBuilder;
import android.arch.paging.PagedList;
import io.reactivex.Single;
import io.reactivex.schedulers.Schedulers;

public class SearchViewModel extends AndroidViewModel {
  LiveData<PagedList<BookSearchResult>> results;

  public SearchViewModel(Application app) {
    super(app);
  }

  LiveData<PagedList<BookSearchResult>> search(String expr) {
    Single<DataSource.Factory<Integer, BookSearchResult>> liveSearch=
      BookRepository.get(getApplication()).search(expr).subscribeOn(Schedulers.single());

    results=Transformations
      .switchMap(LiveDataReactiveStreams.fromPublisher(liveSearch.toFlowable().cache()),
        factory -> new LivePagedListBuilder<>(factory, 50).build());

    return results;
  }
}

The UI

Our MainActivity is pretty simple. It just loads up a BookFragment on startup. It also offers a search() method so that the BookFragment can request a search, which will result in a SearchFragment being placed on the back stack:

package com.commonsware.android.room.fts;

import android.os.Bundle;
import android.support.v4.app.FragmentActivity;
import android.text.TextUtils;

public class MainActivity extends FragmentActivity {
  @Override
  public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);

    if (getSupportFragmentManager().findFragmentById(android.R.id.content)==null) {
      getSupportFragmentManager().beginTransaction()
        .add(android.R.id.content, new BookFragment())
        .commit();
    }
  }

  public void search(String expr) {
    if (!TextUtils.isEmpty(expr)) {
      getSupportFragmentManager().beginTransaction()
        .replace(android.R.id.content, SearchFragment.newInstance(expr))
        .addToBackStack(null)
        .commit();
    }
  }
}

BookFragment has an action bar with a SearchView in it:

<?xml version="1.0" encoding="utf-8"?>
<menu xmlns:android="http://schemas.android.com/apk/res/android">

  <item
    android:id="@+id/search"
    android:actionViewClass="android.widget.SearchView"
    android:icon="@drawable/ic_search_white_24dp"
    android:showAsAction="ifRoom|collapseActionView"
    android:title="@string/search">
  </item>

</menu>

BookFragment is that action bar with the SearchView, plus a RecyclerView to show the paragraphs of the book:

package com.commonsware.android.room.fts;

import android.arch.lifecycle.ViewModelProviders;
import android.arch.paging.PagedListAdapter;
import android.os.Bundle;
import android.support.annotation.Nullable;
import android.support.v7.util.DiffUtil;
import android.support.v7.widget.LinearLayoutManager;
import android.support.v7.widget.RecyclerView;
import android.view.LayoutInflater;
import android.view.Menu;
import android.view.MenuInflater;
import android.view.MenuItem;
import android.view.View;
import android.view.ViewGroup;
import android.widget.SearchView;
import android.widget.TextView;

public class BookFragment extends RecyclerViewFragment implements
  SearchView.OnQueryTextListener, SearchView.OnCloseListener {
  private SearchView sv=null;

  @Override
  public void onCreate(@Nullable Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);

    setHasOptionsMenu(true);
  }

  @Override
  public void onViewCreated(View view, Bundle savedInstanceState) {
    super.onViewCreated(view, savedInstanceState);

    setLayoutManager(new LinearLayoutManager(getActivity()));

    BookViewModel vm=ViewModelProviders.of(this).get(BookViewModel.class);
    final ParagraphAdapter adapter=new ParagraphAdapter(getActivity().getLayoutInflater());

    vm.paragraphs.observe(this, adapter::submitList);

    setAdapter(adapter);
  }

  @Override
  public void onCreateOptionsMenu(Menu menu, MenuInflater inflater) {
    inflater.inflate(R.menu.actions, menu);

    configureSearchView(menu);

    super.onCreateOptionsMenu(menu, inflater);
  }

  @Override
  public boolean onQueryTextChange(String newText) {
    return false;
  }

  @Override
  public boolean onQueryTextSubmit(String query) {
    search(query);

    return true;
  }

  @Override
  public boolean onClose() {
    return true;
  }

  private void configureSearchView(Menu menu) {
    MenuItem search=menu.findItem(R.id.search);

    sv=(SearchView)search.getActionView();
    sv.setOnQueryTextListener(this);
    sv.setOnCloseListener(this);
    sv.setSubmitButtonEnabled(true);
    sv.setIconifiedByDefault(true);
  }

  private void search(String expr) {
    ((MainActivity)getActivity()).search(expr);
  }

  private static class ParagraphAdapter extends PagedListAdapter<ParagraphEntity, RowHolder> {
    private final LayoutInflater inflater;

    ParagraphAdapter(LayoutInflater inflater) {
      super(PARA_DIFF);
      this.inflater=inflater;
    }

    @Override
    public RowHolder onCreateViewHolder(ViewGroup parent, int viewType) {
      return(new RowHolder(inflater.inflate(R.layout.row, parent, false)));
    }

    @Override
    public void onBindViewHolder(RowHolder holder, int position) {
      ParagraphEntity paragraph=getItem(position);

      if (paragraph==null) {
        holder.clear();
      }
      else {
        holder.bind(paragraph);
      }
    }
  }

  private static class RowHolder extends RecyclerView.ViewHolder {
    private final TextView prose;

    RowHolder(View itemView) {
      super(itemView);

      prose=itemView.findViewById(R.id.prose);
    }

    void bind(ParagraphEntity paragraph) {
      prose.setText(paragraph.prose);
    }

    void clear() {
      prose.setText(null);
    }
  }

  static final DiffUtil.ItemCallback<ParagraphEntity> PARA_DIFF=
    new DiffUtil.ItemCallback<ParagraphEntity>() {
      @Override
      public boolean areItemsTheSame(ParagraphEntity oldItem,
                                     ParagraphEntity newItem) {
        return oldItem.sequence==newItem.sequence;
      }

      @Override
      public boolean areContentsTheSame(ParagraphEntity oldItem,
                                        ParagraphEntity newItem) {
        return oldItem.prose.equals(newItem.prose);
      }
  };
}

Of note:

SearchFragment has a similar structure, just without the SearchView:

package com.commonsware.android.room.fts;

import android.arch.lifecycle.ViewModelProviders;
import android.arch.paging.PagedListAdapter;
import android.os.Bundle;
import android.support.v7.util.DiffUtil;
import android.support.v7.widget.DividerItemDecoration;
import android.support.v7.widget.LinearLayoutManager;
import android.support.v7.widget.RecyclerView;
import android.text.Html;
import android.view.LayoutInflater;
import android.view.Menu;
import android.view.MenuInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.TextView;

public class SearchFragment extends RecyclerViewFragment {
  private static final String ARG_EXPR="expr";

  static SearchFragment newInstance(String expr) {
    SearchFragment result=new SearchFragment();
    Bundle args=new Bundle();

    args.putString(ARG_EXPR, expr);
    result.setArguments(args);

    return result;
  }

  @Override
  public void onViewCreated(View view, Bundle savedInstanceState) {
    super.onViewCreated(view, savedInstanceState);

    setLayoutManager(new LinearLayoutManager(getActivity()));
    getRecyclerView()
      .addItemDecoration(new DividerItemDecoration(getActivity(),
        LinearLayoutManager.VERTICAL));

    SearchViewModel vm=ViewModelProviders.of(this).get(SearchViewModel.class);
    BookSearchAdapter adapter=
      new BookSearchAdapter(getActivity().getLayoutInflater());

    vm.search(getArguments().getString(ARG_EXPR))
      .observe(this, adapter::submitList);

    setAdapter(adapter);
  }

  private static class BookSearchAdapter extends PagedListAdapter<BookSearchResult, RowHolder> {
    private final LayoutInflater inflater;

    BookSearchAdapter(LayoutInflater inflater) {
      super(SEARCH_DIFF);
      this.inflater=inflater;
    }

    @Override
    public RowHolder onCreateViewHolder(ViewGroup parent, int viewType) {
      return(new RowHolder(inflater.inflate(R.layout.row, parent, false)));
    }

    @Override
    public void onBindViewHolder(RowHolder holder, int position) {
      BookSearchResult result=getItem(position);

      if (result==null) {
        holder.clear();
      }
      else {
        holder.bind(result);
      }
    }
  }

  private static class RowHolder extends RecyclerView.ViewHolder {
    private final TextView prose;

    RowHolder(View itemView) {
      super(itemView);

      prose=itemView.findViewById(R.id.prose);
    }

    void bind(BookSearchResult result) {
      prose.setText(Html.fromHtml(result.snippet));
    }

    void clear() {
      prose.setText(null);
    }
  }

  static final DiffUtil.ItemCallback<BookSearchResult> SEARCH_DIFF=
    new DiffUtil.ItemCallback<BookSearchResult>() {
      @Override
      public boolean areItemsTheSame(BookSearchResult oldItem,
                                     BookSearchResult newItem) {
        return oldItem==newItem;
      }

      @Override
      public boolean areContentsTheSame(BookSearchResult oldItem,
                                        BookSearchResult newItem) {
        return oldItem.snippet.equals(newItem.snippet);
      }
  };
}

In its onViewCreated(), SearchFragment gets the search expression out of the arguments Bundle (placed there via newInstance()) and passes that to the search() method on the SearchViewModel, then observes the results and routes them to the PagedListAdapter.

The Results

When initially launched, you see the first few paragraphs of the book:

BookFragment, As Initially Launched
BookFragment, As Initially Launched

If the user taps on the search icon and types in a search:

BookFragment, with Search Expression
BookFragment, with Search Expression

…then submits it, the search results are shown in a SearchFragment:

BookFragmSearchFragmentent, with Search Results
BookFragmSearchFragmentent, with Search Results

The search expressions can be simple words or anything supported by SQLite’s FTS query syntax, including AND/OR/NOT/NEAR operators.


Prev Table of Contents Next

This book is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license.