Book Excerpt: Full-Text Indexing and Searching (Part 1)

The following sections are excerpts from Version 6.9 of “The Busy Coder’s Guide to Android Development”, with slight modifications to fit the blog format. Due to the length of the material, it will be split over a few posts.

Standard SQL databases are great for ordinary queries. In particular, when it comes to text, SQL databases are great for finding rows where a certain column value matches a particular string. They are usually pretty good about finding when a column value matches a particular string prefix, if there is an index on that column. Things start to break down when you want to search for an occurrence of a string in a column, as this usually requires a “table scan” (i.e., iteratively examining each row to see if this matches). And getting more complex than that is often impossible, or at least rather difficult.

SQLite, in its stock form, inherits all those capabilities and limitations. However, SQLite also offers full-text indexing, where we can search our database much like how we use a search engine (e.g., “find all rows where this column has both foo and bar in it somewhere”). While a full-text index takes up additional disk space, the speed of the full-text searching is quite impressive.

For example, if you are reading this book using the Android APK edition (instead of the PDF, EPUB, or Kindle/MOBI editions), tap on the SearchView action bar item and search for FTS4. You will get a list of matches back almost instantaneously, despite the fact that you are searching a multi-megabyte book. That is because this book ships a SQLite-powered full-text index of the book’s contents, specifically to power your use of SearchView.

In this series of blog posts, we will review how you can add full-text indexing to your SQLite database.

First, a Word About SQLite Versions

SQLite has evolved since Android’s initial production release in 2008.

In many cases, Android does not incorporate updates to third-party code, for backwards-compatibility reasons (e.g., Apache’s HttpClient). In the case of SQLite, newer Android versions do take on newer versions of SQLite… but the exact version of SQLite that a given version of Android uses is undocumented. Worse, some device manufacturers replace the stock SQLite for a version of Android with a different one.

This Stack Overflow answer contains a mapping of Android OS releases to SQLite versions, including various “anomalies” where manufacturers have elected to ship something else.

In many cases, the SQLite version does not matter. Core SQLite capabilities will have existed since the earliest days of Android. However, full-text indexing did not exist in the first SQLite used by Android, meaning that you will have to pay attention to your minSdkVersion and aim high enough that devices should support the full-text indexing option you choose.

Note that you could use an external SQLite implementation, one that gives you a newer SQLite engine than what might be on the device. For example, SQLCipher for Android ships its own copy of SQLite (with the SQLCipher extensions compiled in), one that is often newer than the one that is baked into the firmware of any given device.

FTS3 and FTS4

There are two full-text indexing options available in SQLite: FTS3 and FTS4. FTS4 can be much faster on certain queries, though overall the speed of the two implementations should be similar. FTS4 has two key limitations:

  1. It may take a bit more disk space for its indexes.

  2. It was added to SQLite 3.7.4, which was only introduced into standard Android in API Level 11.

The sample app for these posts will demonstrate FTS4, as that is available on most Android devices.

Note that the Android developer documentation does not cover FTS3 or FTS4 full-text indexing. The details for the SQL syntax to support these options can be found in the SQLite documentation.

In tomorrow’s post, we will see how to create, populate, and query an FTS3 or FTS4 table.

Nervous about how the newest version of Android affects your app? Consider subscribing, then asking questions in the office hours chats!