How I Deal With Collections in the Framework

Sun Jul 20 10:30:13 EDT 2014

Tags: java

A post on Christian Güdemann's blog and a followup on Nathan Freeman's made me figure it could be useful to discuss how I deal with the job of working with document collections in the frostillic.us Framework.

First of all: I am not solving the same original problem Christian had. His post discussed strategies for actually opening and processing every document and secondarily building a collection using inadvisable selection formulas for views (namely, @Today). I don't generally do that stuff, so our paths diverge.

The collection code in the Framework is very focused on dancing on top of existing view indexes: using them for collecting documents, for sorting/categorization, and accessing summary data. In many ways, the main Framework collections can be thought of as a re-implementation of the Domino view data source in XPages, except without the same cheating they do and with some side benefits.

Core

The core essence of a Framework collection - a DominoModelList - is that it stores information about how to access the underlying view, which category filter to use, and which model class to use to create objects. This can be seen in its constructor; it grabs important metadata information about the view and that's about it. It's not until it's required that the list does the dirty work of actually fetching a ViewNavigator (or re-using one in the current request). As much as possible, I aim to use ViewNavigators - which is to say, whenever there's not an FT search involved. Navigators are (for Domino) highly efficient, particularly compared to the shockingly-bad performance characteristics of ViewEntryCollection.

The primary consumer of the navigator is the get(int) method (among other things, collections implement List). If you'll kindly ignore some workaround code for a bug I haven't reliably been able to pin on either my code or the OpenNTF API yet, you can see the navigator use in action: unless there's a search in place (in which case it falls back to VEC), the method fetches an active navigator, skips to the appropriate requested entry, and uses getCurrent() to retrieve it. Though the skip/retrieve mix is odd when the average case will be iterating over successive entries, the performance is speedy enough that I haven't felt the need to try to cover both random and sequential access differently.

Sorting

Since 8.5.0, we have the ability to use click-to-sort columns in the back-end API, and I make use of this to expose sortable columns though TabularDataModel's sorting methods. Once a sorted column is chosen, I pass that along to the underlying view when fetching it. If there's an FT seach query specified, I use the surfaced-in-8.5.3 FTSearchSorted method to retain the sorting.

Collapsible Categories

In 9.0.1, IBM added the ability to collapse categories in navigators via the setAutoExpandGuidance method paired with the view's setEnableNoteIDsForCategories. Using TabularDataModel's expand/collapse methods, I maintain a Set of the faux category note IDs generated when setEnableNoteIDsForCategories is enabled and pass them to the navigator when appropriate. This allows my code to deal with arbitrarily-collapsed categories without containing any other special code - considering how uselessly buggy my implementation of this was before 9.0.1, I'm quite happy those methods are there.

The combination of the sorting and categorization capabilities means that my collections are able to support the same xp:viewPanel UI features that standard xp:dominoView data sources are.

Deferred Data Access

The final key concept in the framework is the deferral of actually accessing a document until it's necessary. Each model object can be constructed in one of three ways: as a new document in a database, as a wrapper around an existing document, and as a wrapper around a view entry. In the view entry case, the model object doesn't touch the underlying document. Instead, it makes a note of the database path and the UNID (if a non-category) for if it DOES need to access it later and then stores the entry's column values in a map. While the model objects don't make a user-side distinction between view entries and documents (you can request any item value whether or not it's in the view), it DOES use these cached column values first. So if your code only requests values that are present in the view, the underlying document is never accessed at all. This leads to exceedingly-efficient (comparatively) data access without making the user of the objects worry about manually accessing the document if the value isn't in the view.


The result of all this is that the Framework collections share all the advantages and pitfalls of the underlying views. Some things are easy and fast (categorization, sorting, multi-entry documents, summary data) while some are still impractical or slow (Rich Text, MIMEBeans, arbitrary queries). But you go to production with the database indexer you have, and so far this method has been serving me well.

Commenter Photo

Cameron Gregor - Sun Jul 20 19:01:52 EDT 2014

Thanks for this post Jesse it is good to see how you have adapted your DataModel to fit the viewPanel features.

With deferred loading from viewEntry, this looks like a great idea. I am hoping to implement this in our own framework.
 

Commenter Photo

Paul Withers - Mon Jul 21 04:07:27 EDT 2014

Great post. I've previously been using dominoView still, but you've own me over. It sounds like this has all the benefits of dominoView with the advantage of separation of data and display layers.

One question, to which I think I know the answer: how do you deal with entries in the view being added / removed / moved from their current position? I suspect there's no difference to the core view index, so still a bit of an elephant in the room.

Commenter Photo

Jesse Gallagher - Mon Jul 21 11:13:16 EDT 2014

@Paul: For the most part, the answer is "I don't (intentionally)". In my previous big attempt at model collections, I cached the collections like crazy in the application scope and then had to worry about cache invalidation - each object type had to purge any collection that may reference it when an individual model was saved. That got crazy to do reliably and I eventually switched to a model where any change to the data database invalidated all of the cache.

In my current system, it's basically like "view.setAutoUpdate(false)": the collections are meant to be stored loosely and then re-fetched as needed. So I use #{}-binding frequently when using the collections with repeats and other iterator controls. Because it uses the NIF indexes so heavily, the performance ends up good enough even without an extra layer of cache.

New Comment